scale computer code: Topics by Science.gov

Sample records for scale computer code

Cloud Computing for Complex Performance Codes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin

This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
SCALE: A modular code system for performing Standardized Computer Analyses for Licensing Evaluation. Volume 1, Part 2: Control modules S1--H1; Revision 5

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

SCALE--a modular code system for Standardized Computer Analyses Licensing Evaluation--has been developed by Oak Ridge National Laboratory at the request of the US Nuclear Regulatory Commission. The SCALE system utilizes well-established computer codes and methods within standard analysis sequences that (1) allow an input format designed for the occasional user and/or novice, (2) automated the data processing and coupling between modules, and (3) provide accurate and reliable results. System development has been directed at problem-dependent cross-section processing and analysis of criticality safety, shielding, heat transfer, and depletion/decay problems. Since the initial release of SCALE in 1980, the code system hasmore » been heavily used for evaluation of nuclear fuel facility and package designs. This revision documents Version 4.3 of the system.« less
Cross-indexing of binary SIFT codes for large-scale image search.

PubMed

Liu, Zhen; Li, Houqiang; Zhang, Liyan; Zhou, Wengang; Tian, Qi

2014-05-01

In recent years, there has been growing interest in mapping visual features into compact binary codes for applications on large-scale image collections. Encoding high-dimensional data as compact binary codes reduces the memory cost for storage. Besides, it benefits the computational efficiency since the computation of similarity can be efficiently measured by Hamming distance. In this paper, we propose a novel flexible scale invariant feature transform (SIFT) binarization (FSB) algorithm for large-scale image search. The FSB algorithm explores the magnitude patterns of SIFT descriptor. It is unsupervised and the generated binary codes are demonstrated to be dispreserving. Besides, we propose a new searching strategy to find target features based on the cross-indexing in the binary SIFT space and original SIFT space. We evaluate our approach on two publicly released data sets. The experiments on large-scale partial duplicate image retrieval system demonstrate the effectiveness and efficiency of the proposed algorithm.
Parallel Scaling Characteristics of Selected NERSC User ProjectCodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Skinner, David; Verdier, Francesca; Anand, Harsh

This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems.more » An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.« less
GPU implementation of the linear scaling three dimensional fragment method for large scale electronic structure calculations

NASA Astrophysics Data System (ADS)

Jia, Weile; Wang, Jue; Chi, Xuebin; Wang, Lin-Wang

2017-02-01

LS3DF, namely linear scaling three-dimensional fragment method, is an efficient linear scaling ab initio total energy electronic structure calculation code based on a divide-and-conquer strategy. In this paper, we present our GPU implementation of the LS3DF code. Our test results show that the GPU code can calculate systems with about ten thousand atoms fully self-consistently in the order of 10 min using thousands of computing nodes. This makes the electronic structure calculations of 10,000-atom nanosystems routine work. This speed is 4.5-6 times faster than the CPU calculations using the same number of nodes on the Titan machine in the Oak Ridge leadership computing facility (OLCF). Such speedup is achieved by (a) carefully re-designing of the computationally heavy kernels; (b) redesign of the communication pattern for heterogeneous supercomputers.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers.

PubMed

Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne

2018-01-01

State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers

PubMed Central

Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne

2018-01-01

State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems. PMID:29503613
The TeraShake Computational Platform for Large-Scale Earthquake Simulations

NASA Astrophysics Data System (ADS)

Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
SCALE: A modular code system for performing standardized computer analyses for licensing evaluation. Functional modules F1--F8 -- Volume 2, Part 1, Revision 4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greene, N.M.; Petrie, L.M.; Westfall, R.M.

SCALE--a modular code system for Standardized Computer Analyses Licensing Evaluation--has been developed by Oak Ridge National Laboratory at the request of the US Nuclear Regulatory Commission. The SCALE system utilizes well-established computer codes and methods within standard analysis sequences that (1) allow an input format designed for the occasional user and/or novice, (2) automate the data processing and coupling between modules, and (3) provide accurate and reliable results. System development has been directed at problem-dependent cross-section processing and analysis of criticality safety, shielding, heat transfer, and depletion/decay problems. Since the initial release of SCALE in 1980, the code system hasmore » been heavily used for evaluation of nuclear fuel facility and package designs. This revision documents Version 4.2 of the system. The manual is divided into three volumes: Volume 1--for the control module documentation; Volume 2--for functional module documentation; and Volume 3--for documentation of the data libraries and subroutine libraries.« less
Strategy Plan A Methodology to Predict the Uniformity of Double-Shell Tank Waste Slurries Based on Mixing Pump Operation

DOE Office of Scientific and Technical Information (OSTI.GOV)

J.A. Bamberger; L.M. Liljegren; P.S. Lowery

This document presents an analysis of the mechanisms influencing mixing within double-shell slurry tanks. A research program to characterize mixing of slurries within tanks has been proposed. The research program presents a combined experimental and computational approach to produce correlations describing the tank slurry concentration profile (and therefore uniformity) as a function of mixer pump operating conditions. The TEMPEST computer code was used to simulate both a full-scale (prototype) and scaled (model) double-shell waste tank to predict flow patterns resulting from a stationary jet centered in the tank. The simulation results were used to evaluate flow patterns in the tankmore » and to determine whether flow patterns are similar between the full-scale prototype and an existing 1/12-scale model tank. The flow patterns were sufficiently similar to recommend conducting scoping experiments at 1/12-scale. Also, TEMPEST modeled velocity profiles of the near-floor jet were compared to experimental measurements of the near-floor jet with good agreement. Reported values of physical properties of double-shell tank slurries were analyzed to evaluate the range of properties appropriate for conducting scaled experiments. One-twelfth scale scoping experiments are recommended to confirm the prioritization of the dimensionless groups (gravitational settling, Froude, and Reynolds numbers) that affect slurry suspension in the tank. Two of the proposed 1/12-scale test conditions were modeled using the TEMPEST computer code to observe the anticipated flow fields. This information will be used to guide selection of sampling probe locations. Additional computer modeling is being conducted to model a particulate laden, rotating jet centered in the tank. The results of this modeling effort will be compared to the scaled experimental data to quantify the agreement between the code and the 1/12-scale experiment. The scoping experiment results will guide selection of parameters to be varied in the follow-on experiments. Data from the follow-on experiments will be used to develop correlations to describe slurry concentration profile as a function of mixing pump operating conditions. This data will also be used to further evaluate the computer model applications. If the agreement between the experimental data and the code predictions is good, the computer code will be recommended for use to predict slurry uniformity in the tanks under various operating conditions. If the agreement between the code predictions and experimental results is not good, the experimental data correlations will be used to predict slurry uniformity in the tanks within the range of correlation applicability.« less
RADSOURCE. Volume 1, Part 1, A scaling factor prediction computer program technical manual and code validation: Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vance, J.N.; Holderness, J.H.; James, D.W.

1992-12-01

Waste stream scaling factors based on sampling programs are vulnerable to one or more of the following factors: sample representativeness, analytic accuracy, and measurement sensitivity. As an alternative to sample analyses or as a verification of the sampling results, this project proposes the use of the RADSOURCE code, which accounts for the release of fuel-source radionuclides. Once the release rates of these nuclides from fuel are known, the code develops scaling factors for waste streams based on easily measured Cobalt-60 (Co-60) and Cesium-137 (Cs-137). The project team developed mathematical models to account for the appearance rate of 10CFR61 radionuclides inmore » reactor coolant. They based these models on the chemistry and nuclear physics of the radionuclides involved. Next, they incorporated the models into a computer code that calculates plant waste stream scaling factors based on reactor coolant gamma- isotopic data. Finally, the team performed special sampling at 17 reactors to validate the models in the RADSOURCE code.« less
Micro- and meso-scale simulations of magnetospheric processes related to the aurora and substorm morphology

NASA Technical Reports Server (NTRS)

Swift, Daniel W.

1991-01-01

The primary methodology during the grant period has been the use of micro or meso-scale simulations to address specific questions concerning magnetospheric processes related to the aurora and substorm morphology. This approach, while useful in providing some answers, has its limitations. Many of the problems relating to the magnetosphere are inherently global and kinetic. Effort during the last year of the grant period has increasingly focused on development of a global-scale hybrid code to model the entire, coupled magnetosheath - magnetosphere - ionosphere system. In particular, numerical procedures for curvilinear coordinate generation and exactly conservative differencing schemes for hybrid codes in curvilinear coordinates have been developed. The new computer algorithms and the massively parallel computer architectures now make this global code a feasible proposition. Support provided by this project has played an important role in laying the groundwork for the eventual development or a global-scale code to model and forecast magnetospheric weather.
Sub-Selective Quantization for Learning Binary Codes in Large-Scale Image Search.

PubMed

Li, Yeqing; Liu, Wei; Huang, Junzhou

2018-06-01

Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping high-dimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and performing similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm, which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to several popular quantization techniques including cases using linear and nonlinear mappings. Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.
Status and future of MUSE

NASA Astrophysics Data System (ADS)

Harfst, S.; Portegies Zwart, S.; McMillan, S.

2008-12-01

We present MUSE, a software framework for combining existing computational tools from different astrophysical domains into a single multi-physics, multi-scale application. MUSE facilitates the coupling of existing codes written in different languages by providing inter-language tools and by specifying an interface between each module and the framework that represents a balance between generality and computational efficiency. This approach allows scientists to use combinations of codes to solve highly-coupled problems without the need to write new codes for other domains or significantly alter their existing codes. MUSE currently incorporates the domains of stellar dynamics, stellar evolution and stellar hydrodynamics for studying generalized stellar systems. We have now reached a ``Noah's Ark'' milestone, with (at least) two available numerical solvers for each domain. MUSE can treat multi-scale and multi-physics systems in which the time- and size-scales are well separated, like simulating the evolution of planetary systems, small stellar associations, dense stellar clusters, galaxies and galactic nuclei. In this paper we describe two examples calculated using MUSE: the merger of two galaxies and an N-body simulation with live stellar evolution. In addition, we demonstrate an implementation of MUSE on a distributed computer which may also include special-purpose hardware, such as GRAPEs or GPUs, to accelerate computations. The current MUSE code base is publicly available as open source at http://muse.li.
pycola: N-body COLA method code

NASA Astrophysics Data System (ADS)

Tassev, Svetlin; Eisenstein, Daniel J.; Wandelt, Benjamin D.; Zaldarriagag, Matias

2015-09-01

pycola is a multithreaded Python/Cython N-body code, implementing the Comoving Lagrangian Acceleration (COLA) method in the temporal and spatial domains, which trades accuracy at small-scales to gain computational speed without sacrificing accuracy at large scales. This is especially useful for cheaply generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing. The COLA method achieves its speed by calculating the large-scale dynamics exactly using LPT while letting the N-body code solve for the small scales, without requiring it to capture exactly the internal dynamics of halos.
Quantum error correction in crossbar architectures

NASA Astrophysics Data System (ADS)

Helsen, Jonas; Steudtner, Mark; Veldhorst, Menno; Wehner, Stephanie

2018-07-01

A central challenge for the scaling of quantum computing systems is the need to control all qubits in the system without a large overhead. A solution for this problem in classical computing comes in the form of so-called crossbar architectures. Recently we made a proposal for a large-scale quantum processor (Li et al arXiv:1711.03807 (2017)) to be implemented in silicon quantum dots. This system features a crossbar control architecture which limits parallel single-qubit control, but allows the scheme to overcome control scaling issues that form a major hurdle to large-scale quantum computing systems. In this work, we develop a language that makes it possible to easily map quantum circuits to crossbar systems, taking into account their architecture and control limitations. Using this language we show how to map well known quantum error correction codes such as the planar surface and color codes in this limited control setting with only a small overhead in time. We analyze the logical error behavior of this surface code mapping for estimated experimental parameters of the crossbar system and conclude that logical error suppression to a level useful for real quantum computation is feasible.
CPMIP: measurements of real computational performance of Earth system models in CMIP6

NASA Astrophysics Data System (ADS)

Balaji, Venkatramani; Maisonnave, Eric; Zadeh, Niki; Lawrence, Bryan N.; Biercamp, Joachim; Fladrich, Uwe; Aloisio, Giovanni; Benson, Rusty; Caubel, Arnaud; Durachta, Jeffrey; Foujols, Marie-Alice; Lister, Grenville; Mocavero, Silvia; Underwood, Seth; Wright, Garrett

2017-01-01

A climate model represents a multitude of processes on a variety of timescales and space scales: a canonical example of multi-physics multi-scale modeling. The underlying climate system is physically characterized by sensitive dependence on initial conditions, and natural stochastic variability, so very long integrations are needed to extract signals of climate change. Algorithms generally possess weak scaling and can be I/O and/or memory-bound. Such weak-scaling, I/O, and memory-bound multi-physics codes present particular challenges to computational performance. Traditional metrics of computational efficiency such as performance counters and scaling curves do not tell us enough about real sustained performance from climate models on different machines. They also do not provide a satisfactory basis for comparative information across models. codes present particular challenges to computational performance. We introduce a set of metrics that can be used for the study of computational performance of climate (and Earth system) models. These measures do not require specialized software or specific hardware counters, and should be accessible to anyone. They are independent of platform and underlying parallel programming models. We show how these metrics can be used to measure actually attained performance of Earth system models on different machines, and identify the most fruitful areas of research and development for performance engineering. codes present particular challenges to computational performance. We present results for these measures for a diverse suite of models from several modeling centers, and propose to use these measures as a basis for a CPMIP, a computational performance model intercomparison project (MIP).
Validation of a modified table to map the 1998 Abbreviated Injury Scale to the 2008 scale and the use of adjusted severities.

PubMed

Tohira, Hideo; Jacobs, Ian; Mountain, David; Gibson, Nick; Yeo, Allen; Ueno, Masato; Watanabe, Hiroaki

2011-12-01

The Abbreviated Injury Scale 2008 (AIS 2008) is the most recent injury coding system. A mapping table from a previous AIS 98 to AIS 2008 is available. However, AIS 98 codes that are unmappable to AIS 2008 codes exist in this table. Furthermore, some AIS 98 codes can be mapped to multiple candidate AIS 2008 codes with different severities. We aimed to modify the original table to adjust the severities and to validate these changes. We modified the original table by adding links from unmappable AIS 98 codes to AIS 2008 codes. We applied the original table and our modified table to AIS 98 codes for major trauma patients. We also assigned candidate codes with different severities the weighted averages of their severities as an adjusted severity. The proportion of cases whose injury severity scores (ISSs) were computable were compared. We also compared the agreement of the ISS and New ISS (NISS) between manually determined AIS 2008 codes (MAN) and mapped codes by using our table (MAP) with unadjusted or adjusted severities. All and 72.3% of cases had their ISSs computed by our modified table and the original table, respectively. The agreement between MAN and MAP with respect to the ISS and NISS was substantial (intraclass correlation coefficient = 0.939 for ISS and 0.943 for NISS). Using adjusted severities, the agreements of the ISS and NISS improved to 0.953 (p = 0.11) and 0.963 (p = 0.007), respectively. Our modified mapping table seems to allow more ISSs to be computed than the original table. Severity scores exhibited substantial agreement between MAN and MAP. The use of adjusted severities improved these agreements further.
Hybrid reduced order modeling for assembly calculations

DOE PAGES

Bang, Youngsuk; Abdel-Khalik, Hany S.; Jessee, Matthew A.; ...

2015-08-14

While the accuracy of assembly calculations has greatly improved due to the increase in computer power enabling more refined description of the phase space and use of more sophisticated numerical algorithms, the computational cost continues to increase which limits the full utilization of their effectiveness for routine engineering analysis. Reduced order modeling is a mathematical vehicle that scales down the dimensionality of large-scale numerical problems to enable their repeated executions on small computing environment, often available to end users. This is done by capturing the most dominant underlying relationships between the model's inputs and outputs. Previous works demonstrated the usemore » of the reduced order modeling for a single physics code, such as a radiation transport calculation. This paper extends those works to coupled code systems as currently employed in assembly calculations. Finally, numerical tests are conducted using realistic SCALE assembly models with resonance self-shielding, neutron transport, and nuclides transmutation/depletion models representing the components of the coupled code system.« less
The coupling of fluids, dynamics, and controls on advanced architecture computers

NASA Technical Reports Server (NTRS)

Atwood, Christopher

1995-01-01

This grant provided for the demonstration of coupled controls, body dynamics, and fluids computations in a workstation cluster environment; and an investigation of the impact of peer-peer communication on flow solver performance and robustness. The findings of these investigations were documented in the conference articles.The attached publication, 'Towards Distributed Fluids/Controls Simulations', documents the solution and scaling of the coupled Navier-Stokes, Euler rigid-body dynamics, and state feedback control equations for a two-dimensional canard-wing. The poor scaling shown was due to serialized grid connectivity computation and Ethernet bandwidth limits. The scaling of a peer-to-peer communication flow code on an IBM SP-2 was also shown. The scaling of the code on the switched fabric-linked nodes was good, with a 2.4 percent loss due to communication of intergrid boundary point information. The code performance on 30 worker nodes was 1.7 (mu)s/point/iteration, or a factor of three over a Cray C-90 head. The attached paper, 'Nonlinear Fluid Computations in a Distributed Environment', documents the effect of several computational rate enhancing methods on convergence. For the cases shown, the highest throughput was achieved using boundary updates at each step, with the manager process performing communication tasks only. Constrained domain decomposition of the implicit fluid equations did not degrade the convergence rate or final solution. The scaling of a coupled body/fluid dynamics problem on an Ethernet-linked cluster was also shown.

Hypercube matrix computation task

NASA Technical Reports Server (NTRS)

Calalo, Ruel H.; Imbriale, William A.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Lyons, James R.; Manshadi, Farzin; Patterson, Jean E.

1988-01-01

A major objective of the Hypercube Matrix Computation effort at the Jet Propulsion Laboratory (JPL) is to investigate the applicability of a parallel computing architecture to the solution of large-scale electromagnetic scattering problems. Three scattering analysis codes are being implemented and assessed on a JPL/California Institute of Technology (Caltech) Mark 3 Hypercube. The codes, which utilize different underlying algorithms, give a means of evaluating the general applicability of this parallel architecture. The three analysis codes being implemented are a frequency domain method of moments code, a time domain finite difference code, and a frequency domain finite elements code. These analysis capabilities are being integrated into an electromagnetics interactive analysis workstation which can serve as a design tool for the construction of antennas and other radiating or scattering structures. The first two years of work on the Hypercube Matrix Computation effort is summarized. It includes both new developments and results as well as work previously reported in the Hypercube Matrix Computation Task: Final Report for 1986 to 1987 (JPL Publication 87-18).
Constructing Neuronal Network Models in Massively Parallel Environments.

PubMed

Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.
Constructing Neuronal Network Models in Massively Parallel Environments

PubMed Central

Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808
Computational Predictions of the Performance Wright 'Bent End' Propellers

NASA Technical Reports Server (NTRS)

Wang, Xiang-Yu; Ash, Robert L.; Bobbitt, Percy J.; Prior, Edwin (Technical Monitor)

2002-01-01

Computational analysis of two 1911 Wright brothers 'Bent End' wooden propeller reproductions have been performed and compared with experimental test results from the Langley Full Scale Wind Tunnel. The purpose of the analysis was to check the consistency of the experimental results and to validate the reliability of the tests. This report is one part of the project on the propeller performance research of the Wright 'Bent End' propellers, intend to document the Wright brothers' pioneering propeller design contributions. Two computer codes were used in the computational predictions. The FLO-MG Navier-Stokes code is a CFD (Computational Fluid Dynamics) code based on the Navier-Stokes Equations. It is mainly used to compute the lift coefficient and the drag coefficient at specified angles of attack at different radii. Those calculated data are the intermediate results of the computation and a part of the necessary input for the Propeller Design Analysis Code (based on Adkins and Libeck method), which is a propeller design code used to compute the propeller thrust coefficient, the propeller power coefficient and the propeller propulsive efficiency.
Posttest analysis of the 1:6-scale reinforced concrete containment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pfeiffer, P.A.; Kennedy, J.M.; Marchertas, A.H.

A prediction of the response of the Sandia National Laboratories 1:6- scale reinforced concrete containment model test was made by Argonne National Laboratory. ANL along with nine other organizations performed a detailed nonlinear response analysis of the 1:6-scale model containment subjected to overpressurization in the fall of 1986. The two-dimensional code TEMP-STRESS and the three-dimensional NEPTUNE code were utilized (1) to predict the global response of the structure, (2) to identify global failure sites and the corresponding failure pressures and (3) to identify some local failure sites and pressure levels. A series of axisymmetric models was studied with the two-dimensionalmore » computer program TEMP-STRESS. The comparison of these pretest computations with test data from the containment model has provided a test for the capability of the respective finite element codes to predict global failure modes, and hence serves as a validation of these codes. Only the two-dimensional analyses will be discussed in this paper. 3 refs., 10 figs.« less
PARVMEC: An Efficient, Scalable Implementation of the Variational Moments Equilibrium Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seal, Sudip K; Hirshman, Steven Paul; Wingen, Andreas

The ability to sustain magnetically confined plasma in a state of stable equilibrium is crucial for optimal and cost-effective operations of fusion devices like tokamaks and stellarators. The Variational Moments Equilibrium Code (VMEC) is the de-facto serial application used by fusion scientists to compute magnetohydrodynamics (MHD) equilibria and study the physics of three dimensional plasmas in confined configurations. Modern fusion energy experiments have larger system scales with more interactive experimental workflows, both demanding faster analysis turnaround times on computational workloads that are stressing the capabilities of sequential VMEC. In this paper, we present PARVMEC, an efficient, parallel version of itsmore » sequential counterpart, capable of scaling to thousands of processors on distributed memory machines. PARVMEC is a non-linear code, with multiple numerical physics modules, each with its own computational complexity. A detailed speedup analysis supported by scaling results on 1,024 cores of a Cray XC30 supercomputer is presented. Depending on the mode of PARVMEC execution, speedup improvements of one to two orders of magnitude are reported. PARVMEC equips fusion scientists for the first time with a state-of-theart capability for rapid, high fidelity analyses of magnetically confined plasmas at unprecedented scales.« less
Design Aspects of the Rayleigh Convection Code

NASA Astrophysics Data System (ADS)

Featherstone, N. A.

2017-12-01

Understanding the long-term generation of planetary or stellar magnetic field requires complementary knowledge of the large-scale fluid dynamics pervading large fractions of the object's interior. Such large-scale motions are sensitive to the system's geometry which, in planets and stars, is spherical to a good approximation. As a result, computational models designed to study such systems often solve the MHD equations in spherical geometry, frequently employing a spectral approach involving spherical harmonics. We present computational and user-interface design aspects of one such modeling tool, the Rayleigh convection code, which is suitable for deployment on desktop and petascale-hpc architectures alike. In this poster, we will present an overview of this code's parallel design and its built-in diagnostics-output package. Rayleigh has been developed with NSF support through the Computational Infrastructure for Geodynamics and is expected to be released as open-source software in winter 2017/2018.
Addressing the challenges of standalone multi-core simulations in molecular dynamics

NASA Astrophysics Data System (ADS)

Ocaya, R. O.; Terblans, J. J.

2017-07-01

Computational modelling in material science involves mathematical abstractions of force fields between particles with the aim to postulate, develop and understand materials by simulation. The aggregated pairwise interactions of the material's particles lead to a deduction of its macroscopic behaviours. For practically meaningful macroscopic scales, a large amount of data are generated, leading to vast execution times. Simulation times of hours, days or weeks for moderately sized problems are not uncommon. The reduction of simulation times, improved result accuracy and the associated software and hardware engineering challenges are the main motivations for many of the ongoing researches in the computational sciences. This contribution is concerned mainly with simulations that can be done on a "standalone" computer based on Message Passing Interfaces (MPI), parallel code running on hardware platforms with wide specifications, such as single/multi- processor, multi-core machines with minimal reconfiguration for upward scaling of computational power. The widely available, documented and standardized MPI library provides this functionality through the MPI_Comm_size (), MPI_Comm_rank () and MPI_Reduce () functions. A survey of the literature shows that relatively little is written with respect to the efficient extraction of the inherent computational power in a cluster. In this work, we discuss the main avenues available to tap into this extra power without compromising computational accuracy. We also present methods to overcome the high inertia encountered in single-node-based computational molecular dynamics. We begin by surveying the current state of the art and discuss what it takes to achieve parallelism, efficiency and enhanced computational accuracy through program threads and message passing interfaces. Several code illustrations are given. The pros and cons of writing raw code as opposed to using heuristic, third-party code are also discussed. The growing trend towards graphical processor units and virtual computing clouds for high-performance computing is also discussed. Finally, we present the comparative results of vacancy formation energy calculations using our own parallelized standalone code called Verlet-Stormer velocity (VSV) operating on 30,000 copper atoms. The code is based on the Sutton-Chen implementation of the Finnis-Sinclair pairwise embedded atom potential. A link to the code is also given.
Computer code for the prediction of nozzle admittance

NASA Technical Reports Server (NTRS)

Nguyen, Thong V.

1988-01-01

A procedure which can accurately characterize injector designs for large thrust (0.5 to 1.5 million pounds), high pressure (500 to 3000 psia) LOX/hydrocarbon engines is currently under development. In this procedure, a rectangular cross-sectional combustion chamber is to be used to simulate the lower traverse frequency modes of the large scale chamber. The chamber will be sized so that the first width mode of the rectangular chamber corresponds to the first tangential mode of the full-scale chamber. Test data to be obtained from the rectangular chamber will be used to assess the full scale engine stability. This requires the development of combustion stability models for rectangular chambers. As part of the combustion stability model development, a computer code, NOAD based on existing theory was developed to calculate the nozzle admittances for both rectangular and axisymmetric nozzles. This code is detailed.
Analysis of Delays in Transmitting Time Code Using an Automated Computer Time Distribution System

DTIC Science & Technology

1999-12-01

jlevine@clock. bldrdoc.gov Abstract An automated computer time distribution system broadcasts standard tune to users using computers and modems via...contributed to &lays - sofhareplatform (50% of the delay), transmission speed of time- codes (25OA), telephone network (lS%), modem and others (10’4). The... modems , and telephone lines. Users dial the ACTS server to receive time traceable to the national time scale of Singapore, UTC(PSB). The users can in
COLAcode: COmoving Lagrangian Acceleration code

NASA Astrophysics Data System (ADS)

Tassev, Svetlin V.

2016-02-01

COLAcode is a serial particle mesh-based N-body code illustrating the COLA (COmoving Lagrangian Acceleration) method; it solves for Large Scale Structure (LSS) in a frame that is comoving with observers following trajectories calculated in Lagrangian Perturbation Theory (LPT). It differs from standard N-body code by trading accuracy at small-scales to gain computational speed without sacrificing accuracy at large scales. This is useful for generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing; such catalogs are needed to perform detailed error analysis for ongoing and future surveys of LSS.
Teaching CAD on the Apple Computer.

ERIC Educational Resources Information Center

Norton, Robert L.

1984-01-01

Describes a course designed to teach engineers how to accomplish computer graphics techniques on a limited scale with the Apple computer. The same mathematics and program code will also function for larger and more complex computers. Course content, instructional strategies, student evaluation, and recommendations are considered. (JN)
Simulating the heterogeneity in braided channel belt deposits: 1. A geometric-based methodology and code

NASA Astrophysics Data System (ADS)

Ramanathan, Ramya; Guin, Arijit; Ritzi, Robert W.; Dominic, David F.; Freedman, Vicky L.; Scheibe, Timothy D.; Lunt, Ian A.

2010-04-01

A geometric-based simulation methodology was developed and incorporated into a computer code to model the hierarchical stratal architecture, and the corresponding spatial distribution of permeability, in braided channel belt deposits. The code creates digital models of these deposits as a three-dimensional cubic lattice, which can be used directly in numerical aquifer or reservoir models for fluid flow. The digital models have stratal units defined from the kilometer scale to the centimeter scale. These synthetic deposits are intended to be used as high-resolution base cases in various areas of computational research on multiscale flow and transport processes, including the testing of upscaling theories. The input parameters are primarily univariate statistics. These include the mean and variance for characteristic lengths of sedimentary unit types at each hierarchical level, and the mean and variance of log-permeability for unit types defined at only the lowest level (smallest scale) of the hierarchy. The code has been written for both serial and parallel execution. The methodology is described in part 1 of this paper. In part 2 (Guin et al., 2010), models generated by the code are presented and evaluated.
Simulating the Heterogeneity in Braided Channel Belt Deposits: Part 1. A Geometric-Based Methodology and Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ramanathan, Ramya; Guin, Arijit; Ritzi, Robert W.

A geometric-based simulation methodology was developed and incorporated into a computer code to model the hierarchical stratal architecture, and the corresponding spatial distribution of permeability, in braided channel belt deposits. The code creates digital models of these deposits as a three-dimensional cubic lattice, which can be used directly in numerical aquifer or reservoir models for fluid flow. The digital models have stratal units defined from the km scale to the cm scale. These synthetic deposits are intended to be used as high-resolution base cases in various areas of computational research on multiscale flow and transport processes, including the testing ofmore » upscaling theories. The input parameters are primarily univariate statistics. These include the mean and variance for characteristic lengths of sedimentary unit types at each hierarchical level, and the mean and variance of log-permeability for unit types defined at only the lowest level (smallest scale) of the hierarchy. The code has been written for both serial and parallel execution. The methodology is described in Part 1 of this series. In Part 2, models generated by the code are presented and evaluated.« less
Toward a first-principles integrated simulation of tokamak edge plasmas

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, C S; Klasky, Scott A; Cummings, Julian

2008-01-01

Performance of the ITER is anticipated to be highly sensitive to the edge plasma condition. The edge pedestal in ITER needs to be predicted from an integrated simulation of the necessary firstprinciples, multi-scale physics codes. The mission of the SciDAC Fusion Simulation Project (FSP) Prototype Center for Plasma Edge Simulation (CPES) is to deliver such a code integration framework by (1) building new kinetic codes XGC0 and XGC1, which can simulate the edge pedestal buildup; (2) using and improving the existing MHD codes ELITE, M3D-OMP, M3D-MPP and NIMROD, for study of large-scale edge instabilities called Edge Localized Modes (ELMs); andmore » (3) integrating the codes into a framework using cutting-edge computer science technology. Collaborative effort among physics, computer science, and applied mathematics within CPES has created the first working version of the End-to-end Framework for Fusion Integrated Simulation (EFFIS), which can be used to study the pedestal-ELM cycles.« less
Computational prediction of propellant reorientation

NASA Technical Reports Server (NTRS)

Hochstein, John I.

1987-01-01

Viewgraphs from a presentation on computational prediction of propellant reorientation are given. Information is given on code verification, test conditions, predictions for a one-quarter scale cryogenic tank, pulsed settling, and preliminary results.
Simulating Coupling Complexity in Space Plasmas: First Results from a new code

NASA Astrophysics Data System (ADS)

Kryukov, I.; Zank, G. P.; Pogorelov, N. V.; Raeder, J.; Ciardo, G.; Florinski, V. A.; Heerikhuisen, J.; Li, G.; Petrini, F.; Shematovich, V. I.; Winske, D.; Shaikh, D.; Webb, G. M.; Yee, H. M.

2005-12-01

The development of codes that embrace 'coupling complexity' via the self-consistent incorporation of multiple physical scales and multiple physical processes in models has been identified by the NRC Decadal Survey in Solar and Space Physics as a crucial necessary development in simulation/modeling technology for the coming decade. The National Science Foundation, through its Information Technology Research (ITR) Program, is supporting our efforts to develop a new class of computational code for plasmas and neutral gases that integrates multiple scales and multiple physical processes and descriptions. We are developing a highly modular, parallelized, scalable code that incorporates multiple scales by synthesizing 3 simulation technologies: 1) Computational fluid dynamics (hydrodynamics or magneto-hydrodynamics-MHD) for the large-scale plasma; 2) direct Monte Carlo simulation of atoms/neutral gas, and 3) transport code solvers to model highly energetic particle distributions. We are constructing the code so that a fourth simulation technology, hybrid simulations for microscale structures and particle distributions, can be incorporated in future work, but for the present, this aspect will be addressed at a test-particle level. This synthesis we will provide a computational tool that will advance our understanding of the physics of neutral and charged gases enormously. Besides making major advances in basic plasma physics and neutral gas problems, this project will address 3 Grand Challenge space physics problems that reflect our research interests: 1) To develop a temporal global heliospheric model which includes the interaction of solar and interstellar plasma with neutral populations (hydrogen, helium, etc., and dust), test-particle kinetic pickup ion acceleration at the termination shock, anomalous cosmic ray production, interaction with galactic cosmic rays, while incorporating the time variability of the solar wind and the solar cycle. 2) To develop a coronal mass ejection and interplanetary shock propagation model for the inner and outer heliosphere, including, at a test-particle level, wave-particle interactions and particle acceleration at traveling shock waves and compression regions. 3) To develop an advanced Geospace General Circulation Model (GGCM) capable of realistically modeling space weather events, in particular the interaction with CMEs and geomagnetic storms. Furthermore, by implementing scalable run-time supports and sophisticated off- and on-line prediction algorithms, we anticipate important advances in the development of automatic and intelligent system software to optimize a wide variety of 'embedded' computations on parallel computers. Finally, public domain MHD and hydrodynamic codes had a transforming effect on space and astrophysics. We expect that our new generation, open source, public domain multi-scale code will have a similar transformational effect in a variety of disciplines, opening up new classes of problems to physicists and engineers alike.
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
Spiking network simulation code for petascale computers.

PubMed

Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M; Plesser, Hans E; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

2014-01-01

Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today.
Real science at the petascale.

PubMed

Saksena, Radhika S; Boghosian, Bruce; Fazendeiro, Luis; Kenway, Owain A; Manos, Steven; Mazzeo, Marco D; Sadiq, S Kashif; Suter, James L; Wright, David; Coveney, Peter V

2009-06-28

We describe computational science research that uses petascale resources to achieve scientific results at unprecedented scales and resolution. The applications span a wide range of domains, from investigation of fundamental problems in turbulence through computational materials science research to biomedical applications at the forefront of HIV/AIDS research and cerebrovascular haemodynamics. This work was mainly performed on the US TeraGrid 'petascale' resource, Ranger, at Texas Advanced Computing Center, in the first half of 2008 when it was the largest computing system in the world available for open scientific research. We have sought to use this petascale supercomputer optimally across application domains and scales, exploiting the excellent parallel scaling performance found on up to at least 32 768 cores for certain of our codes in the so-called 'capability computing' category as well as high-throughput intermediate-scale jobs for ensemble simulations in the 32-512 core range. Furthermore, this activity provides evidence that conventional parallel programming with MPI should be successful at the petascale in the short to medium term. We also report on the parallel performance of some of our codes on up to 65 636 cores on the IBM Blue Gene/P system at the Argonne Leadership Computing Facility, which has recently been named the fastest supercomputer in the world for open science.

Spiking network simulation code for petascale computers

PubMed Central

Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M.; Plesser, Hans E.; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

2014-01-01

Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today. PMID:25346682
Exclusively visual analysis of classroom group interactions

NASA Astrophysics Data System (ADS)

Tucker, Laura; Scherr, Rachel E.; Zickler, Todd; Mazur, Eric

2016-12-01

Large-scale audiovisual data that measure group learning are time consuming to collect and analyze. As an initial step towards scaling qualitative classroom observation, we qualitatively coded classroom video using an established coding scheme with and without its audio cues. We find that interrater reliability is as high when using visual data only—without audio—as when using both visual and audio data to code. Also, interrater reliability is high when comparing use of visual and audio data to visual-only data. We see a small bias to code interactions as group discussion when visual and audio data are used compared with video-only data. This work establishes that meaningful educational observation can be made through visual information alone. Further, it suggests that after initial work to create a coding scheme and validate it in each environment, computer-automated visual coding could drastically increase the breadth of qualitative studies and allow for meaningful educational analysis on a far greater scale.
Radiant Energy Measurements from a Scaled Jet Engine Axisymmetric Exhaust Nozzle for a Baseline Code Validation Case

NASA Technical Reports Server (NTRS)

Baumeister, Joseph F.

1994-01-01

A non-flowing, electrically heated test rig was developed to verify computer codes that calculate radiant energy propagation from nozzle geometries that represent aircraft propulsion nozzle systems. Since there are a variety of analysis tools used to evaluate thermal radiation propagation from partially enclosed nozzle surfaces, an experimental benchmark test case was developed for code comparison. This paper briefly describes the nozzle test rig and the developed analytical nozzle geometry used to compare the experimental and predicted thermal radiation results. A major objective of this effort was to make available the experimental results and the analytical model in a format to facilitate conversion to existing computer code formats. For code validation purposes this nozzle geometry represents one validation case for one set of analysis conditions. Since each computer code has advantages and disadvantages based on scope, requirements, and desired accuracy, the usefulness of this single nozzle baseline validation case can be limited for some code comparisons.
Hypercube matrix computation task

NASA Technical Reports Server (NTRS)

Calalo, R.; Imbriale, W.; Liewer, P.; Lyons, J.; Manshadi, F.; Patterson, J.

1987-01-01

The Hypercube Matrix Computation (Year 1986-1987) task investigated the applicability of a parallel computing architecture to the solution of large scale electromagnetic scattering problems. Two existing electromagnetic scattering codes were selected for conversion to the Mark III Hypercube concurrent computing environment. They were selected so that the underlying numerical algorithms utilized would be different thereby providing a more thorough evaluation of the appropriateness of the parallel environment for these types of problems. The first code was a frequency domain method of moments solution, NEC-2, developed at Lawrence Livermore National Laboratory. The second code was a time domain finite difference solution of Maxwell's equations to solve for the scattered fields. Once the codes were implemented on the hypercube and verified to obtain correct solutions by comparing the results with those from sequential runs, several measures were used to evaluate the performance of the two codes. First, a comparison was provided of the problem size possible on the hypercube with 128 megabytes of memory for a 32-node configuration with that available in a typical sequential user environment of 4 to 8 megabytes. Then, the performance of the codes was anlyzed for the computational speedup attained by the parallel architecture.
High-speed inlet research program and supporting analysis

NASA Technical Reports Server (NTRS)

Coltrin, Robert E.

1990-01-01

The technology challenges faced by the high speed inlet designer are discussed by describing the considerations that went into the design of the Mach 5 research inlet. It is shown that the emerging three dimensional viscous computational fluid dynamics (CFD) flow codes, together with small scale experiments, can be used to guide larger scale full inlet systems research. Then, in turn, the results of the large scale research, if properly instrumented, can be used to validate or at least to calibrate the CFD codes.
NEAMS Update. Quarterly Report for October - December 2011.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bradley, K.

2012-02-16

The Advanced Modeling and Simulation Office within the DOE Office of Nuclear Energy (NE) has been charged with revolutionizing the design tools used to build nuclear power plants during the next 10 years. To accomplish this, the DOE has brought together the national laboratories, U.S. universities, and the nuclear energy industry to establish the Nuclear Energy Advanced Modeling and Simulation (NEAMS) Program. The mission of NEAMS is to modernize computer modeling of nuclear energy systems and improve the fidelity and validity of modeling results using contemporary software environments and high-performance computers. NEAMS will create a set of engineering-level codes aimedmore » at designing and analyzing the performance and safety of nuclear power plants and reactor fuels. The truly predictive nature of these codes will be achieved by modeling the governing phenomena at the spatial and temporal scales that dominate the behavior. These codes will be executed within a simulation environment that orchestrates code integration with respect to spatial meshing, computational resources, and execution to give the user a common 'look and feel' for setting up problems and displaying results. NEAMS is building upon a suite of existing simulation tools, including those developed by the federal Scientific Discovery through Advanced Computing and Advanced Simulation and Computing programs. NEAMS also draws upon existing simulation tools for materials and nuclear systems, although many of these are limited in terms of scale, applicability, and portability (their ability to be integrated into contemporary software and hardware architectures). NEAMS investments have directly and indirectly supported additional NE research and development programs, including those devoted to waste repositories, safeguarded separations systems, and long-term storage of used nuclear fuel. NEAMS is organized into two broad efforts, each comprising four elements. The quarterly highlights October-December 2011 are: (1) Version 1.0 of AMP, the fuel assembly performance code, was tested on the JAGUAR supercomputer and released on November 1, 2011, a detailed discussion of this new simulation tool is given; (2) A coolant sub-channel model and a preliminary UO{sub 2} smeared-cracking model were implemented in BISON, the single-pin fuel code, more information on how these models were developed and benchmarked is given; (3) The Object Kinetic Monte Carlo model was implemented to account for nucleation events in meso-scale simulations and a discussion of the significance of this advance is given; (4) The SHARP neutronics module, PROTEUS, was expanded to be applicable to all types of reactors, and a discussion of the importance of PROTEUS is given; (5) A plan has been finalized for integrating the high-fidelity, three-dimensional reactor code SHARP with both the systems-level code RELAP7 and the fuel assembly code AMP. This is a new initiative; (6) Work began to evaluate the applicability of AMP to the problem of dry storage of used fuel and to define a relevant problem to test the applicability; (7) A code to obtain phonon spectra from the force-constant matrix for a crystalline lattice has been completed. This important bridge between subcontinuum and continuum phenomena is discussed; (8) Benchmarking was begun on the meso-scale, finite-element fuels code MARMOT to validate its new variable splitting algorithm; (9) A very computationally demanding simulation of diffusion-driven nucleation of new microstructural features has been completed. An explanation of the difficulty of this simulation is given; (10) Experiments were conducted with deformed steel to validate a crystal plasticity finite-element code for bodycentered cubic iron; (11) The Capability Transfer Roadmap was completed and published as an internal laboratory technical report; (12) The AMP fuel assembly code input generator was integrated into the NEAMS Integrated Computational Environment (NiCE). More details on the planned NEAMS computing environment is given; and (13) The NEAMS program website (neams.energy.gov) is nearly ready to launch.« less
Technology Infusion of CodeSonar into the Space Network Ground Segment

NASA Technical Reports Server (NTRS)

Benson, Markland J.

2009-01-01

This slide presentation reviews the applicability of CodeSonar to the Space Network software. CodeSonar is a commercial off the shelf system that analyzes programs written in C, C++ or Ada for defects in the code. Software engineers use CodeSonar results as an input to the existing source code inspection process. The study is focused on large scale software developed using formal processes. The systems studied are mission critical in nature but some use commodity computer systems.
Fully kinetic 3D simulations of the Hermean magnetosphere under realistic conditions: a new approach

NASA Astrophysics Data System (ADS)

Amaya, Jorge; Gonzalez-Herrero, Diego; Lembège, Bertrand; Lapenta, Giovanni

2017-04-01

Simulations of the magnetosphere of planets are usually performed using the MHD and the hybrid approaches. However, these two methods still rely on approximations for the computation of the pressure tensor, and require the neutrality of the plasma at every point of the domain by construction. These approximations undermine the role of electrons on the emergence of plasma features in the magnetosphere of planets. The high mobility of electrons, their characteristic time and space scales, and the lack of perfect neutrality, are the source of many observed phenomena in the magnetospheres, including the turbulence energy cascade, the magnetic reconnection, the particle acceleration in the shock front and the formation of current systems around the magnetosphere. Fully kinetic codes are extremely demanding of computing time, and have been unable to perform simulations of the full magnetosphere at the real scales of a planet with realistic plasma conditions. This is caused by two main reasons: 1) explicit codes must resolve the electron scales limiting the time and space discretisation, and 2) current versions of semi-implicit codes are unstable for cell sizes larger than a few Debye lengths. In this work we present new simulations performed with ECsim, an Energy Conserving semi-implicit method [1], that can overcome these two barriers. We compare the solutions obtained with ECsim with the solutions obtained by the classic semi-implicit code iPic3D [2]. The new simulations with ECsim demand a larger computational effort, but the time and space discretisations are larger than those in iPic3D allowing for a faster simulation time of the full planetary environment. The new code, ECsim, can reach a resolution allowing the capture of significant large scale physics without loosing kinetic electron information, such as wave-electron interaction and non-Maxwellian electron velocity distributions [3]. The code is able to better capture the thickness of the different boundary layers of the magnetosphere of Mercury. Electron kinetics are consistent with the spatial and temporal scale resolutions. Simulations are compared with measurements from the MESSENGER spacecraft showing a better fit when compared against the classic fully kinetic code iPic3D. These results show that the new generation of Energy Conserving semi-implicit codes can be used for an accurate analysis and interpretation of particle data from magnetospheric missions like BepiColombo and MMS, including electron velocity distributions and electron temperature anisotropies. [1] Lapenta, G. (2016). Exactly Energy Conserving Implicit Moment Particle in Cell Formulation. arXiv preprint arXiv:1602.06326. [2] Markidis, S., & Lapenta, G. (2010). Multi-scale simulations of plasma with iPIC3D. Mathematics and Computers in Simulation, 80(7), 1509-1519. [3] Lapenta, G., Gonzalez-Herrero, D., & Boella, E. (2016). Multiple scale kinetic simulations with the energy conserving semi implicit particle in cell (PIC) method. arXiv preprint arXiv:1612.08289.
Visual analysis of inter-process communication for large-scale parallel computing.

PubMed

Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu

2009-01-01

In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.
Proceedings of the 14th International Conference on the Numerical Simulation of Plasmas

NASA Astrophysics Data System (ADS)

Partial Contents are as follows: Numerical Simulations of the Vlasov-Maxwell Equations by Coupled Particle-Finite Element Methods on Unstructured Meshes; Electromagnetic PIC Simulations Using Finite Elements on Unstructured Grids; Modelling Travelling Wave Output Structures with the Particle-in-Cell Code CONDOR; SST--A Single-Slice Particle Simulation Code; Graphical Display and Animation of Data Produced by Electromagnetic, Particle-in-Cell Codes; A Post-Processor for the PEST Code; Gray Scale Rendering of Beam Profile Data; A 2D Electromagnetic PIC Code for Distributed Memory Parallel Computers; 3-D Electromagnetic PIC Simulation on the NRL Connection Machine; Plasma PIC Simulations on MIMD Computers; Vlasov-Maxwell Algorithm for Electromagnetic Plasma Simulation on Distributed Architectures; MHD Boundary Layer Calculation Using the Vortex Method; and Eulerian Codes for Plasma Simulations.
New Challenges in Computational Thermal Hydraulics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yadigaroglu, George; Lakehal, Djamel

New needs and opportunities drive the development of novel computational methods for the design and safety analysis of light water reactors (LWRs). Some new methods are likely to be three dimensional. Coupling is expected between system codes, computational fluid dynamics (CFD) modules, and cascades of computations at scales ranging from the macro- or system scale to the micro- or turbulence scales, with the various levels continuously exchanging information back and forth. The ISP-42/PANDA and the international SETH project provide opportunities for testing applications of single-phase CFD methods to LWR safety problems. Although industrial single-phase CFD applications are commonplace, computational multifluidmore » dynamics is still under development. However, first applications are appearing; the state of the art and its potential uses are discussed. The case study of condensation of steam/air mixtures injected from a downward-facing vent into a pool of water is a perfect illustration of a simulation cascade: At the top of the hierarchy of scales, system behavior can be modeled with a system code; at the central level, the volume-of-fluid method can be applied to predict large-scale bubbling behavior; at the bottom of the cascade, direct-contact condensation can be treated with direct numerical simulation, in which turbulent flow (in both the gas and the liquid), interfacial dynamics, and heat/mass transfer are directly simulated without resorting to models.« less
Concurrent electromagnetic scattering analysis

NASA Technical Reports Server (NTRS)

Patterson, Jean E.; Cwik, Tom; Ferraro, Robert D.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Parker, Jay

1989-01-01

The computational power of the hypercube parallel computing architecture is applied to the solution of large-scale electromagnetic scattering and radiation problems. Three analysis codes have been implemented. A Hypercube Electromagnetic Interactive Analysis Workstation was developed to aid in the design and analysis of metallic structures such as antennas and to facilitate the use of these analysis codes. The workstation provides a general user environment for specification of the structure to be analyzed and graphical representations of the results.
Using Abbreviated Injury Scale (AIS) codes to classify Computed Tomography (CT) features in the Marshall System.

PubMed

Lesko, Mehdi M; Woodford, Maralyn; White, Laura; O'Brien, Sarah J; Childs, Charmaine; Lecky, Fiona E

2010-08-06

The purpose of Abbreviated Injury Scale (AIS) is to code various types of Traumatic Brain Injuries (TBI) based on their anatomical location and severity. The Marshall CT Classification is used to identify those subgroups of brain injured patients at higher risk of deterioration or mortality. The purpose of this study is to determine whether and how AIS coding can be translated to the Marshall Classification Initially, a Marshall Class was allocated to each AIS code through cross-tabulation. This was agreed upon through several discussion meetings with experts from both fields (clinicians and AIS coders). Furthermore, in order to make this translation possible, some necessary assumptions with regards to coding and classification of mass lesions and brain swelling were essential which were all approved and made explicit. The proposed method involves two stages: firstly to determine all possible Marshall Classes which a given patient can attract based on allocated AIS codes; via cross-tabulation and secondly to assign one Marshall Class to each patient through an algorithm. This method can be easily programmed in computer softwares and it would enable future important TBI research programs using trauma registry data.
Using Abbreviated Injury Scale (AIS) codes to classify Computed Tomography (CT) features in the Marshall System

PubMed Central

2010-01-01

Background The purpose of Abbreviated Injury Scale (AIS) is to code various types of Traumatic Brain Injuries (TBI) based on their anatomical location and severity. The Marshall CT Classification is used to identify those subgroups of brain injured patients at higher risk of deterioration or mortality. The purpose of this study is to determine whether and how AIS coding can be translated to the Marshall Classification Methods Initially, a Marshall Class was allocated to each AIS code through cross-tabulation. This was agreed upon through several discussion meetings with experts from both fields (clinicians and AIS coders). Furthermore, in order to make this translation possible, some necessary assumptions with regards to coding and classification of mass lesions and brain swelling were essential which were all approved and made explicit. Results The proposed method involves two stages: firstly to determine all possible Marshall Classes which a given patient can attract based on allocated AIS codes; via cross-tabulation and secondly to assign one Marshall Class to each patient through an algorithm. Conclusion This method can be easily programmed in computer softwares and it would enable future important TBI research programs using trauma registry data. PMID:20691038
High-Threshold Fault-Tolerant Quantum Computation with Analog Quantum Error Correction

NASA Astrophysics Data System (ADS)

Fukui, Kosuke; Tomita, Akihisa; Okamoto, Atsushi; Fujii, Keisuke

2018-04-01

To implement fault-tolerant quantum computation with continuous variables, the Gottesman-Kitaev-Preskill (GKP) qubit has been recognized as an important technological element. However, it is still challenging to experimentally generate the GKP qubit with the required squeezing level, 14.8 dB, of the existing fault-tolerant quantum computation. To reduce this requirement, we propose a high-threshold fault-tolerant quantum computation with GKP qubits using topologically protected measurement-based quantum computation with the surface code. By harnessing analog information contained in the GKP qubits, we apply analog quantum error correction to the surface code. Furthermore, we develop a method to prevent the squeezing level from decreasing during the construction of the large-scale cluster states for the topologically protected, measurement-based, quantum computation. We numerically show that the required squeezing level can be relaxed to less than 10 dB, which is within the reach of the current experimental technology. Hence, this work can considerably alleviate this experimental requirement and take a step closer to the realization of large-scale quantum computation.
Learning Short Binary Codes for Large-scale Image Retrieval.

PubMed

Liu, Li; Yu, Mengyang; Shao, Ling

2017-03-01

Large-scale visual information retrieval has become an active research area in this big data era. Recently, hashing/binary coding algorithms prove to be effective for scalable retrieval applications. Most existing hashing methods require relatively long binary codes (i.e., over hundreds of bits, sometimes even thousands of bits) to achieve reasonable retrieval accuracies. However, for some realistic and unique applications, such as on wearable or mobile devices, only short binary codes can be used for efficient image retrieval due to the limitation of computational resources or bandwidth on these devices. In this paper, we propose a novel unsupervised hashing approach called min-cost ranking (MCR) specifically for learning powerful short binary codes (i.e., usually the code length shorter than 100 b) for scalable image retrieval tasks. By exploring the discriminative ability of each dimension of data, MCR can generate one bit binary code for each dimension and simultaneously rank the discriminative separability of each bit according to the proposed cost function. Only top-ranked bits with minimum cost-values are then selected and grouped together to compose the final salient binary codes. Extensive experimental results on large-scale retrieval demonstrate that MCR can achieve comparative performance as the state-of-the-art hashing algorithms but with significantly shorter codes, leading to much faster large-scale retrieval.
GPU Multi-Scale Particle Tracking and Multi-Fluid Simulations of the Radiation Belts

NASA Astrophysics Data System (ADS)

Ziemba, T.; Carscadden, J.; O'Donnell, D.; Winglee, R.; Harnett, E.; Cash, M.

2007-12-01

The properties of the radiation belts can vary dramatically under the influence of magnetic storms and storm-time substorms. The task of understanding and predicting radiation belt properties is made difficult because their properties determined by global processes as well as small-scale wave-particle interactions. A full solution to the problem will require major innovations in technique and computer hardware. The proposed work will demonstrates liked particle tracking codes with new multi-scale/multi-fluid global simulations that provide the first means to include small-scale processes within the global magnetospheric context. A large hurdle to the problem is having sufficient computer hardware that is able to handle the dissipate temporal and spatial scale sizes. A major innovation of the work is that the codes are designed to run of graphics processing units (GPUs). GPUs are intrinsically highly parallelized systems that provide more than an order of magnitude computing speed over a CPU based systems, for little more cost than a high end-workstation. Recent advancements in GPU technologies allow for full IEEE float specifications with performance up to several hundred GFLOPs per GPU and new software architectures have recently become available to ease the transition from graphics based to scientific applications. This allows for a cheap alternative to standard supercomputing methods and should increase the time to discovery. A demonstration of the code pushing more than 500,000 particles faster than real time is presented, and used to provide new insight into radiation belt dynamics.
High Performance Computing for Modeling Wind Farms and Their Impact

NASA Astrophysics Data System (ADS)

Mavriplis, D.; Naughton, J. W.; Stoellinger, M. K.

2016-12-01

As energy generated by wind penetrates further into our electrical system, modeling of power production, power distribution, and the economic impact of wind-generated electricity is growing in importance. The models used for this work can range in fidelity from simple codes that run on a single computer to those that require high performance computing capabilities. Over the past several years, high fidelity models have been developed and deployed on the NCAR-Wyoming Supercomputing Center's Yellowstone machine. One of the primary modeling efforts focuses on developing the capability to compute the behavior of a wind farm in complex terrain under realistic atmospheric conditions. Fully modeling this system requires the simulation of continental flows to modeling the flow over a wind turbine blade, including down to the blade boundary level, fully 10 orders of magnitude in scale. To accomplish this, the simulations are broken up by scale, with information from the larger scales being passed to the lower scale models. In the code being developed, four scale levels are included: the continental weather scale, the local atmospheric flow in complex terrain, the wind plant scale, and the turbine scale. The current state of the models in the latter three scales will be discussed. These simulations are based on a high-order accurate dynamic overset and adaptive mesh approach, which runs at large scale on the NWSC Yellowstone machine. A second effort on modeling the economic impact of new wind development as well as improvement in wind plant performance and enhancements to the transmission infrastructure will also be discussed.
Observations on computational methodologies for use in large-scale, gradient-based, multidisciplinary design incorporating advanced CFD codes

NASA Technical Reports Server (NTRS)

Newman, P. A.; Hou, G. J.-W.; Jones, H. E.; Taylor, A. C., III; Korivi, V. M.

1992-01-01

How a combination of various computational methodologies could reduce the enormous computational costs envisioned in using advanced CFD codes in gradient based optimized multidisciplinary design (MdD) procedures is briefly outlined. Implications of these MdD requirements upon advanced CFD codes are somewhat different than those imposed by a single discipline design. A means for satisfying these MdD requirements for gradient information is presented which appear to permit: (1) some leeway in the CFD solution algorithms which can be used; (2) an extension to 3-D problems; and (3) straightforward use of other computational methodologies. Many of these observations have previously been discussed as possibilities for doing parts of the problem more efficiently; the contribution here is observing how they fit together in a mutually beneficial way.
Final Report for "Implimentation and Evaluation of Multigrid Linear Solvers into Extended Magnetohydrodynamic Codes for Petascale Computing"

DOE Office of Scientific and Technical Information (OSTI.GOV)

Srinath Vadlamani; Scott Kruger; Travis Austin

Extended magnetohydrodynamic (MHD) codes are used to model the large, slow-growing instabilities that are projected to limit the performance of International Thermonuclear Experimental Reactor (ITER). The multiscale nature of the extended MHD equations requires an implicit approach. The current linear solvers needed for the implicit algorithm scale poorly because the resultant matrices are so ill-conditioned. A new solver is needed, especially one that scales to the petascale. The most successful scalable parallel processor solvers to date are multigrid solvers. Applying multigrid techniques to a set of equations whose fundamental modes are dispersive waves is a promising solution to CEMM problems.more » For the Phase 1, we implemented multigrid preconditioners from the HYPRE project of the Center for Applied Scientific Computing at LLNL via PETSc of the DOE SciDAC TOPS for the real matrix systems of the extended MHD code NIMROD which is a one of the primary modeling codes of the OFES-funded Center for Extended Magnetohydrodynamic Modeling (CEMM) SciDAC. We implemented the multigrid solvers on the fusion test problem that allows for real matrix systems with success, and in the process learned about the details of NIMROD data structures and the difficulties of inverting NIMROD operators. The further success of this project will allow for efficient usage of future petascale computers at the National Leadership Facilities: Oak Ridge National Laboratory, Argonne National Laboratory, and National Energy Research Scientific Computing Center. The project will be a collaborative effort between computational plasma physicists and applied mathematicians at Tech-X Corporation, applied mathematicians Front Range Scientific Computations, Inc. (who are collaborators on the HYPRE project), and other computational plasma physicists involved with the CEMM project.« less

Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

2000-01-01

HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).
Computational methods for coupling microstructural and micromechanical materials response simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

HOLM,ELIZABETH A.; BATTAILE,CORBETT C.; BUCHHEIT,THOMAS E.

2000-04-01

Computational materials simulations have traditionally focused on individual phenomena: grain growth, crack propagation, plastic flow, etc. However, real materials behavior results from a complex interplay between phenomena. In this project, the authors explored methods for coupling mesoscale simulations of microstructural evolution and micromechanical response. In one case, massively parallel (MP) simulations for grain evolution and microcracking in alumina stronglink materials were dynamically coupled. In the other, codes for domain coarsening and plastic deformation in CuSi braze alloys were iteratively linked. this program provided the first comparison of two promising ways to integrate mesoscale computer codes. Coupled microstructural/micromechanical codes were appliedmore » to experimentally observed microstructures for the first time. In addition to the coupled codes, this project developed a suite of new computational capabilities (PARGRAIN, GLAD, OOF, MPM, polycrystal plasticity, front tracking). The problem of plasticity length scale in continuum calculations was recognized and a solution strategy was developed. The simulations were experimentally validated on stockpile materials.« less
Pretest aerosol code comparisons for LWR aerosol containment tests LA1 and LA2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wright, A.L.; Wilson, J.H.; Arwood, P.C.

The Light-Water-Reactor (LWR) Aerosol Containment Experiments (LACE) are being performed in Richland, Washington, at the Hanford Engineering Development Laboratory (HEDL) under the leadership of an international project board and the Electric Power Research Institute. These tests have two objectives: (1) to investigate, at large scale, the inherent aerosol retention behavior in LWR containments under simulated severe accident conditions, and (2) to provide an experimental data base for validating aerosol behavior and thermal-hydraulic computer codes. Aerosol computer-code comparison activities are being coordinated at the Oak Ridge National Laboratory. For each of the six LACE tests, ''pretest'' calculations (for code-to-code comparisons) andmore » ''posttest'' calculations (for code-to-test data comparisons) are being performed. The overall goals of the comparison effort are (1) to provide code users with experience in applying their codes to LWR accident-sequence conditions and (2) to evaluate and improve the code models.« less
Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP

NASA Technical Reports Server (NTRS)

Long, Lyle N.; Brentner, Kenneth S.

2000-01-01

This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.
Influence of temperature fluctuations on infrared limb radiance: a new simulation code

NASA Astrophysics Data System (ADS)

Rialland, Valérie; Chervet, Patrick

2006-08-01

Airborne infrared limb-viewing detectors may be used as surveillance sensors in order to detect dim military targets. These systems' performances are limited by the inhomogeneous background in the sensor field of view which impacts strongly on target detection probability. This background clutter, which results from small-scale fluctuations of temperature, density or pressure must therefore be analyzed and modeled. Few existing codes are able to model atmospheric structures and their impact on limb-observed radiance. SAMM-2 (SHARC-4 and MODTRAN4 Merged), the Air Force Research Laboratory (AFRL) background radiance code can be used to in order to predict the radiance fluctuation as a result of a normalized temperature fluctuation, as a function of the line-of-sight. Various realizations of cluttered backgrounds can then be computed, based on these transfer functions and on a stochastic temperature field. The existing SIG (SHARC Image Generator) code was designed to compute the cluttered background which would be observed from a space-based sensor. Unfortunately, this code was not able to compute accurate scenes as seen by an airborne sensor especially for lines-of-sight close to the horizon. Recently, we developed a new code called BRUTE3D and adapted to our configuration. This approach is based on a method originally developed in the SIG model. This BRUTE3D code makes use of a three-dimensional grid of temperature fluctuations and of the SAMM-2 transfer functions to synthesize an image of radiance fluctuations according to sensor characteristics. This paper details the working principles of the code and presents some output results. The effects of the small-scale temperature fluctuations on infrared limb radiance as seen by an airborne sensor are highlighted.
PHoToNs–A parallel heterogeneous and threads oriented code for cosmological N-body simulation

NASA Astrophysics Data System (ADS)

Wang, Qiao; Cao, Zong-Yan; Gao, Liang; Chi, Xue-Bin; Meng, Chen; Wang, Jie; Wang, Long

2018-06-01

We introduce a new code for cosmological simulations, PHoToNs, which incorporates features for performing massive cosmological simulations on heterogeneous high performance computer (HPC) systems and threads oriented programming. PHoToNs adopts a hybrid scheme to compute gravitational force, with the conventional Particle-Mesh (PM) algorithm to compute the long-range force, the Tree algorithm to compute the short range force and the direct summation Particle-Particle (PP) algorithm to compute gravity from very close particles. A self-similar space filling a Peano-Hilbert curve is used to decompose the computing domain. Threads programming is advantageously used to more flexibly manage the domain communication, PM calculation and synchronization, as well as Dual Tree Traversal on the CPU+MIC platform. PHoToNs scales well and efficiency of the PP kernel achieves 68.6% of peak performance on MIC and 74.4% on CPU platforms. We also test the accuracy of the code against the much used Gadget-2 in the community and found excellent agreement.
Collection Efficiency and Ice Accretion Characteristics of Two Full Scale and One 1/4 Scale Business Jet Horizontal Tails

NASA Technical Reports Server (NTRS)

Bidwell, Colin S.; Papadakis, Michael

2005-01-01

Collection efficiency and ice accretion calculations have been made for a series of business jet horizontal tail configurations using a three-dimensional panel code, an adaptive grid code, and the NASA Glenn LEWICE3D grid based ice accretion code. The horizontal tail models included two full scale wing tips and a 25 percent scale model. Flow solutions for the horizontal tails were generated using the PMARC panel code. Grids used in the ice accretion calculations were generated using the adaptive grid code ICEGRID. The LEWICE3D grid based ice accretion program was used to calculate impingement efficiency and ice shapes. Ice shapes typifying rime and mixed icing conditions were generated for a 30 minute hold condition. All calculations were performed on an SGI Octane computer. The results have been compared to experimental flow and impingement data. In general, the calculated flow and collection efficiencies compared well with experiment, and the ice shapes appeared representative of the rime and mixed icing conditions for which they were calculated.
Porting a Hall MHD Code to a Graphic Processing Unit

NASA Technical Reports Server (NTRS)

Dorelli, John C.

2011-01-01

We present our experience porting a Hall MHD code to a Graphics Processing Unit (GPU). The code is a 2nd order accurate MUSCL-Hancock scheme which makes use of an HLL Riemann solver to compute numerical fluxes and second-order finite differences to compute the Hall contribution to the electric field. The divergence of the magnetic field is controlled with Dedner?s hyperbolic divergence cleaning method. Preliminary benchmark tests indicate a speedup (relative to a single Nehalem core) of 58x for a double precision calculation. We discuss scaling issues which arise when distributing work across multiple GPUs in a CPU-GPU cluster.
Requirements for migration of NSSD code systems from LTSS to NLTSS

NASA Technical Reports Server (NTRS)

Pratt, M.

1984-01-01

The purpose of this document is to address the requirements necessary for a successful conversion of the Nuclear Design (ND) application code systems to the NLTSS environment. The ND application code system community can be characterized as large-scale scientific computation carried out on supercomputers. NLTSS is a distributed operating system being developed at LLNL to replace the LTSS system currently in use. The implications of change are examined including a description of the computational environment and users in ND. The discussion then turns to requirements, first in a general way, followed by specific requirements, including a proposal for managing the transition.
Multi-scale approach to the modeling of fission gas discharge during hypothetical loss-of-flow accident in gen-IV sodium fast reactor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Behafarid, F.; Shaver, D. R.; Bolotnov, I. A.

The required technological and safety standards for future Gen IV Reactors can only be achieved if advanced simulation capabilities become available, which combine high performance computing with the necessary level of modeling detail and high accuracy of predictions. The purpose of this paper is to present new results of multi-scale three-dimensional (3D) simulations of the inter-related phenomena, which occur as a result of fuel element heat-up and cladding failure, including the injection of a jet of gaseous fission products into a partially blocked Sodium Fast Reactor (SFR) coolant channel, and gas/molten sodium transport along the coolant channels. The computational approachmore » to the analysis of the overall accident scenario is based on using two different inter-communicating computational multiphase fluid dynamics (CMFD) codes: a CFD code, PHASTA, and a RANS code, NPHASE-CMFD. Using the geometry and time history of cladding failure and the gas injection rate, direct numerical simulations (DNS), combined with the Level Set method, of two-phase turbulent flow have been performed by the PHASTA code. The model allows one to track the evolution of gas/liquid interfaces at a centimeter scale. The simulated phenomena include the formation and breakup of the jet of fission products injected into the liquid sodium coolant. The PHASTA outflow has been averaged over time to obtain mean phasic velocities and volumetric concentrations, as well as the liquid turbulent kinetic energy and turbulence dissipation rate, all of which have served as the input to the core-scale simulations using the NPHASE-CMFD code. A sliding window time averaging has been used to capture mean flow parameters for transient cases. The results presented in the paper include testing and validation of the proposed models, as well the predictions of fission-gas/liquid-sodium transport along a multi-rod fuel assembly of SFR during a partial loss-of-flow accident. (authors)« less
Parallel Computation of the Regional Ocean Modeling System (ROMS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, P; Song, Y T; Chao, Y

2005-04-05

The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds ofmore » processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.« less
Summary Report of Working Group 2: Computation

NASA Astrophysics Data System (ADS)

Stoltz, P. H.; Tsung, R. S.

2009-01-01

The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) new hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.
Summary Report of Working Group 2: Computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stoltz, P. H.; Tsung, R. S.

2009-01-22

The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) newmore » hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.« less
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

NASA Astrophysics Data System (ADS)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O'Neill, B. J.; Nolting, C.; Edmon, P.; Donnert, J. M. F.; Jones, T. W.

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
Modeling of impulsive propellant reorientation

NASA Technical Reports Server (NTRS)

Hochstein, John I.; Patag, Alfredo E.; Chato, David J.

1988-01-01

The impulsive propellant reorientation process is modeled using the (Energy Calculations for Liquid Propellants in a Space Environment (ECLIPSE) code. A brief description of the process and the computational model is presented. Code validation is documented via comparison to experimentally derived data for small scale tanks. Predictions of reorientation performance are presented for two tanks designed for use in flight experiments and for a proposed full scale OTV tank. A new dimensionless parameter is developed to correlate reorientation performance in geometrically similar tanks. Its success is demonstrated.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.

PubMed

Kunkel, Susanne; Schenck, Wolfram

2017-01-01

NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code

PubMed Central

Kunkel, Susanne; Schenck, Wolfram

2017-01-01

NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946
Large-scale transmission-type multifunctional anisotropic coding metasurfaces in millimeter-wave frequencies

NASA Astrophysics Data System (ADS)

Cui, Tie Jun; Wu, Rui Yuan; Wu, Wei; Shi, Chuan Bo; Li, Yun Bo

2017-10-01

We propose fast and accurate designs to large-scale and low-profile transmission-type anisotropic coding metasurfaces with multiple functions in the millimeter-wave frequencies based on the antenna-array method. The numerical simulation of an anisotropic coding metasurface with the size of 30λ × 30λ by the proposed method takes only 20 min, which however cannot be realized by commercial software due to huge memory usage in personal computers. To inspect the performance of coding metasurfaces in the millimeter-wave band, the working frequency is chosen as 60 GHz. Based on the convolution operations and holographic theory, the proposed multifunctional anisotropic coding metasurface exhibits different effects excited by y-polarized and x-polarized incidences. This study extends the frequency range of coding metasurfaces, filling the gap between microwave and terahertz bands, and implying promising applications in millimeter-wave communication and imaging.
BlazeDEM3D-GPU A Large Scale DEM simulation code for GPUs

NASA Astrophysics Data System (ADS)

Govender, Nicolin; Wilke, Daniel; Pizette, Patrick; Khinast, Johannes

2017-06-01

Accurately predicting the dynamics of particulate materials is of importance to numerous scientific and industrial areas with applications ranging across particle scales from powder flow to ore crushing. Computational discrete element simulations is a viable option to aid in the understanding of particulate dynamics and design of devices such as mixers, silos and ball mills, as laboratory scale tests comes at a significant cost. However, the computational time required to simulate an industrial scale simulation which consists of tens of millions of particles can take months to complete on large CPU clusters, making the Discrete Element Method (DEM) unfeasible for industrial applications. Simulations are therefore typically restricted to tens of thousands of particles with highly detailed particle shapes or a few million of particles with often oversimplified particle shapes. However, a number of applications require accurate representation of the particle shape to capture the macroscopic behaviour of the particulate system. In this paper we give an overview of the recent extensions to the open source GPU based DEM code, BlazeDEM3D-GPU, that can simulate millions of polyhedra and tens of millions of spheres on a desktop computer with a single or multiple GPUs.
Performance Analysis, Design Considerations, and Applications of Extreme-Scale In Situ Infrastructures

DOE PAGES

Ayachit, Utkarsh; Bauer, Andrew; Duque, Earl P. N.; ...

2016-11-01

A key trend facing extreme-scale computational science is the widening gap between computational and I/O rates, and the challenge that follows is how to best gain insight from simulation data when it is increasingly impractical to save it to persistent storage for subsequent visual exploration and analysis. One approach to this challenge is centered around the idea of in situ processing, where visualization and analysis processing is performed while data is still resident in memory. Our paper examines several key design and performance issues related to the idea of in situ processing at extreme scale on modern platforms: Scalability, overhead,more » performance measurement and analysis, comparison and contrast with a traditional post hoc approach, and interfacing with simulation codes. We illustrate these principles in practice with studies, conducted on large-scale HPC platforms, that include a miniapplication and multiple science application codes, one of which demonstrates in situ methods in use at greater than 1M-way concurrency.« less

Fractal Viscous Fingering in Fracture Networks

NASA Astrophysics Data System (ADS)

Boyle, E.; Sams, W.; Ferer, M.; Smith, D. H.

2007-12-01

We have used two very different physical models and computer codes to study miscible injection of a low- viscosity fluid into a simple fracture network, where it displaces a much-more viscous "defending" fluid through "rock" that is otherwise impermeable. The one code (NETfLow) is a standard pore level model, originally intended to treat laboratory-scale experiments; it assumes negligible mixing of the two fluids. The other code (NFFLOW) was written to treat reservoir-scale engineering problems; It explicitly treats the flow through the fractures and allows for significant mixing of the fluids at the interface. Both codes treat the fractures as parallel plates, of different effective apertures. Results are presented for the composition profiles from both codes. Independent of the degree of fluid-mixing, the profiles from both models have a functional form identical to that for fractal viscous fingering (i.e., diffusion limited aggregation, DLA). The two codes that solve the equations for different models gave similar results; together they suggest that the injection of a low-viscosity fluid into large- scale fracture networks may be much more significantly affected by fractal fingering than previously illustrated.
Superconducting quantum circuits at the surface code threshold for fault tolerance.

PubMed

Barends, R; Kelly, J; Megrant, A; Veitia, A; Sank, D; Jeffrey, E; White, T C; Mutus, J; Fowler, A G; Campbell, B; Chen, Y; Chen, Z; Chiaro, B; Dunsworth, A; Neill, C; O'Malley, P; Roushan, P; Vainsencher, A; Wenner, J; Korotkov, A N; Cleland, A N; Martinis, John M

2014-04-24

A quantum computer can solve hard problems, such as prime factoring, database searching and quantum simulation, at the cost of needing to protect fragile quantum states from error. Quantum error correction provides this protection by distributing a logical state among many physical quantum bits (qubits) by means of quantum entanglement. Superconductivity is a useful phenomenon in this regard, because it allows the construction of large quantum circuits and is compatible with microfabrication. For superconducting qubits, the surface code approach to quantum computing is a natural choice for error correction, because it uses only nearest-neighbour coupling and rapidly cycled entangling gates. The gate fidelity requirements are modest: the per-step fidelity threshold is only about 99 per cent. Here we demonstrate a universal set of logic gates in a superconducting multi-qubit processor, achieving an average single-qubit gate fidelity of 99.92 per cent and a two-qubit gate fidelity of up to 99.4 per cent. This places Josephson quantum computing at the fault-tolerance threshold for surface code error correction. Our quantum processor is a first step towards the surface code, using five qubits arranged in a linear array with nearest-neighbour coupling. As a further demonstration, we construct a five-qubit Greenberger-Horne-Zeilinger state using the complete circuit and full set of gates. The results demonstrate that Josephson quantum computing is a high-fidelity technology, with a clear path to scaling up to large-scale, fault-tolerant quantum circuits.
Modernization of the graphics post-processors of the Hamburg German Climate Computer Center Carbon Cycle Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stevens, E.J.; McNeilly, G.S.

The existing National Center for Atmospheric Research (NCAR) code in the Hamburg Oceanic Carbon Cycle Circulation Model and the Hamburg Large-Scale Geostrophic Ocean General Circulation Model was modernized and reduced in size while still producing an equivalent end result. A reduction in the size of the existing code from more than 50,000 lines to approximately 7,500 lines in the new code has made the new code much easier to maintain. The existing code in Hamburg model uses legacy NCAR (including even emulated CALCOMP subrountines) graphics to display graphical output. The new code uses only current (version 3.1) NCAR subrountines.
Performance Analysis, Modeling and Scaling of HPC Applications and Tools

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhatele, Abhinav

2016-01-13

E cient use of supercomputers at DOE centers is vital for maximizing system throughput, mini- mizing energy costs and enabling science breakthroughs faster. This requires complementary e orts along several directions to optimize the performance of scienti c simulation codes and the under- lying runtimes and software stacks. This in turn requires providing scalable performance analysis tools and modeling techniques that can provide feedback to physicists and computer scientists developing the simulation codes and runtimes respectively. The PAMS project is using time allocations on supercomputers at ALCF, NERSC and OLCF to further the goals described above by performing research alongmore » the following fronts: 1. Scaling Study of HPC applications; 2. Evaluation of Programming Models; 3. Hardening of Performance Tools; 4. Performance Modeling of Irregular Codes; and 5. Statistical Analysis of Historical Performance Data. We are a team of computer and computational scientists funded by both DOE/NNSA and DOE/ ASCR programs such as ECRP, XStack (Traleika Glacier, PIPER), ExaOSR (ARGO), SDMAV II (MONA) and PSAAP II (XPACC). This allocation will enable us to study big data issues when analyzing performance on leadership computing class systems and to assist the HPC community in making the most e ective use of these resources.« less
Cosmological neutrino simulations at extreme scale

DOE PAGES

Emberson, J. D.; Yu, Hao-Ran; Inman, Derek; ...

2017-08-01

Constraining neutrino mass remains an elusive challenge in modern physics. Precision measurements are expected from several upcoming cosmological probes of large-scale structure. Achieving this goal relies on an equal level of precision from theoretical predictions of neutrino clustering. Numerical simulations of the non-linear evolution of cold dark matter and neutrinos play a pivotal role in this process. We incorporate neutrinos into the cosmological N-body code CUBEP3M and discuss the challenges associated with pushing to the extreme scales demanded by the neutrino problem. We highlight code optimizations made to exploit modern high performance computing architectures and present a novel method ofmore » data compression that reduces the phase-space particle footprint from 24 bytes in single precision to roughly 9 bytes. We scale the neutrino problem to the Tianhe-2 supercomputer and provide details of our production run, named TianNu, which uses 86% of the machine (13,824 compute nodes). With a total of 2.97 trillion particles, TianNu is currently the world’s largest cosmological N-body simulation and improves upon previous neutrino simulations by two orders of magnitude in scale. We finish with a discussion of the unanticipated computational challenges that were encountered during the TianNu runtime.« less
Hybrid MPI-OpenMP Parallelism in the ONETEP Linear-Scaling Electronic Structure Code: Application to the Delamination of Cellulose Nanofibrils.

PubMed

Wilkinson, Karl A; Hine, Nicholas D M; Skylaris, Chris-Kriton

2014-11-11

We present a hybrid MPI-OpenMP implementation of Linear-Scaling Density Functional Theory within the ONETEP code. We illustrate its performance on a range of high performance computing (HPC) platforms comprising shared-memory nodes with fast interconnect. Our work has focused on applying OpenMP parallelism to the routines which dominate the computational load, attempting where possible to parallelize different loops from those already parallelized within MPI. This includes 3D FFT box operations, sparse matrix algebra operations, calculation of integrals, and Ewald summation. While the underlying numerical methods are unchanged, these developments represent significant changes to the algorithms used within ONETEP to distribute the workload across CPU cores. The new hybrid code exhibits much-improved strong scaling relative to the MPI-only code and permits calculations with a much higher ratio of cores to atoms. These developments result in a significantly shorter time to solution than was possible using MPI alone and facilitate the application of the ONETEP code to systems larger than previously feasible. We illustrate this with benchmark calculations from an amyloid fibril trimer containing 41,907 atoms. We use the code to study the mechanism of delamination of cellulose nanofibrils when undergoing sonification, a process which is controlled by a large number of interactions that collectively determine the structural properties of the fibrils. Many energy evaluations were needed for these simulations, and as these systems comprise up to 21,276 atoms this would not have been feasible without the developments described here.
Comparison of the LLNL ALE3D and AKTS Thermal Safety Computer Codes for Calculating Times to Explosion in ODTX and STEX Thermal Cookoff Experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wemhoff, A P; Burnham, A K

2006-04-05

Cross-comparison of the results of two computer codes for the same problem provides a mutual validation of their computational methods. This cross-validation exercise was performed for LLNL's ALE3D code and AKTS's Thermal Safety code, using the thermal ignition of HMX in two standard LLNL cookoff experiments: the One-Dimensional Time to Explosion (ODTX) test and the Scaled Thermal Explosion (STEX) test. The chemical kinetics model used in both codes was the extended Prout-Tompkins model, a relatively new addition to ALE3D. This model was applied using ALE3D's new pseudospecies feature. In addition, an advanced isoconversional kinetic approach was used in the AKTSmore » code. The mathematical constants in the Prout-Tompkins code were calibrated using DSC data from hermetically sealed vessels and the LLNL optimization code Kinetics05. The isoconversional kinetic parameters were optimized using the AKTS Thermokinetics code. We found that the Prout-Tompkins model calculations agree fairly well between the two codes, and the isoconversional kinetic model gives very similar results as the Prout-Tompkins model. We also found that an autocatalytic approach in the beta-delta phase transition model does affect the times to explosion for some conditions, especially STEX-like simulations at ramp rates above 100 C/hr, and further exploration of that effect is warranted.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sprague, Michael A.

Enabled by petascale supercomputing, the next generation of computer models for wind energy will simulate a vast range of scales and physics, spanning from turbine structural dynamics and blade-scale turbulence to mesoscale atmospheric flow. A single model covering all scales and physics is not feasible. Thus, these simulations will require the coupling of different models/codes, each for different physics, interacting at their domain boundaries.
On the Evolution of the Standard Genetic Code: Vestiges of Critical Scale Invariance from the RNA World in Current Prokaryote Genomes

PubMed Central

José, Marco V.; Govezensky, Tzipe; García, José A.; Bobadilla, Juan R.

2009-01-01

Herein two genetic codes from which the primeval RNA code could have originated the standard genetic code (SGC) are derived. One of them, called extended RNA code type I, consists of all codons of the type RNY (purine-any base-pyrimidine) plus codons obtained by considering the RNA code but in the second (NYR type) and third (YRN type) reading frames. The extended RNA code type II, comprises all codons of the type RNY plus codons that arise from transversions of the RNA code in the first (YNY type) and third (RNR) nucleotide bases. In order to test if putative nucleotide sequences in the RNA World and in both extended RNA codes, share the same scaling and statistical properties to those encountered in current prokaryotes, we used the genomes of four Eubacteria and three Archaeas. For each prokaryote, we obtained their respective genomes obeying the RNA code or the extended RNA codes types I and II. In each case, we estimated the scaling properties of triplet sequences via a renormalization group approach, and we calculated the frequency distributions of distances for each codon. Remarkably, the scaling properties of the distance series of some codons from the RNA code and most codons from both extended RNA codes turned out to be identical or very close to the scaling properties of codons of the SGC. To test for the robustness of these results, we show, via computer simulation experiments, that random mutations of current genomes, at the rates of 10−10 per site per year during three billions of years, were not enough for destroying the observed patterns. Therefore, we conclude that most current prokaryotes may still contain relics of the primeval RNA World and that both extended RNA codes may well represent two plausible evolutionary paths between the RNA code and the current SGC. PMID:19183813
Software engineering and automatic continuous verification of scientific software

NASA Astrophysics Data System (ADS)

Piggott, M. D.; Hill, J.; Farrell, P. E.; Kramer, S. C.; Wilson, C. R.; Ham, D.; Gorman, G. J.; Bond, T.

2011-12-01

Software engineering of scientific code is challenging for a number of reasons including pressure to publish and a lack of awareness of the pitfalls of software engineering by scientists. The Applied Modelling and Computation Group at Imperial College is a diverse group of researchers that employ best practice software engineering methods whilst developing open source scientific software. Our main code is Fluidity - a multi-purpose computational fluid dynamics (CFD) code that can be used for a wide range of scientific applications from earth-scale mantle convection, through basin-scale ocean dynamics, to laboratory-scale classic CFD problems, and is coupled to a number of other codes including nuclear radiation and solid modelling. Our software development infrastructure consists of a number of free tools that could be employed by any group that develops scientific code and has been developed over a number of years with many lessons learnt. A single code base is developed by over 30 people for which we use bazaar for revision control, making good use of the strong branching and merging capabilities. Using features of Canonical's Launchpad platform, such as code review, blueprints for designing features and bug reporting gives the group, partners and other Fluidity uers an easy-to-use platform to collaborate and allows the induction of new members of the group into an environment where software development forms a central part of their work. The code repositoriy are coupled to an automated test and verification system which performs over 20,000 tests, including unit tests, short regression tests, code verification and large parallel tests. Included in these tests are build tests on HPC systems, including local and UK National HPC services. The testing of code in this manner leads to a continuous verification process; not a discrete event performed once development has ceased. Much of the code verification is done via the "gold standard" of comparisons to analytical solutions via the method of manufactured solutions. By developing and verifying code in tandem we avoid a number of pitfalls in scientific software development and advocate similar procedures for other scientific code applications.
Quantum computation with realistic magic-state factories

NASA Astrophysics Data System (ADS)

O'Gorman, Joe; Campbell, Earl T.

2017-03-01

Leading approaches to fault-tolerant quantum computation dedicate a significant portion of the hardware to computational factories that churn out high-fidelity ancillas called magic states. Consequently, efficient and realistic factory design is of paramount importance. Here we present the most detailed resource assessment to date of magic-state factories within a surface code quantum computer, along the way introducing a number of techniques. We show that the block codes of Bravyi and Haah [Phys. Rev. A 86, 052329 (2012), 10.1103/PhysRevA.86.052329] have been systematically undervalued; we track correlated errors both numerically and analytically, providing fidelity estimates without appeal to the union bound. We also introduce a subsystem code realization of these protocols with constant time and low ancilla cost. Additionally, we confirm that magic-state factories have space-time costs that scale as a constant factor of surface code costs. We find that the magic-state factory required for postclassical factoring can be as small as 6.3 million data qubits, ignoring ancilla qubits, assuming 10-4 error gates and the availability of long-range interactions.
An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

DOE PAGES

Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; ...

2017-10-17

Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less
An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

DOE Office of Scientific and Technical Information (OSTI.GOV)

Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.

Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less
An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

NASA Astrophysics Data System (ADS)

Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; Esarey, E.; Leemans, W. P.

2018-01-01

Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details of electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF&RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF&RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.
The SERGISAI procedure for seismic risk assessment

NASA Astrophysics Data System (ADS)

Zonno, G.; Garcia-Fernandez, M.; Jimenez, M.J.; Menoni, S.; Meroni, F.; Petrini, V.

The European project SERGISAI developed a computational tool where amethodology for seismic risk assessment at different geographical scales hasbeen implemented. Experts of various disciplines, including seismologists,engineers, planners, geologists, and computer scientists, co-operated in anactual multidisciplinary process to develop this tool. Standard proceduralcodes, Geographical Information Systems (GIS), and Artificial Intelligence(AI) techniques compose the whole system, that will enable the end userto carry out a complete seismic risk assessment at three geographical scales:regional, sub-regional and local. At present, single codes or models thathave been incorporated are not new in general, but the modularity of theprototype, based on a user-friendly front-end, offers potential users thepossibility of updating or replacing any code or model if desired. Theproposed procedure is a first attempt to integrate tools, codes and methodsfor assessing expected earthquake damage, and it was mainly designedto become a useful support for civil defence and land use planning agencies.Risk factors have been treated in the most suitable way for each one, interms of level of detail, kind of parameters and units of measure.Identifying various geographical scales is not a mere question of dimension;since entities to be studied correspond to areas defined by administrativeand geographical borders. The procedure was applied in the following areas:Toscana in Italy, for the regional scale, the Garfagnana area in Toscana, forthe sub-regional scale, and a part of Barcelona city, Spain, for the localscale.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

NASA Technical Reports Server (NTRS)

Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

1992-01-01

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Multiplier Architecture for Coding Circuits

NASA Technical Reports Server (NTRS)

Wang, C. C.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.

1986-01-01

Multipliers based on new algorithm for Galois-field (GF) arithmetic regular and expandable. Pipeline structures used for computing both multiplications and inverses. Designs suitable for implementation in very-large-scale integrated (VLSI) circuits. This general type of inverter and multiplier architecture especially useful in performing finite-field arithmetic of Reed-Solomon error-correcting codes and of some cryptographic algorithms.
Application of augmented-Lagrangian methods in meteorology: Comparison of different conjugate-gradient codes for large-scale minimization

NASA Technical Reports Server (NTRS)

Navon, I. M.

1984-01-01

A Lagrange multiplier method using techniques developed by Bertsekas (1982) was applied to solving the problem of enforcing simultaneous conservation of the nonlinear integral invariants of the shallow water equations on a limited area domain. This application of nonlinear constrained optimization is of the large dimensional type and the conjugate gradient method was found to be the only computationally viable method for the unconstrained minimization. Several conjugate-gradient codes were tested and compared for increasing accuracy requirements. Robustness and computational efficiency were the principal criteria.
Comparisons for ESTA-Task3: ASTEC, CESAM and CLÉS

NASA Astrophysics Data System (ADS)

Christensen-Dalsgaard, J.

The ESTA activity under the CoRoT project aims at testing the tools for computing stellar models and oscillation frequencies that will be used in the analysis of asteroseismic data from CoRoT and other large-scale upcoming asteroseismic projects. Here I report results of comparisons between calculations using the Aarhus code (ASTEC) and two other codes, for models that include diffusion and settling. It is found that there are likely deficiencies, requiring further study, in the ASTEC computation of models including convective cores.
KENO-VI Primer: A Primer for Criticality Calculations with SCALE/KENO-VI Using GeeWiz

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowman, Stephen M

2008-09-01

The SCALE (Standardized Computer Analyses for Licensing Evaluation) computer software system developed at Oak Ridge National Laboratory is widely used and accepted around the world for criticality safety analyses. The well-known KENO-VI three-dimensional Monte Carlo criticality computer code is one of the primary criticality safety analysis tools in SCALE. The KENO-VI primer is designed to help a new user understand and use the SCALE/KENO-VI Monte Carlo code for nuclear criticality safety analyses. It assumes that the user has a college education in a technical field. There is no assumption of familiarity with Monte Carlo codes in general or with SCALE/KENO-VImore » in particular. The primer is designed to teach by example, with each example illustrating two or three features of SCALE/KENO-VI that are useful in criticality analyses. The primer is based on SCALE 6, which includes the Graphically Enhanced Editing Wizard (GeeWiz) Windows user interface. Each example uses GeeWiz to provide the framework for preparing input data and viewing output results. Starting with a Quickstart section, the primer gives an overview of the basic requirements for SCALE/KENO-VI input and allows the user to quickly run a simple criticality problem with SCALE/KENO-VI. The sections that follow Quickstart include a list of basic objectives at the beginning that identifies the goal of the section and the individual SCALE/KENO-VI features that are covered in detail in the sample problems in that section. Upon completion of the primer, a new user should be comfortable using GeeWiz to set up criticality problems in SCALE/KENO-VI. The primer provides a starting point for the criticality safety analyst who uses SCALE/KENO-VI. Complete descriptions are provided in the SCALE/KENO-VI manual. Although the primer is self-contained, it is intended as a companion volume to the SCALE/KENO-VI documentation. (The SCALE manual is provided on the SCALE installation DVD.) The primer provides specific examples of using SCALE/KENO-VI for criticality analyses; the SCALE/KENO-VI manual provides information on the use of SCALE/KENO-VI and all its modules. The primer also contains an appendix with sample input files.« less

WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it maymore » be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.« less
Simulation of 2D Kinetic Effects in Plasmas using the Grid Based Continuum Code LOKI

NASA Astrophysics Data System (ADS)

Banks, Jeffrey; Berger, Richard; Chapman, Tom; Brunner, Stephan

2016-10-01

Kinetic simulation of multi-dimensional plasma waves through direct discretization of the Vlasov equation is a useful tool to study many physical interactions and is particularly attractive for situations where minimal fluctuation levels are desired, for instance, when measuring growth rates of plasma wave instabilities. However, direct discretization of phase space can be computationally expensive, and as a result there are few examples of published results using Vlasov codes in more than a single configuration space dimension. In an effort to fill this gap we have developed the Eulerian-based kinetic code LOKI that evolves the Vlasov-Poisson system in 2+2-dimensional phase space. The code is designed to reduce the cost of phase-space computation by using fully 4th order accurate conservative finite differencing, while retaining excellent parallel scalability that efficiently uses large scale computing resources. In this poster I will discuss the algorithms used in the code as well as some aspects of their parallel implementation using MPI. I will also overview simulation results of basic plasma wave instabilities relevant to laser plasma interaction, which have been obtained using the code.
Simulating the Thermal Response of High Explosives on Time Scales of Days to Microseconds

NASA Astrophysics Data System (ADS)

Yoh, Jack J.; McClelland, Matthew A.

2004-07-01

We present an overview of computational techniques for simulating the thermal cookoff of high explosives using a multi-physics hydrodynamics code, ALE3D. Recent improvements to the code have aided our computational capability in modeling the response of energetic materials systems exposed to extreme thermal environments, such as fires. We consider an idealized model process for a confined explosive involving the transition from slow heating to rapid deflagration in which the time scale changes from days to hundreds of microseconds. The heating stage involves thermal expansion and decomposition according to an Arrhenius kinetics model while a pressure-dependent burn model is employed during the explosive phase. We describe and demonstrate the numerical strategies employed to make the transition from slow to fast dynamics.
Computer programs for smoothing and scaling airfoil coordinates

NASA Technical Reports Server (NTRS)

Morgan, H. L., Jr.

1983-01-01

Detailed descriptions are given of the theoretical methods and associated computer codes of a program to smooth and a program to scale arbitrary airfoil coordinates. The smoothing program utilizes both least-squares polynomial and least-squares cubic spline techniques to smooth interatively the second derivatives of the y-axis airfoil coordinates with respect to a transformed x-axis system which unwraps the airfoil and stretches the nose and trailing-edge regions. The corresponding smooth airfoil coordinates are then determined by solving a tridiagonal matrix of simultaneous cubic-spline equations relating the y-axis coordinates and their corresponding second derivatives. A technique for computing the camber and thickness distribution of the smoothed airfoil is also discussed. The scaling program can then be used to scale the thickness distribution generated by the smoothing program to a specific maximum thickness which is then combined with the camber distribution to obtain the final scaled airfoil contour. Computer listings of the smoothing and scaling programs are included.
HEC Applications on Columbia Project

NASA Technical Reports Server (NTRS)

Taft, Jim

2004-01-01

NASA's Columbia system consists of a cluster of twenty 512 processor SGI Altix systems. Each of these systems is 3 TFLOP/s in peak performance - approximately the same as the entire compute capability at NAS just one year ago. Each 512p system is a single system image machine with one Linunx O5, one high performance file system, and one globally shared memory. The NAS Terascale Applications Group (TAG) is chartered to assist in scaling NASA's mission critical codes to at least 512p in order to significantly improve emergency response during flight operations, as well as provide significant improvements in the codes. and rate of scientific discovery across the scientifc disciplines within NASA's Missions. Recent accomplishments are 4x improvements to codes in the ocean modeling community, 10x performance improvements in a number of computational fluid dynamics codes used in aero-vehicle design, and 5x improvements in a number of space science codes dealing in extreme physics. The TAG group will continue its scaling work to 2048p and beyond (10240 cpus) as the Columbia system becomes fully operational and the upgrades to the SGI NUMAlink memory fabric are in place. The NUMlink uprades dramatically improve system scalability for a single application. These upgrades will allow a number of codes to execute faster at higher fidelity than ever before on any other system, thus increasing the rate of scientific discovery even further
Three-Dimensional Terahertz Coded-Aperture Imaging Based on Single Input Multiple Output Technology.

PubMed

Chen, Shuo; Luo, Chenggao; Deng, Bin; Wang, Hongqiang; Cheng, Yongqiang; Zhuang, Zhaowen

2018-01-19

As a promising radar imaging technique, terahertz coded-aperture imaging (TCAI) can achieve high-resolution, forward-looking, and staring imaging by producing spatiotemporal independent signals with coded apertures. In this paper, we propose a three-dimensional (3D) TCAI architecture based on single input multiple output (SIMO) technology, which can reduce the coding and sampling times sharply. The coded aperture applied in the proposed TCAI architecture loads either purposive or random phase modulation factor. In the transmitting process, the purposive phase modulation factor drives the terahertz beam to scan the divided 3D imaging cells. In the receiving process, the random phase modulation factor is adopted to modulate the terahertz wave to be spatiotemporally independent for high resolution. Considering human-scale targets, images of each 3D imaging cell are reconstructed one by one to decompose the global computational complexity, and then are synthesized together to obtain the complete high-resolution image. As for each imaging cell, the multi-resolution imaging method helps to reduce the computational burden on a large-scale reference-signal matrix. The experimental results demonstrate that the proposed architecture can achieve high-resolution imaging with much less time for 3D targets and has great potential in applications such as security screening, nondestructive detection, medical diagnosis, etc.
Deployment of the OSIRIS EM-PIC code on the Intel Knights Landing architecture

NASA Astrophysics Data System (ADS)

Fonseca, Ricardo

2017-10-01

Electromagnetic particle-in-cell (EM-PIC) codes such as OSIRIS have found widespread use in modelling the highly nonlinear and kinetic processes that occur in several relevant plasma physics scenarios, ranging from astrophysical settings to high-intensity laser plasma interaction. Being computationally intensive, these codes require large scale HPC systems, and a continuous effort in adapting the algorithm to new hardware and computing paradigms. In this work, we report on our efforts on deploying the OSIRIS code on the new Intel Knights Landing (KNL) architecture. Unlike the previous generation (Knights Corner), these boards are standalone systems, and introduce several new features, include the new AVX-512 instructions and on-package MCDRAM. We will focus on the parallelization and vectorization strategies followed, as well as memory management, and present a detailed performance evaluation of code performance in comparison with the CPU code. This work was partially supported by Fundaçã para a Ciência e Tecnologia (FCT), Portugal, through Grant No. PTDC/FIS-PLA/2940/2014.
Production Level CFD Code Acceleration for Hybrid Many-Core Architectures

NASA Technical Reports Server (NTRS)

Duffy, Austen C.; Hammond, Dana P.; Nielsen, Eric J.

2012-01-01

In this work, a novel graphics processing unit (GPU) distributed sharing model for hybrid many-core architectures is introduced and employed in the acceleration of a production-level computational fluid dynamics (CFD) code. The latest generation graphics hardware allows multiple processor cores to simultaneously share a single GPU through concurrent kernel execution. This feature has allowed the NASA FUN3D code to be accelerated in parallel with up to four processor cores sharing a single GPU. For codes to scale and fully use resources on these and the next generation machines, codes will need to employ some type of GPU sharing model, as presented in this work. Findings include the effects of GPU sharing on overall performance. A discussion of the inherent challenges that parallel unstructured CFD codes face in accelerator-based computing environments is included, with considerations for future generation architectures. This work was completed by the author in August 2010, and reflects the analysis and results of the time.
Scientific Discovery through Advanced Computing in Plasma Science

NASA Astrophysics Data System (ADS)

Tang, William

2005-03-01

Advanced computing is generally recognized to be an increasingly vital tool for accelerating progress in scientific research during the 21st Century. For example, the Department of Energy's ``Scientific Discovery through Advanced Computing'' (SciDAC) Program was motivated in large measure by the fact that formidable scientific challenges in its research portfolio could best be addressed by utilizing the combination of the rapid advances in super-computing technology together with the emergence of effective new algorithms and computational methodologies. The imperative is to translate such progress into corresponding increases in the performance of the scientific codes used to model complex physical systems such as those encountered in high temperature plasma research. If properly validated against experimental measurements and analytic benchmarks, these codes can provide reliable predictive capability for the behavior of a broad range of complex natural and engineered systems. This talk reviews recent progress and future directions for advanced simulations with some illustrative examples taken from the plasma science applications area. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by the combination of access to powerful new computational resources together with innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning a huge range in time and space scales. In particular, the plasma science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of plasma turbulence in magnetically-confined high temperature plasmas. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to the computational science area.
Computing with scale-invariant neural representations

NASA Astrophysics Data System (ADS)

Howard, Marc; Shankar, Karthik

The Weber-Fechner law is perhaps the oldest quantitative relationship in psychology. Consider the problem of the brain representing a function f (x) . Different neurons have receptive fields that support different parts of the range, such that the ith neuron has a receptive field at xi. Weber-Fechner scaling refers to the finding that the width of the receptive field scales with xi as does the difference between the centers of adjacent receptive fields. Weber-Fechner scaling is exponentially resource-conserving. Neurophysiological evidence suggests that neural representations obey Weber-Fechner scaling in the visual system and perhaps other systems as well. We describe an optimality constraint that is solved by Weber-Fechner scaling, providing an information-theoretic rationale for this principle of neural coding. Weber-Fechner scaling can be generated within a mathematical framework using the Laplace transform. Within this framework, simple computations such as translation, correlation and cross-correlation can be accomplished. This framework can in principle be extended to provide a general computational language for brain-inspired cognitive computation on scale-invariant representations. Supported by NSF PHY 1444389 and the BU Initiative for the Physics and Mathematics of Neural Systems,.
Simulating the interaction of the heliosphere with the local interstellar medium: MHD results from a finite volume approach, first bidimensional results

NASA Technical Reports Server (NTRS)

Chanteur, G.; Khanfir, R.

1995-01-01

We have designed a full compressible MHD code working on unstructured meshes in order to be able to compute accurately sharp structures embedded in large scale simulations. The code is based on a finite volume method making use of a kinetic flux splitting. A bidimensional version of the code has been used to simulate the interaction of a moving interstellar medium, magnetized or unmagnetized with a rotating and magnetized heliopspheric plasma source. Being aware that these computations are not realistic due to the restriction to two dimensions, we present it to demonstrate the ability of this new code to handle this problem. An axisymetric version, now under development, will be operational in a few months. Ultimately we plan to run a full 3d version.
PETSc Users Manual Revision 3.7

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, Satish; Abhyankar, S.; Adams, M.

This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.
PETSc Users Manual Revision 3.8

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, S.; Abhyankar, S.; Adams, M.

This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.
FORCE2: A state-of-the-art two-phase code for hydrodynamic calculations

NASA Astrophysics Data System (ADS)

Ding, Jianmin; Lyczkowski, R. W.; Burge, S. W.

1993-02-01

A three-dimensional computer code for two-phase flow named FORCE2 has been developed by Babcock and Wilcox (B & W) in close collaboration with Argonne National Laboratory (ANL). FORCE2 is capable of both transient as well as steady-state simulations. This Cartesian coordinates computer program is a finite control volume, industrial grade and quality embodiment of the pilot-scale FLUFIX/MOD2 code and contains features such as three-dimensional blockages, volume and surface porosities to account for various obstructions in the flow field, and distributed resistance modeling to account for pressure drops caused by baffles, distributor plates and large tube banks. Recently computed results demonstrated the significance of and necessity for three-dimensional models of hydrodynamics and erosion. This paper describes the process whereby ANL's pilot-scale FLUFIX/MOD2 models and numerics were implemented into FORCE2. A description of the quality control to assess the accuracy of the new code and the validation using some of the measured data from Illinois Institute of Technology (UT) and the University of Illinois at Urbana-Champaign (UIUC) are given. It is envisioned that one day, FORCE2 with additional modules such as radiation heat transfer, combustion kinetics and multi-solids together with user-friendly pre- and post-processor software and tailored for massively parallel multiprocessor shared memory computational platforms will be used by industry and researchers to assist in reducing and/or eliminating the environmental and economic barriers which limit full consideration of coal, shale and biomass as energy sources, to retain energy security, and to remediate waste and ecological problems.
Exploring Asynchronous Many-Task Runtime Systems toward Extreme Scales

DOE Office of Scientific and Technical Information (OSTI.GOV)

Knight, Samuel; Baker, Gavin Matthew; Gamell, Marc

2015-10-01

Major exascale computing reports indicate a number of software challenges to meet the dramatic change of system architectures in near future. While several-orders-of-magnitude increase in parallelism is the most commonly cited of those, hurdles also include performance heterogeneity of compute nodes across the system, increased imbalance between computational capacity and I/O capabilities, frequent system interrupts, and complex hardware architectures. Asynchronous task-parallel programming models show a great promise in addressing these issues, but are not yet fully understood nor developed su ciently for computational science and engineering application codes. We address these knowledge gaps through quantitative and qualitative exploration of leadingmore » candidate solutions in the context of engineering applications at Sandia. In this poster, we evaluate MiniAero code ported to three leading candidate programming models (Charm++, Legion and UINTAH) to examine the feasibility of these models that permits insertion of new programming model elements into an existing code base.« less
Performance of a Block Structured, Hierarchical Adaptive MeshRefinement Code on the 64k Node IBM BlueGene/L Computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greenough, Jeffrey A.; de Supinski, Bronis R.; Yates, Robert K.

2005-04-25

We describe the performance of the block-structured Adaptive Mesh Refinement (AMR) code Raptor on the 32k node IBM BlueGene/L computer. This machine represents a significant step forward towards petascale computing. As such, it presents Raptor with many challenges for utilizing the hardware efficiently. In terms of performance, Raptor shows excellent weak and strong scaling when running in single level mode (no adaptivity). Hardware performance monitors show Raptor achieves an aggregate performance of 3:0 Tflops in the main integration kernel on the 32k system. Results from preliminary AMR runs on a prototype astrophysical problem demonstrate the efficiency of the current softwaremore » when running at large scale. The BG/L system is enabling a physics problem to be considered that represents a factor of 64 increase in overall size compared to the largest ones of this type computed to date. Finally, we provide a description of the development work currently underway to address our inefficiencies.« less
NCI HPC Scaling and Optimisation in Climate, Weather, Earth system science and the Geosciences

NASA Astrophysics Data System (ADS)

Evans, B. J. K.; Bermous, I.; Freeman, J.; Roberts, D. S.; Ward, M. L.; Yang, R.

2016-12-01

The Australian National Computational Infrastructure (NCI) has a national focus in the Earth system sciences including climate, weather, ocean, water management, environment and geophysics. NCI leads a Program across its partners from the Australian science agencies and research communities to identify priority computational models to scale-up. Typically, these cases place a large overall demand on the available computer time, need to scale to higher resolutions, use excessive scarce resources such as large memory or bandwidth that limits, or in some cases, need to meet requirements for transition to a separate operational forecasting system, with set time-windows. The model codes include the UK Met Office Unified Model atmospheric model (UM), GFDL's Modular Ocean Model (MOM), both the UK Met Office's GC3 and Australian ACCESS coupled-climate systems (including sea ice), 4D-Var data assimilation and satellite processing, the Regional Ocean Model (ROMS), and WaveWatch3 as well as geophysics codes including hazards, magentuellerics, seismic inversions, and geodesy. Many of these codes use significant compute resources both for research applications as well as within the operational systems. Some of these models are particularly complex, and their behaviour had not been critically analysed for effective use of the NCI supercomputer or how they could be improved. As part of the Program, we have established a common profiling methodology that uses a suite of open source tools for performing scaling analyses. The most challenging cases are profiling multi-model coupled systems where the component models have their own complex algorithms and performance issues. We have also found issues within the current suite of profiling tools, and no single tool fully exposes the nature of the code performance. As a result of this work, international collaborations are now in place to ensure that improvements are incorporated within the community models, and our effort can be targeted in a coordinated way. The coordinations have involved user stakeholders, the model developer community, and dependent software libraries. For example, we have spent significant time characterising I/O scalability, and improving the use of libraries such as NetCDF and HDF5.
GPU-Accelerated Large-Scale Electronic Structure Theory on Titan with a First-Principles All-Electron Code

NASA Astrophysics Data System (ADS)

Huhn, William Paul; Lange, Björn; Yu, Victor; Blum, Volker; Lee, Seyong; Yoon, Mina

Density-functional theory has been well established as the dominant quantum-mechanical computational method in the materials community. Large accurate simulations become very challenging on small to mid-scale computers and require high-performance compute platforms to succeed. GPU acceleration is one promising approach. In this talk, we present a first implementation of all-electron density-functional theory in the FHI-aims code for massively parallel GPU-based platforms. Special attention is paid to the update of the density and to the integration of the Hamiltonian and overlap matrices, realized in a domain decomposition scheme on non-uniform grids. The initial implementation scales well across nodes on ORNL's Titan Cray XK7 supercomputer (8 to 64 nodes, 16 MPI ranks/node) and shows an overall speed up in runtime due to utilization of the K20X Tesla GPUs on each Titan node of 1.4x, with the charge density update showing a speed up of 2x. Further acceleration opportunities will be discussed. Work supported by the LDRD Program of ORNL managed by UT-Battle, LLC, for the U.S. DOE and by the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer

NASA Technical Reports Server (NTRS)

Hornfeck, William A.

1988-01-01

A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
Maestro and Castro: Simulation Codes for Astrophysical Flows

NASA Astrophysics Data System (ADS)

Zingale, Michael; Almgren, Ann; Beckner, Vince; Bell, John; Friesen, Brian; Jacobs, Adam; Katz, Maximilian P.; Malone, Christopher; Nonaka, Andrew; Zhang, Weiqun

2017-01-01

Stellar explosions are multiphysics problems—modeling them requires the coordinated input of gravity solvers, reaction networks, radiation transport, and hydrodynamics together with microphysics recipes to describe the physics of matter under extreme conditions. Furthermore, these models involve following a wide range of spatial and temporal scales, which puts tough demands on simulation codes. We developed the codes Maestro and Castro to meet the computational challenges of these problems. Maestro uses a low Mach number formulation of the hydrodynamics to efficiently model convection. Castro solves the fully compressible radiation hydrodynamics equations to capture the explosive phases of stellar phenomena. Both codes are built upon the BoxLib adaptive mesh refinement library, which prepares them for next-generation exascale computers. Common microphysics shared between the codes allows us to transfer a problem from the low Mach number regime in Maestro to the explosive regime in Castro. Importantly, both codes are freely available (https://github.com/BoxLib-Codes). We will describe the design of the codes and some of their science applications, as well as future development directions.Support for development was provided by NSF award AST-1211563 and DOE/Office of Nuclear Physics grant DE-FG02-87ER40317 to Stony Brook and by the Applied Mathematics Program of the DOE Office of Advance Scientific Computing Research under US DOE contract DE-AC02-05CH11231 to LBNL.

Wasatch: An architecture-proof multiphysics development environment using a Domain Specific Language and graph theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saad, Tony; Sutherland, James C.

To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
Wasatch: An architecture-proof multiphysics development environment using a Domain Specific Language and graph theory

DOE PAGES

Saad, Tony; Sutherland, James C.

2016-05-04

To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
Experimental program for real gas flow code validation at NASA Ames Research Center

NASA Technical Reports Server (NTRS)

Deiwert, George S.; Strawa, Anthony W.; Sharma, Surendra P.; Park, Chul

1989-01-01

The experimental program for validating real gas hypersonic flow codes at NASA Ames Rsearch Center is described. Ground-based test facilities used include ballistic ranges, shock tubes and shock tunnels, arc jet facilities and heated-air hypersonic wind tunnels. Also included are large-scale computer systems for kinetic theory simulations and benchmark code solutions. Flight tests consist of the Aeroassist Flight Experiment, the Space Shuttle, Project Fire 2, and planetary probes such as Galileo, Pioneer Venus, and PAET.
Multidimensional Multiphysics Simulation of TRISO Particle Fuel

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. D. Hales; R. L. Williamson; S. R. Novascone

2013-11-01

Multidimensional multiphysics analysis of TRISO-coated particle fuel using the BISON finite-element based nuclear fuels code is described. The governing equations and material models applicable to particle fuel and implemented in BISON are outlined. Code verification based on a recent IAEA benchmarking exercise is described, and excellant comparisons are reported. Multiple TRISO-coated particles of increasing geometric complexity are considered. It is shown that the code's ability to perform large-scale parallel computations permits application to complex 3D phenomena while very efficient solutions for either 1D spherically symmetric or 2D axisymmetric geometries are straightforward. Additionally, the flexibility to easily include new physical andmore » material models and uncomplicated ability to couple to lower length scale simulations makes BISON a powerful tool for simulation of coated-particle fuel. Future code development activities and potential applications are identified.« less
Final Report for ALCC Allocation: Predictive Simulation of Complex Flow in Wind Farms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barone, Matthew F.; Ananthan, Shreyas; Churchfield, Matt

This report documents work performed using ALCC computing resources granted under a proposal submitted in February 2016, with the resource allocation period spanning the period July 2016 through June 2017. The award allocation was 10.7 million processor-hours at the National Energy Research Scientific Computing Center. The simulations performed were in support of two projects: the Atmosphere to Electrons (A2e) project, supported by the DOE EERE office; and the Exascale Computing Project (ECP), supported by the DOE Office of Science. The project team for both efforts consists of staff scientists and postdocs from Sandia National Laboratories and the National Renewable Energymore » Laboratory. At the heart of these projects is the open-source computational-fluid-dynamics (CFD) code, Nalu. Nalu solves the low-Mach-number Navier-Stokes equations using an unstructured- grid discretization. Nalu leverages the open-source Trilinos solver library and the Sierra Toolkit (STK) for parallelization and I/O. This report documents baseline computational performance of the Nalu code on problems of direct relevance to the wind plant physics application - namely, Large Eddy Simulation (LES) of an atmospheric boundary layer (ABL) flow and wall-modeled LES of a flow past a static wind turbine rotor blade. Parallel performance of Nalu and its constituent solver routines residing in the Trilinos library has been assessed previously under various campaigns. However, both Nalu and Trilinos have been, and remain, in active development and resources have not been available previously to rigorously track code performance over time. With the initiation of the ECP, it is important to establish and document baseline code performance on the problems of interest. This will allow the project team to identify and target any deficiencies in performance, as well as highlight any performance bottlenecks as we exercise the code on a greater variety of platforms and at larger scales. The current study is rather modest in scale, examining performance on problem sizes of O(100 million) elements and core counts up to 8k cores. This will be expanded as more computational resources become available to the projects.« less
Comprehensive Benchmark Suite for Simulation of Particle Laden Flows Using the Discrete Element Method with Performance Profiles from the Multiphase Flow with Interface eXchanges (MFiX) Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Peiyuan; Brown, Timothy; Fullmer, William D.

Five benchmark problems are developed and simulated with the computational fluid dynamics and discrete element model code MFiX. The benchmark problems span dilute and dense regimes, consider statistically homogeneous and inhomogeneous (both clusters and bubbles) particle concentrations and a range of particle and fluid dynamic computational loads. Several variations of the benchmark problems are also discussed to extend the computational phase space to cover granular (particles only), bidisperse and heat transfer cases. A weak scaling analysis is performed for each benchmark problem and, in most cases, the scalability of the code appears reasonable up to approx. 103 cores. Profiling ofmore » the benchmark problems indicate that the most substantial computational time is being spent on particle-particle force calculations, drag force calculations and interpolating between discrete particle and continuum fields. Hardware performance analysis was also carried out showing significant Level 2 cache miss ratios and a rather low degree of vectorization. These results are intended to serve as a baseline for future developments to the code as well as a preliminary indicator of where to best focus performance optimizations.« less
Lightweight computational steering of very large scale molecular dynamics simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beazley, D.M.; Lomdahl, P.S.

1996-09-01

We present a computational steering approach for controlling, analyzing, and visualizing very large scale molecular dynamics simulations involving tens to hundreds of millions of atoms. Our approach relies on extensible scripting languages and an easy to use tool for building extensions and modules. The system is extremely easy to modify, works with existing C code, is memory efficient, and can be used from inexpensive workstations and networks. We demonstrate how we have used this system to manipulate data from production MD simulations involving as many as 104 million atoms running on the CM-5 and Cray T3D. We also show howmore » this approach can be used to build systems that integrate common scripting languages (including Tcl/Tk, Perl, and Python), simulation code, user extensions, and commercial data analysis packages.« less
Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies

PubMed Central

Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang

2008-01-01

Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
Probabilistic simulation of multi-scale composite behavior

NASA Technical Reports Server (NTRS)

Liaw, D. G.; Shiao, M. C.; Singhal, S. N.; Chamis, Christos C.

1993-01-01

A methodology is developed to computationally assess the probabilistic composite material properties at all composite scale levels due to the uncertainties in the constituent (fiber and matrix) properties and in the fabrication process variables. The methodology is computationally efficient for simulating the probability distributions of material properties. The sensitivity of the probabilistic composite material property to each random variable is determined. This information can be used to reduce undesirable uncertainties in material properties at the macro scale of the composite by reducing the uncertainties in the most influential random variables at the micro scale. This methodology was implemented into the computer code PICAN (Probabilistic Integrated Composite ANalyzer). The accuracy and efficiency of this methodology are demonstrated by simulating the uncertainties in the material properties of a typical laminate and comparing the results with the Monte Carlo simulation method. The experimental data of composite material properties at all scales fall within the scatters predicted by PICAN.
Semantic Interoperability for Computational Mineralogy: Experiences of the eMinerals Consortium

NASA Astrophysics Data System (ADS)

Walker, A. M.; White, T. O.; Dove, M. T.; Bruin, R. P.; Couch, P. A.; Tyer, R. P.

2006-12-01

The use of atomic scale computer simulation of minerals to obtain information for geophysics and environmental science has grown enormously over the past couple of decades. It is now routine to probe mineral behavior in the Earth's deep interior and in the surface environment by borrowing methods and simulation codes from computational chemistry and physics. It is becoming increasingly important to use methods embodied in more than one of these codes to solve any single scientific problem. However, scientific codes are rarely designed for easy interoperability and data exchange; data formats are often code-specific, poorly documented and fragile, liable to frequent change between software versions, and even compiler versions. This means that the scientist's simple desire to use the methodological approaches offered by multiple codes is frustrated, and even the sharing of data between collaborators becomes fraught with difficulties. The eMinerals consortium was formed in the early stages of the UK eScience program with the aim of developing the tools needed to apply atomic scale simulation to environmental problems in a grid-enabled world, and to harness the computational power offered by grid technologies to address some outstanding mineralogical problems. One example of the kind of problem we can tackle is the origin of the compressibility anomaly in silica glass. By passing data directly between simulation and analysis tools we were able to probe this effect in more detail than has previously been possible and have shown how the anomaly is related to the details of the amorphous structure. In order to approach this kind of problem we have constructed a mini-grid, a small scale and extensible combined compute- and data-grid that allows the execution of many calculations in parallel, and the transparent storage of semantically-rich marked-up result data. Importantly, we automatically capture multiple kinds of metadata and key results from each calculation. We believe that the lessons learned and tools developed will be useful in many areas of science beyond the computational mineralogy. Key tools that will be described include: a pure Fortran XML library (FoX) that presents XPath, SAX and DOM interfaces as well as permitting the easy production of valid XML from legacy Fortran programs; a job submission framework that automatically schedules calculations to remote grid resources, handles data staging and metadata capture; and a tool (AgentX) that map concepts from an ontology onto locations in documents of various formats that we use to enable data exchange.
Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations

NASA Astrophysics Data System (ADS)

Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.

2016-07-01

Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.
The Magnetic Reconnection Code: an AMR-based fully implicit simulation suite

NASA Astrophysics Data System (ADS)

Germaschewski, K.; Bhattacharjee, A.; Ng, C.-S.

2006-12-01

Extended MHD models, which incorporate two-fluid effects, are promising candidates to enhance understanding of collisionless reconnection phenomena in laboratory, space and astrophysical plasma physics. In this paper, we introduce two simulation codes in the Magnetic Reconnection Code suite which integrate reduced and full extended MHD models. Numerical integration of these models comes with two challenges: Small-scale spatial structures, e.g. thin current sheets, develop and must be well resolved by the code. Adaptive mesh refinement (AMR) is employed to provide high resolution where needed while maintaining good performance. Secondly, the two-fluid effects in extended MHD give rise to dispersive waves, which lead to a very stringent CFL condition for explicit codes, while reconnection happens on a much slower time scale. We use a fully implicit Crank--Nicholson time stepping algorithm. Since no efficient preconditioners are available for our system of equations, we instead use a direct solver to handle the inner linear solves. This requires us to actually compute the Jacobian matrix, which is handled by a code generator that calculates the derivative symbolically and then outputs code to calculate it.
Final Report Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Leary, Patrick

The primary challenge motivating this project is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who can perform analysis only on a small fraction of the data they calculate, resulting in the substantial likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, which is known as in situ processing. The idea in situ processing was not new at the time ofmore » the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by Department of Energy (DOE) science projects. Our objective was to produce and enable the use of production-quality in situ methods and infrastructure, at scale, on DOE high-performance computing (HPC) facilities, though we expected to have an impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve this objective, we engaged in software technology research and development (R&D), in close partnerships with DOE science code teams, to produce software technologies that were shown to run efficiently at scale on DOE HPC platforms.« less
Making Ordered DNA and Protein Structures from Computer-Printed Transparency Film Cut-Outs

ERIC Educational Resources Information Center

Jittivadhna, Karnyupha; Ruenwongsa, Pintip; Panijpan, Bhinyo

2009-01-01

Instructions are given for building physical scale models of ordered structures of B-form DNA, protein [alpha]-helix, and parallel and antiparallel protein [beta]-pleated sheets made from colored computer printouts designed for transparency film sheets. Cut-outs from these sheets are easily assembled. Conventional color coding for atoms are used…
Exposure calculation code module for reactor core analysis: BURNER

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vondy, D.R.; Cunningham, G.W.

1979-02-01

The code module BURNER for nuclear reactor exposure calculations is presented. The computer requirements are shown, as are the reference data and interface data file requirements, and the programmed equations and procedure of calculation are described. The operating history of a reactor is followed over the period between solutions of the space, energy neutronics problem. The end-of-period nuclide concentrations are determined given the necessary information. A steady state, continuous fueling model is treated in addition to the usual fixed fuel model. The control options provide flexibility to select among an unusually wide variety of programmed procedures. The code also providesmore » user option to make a number of auxiliary calculations and print such information as the local gamma source, cumulative exposure, and a fine scale power density distribution in a selected zone. The code is used locally in a system for computation which contains the VENTURE diffusion theory neutronics code and other modules.« less
Predictions of the spontaneous symmetry-breaking theory for visual code completeness and spatial scaling in single-cell learning rules.

PubMed

Webber, C J

2001-05-01

This article shows analytically that single-cell learning rules that give rise to oriented and localized receptive fields, when their synaptic weights are randomly and independently initialized according to a plausible assumption of zero prior information, will generate visual codes that are invariant under two-dimensional translations, rotations, and scale magnifications, provided that the statistics of their training images are sufficiently invariant under these transformations. Such codes span different image locations, orientations, and size scales with equal economy. Thus, single-cell rules could account for the spatial scaling property of the cortical simple-cell code. This prediction is tested computationally by training with natural scenes; it is demonstrated that a single-cell learning rule can give rise to simple-cell receptive fields spanning the full range of orientations, image locations, and spatial frequencies (except at the extreme high and low frequencies at which the scale invariance of the statistics of digitally sampled images must ultimately break down, because of the image boundary and the finite pixel resolution). Thus, no constraint on completeness, or any other coupling between cells, is necessary to induce the visual code to span wide ranges of locations, orientations, and size scales. This prediction is made using the theory of spontaneous symmetry breaking, which we have previously shown can also explain the data-driven self-organization of a wide variety of transformation invariances in neurons' responses, such as the translation invariance of complex cell response.
GeNN: a code generation framework for accelerated brain simulations

NASA Astrophysics Data System (ADS)

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.
GeNN: a code generation framework for accelerated brain simulations.

PubMed

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-07

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.
GeNN: a code generation framework for accelerated brain simulations

PubMed Central

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/. PMID:26740369
Performance optimization of Qbox and WEST on Intel Knights Landing

NASA Astrophysics Data System (ADS)

Zheng, Huihuo; Knight, Christopher; Galli, Giulia; Govoni, Marco; Gygi, Francois

We present the optimization of electronic structure codes Qbox and WEST targeting the Intel®Xeon Phi™processor, codenamed Knights Landing (KNL). Qbox is an ab-initio molecular dynamics code based on plane wave density functional theory (DFT) and WEST is a post-DFT code for excited state calculations within many-body perturbation theory. Both Qbox and WEST employ highly scalable algorithms which enable accurate large-scale electronic structure calculations on leadership class supercomputer platforms beyond 100,000 cores, such as Mira and Theta at the Argonne Leadership Computing Facility. In this work, features of the KNL architecture (e.g. hierarchical memory) are explored to achieve higher performance in key algorithms of the Qbox and WEST codes and to develop a road-map for further development targeting next-generation computing architectures. In particular, the optimizations of the Qbox and WEST codes on the KNL platform will target efficient large-scale electronic structure calculations of nanostructured materials exhibiting complex structures and prediction of their electronic and thermal properties for use in solar and thermal energy conversion device. This work was supported by MICCoM, as part of Comp. Mats. Sci. Program funded by the U.S. DOE, Office of Sci., BES, MSE Division. This research used resources of the ALCF, which is a DOE Office of Sci. User Facility under Contract DE-AC02-06CH11357.

ELEFANT: a user-friendly multipurpose geodynamics code

NASA Astrophysics Data System (ADS)

Thieulot, C.

2014-07-01

A new finite element code for the solution of the Stokes and heat transport equations is presented. It has purposely been designed to address geological flow problems in two and three dimensions at crustal and lithospheric scales. The code relies on the Marker-in-Cell technique and Lagrangian markers are used to track materials in the simulation domain which allows recording of the integrated history of deformation; their (number) density is variable and dynamically adapted. A variety of rheologies has been implemented including nonlinear thermally activated dislocation and diffusion creep and brittle (or plastic) frictional models. The code is built on the Arbitrary Lagrangian Eulerian kinematic description: the computational grid deforms vertically and allows for a true free surface while the computational domain remains of constant width in the horizontal direction. The solution to the large system of algebraic equations resulting from the finite element discretisation and linearisation of the set of coupled partial differential equations to be solved is obtained by means of the efficient parallel direct solver MUMPS whose performance is thoroughly tested, or by means of the WISMP and AGMG iterative solvers. The code accuracy is assessed by means of many geodynamically relevant benchmark experiments which highlight specific features or algorithms, e.g., the implementation of the free surface stabilisation algorithm, the (visco-)plastic rheology implementation, the temperature advection, the capacity of the code to handle large viscosity contrasts. A two-dimensional application to salt tectonics presented as case study illustrates the potential of the code to model large scale high resolution thermo-mechanically coupled free surface flows.
Implementation and Characterization of Three-Dimensional Particle-in-Cell Codes on Multiple-Instruction-Multiple-Data Massively Parallel Supercomputers

NASA Technical Reports Server (NTRS)

Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.

1995-01-01

A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.
Intercomparison of 3D pore-scale flow and solute transport simulation methods

DOE PAGES

Mehmani, Yashar; Schoenherr, Martin; Pasquali, Andrea; ...

2015-09-28

Multiple numerical approaches have been developed to simulate porous media fluid flow and solute transport at the pore scale. These include 1) methods that explicitly model the three-dimensional geometry of pore spaces and 2) methods that conceptualize the pore space as a topologically consistent set of stylized pore bodies and pore throats. In previous work we validated a model of the first type, using computational fluid dynamics (CFD) codes employing a standard finite volume method (FVM), against magnetic resonance velocimetry (MRV) measurements of pore-scale velocities. Here we expand that validation to include additional models of the first type based onmore » the lattice Boltzmann method (LBM) and smoothed particle hydrodynamics (SPH), as well as a model of the second type, a pore-network model (PNM). The PNM approach used in the current study was recently improved and demonstrated to accurately simulate solute transport in a two-dimensional experiment. While the PNM approach is computationally much less demanding than direct numerical simulation methods, the effect of conceptualizing complex three-dimensional pore geometries on solute transport in the manner of PNMs has not been fully determined. We apply all four approaches (FVM-based CFD, LBM, SPH and PNM) to simulate pore-scale velocity distributions and (for capable codes) nonreactive solute transport, and intercompare the model results. Comparisons are drawn both in terms of macroscopic variables (e.g., permeability, solute breakthrough curves) and microscopic variables (e.g., local velocities and concentrations). Generally good agreement was achieved among the various approaches, but some differences were observed depending on the model context. The intercomparison work was challenging because of variable capabilities of the codes, and inspired some code enhancements to allow consistent comparison of flow and transport simulations across the full suite of methods. This paper provides support for confidence in a variety of pore-scale modeling methods and motivates further development and application of pore-scale simulation methods.« less
Intercomparison of 3D pore-scale flow and solute transport simulation methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Xiaofan; Mehmani, Yashar; Perkins, William A.

2016-09-01

Multiple numerical approaches have been developed to simulate porous media fluid flow and solute transport at the pore scale. These include 1) methods that explicitly model the three-dimensional geometry of pore spaces and 2) methods that conceptualize the pore space as a topologically consistent set of stylized pore bodies and pore throats. In previous work we validated a model of the first type, using computational fluid dynamics (CFD) codes employing a standard finite volume method (FVM), against magnetic resonance velocimetry (MRV) measurements of pore-scale velocities. Here we expand that validation to include additional models of the first type based onmore » the lattice Boltzmann method (LBM) and smoothed particle hydrodynamics (SPH), as well as a model of the second type, a pore-network model (PNM). The PNM approach used in the current study was recently improved and demonstrated to accurately simulate solute transport in a two-dimensional experiment. While the PNM approach is computationally much less demanding than direct numerical simulation methods, the effect of conceptualizing complex three-dimensional pore geometries on solute transport in the manner of PNMs has not been fully determined. We apply all four approaches (FVM-based CFD, LBM, SPH and PNM) to simulate pore-scale velocity distributions and (for capable codes) nonreactive solute transport, and intercompare the model results. Comparisons are drawn both in terms of macroscopic variables (e.g., permeability, solute breakthrough curves) and microscopic variables (e.g., local velocities and concentrations). Generally good agreement was achieved among the various approaches, but some differences were observed depending on the model context. The intercomparison work was challenging because of variable capabilities of the codes, and inspired some code enhancements to allow consistent comparison of flow and transport simulations across the full suite of methods. This study provides support for confidence in a variety of pore-scale modeling methods and motivates further development and application of pore-scale simulation methods.« less
Multi-level discriminative dictionary learning with application to large scale image classification.

PubMed

Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua

2015-10-01

The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.
Hybrid model for simulation of plasma jet injection in tokamak

NASA Astrophysics Data System (ADS)

Galkin, Sergei A.; Bogatu, I. N.

2016-10-01

Hybrid kinetic model of plasma treats the ions as kinetic particles and the electrons as charge neutralizing massless fluid. The model is essentially applicable when most of the energy is concentrated in the ions rather than in the electrons, i.e. it is well suited for the high-density hyper-velocity C60 plasma jet. The hybrid model separates the slower ion time scale from the faster electron time scale, which becomes disregardable. That is why hybrid codes consistently outperform the traditional PIC codes in computational efficiency, still resolving kinetic ions effects. We discuss 2D hybrid model and code with exact energy conservation numerical algorithm and present some results of its application to simulation of C60 plasma jet penetration through tokamak-like magnetic barrier. We also examine the 3D model/code extension and its possible applications to tokamak and ionospheric plasmas. The work is supported in part by US DOE DE-SC0015776 Grant.
Large-Scale Computation of Nuclear Magnetic Resonance Shifts for Paramagnetic Solids Using CP2K.

PubMed

Mondal, Arobendo; Gaultois, Michael W; Pell, Andrew J; Iannuzzi, Marcella; Grey, Clare P; Hutter, Jürg; Kaupp, Martin

2018-01-09

Large-scale computations of nuclear magnetic resonance (NMR) shifts for extended paramagnetic solids (pNMR) are reported using the highly efficient Gaussian-augmented plane-wave implementation of the CP2K code. Combining hyperfine couplings obtained with hybrid functionals with g-tensors and orbital shieldings computed using gradient-corrected functionals, contact, pseudocontact, and orbital-shift contributions to pNMR shifts are accessible. Due to the efficient and highly parallel performance of CP2K, a wide variety of materials with large unit cells can be studied with extended Gaussian basis sets. Validation of various approaches for the different contributions to pNMR shifts is done first for molecules in a large supercell in comparison with typical quantum-chemical codes. This is then extended to a detailed study of g-tensors for extended solid transition-metal fluorides and for a series of complex lithium vanadium phosphates. Finally, lithium pNMR shifts are computed for Li 3 V 2 (PO 4 ) 3 , for which detailed experimental data are available. This has allowed an in-depth study of different approaches (e.g., full periodic versus incremental cluster computations of g-tensors and different functionals and basis sets for hyperfine computations) as well as a thorough analysis of the different contributions to the pNMR shifts. This study paves the way for a more-widespread computational treatment of NMR shifts for paramagnetic materials.
Atomic-scale Modeling of the Structure and Dynamics of Dislocations in Complex Alloys at High Temperatures

NASA Technical Reports Server (NTRS)

Daw, Murray S.; Mills, Michael J.

2003-01-01

We report on the progress made during the first year of the project. Most of the progress at this point has been on the theoretical and computational side. Here are the highlights: (1) A new code, tailored for high-end desktop computing, now combines modern Accelerated Dynamics (AD) with the well-tested Embedded Atom Method (EAM); (2) The new Accelerated Dynamics allows the study of relatively slow, thermally-activated processes, such as diffusion, which are much too slow for traditional Molecular Dynamics; (3) We have benchmarked the new AD code on a rather simple and well-known process: vacancy diffusion in copper; and (4) We have begun application of the AD code to the diffusion of vacancies in ordered intermetallics.
A multiphysics and multiscale software environment for modeling astrophysical systems

NASA Astrophysics Data System (ADS)

Portegies Zwart, Simon; McMillan, Steve; Harfst, Stefan; Groen, Derek; Fujii, Michiko; Nualláin, Breanndán Ó.; Glebbeek, Evert; Heggie, Douglas; Lombardi, James; Hut, Piet; Angelou, Vangelis; Banerjee, Sambaran; Belkus, Houria; Fragos, Tassos; Fregeau, John; Gaburov, Evghenii; Izzard, Rob; Jurić, Mario; Justham, Stephen; Sottoriva, Andrea; Teuben, Peter; van Bever, Joris; Yaron, Ofer; Zemp, Marcel

2009-05-01

We present MUSE, a software framework for combining existing computational tools for different astrophysical domains into a single multiphysics, multiscale application. MUSE facilitates the coupling of existing codes written in different languages by providing inter-language tools and by specifying an interface between each module and the framework that represents a balance between generality and computational efficiency. This approach allows scientists to use combinations of codes to solve highly coupled problems without the need to write new codes for other domains or significantly alter their existing codes. MUSE currently incorporates the domains of stellar dynamics, stellar evolution and stellar hydrodynamics for studying generalized stellar systems. We have now reached a "Noah's Ark" milestone, with (at least) two available numerical solvers for each domain. MUSE can treat multiscale and multiphysics systems in which the time- and size-scales are well separated, like simulating the evolution of planetary systems, small stellar associations, dense stellar clusters, galaxies and galactic nuclei. In this paper we describe three examples calculated using MUSE: the merger of two galaxies, the merger of two evolving stars, and a hybrid N-body simulation. In addition, we demonstrate an implementation of MUSE on a distributed computer which may also include special-purpose hardware, such as GRAPEs or GPUs, to accelerate computations. The current MUSE code base is publicly available as open source at http://muse.li.
A parallel-vector algorithm for rapid structural analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1990-01-01

A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the 'loop unrolling' technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large-scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
A parallel-vector algorithm for rapid structural analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1990-01-01

A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the loop unrolling technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
Computing and Visualizing the Complex Dynamics of Earthquake Fault Systems: Towards Ensemble Earthquake Forecasting

NASA Astrophysics Data System (ADS)

Rundle, J.; Rundle, P.; Donnellan, A.; Li, P.

2003-12-01

We consider the problem of the complex dynamics of earthquake fault systems, and whether numerical simulations can be used to define an ensemble forecasting technology similar to that used in weather and climate research. To effectively carry out such a program, we need 1) a topological realistic model to simulate the fault system; 2) data sets to constrain the model parameters through a systematic program of data assimilation; 3) a computational technology making use of modern paradigms of high performance and parallel computing systems; and 4) software to visualize and analyze the results. In particular, we focus attention of a new version of our code Virtual California (version 2001) in which we model all of the major strike slip faults extending throughout California, from the Mexico-California border to the Mendocino Triple Junction. We use the historic data set of earthquakes larger than magnitude M > 6 to define the frictional properties of all 654 fault segments (degrees of freedom) in the model. Previous versions of Virtual California had used only 215 fault segments to model the strike slip faults in southern California. To compute the dynamics and the associated surface deformation, we use message passing as implemented in the MPICH standard distribution on a small Beowulf cluster consisting of 10 cpus. We are also planning to run the code on significantly larger machines so that we can begin to examine much finer spatial scales of resolution, and to assess scaling properties of the code. We present results of simulations both as static images and as mpeg movies, so that the dynamical aspects of the computation can be assessed by the viewer. We also compute a variety of statistics from the simulations, including magnitude-frequency relations, and compare these with data from real fault systems.
The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

NASA Technical Reports Server (NTRS)

Woo, Alex C.; Hill, Kueichien C.

1996-01-01

The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.
Nuclear Energy Advanced Modeling and Simulation Waste Integrated Performance and Safety Codes (NEAMS Waste IPSC).

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schultz, Peter Andrew

The objective of the U.S. Department of Energy Office of Nuclear Energy Advanced Modeling and Simulation Waste Integrated Performance and Safety Codes (NEAMS Waste IPSC) is to provide an integrated suite of computational modeling and simulation (M&S) capabilities to quantitatively assess the long-term performance of waste forms in the engineered and geologic environments of a radioactive-waste storage facility or disposal repository. Achieving the objective of modeling the performance of a disposal scenario requires describing processes involved in waste form degradation and radionuclide release at the subcontinuum scale, beginning with mechanistic descriptions of chemical reactions and chemical kinetics at the atomicmore » scale, and upscaling into effective, validated constitutive models for input to high-fidelity continuum scale codes for coupled multiphysics simulations of release and transport. Verification and validation (V&V) is required throughout the system to establish evidence-based metrics for the level of confidence in M&S codes and capabilities, including at the subcontiunuum scale and the constitutive models they inform or generate. This Report outlines the nature of the V&V challenge at the subcontinuum scale, an approach to incorporate V&V concepts into subcontinuum scale modeling and simulation (M&S), and a plan to incrementally incorporate effective V&V into subcontinuum scale M&S destined for use in the NEAMS Waste IPSC work flow to meet requirements of quantitative confidence in the constitutive models informed by subcontinuum scale phenomena.« less
SATCOM antenna siting study on a P-3C using the NEC-BSC V3.1

NASA Technical Reports Server (NTRS)

Bensman, D.; Marhefka, R. J.

1990-01-01

The location of a UHF SATCOM antenna on a P-3C aircraft is studied using the NEC-Basic Scattering Code V3.1 (NEC-BSC3). The NEC-BSC3 is a computer code based on the uniform theory of diffraction. The code is first validated for this application using scale model measurements. In general, the comparisons are good except in 10 degree regions near the nose and tail of the aircraft. Patterns for various antenna locations are analyzed to achieve a prescripted performance.
Irreducible normalizer operators and thresholds for degenerate quantum codes with sublinear distances

NASA Astrophysics Data System (ADS)

Pryadko, Leonid P.; Dumer, Ilya; Kovalev, Alexey A.

2015-03-01

We construct a lower (existence) bound for the threshold of scalable quantum computation which is applicable to all stabilizer codes, including degenerate quantum codes with sublinear distance scaling. The threshold is based on enumerating irreducible operators in the normalizer of the code, i.e., those that cannot be decomposed into a product of two such operators with non-overlapping support. For quantum LDPC codes with logarithmic or power-law distances, we get threshold values which are parametrically better than the existing analytical bound based on percolation. The new bound also gives a finite threshold when applied to other families of degenerate quantum codes, e.g., the concatenated codes. This research was supported in part by the NSF Grant PHY-1416578 and by the ARO Grant W911NF-11-1-0027.
Computer Laboratory for Multi-scale Simulations of Novel Nanomaterials

DTIC Science & Technology

2014-09-15

schemes for multiscale modeling of polymers. Permselective ion-exchange membranes for protective clothing, fuel cells , and batteries are of special...polyelectrolyte membranes ( PEM ) with chemical warfare agents (CWA) and their simulants and (2) development of new simulation methods and computational...chemical potential using gauge cell method and calculation of density profiles. However, the code does not run in parallel environments. For mesoscale
Solutions of the Taylor-Green Vortex Problem Using High-Resolution Explicit Finite Difference Methods

NASA Technical Reports Server (NTRS)

DeBonis, James R.

2013-01-01

A computational fluid dynamics code that solves the compressible Navier-Stokes equations was applied to the Taylor-Green vortex problem to examine the code s ability to accurately simulate the vortex decay and subsequent turbulence. The code, WRLES (Wave Resolving Large-Eddy Simulation), uses explicit central-differencing to compute the spatial derivatives and explicit Low Dispersion Runge-Kutta methods for the temporal discretization. The flow was first studied and characterized using Bogey & Bailley s 13-point dispersion relation preserving (DRP) scheme. The kinetic energy dissipation rate, computed both directly and from the enstrophy field, vorticity contours, and the energy spectra are examined. Results are in excellent agreement with a reference solution obtained using a spectral method and provide insight into computations of turbulent flows. In addition the following studies were performed: a comparison of 4th-, 8th-, 12th- and DRP spatial differencing schemes, the effect of the solution filtering on the results, the effect of large-eddy simulation sub-grid scale models, and the effect of high-order discretization of the viscous terms.
NASA's Information Power Grid: Large Scale Distributed Computing and Data Management

NASA Technical Reports Server (NTRS)

Johnston, William E.; Vaziri, Arsi; Hinke, Tom; Tanner, Leigh Ann; Feiereisen, William J.; Thigpen, William; Tang, Harry (Technical Monitor)

2001-01-01

Large-scale science and engineering are done through the interaction of people, heterogeneous computing resources, information systems, and instruments, all of which are geographically and organizationally dispersed. The overall motivation for Grids is to facilitate the routine interactions of these resources in order to support large-scale science and engineering. Multi-disciplinary simulations provide a good example of a class of applications that are very likely to require aggregation of widely distributed computing, data, and intellectual resources. Such simulations - e.g. whole system aircraft simulation and whole system living cell simulation - require integrating applications and data that are developed by different teams of researchers frequently in different locations. The research team's are the only ones that have the expertise to maintain and improve the simulation code and/or the body of experimental data that drives the simulations. This results in an inherently distributed computing and data management environment.
Development of the US3D Code for Advanced Compressible and Reacting Flow Simulations

NASA Technical Reports Server (NTRS)

Candler, Graham V.; Johnson, Heath B.; Nompelis, Ioannis; Subbareddy, Pramod K.; Drayna, Travis W.; Gidzak, Vladimyr; Barnhardt, Michael D.

2015-01-01

Aerothermodynamics and hypersonic flows involve complex multi-disciplinary physics, including finite-rate gas-phase kinetics, finite-rate internal energy relaxation, gas-surface interactions with finite-rate oxidation and sublimation, transition to turbulence, large-scale unsteadiness, shock-boundary layer interactions, fluid-structure interactions, and thermal protection system ablation and thermal response. Many of the flows have a large range of length and time scales, requiring large computational grids, implicit time integration, and large solution run times. The University of Minnesota NASA US3D code was designed for the simulation of these complex, highly-coupled flows. It has many of the features of the well-established DPLR code, but uses unstructured grids and has many advanced numerical capabilities and physical models for multi-physics problems. The main capabilities of the code are described, the physical modeling approaches are discussed, the different types of numerical flux functions and time integration approaches are outlined, and the parallelization strategy is overviewed. Comparisons between US3D and the NASA DPLR code are presented, and several advanced simulations are presented to illustrate some of novel features of the code.

Parallel Semi-Implicit Spectral Element Atmospheric Model

NASA Astrophysics Data System (ADS)

Fournier, A.; Thomas, S.; Loft, R.

2001-05-01

The shallow-water equations (SWE) have long been used to test atmospheric-modeling numerical methods. The SWE contain essential wave-propagation and nonlinear effects of more complete models. We present a semi-implicit (SI) improvement of the Spectral Element Atmospheric Model to solve the SWE (SEAM, Taylor et al. 1997, Fournier et al. 2000, Thomas & Loft 2000). SE methods are h-p finite element methods combining the geometric flexibility of size-h finite elements with the accuracy of degree-p spectral methods. Our work suggests that exceptional parallel-computation performance is achievable by a General-Circulation-Model (GCM) dynamical core, even at modest climate-simulation resolutions (>1o). The code derivation involves weak variational formulation of the SWE, Gauss(-Lobatto) quadrature over the collocation points, and Legendre cardinal interpolators. Appropriate weak variation yields a symmetric positive-definite Helmholtz operator. To meet the Ladyzhenskaya-Babuska-Brezzi inf-sup condition and avoid spurious modes, we use a staggered grid. The SI scheme combines leapfrog and Crank-Nicholson schemes for the nonlinear and linear terms respectively. The localization of operations to elements ideally fits the method to cache-based microprocessor computer architectures --derivatives are computed as collections of small (8x8), naturally cache-blocked matrix-vector products. SEAM also has desirable boundary-exchange communication, like finite-difference models. Timings on on the IBM SP and Compaq ES40 supercomputers indicate that the SI code (20-min timestep) requires 1/3 the CPU time of the explicit code (2-min timestep) for T42 resolutions. Both codes scale nearly linearly out to 400 processors. We achieved single-processor performance up to 30% of peak for both codes on the 375-MHz IBM Power-3 processors. Fast computation and linear scaling lead to a useful climate-simulation dycore only if enough model time is computed per unit wall-clock time. An efficient SI solver is essential to substantially increase this rate. Parallel preconditioning for an iterative conjugate-gradient elliptic solver is described. We are building a GCM dycore capable of 200 GF% lOPS sustained performance on clustered RISC/cache architectures using hybrid MPI/OpenMP programming.
Large scale in vivo recordings to study neuronal biophysics.

PubMed

Giocomo, Lisa M

2015-06-01

Over the last several years, technological advances have enabled researchers to more readily observe single-cell membrane biophysics in awake, behaving animals. Studies utilizing these technologies have provided important insights into the mechanisms generating functional neural codes in both sensory and non-sensory cortical circuits. Crucial for a deeper understanding of how membrane biophysics control circuit dynamics however, is a continued effort to move toward large scale studies of membrane biophysics, in terms of the numbers of neurons and ion channels examined. Future work faces a number of theoretical and technical challenges on this front but recent technological developments hold great promise for a larger scale understanding of how membrane biophysics contribute to circuit coding and computation. Copyright © 2014 Elsevier Ltd. All rights reserved.
Wakefield Computations for the CLIC PETS using the Parallel Finite Element Time-Domain Code T3P

DOE Office of Scientific and Technical Information (OSTI.GOV)

Candel, A; Kabel, A.; Lee, L.

In recent years, SLAC's Advanced Computations Department (ACD) has developed the high-performance parallel 3D electromagnetic time-domain code, T3P, for simulations of wakefields and transients in complex accelerator structures. T3P is based on advanced higher-order Finite Element methods on unstructured grids with quadratic surface approximation. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with unprecedented accuracy, aiding the design of the next generation of accelerator facilities. Applications to the Compact Linear Collider (CLIC) Power Extraction and Transfer Structure (PETS) are presented.
Scaling up Planetary Dynamo Modeling to Massively Parallel Computing Systems: The Rayleigh Code at ALCF

NASA Astrophysics Data System (ADS)

Featherstone, N. A.; Aurnou, J. M.; Yadav, R. K.; Heimpel, M. H.; Soderlund, K. M.; Matsui, H.; Stanley, S.; Brown, B. P.; Glatzmaier, G.; Olson, P.; Buffett, B. A.; Hwang, L.; Kellogg, L. H.

2017-12-01

In the past three years, CIG's Dynamo Working Group has successfully ported the Rayleigh Code to the Argonne Leadership Computer Facility's Mira BG/Q device. In this poster, we present some our first results, showing simulations of 1) convection in the solar convection zone; 2) dynamo action in Earth's core and 3) convection in the jovian deep atmosphere. These simulations have made efficient use of 131 thousand cores, 131 thousand cores and 232 thousand cores, respectively, on Mira. In addition to our novel results, the joys and logistical challenges of carrying out such large runs will also be discussed.
Using reactive transport codes to provide mechanistic biogeochemistry representations in global land surface models: CLM-PFLOTRAN 1.0

DOE PAGES

Tang, G.; Yuan, F.; Bisht, G.; ...

2015-12-17

We explore coupling to a configurable subsurface reactive transport code as a flexible and extensible approach to biogeochemistry in land surface models; our goal is to facilitate testing of alternative models and incorporation of new understanding. A reaction network with the CLM-CN decomposition, nitrification, denitrification, and plant uptake is used as an example. We implement the reactions in the open-source PFLOTRAN code, coupled with the Community Land Model (CLM), and test at Arctic, temperate, and tropical sites. To make the reaction network designed for use in explicit time stepping in CLM compatible with the implicit time stepping used in PFLOTRAN,more » the Monod substrate rate-limiting function with a residual concentration is used to represent the limitation of nitrogen availability on plant uptake and immobilization. To achieve accurate, efficient, and robust numerical solutions, care needs to be taken to use scaling, clipping, or log transformation to avoid negative concentrations during the Newton iterations. With a tight relative update tolerance to avoid false convergence, an accurate solution can be achieved with about 50 % more computing time than CLM in point mode site simulations using either the scaling or clipping methods. The log transformation method takes 60–100 % more computing time than CLM. The computing time increases slightly for clipping and scaling; it increases substantially for log transformation for half saturation decrease from 10 −3 to 10 −9 mol m −3, which normally results in decreasing nitrogen concentrations. The frequent occurrence of very low concentrations (e.g. below nanomolar) can increase the computing time for clipping or scaling by about 20 %; computing time can be doubled for log transformation. Caution needs to be taken in choosing the appropriate scaling factor because a small value caused by a negative update to a small concentration may diminish the update and result in false convergence even with very tight relative update tolerance. As some biogeochemical processes (e.g., methane and nitrous oxide production and consumption) involve very low half saturation and threshold concentrations, this work provides insights for addressing nonphysical negativity issues and facilitates the representation of a mechanistic biogeochemical description in earth system models to reduce climate prediction uncertainty.« less
Enabling large-scale viscoelastic calculations via neural network acceleration

NASA Astrophysics Data System (ADS)

Robinson DeVries, P.; Thompson, T. B.; Meade, B. J.

2017-12-01

One of the most significant challenges involved in efforts to understand the effects of repeated earthquake cycle activity are the computational costs of large-scale viscoelastic earthquake cycle models. Deep artificial neural networks (ANNs) can be used to discover new, compact, and accurate computational representations of viscoelastic physics. Once found, these efficient ANN representations may replace computationally intensive viscoelastic codes and accelerate large-scale viscoelastic calculations by more than 50,000%. This magnitude of acceleration enables the modeling of geometrically complex faults over thousands of earthquake cycles across wider ranges of model parameters and at larger spatial and temporal scales than have been previously possible. Perhaps most interestingly from a scientific perspective, ANN representations of viscoelastic physics may lead to basic advances in the understanding of the underlying model phenomenology. We demonstrate the potential of artificial neural networks to illuminate fundamental physical insights with specific examples.
A numerical investigation of the scale-up effects on flow, heat transfer, and kinetics processes of FCC units.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, S. L.

1998-08-25

Fluid Catalytic Cracking (FCC) technology is the most important process used by the refinery industry to convert crude oil to valuable lighter products such as gasoline. Process development is generally very time consuming especially when a small pilot unit is being scaled-up to a large commercial unit because of the lack of information to aide in the design of scaled-up units. Such information can now be obtained by analysis based on the pilot scale measurements and computer simulation that includes controlling physics of the FCC system. A Computational fluid dynamic (CFD) code, ICRKFLO, has been developed at Argonne National Laboratorymore » (ANL) and has been successfully applied to the simulation of catalytic petroleum cracking risers. It employs hybrid hydrodynamic-chemical kinetic coupling techniques, enabling the analysis of an FCC unit with complex chemical reaction sets containing tens or hundreds of subspecies. The code has been continuously validated based on pilot-scale experimental data. It is now being used to investigate the effects of scaled-up FCC units. Among FCC operating conditions, the feed injection conditions are found to have a strong impact on the product yields of scaled-up FCC units. The feed injection conditions appear to affect flow and heat transfer patterns and the interaction of hydrodynamics and cracking kinetics causes the product yields to change accordingly.« less
Silicon CMOS architecture for a spin-based quantum computer.

PubMed

Veldhorst, M; Eenink, H G J; Yang, C H; Dzurak, A S

2017-12-15

Recent advances in quantum error correction codes for fault-tolerant quantum computing and physical realizations of high-fidelity qubits in multiple platforms give promise for the construction of a quantum computer based on millions of interacting qubits. However, the classical-quantum interface remains a nascent field of exploration. Here, we propose an architecture for a silicon-based quantum computer processor based on complementary metal-oxide-semiconductor (CMOS) technology. We show how a transistor-based control circuit together with charge-storage electrodes can be used to operate a dense and scalable two-dimensional qubit system. The qubits are defined by the spin state of a single electron confined in quantum dots, coupled via exchange interactions, controlled using a microwave cavity, and measured via gate-based dispersive readout. We implement a spin qubit surface code, showing the prospects for universal quantum computation. We discuss the challenges and focus areas that need to be addressed, providing a path for large-scale quantum computing.
A Parallel Sliding Region Algorithm to Make Agent-Based Modeling Possible for a Large-Scale Simulation: Modeling Hepatitis C Epidemics in Canada.

PubMed

Wong, William W L; Feng, Zeny Z; Thein, Hla-Hla

2016-11-01

Agent-based models (ABMs) are computer simulation models that define interactions among agents and simulate emergent behaviors that arise from the ensemble of local decisions. ABMs have been increasingly used to examine trends in infectious disease epidemiology. However, the main limitation of ABMs is the high computational cost for a large-scale simulation. To improve the computational efficiency for large-scale ABM simulations, we built a parallelizable sliding region algorithm (SRA) for ABM and compared it to a nonparallelizable ABM. We developed a complex agent network and performed two simulations to model hepatitis C epidemics based on the real demographic data from Saskatchewan, Canada. The first simulation used the SRA that processed on each postal code subregion subsequently. The second simulation processed the entire population simultaneously. It was concluded that the parallelizable SRA showed computational time saving with comparable results in a province-wide simulation. Using the same method, SRA can be generalized for performing a country-wide simulation. Thus, this parallel algorithm enables the possibility of using ABM for large-scale simulation with limited computational resources.
Inference in the brain: Statistics flowing in redundant population codes

PubMed Central

Pitkow, Xaq; Angelaki, Dora E

2017-01-01

It is widely believed that the brain performs approximate probabilistic inference to estimate causal variables in the world from ambiguous sensory data. To understand these computations, we need to analyze how information is represented and transformed by the actions of nonlinear recurrent neural networks. We propose that these probabilistic computations function by a message-passing algorithm operating at the level of redundant neural populations. To explain this framework, we review its underlying concepts, including graphical models, sufficient statistics, and message-passing, and then describe how these concepts could be implemented by recurrently connected probabilistic population codes. The relevant information flow in these networks will be most interpretable at the population level, particularly for redundant neural codes. We therefore outline a general approach to identify the essential features of a neural message-passing algorithm. Finally, we argue that to reveal the most important aspects of these neural computations, we must study large-scale activity patterns during moderately complex, naturalistic behaviors. PMID:28595050
Unsteady Aero Computation of a 1 1/2 Stage Large Scale Rotating Turbine

NASA Technical Reports Server (NTRS)

To, Wai-Ming

2012-01-01

This report is the documentation of the work performed for the Subsonic Rotary Wing Project under the NASA s Fundamental Aeronautics Program. It was funded through Task Number NNC10E420T under GESS-2 Contract NNC06BA07B in the period of 10/1/2010 to 8/31/2011. The objective of the task is to provide support for the development of variable speed power turbine technology through application of computational fluid dynamics analyses. This includes work elements in mesh generation, multistage URANS simulations, and post-processing of the simulation results for comparison with the experimental data. The unsteady CFD calculations were performed with the TURBO code running in multistage single passage (phase lag) mode. Meshes for the blade rows were generated with the NASA developed TCGRID code. The CFD performance is assessed and improvements are recommended for future research in this area. For that, the United Technologies Research Center's 1 1/2 stage Large Scale Rotating Turbine was selected to be the candidate engine configuration for this computational effort because of the completeness and availability of the data.
Hybrid MPI/OpenMP Implementation of the ORAC Molecular Dynamics Program for Generalized Ensemble and Fast Switching Alchemical Simulations.

PubMed

Procacci, Piero

2016-06-27

We present a new release (6.0β) of the ORAC program [Marsili et al. J. Comput. Chem. 2010, 31, 1106-1116] with a hybrid OpenMP/MPI (open multiprocessing message passing interface) multilevel parallelism tailored for generalized ensemble (GE) and fast switching double annihilation (FS-DAM) nonequilibrium technology aimed at evaluating the binding free energy in drug-receptor system on high performance computing platforms. The production of the GE or FS-DAM trajectories is handled using a weak scaling parallel approach on the MPI level only, while a strong scaling force decomposition scheme is implemented for intranode computations with shared memory access at the OpenMP level. The efficiency, simplicity, and inherent parallel nature of the ORAC implementation of the FS-DAM algorithm, project the code as a possible effective tool for a second generation high throughput virtual screening in drug discovery and design. The code, along with documentation, testing, and ancillary tools, is distributed under the provisions of the General Public License and can be freely downloaded at www.chim.unifi.it/orac .
2013 R&D 100 Award: âMiniappsâ Bolster High Performance Computing

ScienceCinema

Belak, Jim; Richards, David

2018-06-12

Two Livermore computer scientists served on a Sandia National Laboratories-led team that developed Mantevo Suite 1.0, the first integrated suite of small software programs, also called "miniapps," to be made available to the high performance computing (HPC) community. These miniapps facilitate the development of new HPC systems and the applications that run on them. Miniapps (miniature applications) serve as stripped down surrogates for complex, full-scale applications that can require a great deal of time and effort to port to a new HPC system because they often consist of hundreds of thousands of lines of code. The miniapps are a prototype that contains some or all of the essentials of the real application but with many fewer lines of code, making the miniapp more versatile for experimentation. This allows researchers to more rapidly explore options and optimize system design, greatly improving the chances the full-scale application will perform successfully. These miniapps have become essential tools for exploring complex design spaces because they can reliably predict the performance of full applications.
Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval.

PubMed

Wang, Di; Gao, Xinbo; Wang, Xiumei; He, Lihuo; Yuan, Bo

2016-10-01

Multimodal hashing, which conducts effective and efficient nearest neighbor search across heterogeneous data on large-scale multimedia databases, has been attracting increasing interest, given the explosive growth of multimedia content on the Internet. Recent multimodal hashing research mainly aims at learning the compact binary codes to preserve semantic information given by labels. The overwhelming majority of these methods are similarity preserving approaches which approximate pairwise similarity matrix with Hamming distances between the to-be-learnt binary hash codes. However, these methods ignore the discriminative property in hash learning process, which results in hash codes from different classes undistinguished, and therefore reduces the accuracy and robustness for the nearest neighbor search. To this end, we present a novel multimodal hashing method, named multimodal discriminative binary embedding (MDBE), which focuses on learning discriminative hash codes. First, the proposed method formulates the hash function learning in terms of classification, where the binary codes generated by the learned hash functions are expected to be discriminative. And then, it exploits the label information to discover the shared structures inside heterogeneous data. Finally, the learned structures are preserved for hash codes to produce similar binary codes in the same class. Hence, the proposed MDBE can preserve both discriminability and similarity for hash codes, and will enhance retrieval accuracy. Thorough experiments on benchmark data sets demonstrate that the proposed method achieves excellent accuracy and competitive computational efficiency compared with the state-of-the-art methods for large-scale cross-modal retrieval task.
Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bethel, Wes

2016-07-24

The primary challenge motivating this team’s work is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who are able to perform analysis only on a small fraction of the data they compute, resulting in the very real likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, an approach that is known as in situ processing. The idea in situ processing wasmore » not new at the time of the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by DOE science projects. In large, our objective was produce and enable use of production-quality in situ methods and infrastructure, at scale, on DOE HPC facilities, though we expected to have impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve that objective, we assembled a unique team of researchers consisting of representatives from DOE national laboratories, academia, and industry, and engaged in software technology R&D, as well as engaged in close partnerships with DOE science code teams, to produce software technologies that were shown to run effectively at scale on DOE HPC platforms.« less
Mendel-GPU: haplotyping and genotype imputation on graphics processing units

PubMed Central

Chen, Gary K.; Wang, Kai; Stram, Alex H.; Sobel, Eric M.; Lange, Kenneth

2012-01-01

Motivation: In modern sequencing studies, one can improve the confidence of genotype calls by phasing haplotypes using information from an external reference panel of fully typed unrelated individuals. However, the computational demands are so high that they prohibit researchers with limited computational resources from haplotyping large-scale sequence data. Results: Our graphics processing unit based software delivers haplotyping and imputation accuracies comparable to competing programs at a fraction of the computational cost and peak memory demand. Availability: Mendel-GPU, our OpenCL software, runs on Linux platforms and is portable across AMD and nVidia GPUs. Users can download both code and documentation at http://code.google.com/p/mendel-gpu/. Contact: gary.k.chen@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22954633
ALEGRA -- A massively parallel h-adaptive code for solid dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Summers, R.M.; Wong, M.K.; Boucheron, E.A.

1997-12-31

ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Usingmore » this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.« less
Overview of the NASA Glenn Flux Reconstruction Based High-Order Unstructured Grid Code

NASA Technical Reports Server (NTRS)

Spiegel, Seth C.; DeBonis, James R.; Huynh, H. T.

2016-01-01

A computational fluid dynamics code based on the flux reconstruction (FR) method is currently being developed at NASA Glenn Research Center to ultimately provide a large- eddy simulation capability that is both accurate and efficient for complex aeropropulsion flows. The FR approach offers a simple and efficient method that is easy to implement and accurate to an arbitrary order on common grid cell geometries. The governing compressible Navier-Stokes equations are discretized in time using various explicit Runge-Kutta schemes, with the default being the 3-stage/3rd-order strong stability preserving scheme. The code is written in modern Fortran (i.e., Fortran 2008) and parallelization is attained through MPI for execution on distributed-memory high-performance computing systems. An h- refinement study of the isentropic Euler vortex problem is able to empirically demonstrate the capability of the FR method to achieve super-accuracy for inviscid flows. Additionally, the code is applied to the Taylor-Green vortex problem, performing numerous implicit large-eddy simulations across a range of grid resolutions and solution orders. The solution found by a pseudo-spectral code is commonly used as a reference solution to this problem, and the FR code is able to reproduce this solution using approximately the same grid resolution. Finally, an examination of the code's performance demonstrates good parallel scaling, as well as an implementation of the FR method with a computational cost/degree- of-freedom/time-step that is essentially independent of the solution order of accuracy for structured geometries.
Investigating the impact of the cielo cray XE6 architecture on scientific application codes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajan, Mahesh; Barrett, Richard; Pedretti, Kevin Thomas Tauke

2010-12-01

Cielo, a Cray XE6, is the Department of Energy NNSA Advanced Simulation and Computing (ASC) campaign's newest capability machine. Rated at 1.37 PFLOPS, it consists of 8,944 dual-socket oct-core AMD Magny-Cours compute nodes, linked using Cray's Gemini interconnect. Its primary mission objective is to enable a suite of the ASC applications implemented using MPI to scale to tens of thousands of cores. Cielo is an evolutionary improvement to a successful architecture previously available to many of our codes, thus enabling a basis for understanding the capabilities of this new architecture. Using three codes strategically important to the ASC campaign, andmore » supplemented with some micro-benchmarks that expose the fundamental capabilities of the XE6, we report on the performance characteristics and capabilities of Cielo.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Carothers, Christopher D.; Meredith, Jeremy S.; Blanco, Marc

Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, thatmore » combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework.Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling.Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.« less

Pressurized thermal shock: TEMPEST computer code simulation of thermal mixing in the cold leg and downcomer of a pressurized water reactor. [Creare 61 and 64

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eyler, L.L.; Trent, D.S.

The TEMPEST computer program was used to simulate fluid and thermal mixing in the cold leg and downcomer of a pressurized water reactor under emergency core cooling high-pressure injection (HPI), which is of concern to the pressurized thermal shock (PTS) problem. Application of the code was made in performing an analysis simulation of a full-scale Westinghouse three-loop plant design cold leg and downcomer. Verification/assessment of the code was performed and analysis procedures developed using data from Creare 1/5-scale experimental tests. Results of three simulations are presented. The first is a no-loop-flow case with high-velocity, low-negative-buoyancy HPI in a 1/5-scale modelmore » of a cold leg and downcomer. The second is a no-loop-flow case with low-velocity, high-negative density (modeled with salt water) injection in a 1/5-scale model. Comparison of TEMPEST code predictions with experimental data for these two cases show good agreement. The third simulation is a three-dimensional model of one loop of a full size Westinghouse three-loop plant design. Included in this latter simulation are loop components extending from the steam generator to the reactor vessel and a one-third sector of the vessel downcomer and lower plenum. No data were available for this case. For the Westinghouse plant simulation, thermally coupled conduction heat transfer in structural materials is included. The cold leg pipe and fluid mixing volumes of the primary pump, the stillwell, and the riser to the steam generator are included in the model. In the reactor vessel, the thermal shield, pressure vessel cladding, and pressure vessel wall are thermally coupled to the fluid and thermal mixing in the downcomer. The inlet plenum mixing volume is included in the model. A 10-min (real time) transient beginning at the initiation of HPI is computed to determine temperatures at the beltline of the pressure vessel wall.« less
A new way of setting the phases for cosmological multiscale Gaussian initial conditions

NASA Astrophysics Data System (ADS)

Jenkins, Adrian

2013-09-01

We describe how to define an extremely large discrete realization of a Gaussian white noise field that has a hierarchical structure and the property that the value of any part of the field can be computed quickly. Tiny subregions of such a field can be used to set the phase information for Gaussian initial conditions for individual cosmological simulations of structure formation. This approach has several attractive features: (i) the hierarchical structure based on an octree is particularly well suited for generating follow-up resimulation or zoom initial conditions; (ii) the phases are defined for all relevant physical scales in advance so that resimulation initial conditions are, by construction, consistent both with their parent simulation and with each other; (iii) the field can easily be made public by releasing a code to compute it - once public, phase information can be shared or published by specifying a spatial location within the realization. In this paper, we describe the principles behind creating such realizations. We define an example called Panphasia and in a companion paper by Jenkins and Booth (2013) make public a code to compute it. With 50 octree levels Panphasia spans a factor of more than 1015 in linear scale - a range that significantly exceeds the ratio of the current Hubble radius to the putative cold dark matter free-streaming scale. We show how to modify a code used for making cosmological and resimulation initial conditions so that it can take the phase information from Panphasia and, using this code, we demonstrate that it is possible to make good quality resimulation initial conditions. We define a convention for publishing phase information from Panphasia and publish the initial phases for several of the Virgo Consortium's most recent cosmological simulations including the 303 billion particle MXXL simulation. Finally, for reference, we give the locations and properties of several dark matter haloes that can be resimulated within these volumes.
Consequence modeling using the fire dynamics simulator.

PubMed

Ryder, Noah L; Sutula, Jason A; Schemel, Christopher F; Hamer, Andrew J; Van Brunt, Vincent

2004-11-11

The use of Computational Fluid Dynamics (CFD) and in particular Large Eddy Simulation (LES) codes to model fires provides an efficient tool for the prediction of large-scale effects that include plume characteristics, combustion product dispersion, and heat effects to adjacent objects. This paper illustrates the strengths of the Fire Dynamics Simulator (FDS), an LES code developed by the National Institute of Standards and Technology (NIST), through several small and large-scale validation runs and process safety applications. The paper presents two fire experiments--a small room fire and a large (15 m diameter) pool fire. The model results are compared to experimental data and demonstrate good agreement between the models and data. The validation work is then extended to demonstrate applicability to process safety concerns by detailing a model of a tank farm fire and a model of the ignition of a gaseous fuel in a confined space. In this simulation, a room was filled with propane, given time to disperse, and was then ignited. The model yields accurate results of the dispersion of the gas throughout the space. This information can be used to determine flammability and explosive limits in a space and can be used in subsequent models to determine the pressure and temperature waves that would result from an explosion. The model dispersion results were compared to an experiment performed by Factory Mutual. Using the above examples, this paper will demonstrate that FDS is ideally suited to build realistic models of process geometries in which large scale explosion and fire failure risks can be evaluated with several distinct advantages over more traditional CFD codes. Namely transient solutions to fire and explosion growth can be produced with less sophisticated hardware (lower cost) than needed for traditional CFD codes (PC type computer verses UNIX workstation) and can be solved for longer time histories (on the order of hundreds of seconds of computed time) with minimal computer resources and length of model run. Additionally results that are produced can be analyzed, viewed, and tabulated during and following a model run within a PC environment. There are some tradeoffs, however, as rapid computations in PC's may require a sacrifice in the grid resolution or in the sub-grid modeling, depending on the size of the geometry modeled.
Parallel Higher-order Finite Element Method for Accurate Field Computations in Wakefield and PIC Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Candel, A.; Kabel, A.; Lee, L.

Over the past years, SLAC's Advanced Computations Department (ACD), under SciDAC sponsorship, has developed a suite of 3D (2D) parallel higher-order finite element (FE) codes, T3P (T2P) and Pic3P (Pic2P), aimed at accurate, large-scale simulation of wakefields and particle-field interactions in radio-frequency (RF) cavities of complex shape. The codes are built on the FE infrastructure that supports SLAC's frequency domain codes, Omega3P and S3P, to utilize conformal tetrahedral (triangular)meshes, higher-order basis functions and quadratic geometry approximation. For time integration, they adopt an unconditionally stable implicit scheme. Pic3P (Pic2P) extends T3P (T2P) to treat charged-particle dynamics self-consistently using the PIC (particle-in-cell)more » approach, the first such implementation on a conformal, unstructured grid using Whitney basis functions. Examples from applications to the International Linear Collider (ILC), Positron Electron Project-II (PEP-II), Linac Coherent Light Source (LCLS) and other accelerators will be presented to compare the accuracy and computational efficiency of these codes versus their counterparts using structured grids.« less
Cove benchmark calculations using SAGUARO and FEMTRAN

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eaton, R.R.; Martinez, M.J.

1986-10-01

Three small-scale, time-dependent, benchmarking calculations have been made using the finite element codes SAGUARO, to determine hydraulic head and water velocity profiles, and FEMTRAN, to predict the solute transport. Sand and hard rock porous materials were used. Time scales for the problems, which ranged from tens of hours to thousands of years, have posed no particular diffculty for the two codes. Studies have been performed to determine the effects of computational mesh, boundary conditions, velocity formulation and SAGUARO/FEMTRAN code-coupling on water and solute transport. Results showed that mesh refinement improved mass conservation. Varying the drain-tile size in COVE 1N hadmore » a weak effect on the rate at which the tile field drained. Excellent agreement with published COVE 1N data was obtained for the hydrological field and reasonable agreement for the solute-concentration predictions. The question remains whether these types of calculations can be carried out on repository-scale problems using material characteristic curves representing tuff with fractures.« less
On the Large-Scaling Issues of Cloud-based Applications for Earth Science Dat

NASA Astrophysics Data System (ADS)

Hua, H.

2016-12-01

Next generation science data systems are needed to address the incoming flood of data from new missions such as NASA's SWOT and NISAR where its SAR data volumes and data throughput rates are order of magnitude larger than present day missions. Existing missions, such as OCO-2, may also require high turn-around time for processing different science scenarios where on-premise and even traditional HPC computing environments may not meet the high processing needs. Additionally, traditional means of procuring hardware on-premise are already limited due to facilities capacity constraints for these new missions. Experiences have shown that to embrace efficient cloud computing approaches for large-scale science data systems requires more than just moving existing code to cloud environments. At large cloud scales, we need to deal with scaling and cost issues. We present our experiences on deploying multiple instances of our hybrid-cloud computing science data system (HySDS) to support large-scale processing of Earth Science data products. We will explore optimization approaches to getting best performance out of hybrid-cloud computing as well as common issues that will arise when dealing with large-scale computing. Novel approaches were utilized to do processing on Amazon's spot market, which can potentially offer 75%-90% costs savings but with an unpredictable computing environment based on market forces.
Highly Scalable Asynchronous Computing Method for Partial Differential Equations: A Path Towards Exascale

NASA Astrophysics Data System (ADS)

Konduri, Aditya

Many natural and engineering systems are governed by nonlinear partial differential equations (PDEs) which result in a multiscale phenomena, e.g. turbulent flows. Numerical simulations of these problems are computationally very expensive and demand for extreme levels of parallelism. At realistic conditions, simulations are being carried out on massively parallel computers with hundreds of thousands of processing elements (PEs). It has been observed that communication between PEs as well as their synchronization at these extreme scales take up a significant portion of the total simulation time and result in poor scalability of codes. This issue is likely to pose a bottleneck in scalability of codes on future Exascale systems. In this work, we propose an asynchronous computing algorithm based on widely used finite difference methods to solve PDEs in which synchronization between PEs due to communication is relaxed at a mathematical level. We show that while stability is conserved when schemes are used asynchronously, accuracy is greatly degraded. Since message arrivals at PEs are random processes, so is the behavior of the error. We propose a new statistical framework in which we show that average errors drop always to first-order regardless of the original scheme. We propose new asynchrony-tolerant schemes that maintain accuracy when synchronization is relaxed. The quality of the solution is shown to depend, not only on the physical phenomena and numerical schemes, but also on the characteristics of the computing machine. A novel algorithm using remote memory access communications has been developed to demonstrate excellent scalability of the method for large-scale computing. Finally, we present a path to extend this method in solving complex multi-scale problems on Exascale machines.
Environment parameters and basic functions for floating-point computation

NASA Technical Reports Server (NTRS)

Brown, W. S.; Feldman, S. I.

1978-01-01

A language-independent proposal for environment parameters and basic functions for floating-point computation is presented. Basic functions are proposed to analyze, synthesize, and scale floating-point numbers. The model provides a small set of parameters and a small set of axioms along with sharp measures of roundoff error. The parameters and functions can be used to write portable and robust codes that deal intimately with the floating-point representation. Subject to underflow and overflow constraints, a number can be scaled by a power of the floating-point radix inexpensively and without loss of precision. A specific representation for FORTRAN is included.
Scalable hybrid computation with spikes.

PubMed

Sarpeshkar, Rahul; O'Halloran, Micah

2002-09-01

We outline a hybrid analog-digital scheme for computing with three important features that enable it to scale to systems of large complexity: First, like digital computation, which uses several one-bit precise logical units to collectively compute a precise answer to a computation, the hybrid scheme uses several moderate-precision analog units to collectively compute a precise answer to a computation. Second, frequent discrete signal restoration of the analog information prevents analog noise and offset from degrading the computation. And, third, a state machine enables complex computations to be created using a sequence of elementary computations. A natural choice for implementing this hybrid scheme is one based on spikes because spike-count codes are digital, while spike-time codes are analog. We illustrate how spikes afford easy ways to implement all three components of scalable hybrid computation. First, as an important example of distributed analog computation, we show how spikes can create a distributed modular representation of an analog number by implementing digital carry interactions between spiking analog neurons. Second, we show how signal restoration may be performed by recursive spike-count quantization of spike-time codes. And, third, we use spikes from an analog dynamical system to trigger state transitions in a digital dynamical system, which reconfigures the analog dynamical system using a binary control vector; such feedback interactions between analog and digital dynamical systems create a hybrid state machine (HSM). The HSM extends and expands the concept of a digital finite-state-machine to the hybrid domain. We present experimental data from a two-neuron HSM on a chip that implements error-correcting analog-to-digital conversion with the concurrent use of spike-time and spike-count codes. We also present experimental data from silicon circuits that implement HSM-based pattern recognition using spike-time synchrony. We outline how HSMs may be used to perform learning, vector quantization, spike pattern recognition and generation, and how they may be reconfigured.
RAY-RAMSES: a code for ray tracing on the fly in N-body simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barreira, Alexandre; Llinares, Claudio; Bose, Sownak

2016-05-01

We present a ray tracing code to compute integrated cosmological observables on the fly in AMR N-body simulations. Unlike conventional ray tracing techniques, our code takes full advantage of the time and spatial resolution attained by the N-body simulation by computing the integrals along the line of sight on a cell-by-cell basis through the AMR simulation grid. Moroever, since it runs on the fly in the N-body run, our code can produce maps of the desired observables without storing large (or any) amounts of data for post-processing. We implemented our routines in the RAMSES N-body code and tested the implementationmore » using an example of weak lensing simulation. We analyse basic statistics of lensing convergence maps and find good agreement with semi-analytical methods. The ray tracing methodology presented here can be used in several cosmological analysis such as Sunyaev-Zel'dovich and integrated Sachs-Wolfe effect studies as well as modified gravity. Our code can also be used in cross-checks of the more conventional methods, which can be important in tests of theory systematics in preparation for upcoming large scale structure surveys.« less
GAPD: a GPU-accelerated atom-based polychromatic diffraction simulation code.

PubMed

E, J C; Wang, L; Chen, S; Zhang, Y Y; Luo, S N

2018-03-01

GAPD, a graphics-processing-unit (GPU)-accelerated atom-based polychromatic diffraction simulation code for direct, kinematics-based, simulations of X-ray/electron diffraction of large-scale atomic systems with mono-/polychromatic beams and arbitrary plane detector geometries, is presented. This code implements GPU parallel computation via both real- and reciprocal-space decompositions. With GAPD, direct simulations are performed of the reciprocal lattice node of ultralarge systems (∼5 billion atoms) and diffraction patterns of single-crystal and polycrystalline configurations with mono- and polychromatic X-ray beams (including synchrotron undulator sources), and validation, benchmark and application cases are presented.
Adjoint-Based Sensitivity and Uncertainty Analysis for Density and Composition: A User’s Guide

DOE PAGES

Favorite, Jeffrey A.; Perko, Zoltan; Kiedrowski, Brian C.; ...

2017-03-01

The ability to perform sensitivity analyses using adjoint-based first-order sensitivity theory has existed for decades. This paper provides guidance on how adjoint sensitivity methods can be used to predict the effect of material density and composition uncertainties in critical experiments, including when these uncertain parameters are correlated or constrained. Two widely used Monte Carlo codes, MCNP6 (Ref. 2) and SCALE 6.2 (Ref. 3), are both capable of computing isotopic density sensitivities in continuous energy and angle. Additionally, Perkó et al. have shown how individual isotope density sensitivities, easily computed using adjoint methods, can be combined to compute constrained first-order sensitivitiesmore » that may be used in the uncertainty analysis. This paper provides details on how the codes are used to compute first-order sensitivities and how the sensitivities are used in an uncertainty analysis. Constrained first-order sensitivities are computed in a simple example problem.« less
A comprehensive study of MPI parallelism in three-dimensional discrete element method (DEM) simulation of complex-shaped granular particles

NASA Astrophysics Data System (ADS)

Yan, Beichuan; Regueiro, Richard A.

2018-02-01

A three-dimensional (3D) DEM code for simulating complex-shaped granular particles is parallelized using message-passing interface (MPI). The concepts of link-block, ghost/border layer, and migration layer are put forward for design of the parallel algorithm, and theoretical scalability function of 3-D DEM scalability and memory usage is derived. Many performance-critical implementation details are managed optimally to achieve high performance and scalability, such as: minimizing communication overhead, maintaining dynamic load balance, handling particle migrations across block borders, transmitting C++ dynamic objects of particles between MPI processes efficiently, eliminating redundant contact information between adjacent MPI processes. The code executes on multiple US Department of Defense (DoD) supercomputers and tests up to 2048 compute nodes for simulating 10 million three-axis ellipsoidal particles. Performance analyses of the code including speedup, efficiency, scalability, and granularity across five orders of magnitude of simulation scale (number of particles) are provided, and they demonstrate high speedup and excellent scalability. It is also discovered that communication time is a decreasing function of the number of compute nodes in strong scaling measurements. The code's capability of simulating a large number of complex-shaped particles on modern supercomputers will be of value in both laboratory studies on micromechanical properties of granular materials and many realistic engineering applications involving granular materials.
Fabrication of Circuit QED Quantum Processors, Part 1: Extensible Footprint for a Superconducting Surface Code

NASA Astrophysics Data System (ADS)

Bruno, A.; Michalak, D. J.; Poletto, S.; Clarke, J. S.; Dicarlo, L.

Large-scale quantum computation hinges on the ability to preserve and process quantum information with higher fidelity by increasing redundancy in a quantum error correction code. We present the realization of a scalable footprint for superconducting surface code based on planar circuit QED. We developed a tileable unit cell for surface code with all I/O routed vertically by means of superconducting through-silicon vias (TSVs). We address some of the challenges encountered during the fabrication and assembly of these chips, such as the quality of etch of the TSV, the uniformity of the ALD TiN coating conformal to the TSV, and the reliability of superconducting indium contact between the chips and PCB. We compare measured performance to a detailed list of specifications required for the realization of quantum fault tolerance. Our demonstration using centimeter-scale chips can accommodate the 50 qubits needed to target the experimental demonstration of small-distance logical qubits. Research funded by Intel Corporation and IARPA.
A New Inversion Routine to Produce Vertical Electron-Density Profiles from Ionospheric Topside-Sounder Data

NASA Technical Reports Server (NTRS)

Wang, Yongli; Benson, Robert F.

2011-01-01

Two software applications have been produced specifically for the analysis of some million digital topside ionograms produced by a recent analog-to-digital conversion effort of selected analog telemetry tapes from the Alouette-2, ISIS-1 and ISIS-2 satellites. One, TOPIST (TOPside Ionogram Scalar with True-height algorithm) from the University of Massachusetts Lowell, is designed for the automatic identification of the topside-ionogram ionospheric-reflection traces and their inversion into vertical electron-density profiles Ne(h). TOPIST also has the capability of manual intervention. The other application, from the Goddard Space Flight Center based on the FORTRAN code of John E. Jackson from the 1960s, is designed as an IDL-based interactive program for the scaling of selected digital topside-sounder ionograms. The Jackson code has also been modified, with some effort, so as to run on modern computers. This modification was motivated by the need to scale selected ionograms from the millions of Alouette/ISIS topside-sounder ionograms that only exist on 35-mm film. During this modification, it became evident that it would be more efficient to design a new code, based on the capabilities of present-day computers, than to continue to modify the old code. Such a new code has been produced and here we will describe its capabilities and compare Ne(h) profiles produced from it with those produced by the Jackson code. The concept of the new code is to assume an initial Ne(h) and derive a final Ne(h) through an iteration process that makes the resulting apparent-height profile fir the scaled values within a certain error range. The new code can be used on the X-, O-, and Z-mode traces. It does not assume any predefined profile shape between two contiguous points, like the exponential rule used in Jackson s program. Instead, Monotone Piecewise Cubic Interpolation is applied in the global profile to keep the monotone nature of the profile, which also ensures better smoothness in the final profile than in Jackson s program. The new code uses the complete refractive index expression for a cold collisionless plasma and can accommodate the IGRF, T96, and other geomagnetic field models.
Robust Nonlinear Neural Codes

NASA Astrophysics Data System (ADS)

Yang, Qianli; Pitkow, Xaq

2015-03-01

Most interesting natural sensory stimuli are encoded in the brain in a form that can only be decoded nonlinearly. But despite being a core function of the brain, nonlinear population codes are rarely studied and poorly understood. Interestingly, the few existing models of nonlinear codes are inconsistent with known architectural features of the brain. In particular, these codes have information content that scales with the size of the cortical population, even if that violates the data processing inequality by exceeding the amount of information entering the sensory system. Here we provide a valid theory of nonlinear population codes by generalizing recent work on information-limiting correlations in linear population codes. Although these generalized, nonlinear information-limiting correlations bound the performance of any decoder, they also make decoding more robust to suboptimal computation, allowing many suboptimal decoders to achieve nearly the same efficiency as an optimal decoder. Although these correlations are extremely difficult to measure directly, particularly for nonlinear codes, we provide a simple, practical test by which one can use choice-related activity in small populations of neurons to determine whether decoding is suboptimal or optimal and limited by correlated noise. We conclude by describing an example computation in the vestibular system where this theory applies. QY and XP was supported by a grant from the McNair foundation.
Nicholas Metropolis Award for Outstanding Doctoral Thesis Work in Computational Physics Talk: Understanding Nano-scale Electronic Systems via Large-scale Computation

NASA Astrophysics Data System (ADS)

Cao, Chao

2009-03-01

Nano-scale physical phenomena and processes, especially those in electronics, have drawn great attention in the past decade. Experiments have shown that electronic and transport properties of functionalized carbon nanotubes are sensitive to adsorption of gas molecules such as H2, NO2, and NH3. Similar measurements have also been performed to study adsorption of proteins on other semiconductor nano-wires. These experiments suggest that nano-scale systems can be useful for making future chemical and biological sensors. Aiming to understand the physical mechanisms underlying and governing property changes at nano-scale, we start off by investigating, via first-principles method, the electronic structure of Pd-CNT before and after hydrogen adsorption, and continue with coherent electronic transport using non-equilibrium Green’s function techniques combined with density functional theory. Once our results are fully analyzed they can be used to interpret and understand experimental data, with a few difficult issues to be addressed. Finally, we discuss a newly developed multi-scale computing architecture, OPAL, that coordinates simultaneous execution of multiple codes. Inspired by the capabilities of this computing framework, we present a scenario of future modeling and simulation of multi-scale, multi-physical processes.
Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas

In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

DOE PAGES

Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas; ...

2016-01-06

In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
LB3D: A parallel implementation of the Lattice-Boltzmann method for simulation of interacting amphiphilic fluids

NASA Astrophysics Data System (ADS)

Schmieschek, S.; Shamardin, L.; Frijters, S.; Krüger, T.; Schiller, U. D.; Harting, J.; Coveney, P. V.

2017-08-01

We introduce the lattice-Boltzmann code LB3D, version 7.1. Building on a parallel program and supporting tools which have enabled research utilising high performance computing resources for nearly two decades, LB3D version 7 provides a subset of the research code functionality as an open source project. Here, we describe the theoretical basis of the algorithm as well as computational aspects of the implementation. The software package is validated against simulations of meso-phases resulting from self-assembly in ternary fluid mixtures comprising immiscible and amphiphilic components such as water-oil-surfactant systems. The impact of the surfactant species on the dynamics of spinodal decomposition are tested and quantitative measurement of the permeability of a body centred cubic (BCC) model porous medium for a simple binary mixture is described. Single-core performance and scaling behaviour of the code are reported for simulations on current supercomputer architectures.

User's manual for the BNW-II optimization code for dry/wet-cooled power plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Braun, D.J.; Bamberger, J.A.; Braun, D.J.

1978-05-01

The User's Manual describes how to operate BNW-II, a computer code developed by the Pacific Northwest Laboratory (PNL) as a part of its activities under the Department of Energy (DOE) Dry Cooling Enhancement Program. The computer program offers a comprehensive method of evaluating the cost savings potential of dry/wet-cooled heat rejection systems. Going beyond simple ''figure-of-merit'' cooling tower optimization, this method includes such items as the cost of annual replacement capacity, and the optimum split between plant scale-up and replacement capacity, as well as the purchase and operating costs of all major heat rejection components. Hence the BNW-II code ismore » a useful tool for determining potential cost savings of new dry/wet surfaces, new piping, or other components as part of an optimized system for a dry/wet-cooled plant.« less
Portable multi-node LQCD Monte Carlo simulations using OpenACC

NASA Astrophysics Data System (ADS)

Bonati, Claudio; Calore, Enrico; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Sanfilippo, Francesco; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele

This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.
Programming for 1.6 Millon cores: Early experiences with IBM's BG/Q SMP architecture

NASA Astrophysics Data System (ADS)

Glosli, James

2013-03-01

With the stall in clock cycle improvements a decade ago, the drive for computational performance has continues along a path of increasing core counts on a processor. The multi-core evolution has been expressed in both a symmetric multi processor (SMP) architecture and cpu/GPU architecture. Debates rage in the high performance computing (HPC) community which architecture best serves HPC. In this talk I will not attempt to resolve that debate but perhaps fuel it. I will discuss the experience of exploiting Sequoia, a 98304 node IBM Blue Gene/Q SMP at Lawrence Livermore National Laboratory. The advantages and challenges of leveraging the computational power BG/Q will be detailed through the discussion of two applications. The first application is a Molecular Dynamics code called ddcMD. This is a code developed over the last decade at LLNL and ported to BG/Q. The second application is a cardiac modeling code called Cardioid. This is a code that was recently designed and developed at LLNL to exploit the fine scale parallelism of BG/Q's SMP architecture. Through the lenses of these efforts I'll illustrate the need to rethink how we express and implement our computational approaches. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Validation of hydrogen gas stratification and mixing models

DOE PAGES

Wu, Hsingtzu; Zhao, Haihua

2015-05-26

Two validation benchmarks confirm that the BMIX++ code is capable of simulating unintended hydrogen release scenarios efficiently. The BMIX++ (UC Berkeley mechanistic MIXing code in C++) code has been developed to accurately and efficiently predict the fluid mixture distribution and heat transfer in large stratified enclosures for accident analyses and design optimizations. The BMIX++ code uses a scaling based one-dimensional method to achieve large reduction in computational effort compared to a 3-D computational fluid dynamics (CFD) simulation. Two BMIX++ benchmark models have been developed. One is for a single buoyant jet in an open space and another is for amore » large sealed enclosure with both a jet source and a vent near the floor. Both of them have been validated by comparisons with experimental data. Excellent agreements are observed. The entrainment coefficients of 0.09 and 0.08 are found to fit the experimental data for hydrogen leaks with the Froude number of 99 and 268 best, respectively. In addition, the BIX++ simulation results of the average helium concentration for an enclosure with a vent and a single jet agree with the experimental data within a margin of about 10% for jet flow rates ranging from 1.21 × 10⁻⁴ to 3.29 × 10⁻⁴ m³/s. In conclusion, computing time for each BMIX++ model with a normal desktop computer is less than 5 min.« less
Extreme-Scale De Novo Genome Assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
Full-Process Computer Model of Magnetron Sputter, Part I: Test Existing State-of-Art Components

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walton, C C; Gilmer, G H; Wemhoff, A P

2007-09-26

This work is part of a larger project to develop a modeling capability for magnetron sputter deposition. The process is divided into four steps: plasma transport, target sputter, neutral gas and sputtered atom transport, and film growth, shown schematically in Fig. 1. Each of these is simulated separately in this Part 1 of the project, which is jointly funded between CMLS and Engineering. The Engineering portion is the plasma modeling, in step 1. The plasma modeling was performed using the Object-Oriented Particle-In-Cell code (OOPIC) from UC Berkeley [1]. Figure 2 shows the electron density in the simulated region, using magneticmore » field strength input from experiments by Bohlmark [2], where a scale of 1% is used. Figures 3 and 4 depict the magnetic field components that were generated using two-dimensional linear interpolation of Bohlmark's experimental data. The goal of the overall modeling tool is to understand, and later predict, relationships between parameters of film deposition we can change (such as gas pressure, gun voltage, and target-substrate distance) and key properties of the results (such as film stress, density, and stoichiometry.) The simulation must use existing codes, either open-source or low-cost, not develop new codes. In part 1 (FY07) we identified and tested the best available code for each process step, then determined if it can cover the size and time scales we need in reasonable computation times. We also had to determine if the process steps are sufficiently decoupled that they can be treated separately, and identify any research-level issues preventing practical use of these codes. Part 2 will consider whether the codes can be (or need to be) made to talk to each other and integrated into a whole.« less
Performance of distributed multiscale simulations

PubMed Central

Borgdorff, J.; Ben Belgacem, M.; Bona-Casas, C.; Fazendeiro, L.; Groen, D.; Hoenen, O.; Mizeranschi, A.; Suter, J. L.; Coster, D.; Coveney, P. V.; Dubitzky, W.; Hoekstra, A. G.; Strand, P.; Chopard, B.

2014-01-01

Multiscale simulations model phenomena across natural scales using monolithic or component-based code, running on local or distributed resources. In this work, we investigate the performance of distributed multiscale computing of component-based models, guided by six multiscale applications with different characteristics and from several disciplines. Three modes of distributed multiscale computing are identified: supplementing local dependencies with large-scale resources, load distribution over multiple resources, and load balancing of small- and large-scale resources. We find that the first mode has the apparent benefit of increasing simulation speed, and the second mode can increase simulation speed if local resources are limited. Depending on resource reservation and model coupling topology, the third mode may result in a reduction of resource consumption. PMID:24982258
Performance analysis of parallel gravitational N-body codes on large GPU clusters

NASA Astrophysics Data System (ADS)

Huang, Si-Yi; Spurzem, Rainer; Berczik, Peter

2016-01-01

We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit (GPU) clusters, both of which are pioneers in their own fields as well as on certain mutual scales - NBODY6++ and Bonsai. We carry out benchmarks of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of 200 - 300 can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of the two codes, finding that in the same cases Bonsai adopts larger time steps as well as larger relative energy errors than NBODY6++, typically ranging from 10 - 50 times larger, depending on the chosen parameters of the codes. Although the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scales, thus allowing the user to choose either one by fine-tuning parameters accordingly.
Parallel algorithm for multiscale atomistic/continuum simulations using LAMMPS

NASA Astrophysics Data System (ADS)

Pavia, F.; Curtin, W. A.

2015-07-01

Deformation and fracture processes in engineering materials often require simultaneous descriptions over a range of length and time scales, with each scale using a different computational technique. Here we present a high-performance parallel 3D computing framework for executing large multiscale studies that couple an atomic domain, modeled using molecular dynamics and a continuum domain, modeled using explicit finite elements. We use the robust Coupled Atomistic/Discrete-Dislocation (CADD) displacement-coupling method, but without the transfer of dislocations between atoms and continuum. The main purpose of the work is to provide a multiscale implementation within an existing large-scale parallel molecular dynamics code (LAMMPS) that enables use of all the tools associated with this popular open-source code, while extending CADD-type coupling to 3D. Validation of the implementation includes the demonstration of (i) stability in finite-temperature dynamics using Langevin dynamics, (ii) elimination of wave reflections due to large dynamic events occurring in the MD region and (iii) the absence of spurious forces acting on dislocations due to the MD/FE coupling, for dislocations further than 10 Å from the coupling boundary. A first non-trivial example application of dislocation glide and bowing around obstacles is shown, for dislocation lengths of ∼50 nm using fewer than 1 000 000 atoms but reproducing results of extremely large atomistic simulations at much lower computational cost.
Application of Fast Multipole Methods to the NASA Fast Scattering Code

NASA Technical Reports Server (NTRS)

Dunn, Mark H.; Tinetti, Ana F.

2008-01-01

The NASA Fast Scattering Code (FSC) is a versatile noise prediction program designed to conduct aeroacoustic noise reduction studies. The equivalent source method is used to solve an exterior Helmholtz boundary value problem with an impedance type boundary condition. The solution process in FSC v2.0 requires direct manipulation of a large, dense system of linear equations, limiting the applicability of the code to small scales and/or moderate excitation frequencies. Recent advances in the use of Fast Multipole Methods (FMM) for solving scattering problems, coupled with sparse linear algebra techniques, suggest that a substantial reduction in computer resource utilization over conventional solution approaches can be obtained. Implementation of the single level FMM (SLFMM) and a variant of the Conjugate Gradient Method (CGM) into the FSC is discussed in this paper. The culmination of this effort, FSC v3.0, was used to generate solutions for three configurations of interest. Benchmarking against previously obtained simulations indicate that a twenty-fold reduction in computational memory and up to a four-fold reduction in computer time have been achieved on a single processor.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Parcels v0.9: prototyping a Lagrangian ocean analysis framework for the petascale age

NASA Astrophysics Data System (ADS)

Lange, Michael; van Sebille, Erik

2017-11-01

As ocean general circulation models (OGCMs) move into the petascale age, where the output of single simulations exceeds petabytes of storage space, tools to analyse the output of these models will need to scale up too. Lagrangian ocean analysis, where virtual particles are tracked through hydrodynamic fields, is an increasingly popular way to analyse OGCM output, by mapping pathways and connectivity of biotic and abiotic particulates. However, the current software stack of Lagrangian ocean analysis codes is not dynamic enough to cope with the increasing complexity, scale and need for customization of use-cases. Furthermore, most community codes are developed for stand-alone use, making it a nontrivial task to integrate virtual particles at runtime of the OGCM. Here, we introduce the new Parcels code, which was designed from the ground up to be sufficiently scalable to cope with petascale computing. We highlight its API design that combines flexibility and customization with the ability to optimize for HPC workflows, following the paradigm of domain-specific languages. Parcels is primarily written in Python, utilizing the wide range of tools available in the scientific Python ecosystem, while generating low-level C code and using just-in-time compilation for performance-critical computation. We show a worked-out example of its API, and validate the accuracy of the code against seven idealized test cases. This version 0.9 of Parcels is focused on laying out the API, with future work concentrating on support for curvilinear grids, optimization, efficiency and at-runtime coupling with OGCMs.
Modeling Subsurface Reactive Flows Using Leadership-Class Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mills, Richard T; Hammond, Glenn; Lichtner, Peter

2009-01-01

We describe our experiences running PFLOTRAN - a code for simulation of coupled hydro-thermal-chemical processes in variably saturated, non-isothermal, porous media - on leadership-class supercomputers, including initial experiences running on the petaflop incarnation of Jaguar, the Cray XT5 at the National Center for Computational Sciences at Oak Ridge National Laboratory. PFLOTRAN utilizes fully implicit time-stepping and is built on top of the Portable, Extensible Toolkit for Scientific Computation (PETSc). We discuss some of the hurdles to 'at scale' performance with PFLOTRAN and the progress we have made in overcoming them on leadership-class computer architectures.
SCALE Code System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rearden, Bradley T.; Jessee, Matthew Anderson

The SCALE Code System is a widely-used modeling and simulation suite for nuclear safety analysis and design that is developed, maintained, tested, and managed by the Reactor and Nuclear Systems Division (RNSD) of Oak Ridge National Laboratory (ORNL). SCALE provides a comprehensive, verified and validated, user-friendly tool set for criticality safety, reactor and lattice physics, radiation shielding, spent fuel and radioactive source term characterization, and sensitivity and uncertainty analysis. Since 1980, regulators, licensees, and research institutions around the world have used SCALE for safety analysis and design. SCALE provides an integrated framework with dozens of computational modules including three deterministicmore » and three Monte Carlo radiation transport solvers that are selected based on the desired solution strategy. SCALE includes current nuclear data libraries and problem-dependent processing tools for continuous-energy (CE) and multigroup (MG) neutronics and coupled neutron-gamma calculations, as well as activation, depletion, and decay calculations. SCALE includes unique capabilities for automated variance reduction for shielding calculations, as well as sensitivity and uncertainty analysis. SCALE’s graphical user interfaces assist with accurate system modeling, visualization of nuclear data, and convenient access to desired results.« less
SCALE Code System 6.2.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rearden, Bradley T.; Jessee, Matthew Anderson

The SCALE Code System is a widely-used modeling and simulation suite for nuclear safety analysis and design that is developed, maintained, tested, and managed by the Reactor and Nuclear Systems Division (RNSD) of Oak Ridge National Laboratory (ORNL). SCALE provides a comprehensive, verified and validated, user-friendly tool set for criticality safety, reactor and lattice physics, radiation shielding, spent fuel and radioactive source term characterization, and sensitivity and uncertainty analysis. Since 1980, regulators, licensees, and research institutions around the world have used SCALE for safety analysis and design. SCALE provides an integrated framework with dozens of computational modules including three deterministicmore » and three Monte Carlo radiation transport solvers that are selected based on the desired solution strategy. SCALE includes current nuclear data libraries and problem-dependent processing tools for continuous-energy (CE) and multigroup (MG) neutronics and coupled neutron-gamma calculations, as well as activation, depletion, and decay calculations. SCALE includes unique capabilities for automated variance reduction for shielding calculations, as well as sensitivity and uncertainty analysis. SCALE’s graphical user interfaces assist with accurate system modeling, visualization of nuclear data, and convenient access to desired results.« less
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.

PubMed

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-12-01

Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment

PubMed Central

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-01-01

Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
Growth of multi-component alloy films with controlled graded chemical composition on sub-nanometer scale

DOEpatents

Bajt, Sasa; Vernon, Stephen P.

2005-03-15

The chemical composition of thin films is modulated during their growth. A computer code has been developed to design specific processes for producing a desired chemical composition for various deposition geometries. Good agreement between theoretical and experimental results was achieved.
Large-scale structural optimization

NASA Technical Reports Server (NTRS)

Sobieszczanski-Sobieski, J.

1983-01-01

Problems encountered by aerospace designers in attempting to optimize whole aircraft are discussed, along with possible solutions. Large scale optimization, as opposed to component-by-component optimization, is hindered by computational costs, software inflexibility, concentration on a single, rather than trade-off, design methodology and the incompatibility of large-scale optimization with single program, single computer methods. The software problem can be approached by placing the full analysis outside of the optimization loop. Full analysis is then performed only periodically. Problem-dependent software can be removed from the generic code using a systems programming technique, and then embody the definitions of design variables, objective function and design constraints. Trade-off algorithms can be used at the design points to obtain quantitative answers. Finally, decomposing the large-scale problem into independent subproblems allows systematic optimization of the problems by an organization of people and machines.

The EDIT-COMGEOM Code

DTIC Science & Technology

1975-09-01

This report assumes a familiarity with the GIFT and MAGIC computer codes. The EDIT-COMGEOM code is a FORTRAN computer code. The EDIT-COMGEOM code...converts the target description data which was used in the MAGIC computer code to the target description data which can be used in the GIFT computer code
An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

NASA Astrophysics Data System (ADS)

Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

2018-02-01

De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
A Novel Multi-Scale Domain Overlapping CFD/STH Coupling Methodology for Multi-Dimensional Flows Relevant to Nuclear Applications

NASA Astrophysics Data System (ADS)

Grunloh, Timothy P.

The objective of this dissertation is to develop a 3-D domain-overlapping coupling method that leverages the superior flow field resolution of the Computational Fluid Dynamics (CFD) code STAR-CCM+ and the fast execution of the System Thermal Hydraulic (STH) code TRACE to efficiently and accurately model thermal hydraulic transport properties in nuclear power plants under complex conditions of regulatory and economic importance. The primary contribution is the novel Stabilized Inertial Domain Overlapping (SIDO) coupling method, which allows for on-the-fly correction of TRACE solutions for local pressures and velocity profiles inside multi-dimensional regions based on the results of the CFD simulation. The method is found to outperform the more frequently-used domain decomposition coupling methods. An STH code such as TRACE is designed to simulate large, diverse component networks, requiring simplifications to the fluid flow equations for reasonable execution times. Empirical correlations are therefore required for many sub-grid processes. The coarse grids used by TRACE diminish sensitivity to small scale geometric details such as Reactor Pressure Vessel (RPV) internals. A CFD code such as STAR-CCM+ uses much finer computational meshes that are sensitive to the geometric details of reactor internals. In turbulent flows, it is infeasible to fully resolve the flow solution, but the correlations used to model turbulence are at a low level. The CFD code can therefore resolve smaller scale flow processes. The development of a 3-D coupling method was carried out with the intention of improving predictive capabilities of transport properties in the downcomer and lower plenum regions of an RPV in reactor safety calculations. These regions are responsible for the multi-dimensional mixing effects that determine the distribution at the core inlet of quantities with reactivity implications, such as fluid temperature and dissolved neutron absorber concentration.
Scaling Optimization of the SIESTA MHD Code

NASA Astrophysics Data System (ADS)

Seal, Sudip; Hirshman, Steven; Perumalla, Kalyan

2013-10-01

SIESTA is a parallel three-dimensional plasma equilibrium code capable of resolving magnetic islands at high spatial resolutions for toroidal plasmas. Originally designed to exploit small-scale parallelism, SIESTA has now been scaled to execute efficiently over several thousands of processors P. This scaling improvement was accomplished with minimal intrusion to the execution flow of the original version. First, the efficiency of the iterative solutions was improved by integrating the parallel tridiagonal block solver code BCYCLIC. Krylov-space generation in GMRES was then accelerated using a customized parallel matrix-vector multiplication algorithm. Novel parallel Hessian generation algorithms were integrated and memory access latencies were dramatically reduced through loop nest optimizations and data layout rearrangement. These optimizations sped up equilibria calculations by factors of 30-50. It is possible to compute solutions with granularity N/P near unity on extremely fine radial meshes (N > 1024 points). Grid separation in SIESTA, which manifests itself primarily in the resonant components of the pressure far from rational surfaces, is strongly suppressed by finer meshes. Large problem sizes of up to 300 K simultaneous non-linear coupled equations have been solved on the NERSC supercomputers. Work supported by U.S. DOE under Contract DE-AC05-00OR22725 with UT-Battelle, LLC.
Correlated Errors in the Surface Code

NASA Astrophysics Data System (ADS)

Lopez, Daniel; Mucciolo, E. R.; Novais, E.

2012-02-01

A milestone step into the development of quantum information technology would be the ability to design and operate a reliable quantum memory. The greatest obstacle to create such a device has been decoherence due to the unavoidable interaction between the quantum system and its environment. Quantum Error Correction is therefore an essential ingredient to any quantum computing information device. A great deal of attention has been given to surface codes, since it has very good scaling properties. In this seminar, we discuss the time evolution of a qubit encoded in the logical basis of a surface code. The system is interacting with a bosonic environment at zero temperature. Our results show how much spatial and time correlations can be detrimental to the efficiency of the code.
JACOB: an enterprise framework for computational chemistry.

PubMed

Waller, Mark P; Dresselhaus, Thomas; Yang, Jack

2013-06-15

Here, we present just a collection of beans (JACOB): an integrated batch-based framework designed for the rapid development of computational chemistry applications. The framework expedites developer productivity by handling the generic infrastructure tier, and can be easily extended by user-specific scientific code. Paradigms from enterprise software engineering were rigorously applied to create a scalable, testable, secure, and robust framework. A centralized web application is used to configure and control the operation of the framework. The application-programming interface provides a set of generic tools for processing large-scale noninteractive jobs (e.g., systematic studies), or for coordinating systems integration (e.g., complex workflows). The code for the JACOB framework is open sourced and is available at: www.wallerlab.org/jacob. Copyright © 2013 Wiley Periodicals, Inc.
CoreTSAR: Core Task-Size Adapting Runtime

DOE PAGES

Scogland, Thomas R. W.; Feng, Wu-chun; Rountree, Barry; ...

2014-10-27

Heterogeneity continues to increase at all levels of computing, with the rise of accelerators such as GPUs, FPGAs, and other co-processors into everything from desktops to supercomputers. As a consequence, efficiently managing such disparate resources has become increasingly complex. CoreTSAR seeks to reduce this complexity by adaptively worksharing parallel-loop regions across compute resources without requiring any transformation of the code within the loop. Lastly, our results show performance improvements of up to three-fold over a current state-of-the-art heterogeneous task scheduler as well as linear performance scaling from a single GPU to four GPUs for many codes. In addition, CoreTSAR demonstratesmore » a robust ability to adapt to both a variety of workloads and underlying system configurations.« less
A hybrid gyrokinetic ion and isothermal electron fluid code for astrophysical plasma

NASA Astrophysics Data System (ADS)

Kawazura, Y.; Barnes, M.

2018-05-01

This paper describes a new code for simulating astrophysical plasmas that solves a hybrid model composed of gyrokinetic ions (GKI) and an isothermal electron fluid (ITEF) Schekochihin et al. (2009) [9]. This model captures ion kinetic effects that are important near the ion gyro-radius scale while electron kinetic effects are ordered out by an electron-ion mass ratio expansion. The code is developed by incorporating the ITEF approximation into AstroGK, an Eulerian δf gyrokinetics code specialized to a slab geometry Numata et al. (2010) [41]. The new code treats the linear terms in the ITEF equations implicitly while the nonlinear terms are treated explicitly. We show linear and nonlinear benchmark tests to prove the validity and applicability of the simulation code. Since the fast electron timescale is eliminated by the mass ratio expansion, the Courant-Friedrichs-Lewy condition is much less restrictive than in full gyrokinetic codes; the present hybrid code runs ∼ 2√{mi /me } ∼ 100 times faster than AstroGK with a single ion species and kinetic electrons where mi /me is the ion-electron mass ratio. The improvement of the computational time makes it feasible to execute ion scale gyrokinetic simulations with a high velocity space resolution and to run multiple simulations to determine the dependence of turbulent dynamics on parameters such as electron-ion temperature ratio and plasma beta.
Integrated Electronic Warfare System Advanced Development Model (ADM); Appendix 1 - Functional Requirement Specification.

DTIC Science & Technology

1977-10-01

APPROVED DATE FUNCTION APPROVED jDATE WRITER J . K-olanek 2/6/76 REVISIONS CHK DESCRIPTION REV CHK DESCRIPTION IREV REVISION jJ ~ ~ ~~~ _ II SHEET NO...DOCUMENT (CDBDD) 45 5.5 COMPUTER PROGRAM PACKAGE (CPP)- j 45 5.6 COMPUTER PROGRAM OPERATOR’S MANUAL (CPOM) 45 5.7 COMPUTER PROGRAM TEST PLAN (CPTPL) 45...I LIST OF FIGURES Number Page 1 JEWS Simplified Block Diagram 4 2 System Controller Architecture 5 SIZE CODE IDENT NO DRAWING NO. A 49956 SCALE REV J
Computer-based coding of free-text job descriptions to efficiently identify occupations in epidemiological studies

PubMed Central

Russ, Daniel E.; Ho, Kwan-Yuet; Colt, Joanne S.; Armenti, Karla R.; Baris, Dalsu; Chow, Wong-Ho; Davis, Faith; Johnson, Alison; Purdue, Mark P.; Karagas, Margaret R.; Schwartz, Kendra; Schwenn, Molly; Silverman, Debra T.; Johnson, Calvin A.; Friesen, Melissa C.

2016-01-01

Background Mapping job titles to standardized occupation classification (SOC) codes is an important step in identifying occupational risk factors in epidemiologic studies. Because manual coding is time-consuming and has moderate reliability, we developed an algorithm called SOCcer (Standardized Occupation Coding for Computer-assisted Epidemiologic Research) to assign SOC-2010 codes based on free-text job description components. Methods Job title and task-based classifiers were developed by comparing job descriptions to multiple sources linking job and task descriptions to SOC codes. An industry-based classifier was developed based on the SOC prevalence within an industry. These classifiers were used in a logistic model trained using 14,983 jobs with expert-assigned SOC codes to obtain empirical weights for an algorithm that scored each SOC/job description. We assigned the highest scoring SOC code to each job. SOCcer was validated in two occupational data sources by comparing SOC codes obtained from SOCcer to expert assigned SOC codes and lead exposure estimates obtained by linking SOC codes to a job-exposure matrix. Results For 11,991 case-control study jobs, SOCcer-assigned codes agreed with 44.5% and 76.3% of manually assigned codes at the 6- and 2-digit level, respectively. Agreement increased with the score, providing a mechanism to identify assignments needing review. Good agreement was observed between lead estimates based on SOCcer and manual SOC assignments (kappa: 0.6–0.8). Poorer performance was observed for inspection job descriptions, which included abbreviations and worksite-specific terminology. Conclusions Although some manual coding will remain necessary, using SOCcer may improve the efficiency of incorporating occupation into large-scale epidemiologic studies. PMID:27102331
Data engineering systems: Computerized modeling and data bank capabilities for engineering analysis

NASA Technical Reports Server (NTRS)

Kopp, H.; Trettau, R.; Zolotar, B.

1984-01-01

The Data Engineering System (DES) is a computer-based system that organizes technical data and provides automated mechanisms for storage, retrieval, and engineering analysis. The DES combines the benefits of a structured data base system with automated links to large-scale analysis codes. While the DES provides the user with many of the capabilities of a computer-aided design (CAD) system, the systems are actually quite different in several respects. A typical CAD system emphasizes interactive graphics capabilities and organizes data in a manner that optimizes these graphics. On the other hand, the DES is a computer-aided engineering system intended for the engineer who must operationally understand an existing or planned design or who desires to carry out additional technical analysis based on a particular design. The DES emphasizes data retrieval in a form that not only provides the engineer access to search and display the data but also links the data automatically with the computer analysis codes.
Solving large scale structure in ten easy steps with COLA

NASA Astrophysics Data System (ADS)

Tassev, Svetlin; Zaldarriaga, Matias; Eisenstein, Daniel J.

2013-06-01

We present the COmoving Lagrangian Acceleration (COLA) method: an N-body method for solving for Large Scale Structure (LSS) in a frame that is comoving with observers following trajectories calculated in Lagrangian Perturbation Theory (LPT). Unlike standard N-body methods, the COLA method can straightforwardly trade accuracy at small-scales in order to gain computational speed without sacrificing accuracy at large scales. This is especially useful for cheaply generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing, as those catalogs are essential for performing detailed error analysis for ongoing and future surveys of LSS. As an illustration, we ran a COLA-based N-body code on a box of size 100 Mpc/h with particles of mass ≈ 5 × 109Msolar/h. Running the code with only 10 timesteps was sufficient to obtain an accurate description of halo statistics down to halo masses of at least 1011Msolar/h. This is only at a modest speed penalty when compared to mocks obtained with LPT. A standard detailed N-body run is orders of magnitude slower than our COLA-based code. The speed-up we obtain with COLA is due to the fact that we calculate the large-scale dynamics exactly using LPT, while letting the N-body code solve for the small scales, without requiring it to capture exactly the internal dynamics of halos. Achieving a similar level of accuracy in halo statistics without the COLA method requires at least 3 times more timesteps than when COLA is employed.
Rapid solution of large-scale systems of equations

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.

1994-01-01

The analysis and design of complex aerospace structures requires the rapid solution of large systems of linear and nonlinear equations, eigenvalue extraction for buckling, vibration and flutter modes, structural optimization and design sensitivity calculation. Computers with multiple processors and vector capabilities can offer substantial computational advantages over traditional scalar computer for these analyses. These computers fall into two categories: shared memory computers and distributed memory computers. This presentation covers general-purpose, highly efficient algorithms for generation/assembly or element matrices, solution of systems of linear and nonlinear equations, eigenvalue and design sensitivity analysis and optimization. All algorithms are coded in FORTRAN for shared memory computers and many are adapted to distributed memory computers. The capability and numerical performance of these algorithms will be addressed.
Analytical investigation of critical phenomena in MHD power generators

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1980-07-31

Critical phenomena in the Arnold Engineering Development Center (AEDC) High Performance Demonstration Experiment (HPDE) and the US U-25 Experiment, are analyzed. Also analyzed are the performance of a NASA-specified 500 MW(th) flow train and computations concerning critica issues for the scale-up of MHD Generators. The HPDE is characterized by computational simulations of both the nominal conditions and the conditions during the experimental runs. The steady-state performance is discussed along with the Hall voltage overshoots during the start-up and shutdown transients. The results of simulations of the HPDE runs with codes from the Q3D and TRANSIENT code families are compared tomore » the experimental results. The results of the simulations are in good agreement with the experimental data. Additional critica phenomena analyzed in the AEDC/HPDE are the optimal load schedules, parametric variations, the parametric dependence of the electrode voltage drops, the boundary layer behavior, near electrode phenomena with finite electrode segmentation, and current distribution in the end regions. The US U-25 experiment is characterized by computational simulations of the nominal operating conditions. The steady-state performance for the nominal design of the US U-25 experiment is analyzed, as is the dependence of performance on the mass flow rate. A NASA-specified 500 MW(th) MHD flow train is characterized for computer simulation and the electrical, transport, and thermodynamic properties at the inlet plane are analyzed. Issues for the scale-up of MHD power trains are discussed. The AEDC/HPDE performance is analyzed to compare these experimental results to scale-up rules.« less
Computation of unsteady transonic aerodynamics with steady state fixed by truncation error injection

NASA Technical Reports Server (NTRS)

Fung, K.-Y.; Fu, J.-K.

1985-01-01

A novel technique is introduced for efficient computations of unsteady transonic aerodynamics. The steady flow corresponding to body shape is maintained by truncation error injection while the perturbed unsteady flows corresponding to unsteady body motions are being computed. This allows the use of different grids comparable to the characteristic length scales of the steady and unsteady flows and, hence, allows efficient computation of the unsteady perturbations. An example of typical unsteady computation of flow over a supercritical airfoil shows that substantial savings in computation time and storage without loss of solution accuracy can easily be achieved. This technique is easy to apply and requires very few changes to existing codes.
On fully three-dimensional resistive wall mode and feedback stabilization computations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strumberger, E.; Merkel, P.; Sempf, M.

2008-05-15

Resistive walls, located close to the plasma boundary, reduce the growth rates of external kink modes to resistive time scales. For such slowly growing resistive wall modes, the stabilization by an active feedback system becomes feasible. The fully three-dimensional stability code STARWALL, and the feedback optimization code OPTIM have been developed [P. Merkel and M. Sempf, 21st IAEA Fusion Energy Conference 2006, Chengdu, China (International Atomic Energy Agency, Vienna, 2006, paper TH/P3-8] to compute the growth rates of resistive wall modes in the presence of nonaxisymmetric, multiply connected wall structures and to model the active feedback stabilization of these modes.more » In order to demonstrate the capabilities of the codes and to study the effect of the toroidal mode coupling caused by multiply connected wall structures, the codes are applied to test equilibria using the resistive wall structures currently under debate for ITER [M. Shimada et al., Nucl. Fusion 47, S1 (2007)] and ASDEX Upgrade [W. Koeppendoerfer et al., Proceedings of the 16th Symposium on Fusion Technology, London, 1990 (Elsevier, Amsterdam, 1991), Vol. 1, p. 208].« less
Combining Language Corpora with Experimental and Computational Approaches for Language Acquisition Research

ERIC Educational Resources Information Center

Monaghan, Padraic; Rowland, Caroline F.

2017-01-01

Historically, first language acquisition research was a painstaking process of observation, requiring the laborious hand coding of children's linguistic productions, followed by the generation of abstract theoretical proposals for how the developmental process unfolds. Recently, the ability to collect large-scale corpora of children's language…
Trinity Phase 2 Open Science: CTH

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruggirello, Kevin Patrick; Vogler, Tracy

CTH is an Eulerian hydrocode developed by Sandia National Laboratories (SNL) to solve a wide range of shock wave propagation and material deformation problems. Adaptive mesh refinement is also used to improve efficiency for problems with a wide range of spatial scales. The code has a history of running on a variety of computing platforms ranging from desktops to massively parallel distributed-data systems. For the Trinity Phase 2 Open Science campaign, CTH was used to study mesoscale simulations of the hypervelocity penetration of granular SiC powders. The simulations were compared to experimental data. A scaling study of CTH up tomore » 8192 KNL nodes was also performed, and several improvements were made to the code to improve the scalability.« less
Galaxy formation and physical bias

NASA Technical Reports Server (NTRS)

Cen, Renyue; Ostriker, Jeremiah P.

1992-01-01

We have supplemented our code, which computes the evolution of the physical state of a representative piece of the universe to include, not only the dynamics of dark matter (with a standard PM code), and the hydrodynamics of the gaseous component (including detailed collisional and radiative processes), but also galaxy formation on a heuristic but plausible basis. If, within a cell the gas is Jeans' unstable, collapsing, and cooling rapidly, it is transformed to galaxy subunits, which are then followed with a collisionless code. After grouping them into galaxies, we estimate the relative distributions of galaxies and dark matter and the relative velocities of galaxies and dark matter. In a large scale CDM run of 80/h Mpc size with 8 x 10 exp 6 cells and dark matter particles, we find that physical bias b is on the 8/h Mpc scale is about 1.6 and increases towards smaller scales, and that velocity bias is about 0.8 on the same scale. The comparable HDM simulation is highly biased with b = 2.7 on the 8/h Mpc scale. Implications of these results are discussed in the light of the COBE observations which provide an accurate normalization for the initial power spectrum. CDM can be ruled out on the basis of too large a predicted small scale velocity dispersion at greater than 95 percent confidence level.
AX-GADGET: a new code for cosmological simulations of Fuzzy Dark Matter and Axion models

NASA Astrophysics Data System (ADS)

Nori, Matteo; Baldi, Marco

2018-05-01

We present a new module of the parallel N-Body code P-GADGET3 for cosmological simulations of light bosonic non-thermal dark matter, often referred as Fuzzy Dark Matter (FDM). The dynamics of the FDM features a highly non-linear Quantum Potential (QP) that suppresses the growth of structures at small scales. Most of the previous attempts of FDM simulations either evolved suppressed initial conditions, completely neglecting the dynamical effects of QP throughout cosmic evolution, or resorted to numerically challenging full-wave solvers. The code provides an interesting alternative, following the FDM evolution without impairing the overall performance. This is done by computing the QP acceleration through the Smoothed Particle Hydrodynamics (SPH) routines, with improved schemes to ensure precise and stable derivatives. As an extension of the P-GADGET3 code, it inherits all the additional physics modules implemented up to date, opening a wide range of possibilities to constrain FDM models and explore its degeneracies with other physical phenomena. Simulations are compared with analytical predictions and results of other codes, validating the QP as a crucial player in structure formation at small scales.

A class of hybrid finite element methods for electromagnetics: A review

NASA Technical Reports Server (NTRS)

Volakis, J. L.; Chatterjee, A.; Gong, J.

1993-01-01

Integral equation methods have generally been the workhorse for antenna and scattering computations. In the case of antennas, they continue to be the prominent computational approach, but for scattering applications the requirement for large-scale computations has turned researchers' attention to near neighbor methods such as the finite element method, which has low O(N) storage requirements and is readily adaptable in modeling complex geometrical features and material inhomogeneities. In this paper, we review three hybrid finite element methods for simulating composite scatterers, conformal microstrip antennas, and finite periodic arrays. Specifically, we discuss the finite element method and its application to electromagnetic problems when combined with the boundary integral, absorbing boundary conditions, and artificial absorbers for terminating the mesh. Particular attention is given to large-scale simulations, methods, and solvers for achieving low memory requirements and code performance on parallel computing architectures.
Cosmic microwave background bispectrum from recombination.

PubMed

Huang, Zhiqi; Vernizzi, Filippo

2013-03-08

We compute the cosmic microwave background temperature bispectrum generated by nonlinearities at recombination on all scales. We use CosmoLib2nd, a numerical Boltzmann code at second order to compute cosmic microwave background bispectra on the full sky. We consistently include all effects except gravitational lensing, which can be added to our result using standard methods. The bispectrum is peaked on squeezed triangles and agrees with the analytic approximation in the squeezed limit at the few percent level for all the scales where this is applicable. On smaller scales, we recover previous results on perturbed recombination. For cosmic-variance limited data to l(max)=2000, its signal-to-noise ratio is S/N=0.47, corresponding to f(NL)(eff)=-2.79, and will bias a local signal by f(NL)(loc) ~/= 0.82.
Ab initio density-functional calculations in materials science: from quasicrystals over microporous catalysts to spintronics.

PubMed

Hafner, Jürgen

2010-09-29

During the last 20 years computer simulations based on a quantum-mechanical description of the interactions between electrons and atomic nuclei have developed an increasingly important impact on materials science, not only in promoting a deeper understanding of the fundamental physical phenomena, but also enabling the computer-assisted design of materials for future technologies. The backbone of atomic-scale computational materials science is density-functional theory (DFT) which allows us to cast the intractable complexity of electron-electron interactions into the form of an effective single-particle equation determined by the exchange-correlation functional. Progress in DFT-based calculations of the properties of materials and of simulations of processes in materials depends on: (1) the development of improved exchange-correlation functionals and advanced post-DFT methods and their implementation in highly efficient computer codes, (2) the development of methods allowing us to bridge the gaps in the temperature, pressure, time and length scales between the ab initio calculations and real-world experiments and (3) the extension of the functionality of these codes, permitting us to treat additional properties and new processes. In this paper we discuss the current status of techniques for performing quantum-based simulations on materials and present some illustrative examples of applications to complex quasiperiodic alloys, cluster-support interactions in microporous acid catalysts and magnetic nanostructures.
Study of the integration of wind tunnel and computational methods for aerodynamic configurations

NASA Technical Reports Server (NTRS)

Browne, Lindsey E.; Ashby, Dale L.

1989-01-01

A study was conducted to determine the effectiveness of using a low-order panel code to estimate wind tunnel wall corrections. The corrections were found by two computations. The first computation included the test model and the surrounding wind tunnel walls, while in the second computation the wind tunnel walls were removed. The difference between the force and moment coefficients obtained by comparing these two cases allowed the determination of the wall corrections. The technique was verified by matching the test-section, wall-pressure signature from a wind tunnel test with the signature predicted by the panel code. To prove the viability of the technique, two cases were considered. The first was a two-dimensional high-lift wing with a flap that was tested in the 7- by 10-foot wind tunnel at NASA Ames Research Center. The second was a 1/32-scale model of the F/A-18 aircraft which was tested in the low-speed wind tunnel at San Diego State University. The panel code used was PMARC (Panel Method Ames Research Center). Results of this study indicate that the proposed wind tunnel wall correction method is comparable to other methods and that it also inherently includes the corrections due to model blockage and wing lift.
Evolutionary Computation with Spatial Receding Horizon Control to Minimize Network Coding Resources

PubMed Central

Leeson, Mark S.

2014-01-01

The minimization of network coding resources, such as coding nodes and links, is a challenging task, not only because it is a NP-hard problem, but also because the problem scale is huge; for example, networks in real world may have thousands or even millions of nodes and links. Genetic algorithms (GAs) have a good potential of resolving NP-hard problems like the network coding problem (NCP), but as a population-based algorithm, serious scalability and applicability problems are often confronted when GAs are applied to large- or huge-scale systems. Inspired by the temporal receding horizon control in control engineering, this paper proposes a novel spatial receding horizon control (SRHC) strategy as a network partitioning technology, and then designs an efficient GA to tackle the NCP. Traditional network partitioning methods can be viewed as a special case of the proposed SRHC, that is, one-step-wide SRHC, whilst the method in this paper is a generalized N-step-wide SRHC, which can make a better use of global information of network topologies. Besides the SRHC strategy, some useful designs are also reported in this paper. The advantages of the proposed SRHC and GA for the NCP are illustrated by extensive experiments, and they have a good potential of being extended to other large-scale complex problems. PMID:24883371
Particle model of a cylindrical inductively coupled ion source

NASA Astrophysics Data System (ADS)

Ippolito, N. D.; Taccogna, F.; Minelli, P.; Cavenago, M.; Veltri, P.

2017-08-01

In spite of the wide use of RF sources, a complete understanding of the mechanisms regulating the RF-coupling of the plasma is still lacking so self-consistent simulations of the involved physics are highly desirable. For this reason we are developing a 2.5D fully kinetic Particle-In-Cell Monte-Carlo-Collision (PIC-MCC) model of a cylindrical ICP-RF source, keeping the time step of the simulation small enough to resolve the plasma frequency scale. The grid cell dimension is now about seven times larger than the average Debye length, because of the large computational demand of the code. It will be scaled down in the next phase of the development of the code. The filling gas is Xenon, in order to minimize the time lost by the MCC collision module in the first stage of development of the code. The results presented here are preliminary, with the code already showing a good robustness. The final goal will be the modeling of the NIO1 (Negative Ion Optimization phase 1) source, operating in Padua at Consorzio RFX.
New class of photonic quantum error correction codes

NASA Astrophysics Data System (ADS)

Silveri, Matti; Michael, Marios; Brierley, R. T.; Salmilehto, Juha; Albert, Victor V.; Jiang, Liang; Girvin, S. M.

We present a new class of quantum error correction codes for applications in quantum memories, communication and scalable computation. These codes are constructed from a finite superposition of Fock states and can exactly correct errors that are polynomial up to a specified degree in creation and destruction operators. Equivalently, they can perform approximate quantum error correction to any given order in time step for the continuous-time dissipative evolution under these errors. The codes are related to two-mode photonic codes but offer the advantage of requiring only a single photon mode to correct loss (amplitude damping), as well as the ability to correct other errors, e.g. dephasing. Our codes are also similar in spirit to photonic ''cat codes'' but have several advantages including smaller mean occupation number and exact rather than approximate orthogonality of the code words. We analyze how the rate of uncorrectable errors scales with the code complexity and discuss the unitary control for the recovery process. These codes are realizable with current superconducting qubit technology and can increase the fidelity of photonic quantum communication and memories.
Large-scale three-dimensional phase-field simulations for phase coarsening at ultrahigh volume fraction on high-performance architectures

NASA Astrophysics Data System (ADS)

Yan, Hui; Wang, K. G.; Jones, Jim E.

2016-06-01

A parallel algorithm for large-scale three-dimensional phase-field simulations of phase coarsening is developed and implemented on high-performance architectures. From the large-scale simulations, a new kinetics in phase coarsening in the region of ultrahigh volume fraction is found. The parallel implementation is capable of harnessing the greater computer power available from high-performance architectures. The parallelized code enables increase in three-dimensional simulation system size up to a 5123 grid cube. Through the parallelized code, practical runtime can be achieved for three-dimensional large-scale simulations, and the statistical significance of the results from these high resolution parallel simulations are greatly improved over those obtainable from serial simulations. A detailed performance analysis on speed-up and scalability is presented, showing good scalability which improves with increasing problem size. In addition, a model for prediction of runtime is developed, which shows a good agreement with actual run time from numerical tests.
Application of the TEMPEST computer code to canister-filling heat transfer problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Farnsworth, R.K.; Faletti, D.W.; Budden, M.J.

Pacific Northwest Laboratory (PNL) researchers used the TEMPEST computer code to simulate thermal cooldown behavior of nuclear waste glass after it was poured into steel canisters for long-term storage. The objective of this work was to determine the accuracy and applicability of the TEMPEST code when used to compute canister thermal histories. First, experimental data were obtained to provide the basis for comparing TEMPEST-generated predictions. Five canisters were instrumented with appropriately located radial and axial thermocouples. The canister were filled using the pilot-scale ceramic melter (PSCM) at PNL. Each canister was filled in either a continous or a batch fillingmore » mode. One of the canisters was also filled within a turntable simulant (a group of cylindrical shells with heat transfer resistances similar to those in an actual melter turntable). This was necessary to provide a basis for assessing the ability of the TEMPEST code to also model the transient cooling of canisters in a melter turntable. The continous-fill model, Version M, was found to predict temperatures with more accuracy. The turntable simulant experiment demonstrated that TEMPEST can adequately model the asymmetric temperature field caused by the turntable geometry. Further, TEMPEST can acceptably predict the canister cooling history within a turntable, despite code limitations in computing simultaneous radiation and convection heat transfer between shells, along with uncertainty in stainless-steel surface emissivities. Based on the successful performance of TEMPEST Version M, development was initiated to incorporate 1) full viscous glass convection, 2) a dynamically adaptive grid that automatically follows the glass/air interface throughout the transient, and 3) a full enclosure radiation model to allow radiation heat transfer to non-nearest neighbor cells. 5 refs., 47 figs., 17 tabs.« less
Development of a Renormalization Group Approach to Multi-Scale Plasma Physics Computation

DTIC Science & Technology

2012-03-28

with a collection of information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1...NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT 13. SUPPLEMENTARY NOTES 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: a . REPORT...code) 29-12-2008 Final Technical Report From 29-12-2008 To 16-95-2011 (STTR PHASE II) DEVELOPMENT OF A RENORMALIZATION GROUP APPROACH TO MULTI-SCALE
Hybrid Parallelization of Adaptive MHD-Kinetic Module in Multi-Scale Fluid-Kinetic Simulation Suite

DOE PAGES

Borovikov, Sergey; Heerikhuisen, Jacob; Pogorelov, Nikolai

2013-04-01

The Multi-Scale Fluid-Kinetic Simulation Suite has a computational tool set for solving partially ionized flows. In this paper we focus on recent developments of the kinetic module which solves the Boltzmann equation using the Monte-Carlo method. The module has been recently redesigned to utilize intra-node hybrid parallelization. We describe in detail the redesign process, implementation issues, and modifications made to the code. Finally, we conduct a performance analysis.
Modeling of pulsed propellant reorientation

NASA Technical Reports Server (NTRS)

Patag, A. E.; Hochstein, J. I.; Chato, D. J.

1989-01-01

Optimization of the propellant reorientation process can provide increased payload capability and extend the service life of spacecraft. The use of pulsed propellant reorientation to optimize the reorientation process is proposed. The ECLIPSE code was validated for modeling the reorientation process and is used to study pulsed reorientation in small-scale and full-scale propellant tanks. A dimensional analysis of the process is performed and the resulting dimensionless groups are used to present and correlate the computational predictions for reorientation performance.
Making extreme computations possible with virtual machines

NASA Astrophysics Data System (ADS)

Reuter, J.; Chokoufe Nejad, B.; Ohl, T.

2016-10-01

State-of-the-art algorithms generate scattering amplitudes for high-energy physics at leading order for high-multiplicity processes as compiled code (in Fortran, C or C++). For complicated processes the size of these libraries can become tremendous (many GiB). We show that amplitudes can be translated to byte-code instructions, which even reduce the size by one order of magnitude. The byte-code is interpreted by a Virtual Machine with runtimes comparable to compiled code and a better scaling with additional legs. We study the properties of this algorithm, as an extension of the Optimizing Matrix Element Generator (O'Mega). The bytecode matrix elements are available as alternative input for the event generator WHIZARD. The bytecode interpreter can be implemented very compactly, which will help with a future implementation on massively parallel GPUs.
Development of small scale cluster computer for numerical analysis

NASA Astrophysics Data System (ADS)

Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

2017-09-01

In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.
Computational study of 3-D hot-spot initiation in shocked insensitive high-explosive

NASA Astrophysics Data System (ADS)

Najjar, F. M.; Howard, W. M.; Fried, L. E.; Manaa, M. R.; Nichols, A., III; Levesque, G.

2012-03-01

High-explosive (HE) material consists of large-sized grains with micron-sized embedded impurities and pores. Under various mechanical/thermal insults, these pores collapse generating hightemperature regions leading to ignition. A hydrodynamic study has been performed to investigate the mechanisms of pore collapse and hot spot initiation in TATB crystals, employing a multiphysics code, ALE3D, coupled to the chemistry module, Cheetah. This computational study includes reactive dynamics. Two-dimensional high-resolution large-scale meso-scale simulations have been performed. The parameter space is systematically studied by considering various shock strengths, pore diameters and multiple pore configurations. Preliminary 3-D simulations are undertaken to quantify the 3-D dynamics.
Warp-X: A new exascale computing platform for beam–plasma simulations

DOE PAGES

Vay, J. -L.; Almgren, A.; Bell, J.; ...

2018-01-31

Turning the current experimental plasma accelerator state-of-the-art from a promising technology into mainstream scientific tools depends critically on high-performance, high-fidelity modeling of complex processes that develop over a wide range of space and time scales. As part of the U.S. Department of Energy's Exascale Computing Project, a team from Lawrence Berkeley National Laboratory, in collaboration with teams from SLAC National Accelerator Laboratory and Lawrence Livermore National Laboratory, is developing a new plasma accelerator simulation tool that will harness the power of future exascale supercomputers for high-performance modeling of plasma accelerators. We present the various components of the codes such asmore » the new Particle-In-Cell Scalable Application Resource (PICSAR) and the redesigned adaptive mesh refinement library AMReX, which are combined with redesigned elements of the Warp code, in the new WarpX software. Lastly, the code structure, status, early examples of applications and plans are discussed.« less
Warp-X: A new exascale computing platform for beam–plasma simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vay, J. -L.; Almgren, A.; Bell, J.

Turning the current experimental plasma accelerator state-of-the-art from a promising technology into mainstream scientific tools depends critically on high-performance, high-fidelity modeling of complex processes that develop over a wide range of space and time scales. As part of the U.S. Department of Energy's Exascale Computing Project, a team from Lawrence Berkeley National Laboratory, in collaboration with teams from SLAC National Accelerator Laboratory and Lawrence Livermore National Laboratory, is developing a new plasma accelerator simulation tool that will harness the power of future exascale supercomputers for high-performance modeling of plasma accelerators. We present the various components of the codes such asmore » the new Particle-In-Cell Scalable Application Resource (PICSAR) and the redesigned adaptive mesh refinement library AMReX, which are combined with redesigned elements of the Warp code, in the new WarpX software. Lastly, the code structure, status, early examples of applications and plans are discussed.« less
High Performance Radiation Transport Simulations on TITAN

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Christopher G; Davidson, Gregory G; Evans, Thomas M

2012-01-01

In this paper we describe the Denovo code system. Denovo solves the six-dimensional, steady-state, linear Boltzmann transport equation, of central importance to nuclear technology applications such as reactor core analysis (neutronics), radiation shielding, nuclear forensics and radiation detection. The code features multiple spatial differencing schemes, state-of-the-art linear solvers, the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm for inverting the transport operator, a new multilevel energy decomposition method scaling to hundreds of thousands of processing cores, and a modern, novel code architecture that supports straightforward integration of new features. In this paper we discuss the performance of Denovo on the 10--20 petaflop ORNLmore » GPU-based system, Titan. We describe algorithms and techniques used to exploit the capabilities of Titan's heterogeneous compute node architecture and the challenges of obtaining good parallel performance for this sparse hyperbolic PDE solver containing inherently sequential computations. Numerical results demonstrating Denovo performance on early Titan hardware are presented.« less
Large eddy simulation of fine water sprays: comparative analysis of two models and computer codes

NASA Astrophysics Data System (ADS)

Tsoy, A. S.; Snegirev, A. Yu.

2015-09-01

The model and the computer code FDS, albeit widely used in engineering practice to predict fire development, is not sufficiently validated for fire suppression by fine water sprays. In this work, the effect of numerical resolution of the large scale turbulent pulsations on the accuracy of predicted time-averaged spray parameters is evaluated. Comparison of the simulation results obtained with the two versions of the model and code, as well as that of the predicted and measured radial distributions of the liquid flow rate revealed the need to apply monotonic and yet sufficiently accurate discrete approximations of the convective terms. Failure to do so delays jet break-up, otherwise induced by large turbulent eddies, thereby excessively focuses the predicted flow around its axis. The effect of the pressure drop in the spray nozzle is also examined, and its increase has shown to cause only weak increase of the evaporated fraction and vapor concentration despite the significant increase of flow velocity.
Multi-scale simulations of space problems with iPIC3D

NASA Astrophysics Data System (ADS)

Lapenta, Giovanni; Bettarini, Lapo; Markidis, Stefano

The implicit Particle-in-Cell method for the computer simulation of space plasma, and its im-plementation in a three-dimensional parallel code, called iPIC3D, are presented. The implicit integration in time of the Vlasov-Maxwell system removes the numerical stability constraints and enables kinetic plasma simulations at magnetohydrodynamics scales. Simulations of mag-netic reconnection in plasma are presented to show the effectiveness of the algorithm. In particular we will show a number of simulations done for large scale 3D systems using the physical mass ratio for Hydrogen. Most notably one simulation treats kinetically a box of tens of Earth radii in each direction and was conducted using about 16000 processors of the Pleiades NASA computer. The work is conducted in collaboration with the MMS-IDS theory team from University of Colorado (M. Goldman, D. Newman and L. Andersson). Reference: Stefano Markidis, Giovanni Lapenta, Rizwan-uddin Multi-scale simulations of plasma with iPIC3D Mathematics and Computers in Simulation, Available online 17 October 2009, http://dx.doi.org/10.1016/j.matcom.2009.08.038

Composite load spectra for select space propulsion structural components

NASA Technical Reports Server (NTRS)

Newell, James F.; Ho, Hing W.

1991-01-01

This report summarizes the development for: (1) correlation fields; (2) applications to liquid oxygen post; (3) models for pressure fluctuatios and vibration loads fluctuations; (4) additions to expert systems; and (5) scaling criteria. Implementation to computer code is also described. Demonstration sample cases are included with additional applications to engine duct and pipe bend.
Adaptive Encoding for Numerical Data Compression.

ERIC Educational Resources Information Center

Yokoo, Hidetoshi

1994-01-01

Discusses the adaptive compression of computer files of numerical data whose statistical properties are not given in advance. A new lossless coding method for this purpose, which utilizes Adelson-Velskii and Landis (AVL) trees, is proposed. The method is effective to any word length. Its application to the lossless compression of gray-scale images…
Performance of the fusion code GYRO on four generations of Cray computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fahey, Mark R

2014-01-01

GYRO is a code used for the direct numerical simulation of plasma microturbulence. It has been ported to a variety of modern MPP platforms including several modern commodity clusters, IBM SPs, and Cray XC, XT, and XE series machines. We briefly describe the mathematical structure of the equations, the data layout, and the redistribution scheme. Also, while the performance and scaling of GYRO on many of these systems has been shown before, here we show the comparative performance and scaling on four generations of Cray supercomputers including the newest addition - the Cray XC30. The more recently added hybrid OpenMP/MPImore » imple- mentation also shows a great deal of promise on custom HPC systems that utilize fast CPUs and proprietary interconnects. Four machines of varying sizes were used in the experiment, all of which are located at the National Institute for Computational Sciences at the University of Tennessee at Knoxville and Oak Ridge National Laboratory. The advantages, limitations, and performance of using each system are discussed.« less
Solving large scale structure in ten easy steps with COLA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tassev, Svetlin; Zaldarriaga, Matias; Eisenstein, Daniel J., E-mail: stassev@cfa.harvard.edu, E-mail: matiasz@ias.edu, E-mail: deisenstein@cfa.harvard.edu

2013-06-01

We present the COmoving Lagrangian Acceleration (COLA) method: an N-body method for solving for Large Scale Structure (LSS) in a frame that is comoving with observers following trajectories calculated in Lagrangian Perturbation Theory (LPT). Unlike standard N-body methods, the COLA method can straightforwardly trade accuracy at small-scales in order to gain computational speed without sacrificing accuracy at large scales. This is especially useful for cheaply generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing, as those catalogs are essential for performing detailed error analysis for ongoing and future surveys of LSS. As anmore » illustration, we ran a COLA-based N-body code on a box of size 100 Mpc/h with particles of mass ≈ 5 × 10{sup 9}M{sub s}un/h. Running the code with only 10 timesteps was sufficient to obtain an accurate description of halo statistics down to halo masses of at least 10{sup 11}M{sub s}un/h. This is only at a modest speed penalty when compared to mocks obtained with LPT. A standard detailed N-body run is orders of magnitude slower than our COLA-based code. The speed-up we obtain with COLA is due to the fact that we calculate the large-scale dynamics exactly using LPT, while letting the N-body code solve for the small scales, without requiring it to capture exactly the internal dynamics of halos. Achieving a similar level of accuracy in halo statistics without the COLA method requires at least 3 times more timesteps than when COLA is employed.« less
A computer program incorporating Pitzer's equations for calculation of geochemical reactions in brines

USGS Publications Warehouse

Plummer, Niel; Parkhurst, D.L.; Fleming, G.W.; Dunkle, S.A.

1988-01-01

The program named PHRQPITZ is a computer code capable of making geochemical calculations in brines and other electrolyte solutions to high concentrations using the Pitzer virial-coefficient approach for activity-coefficient corrections. Reaction-modeling capabilities include calculation of (1) aqueous speciation and mineral-saturation index, (2) mineral solubility, (3) mixing and titration of aqueous solutions, (4) irreversible reactions and mineral water mass transfer, and (5) reaction path. The computed results for each aqueous solution include the osmotic coefficient, water activity , mineral saturation indices, mean activity coefficients, total activity coefficients, and scale-dependent values of pH, individual-ion activities and individual-ion activity coeffients , and scale-dependent values of pH, individual-ion activities and individual-ion activity coefficients. A data base of Pitzer interaction parameters is provided at 25 C for the system: Na-K-Mg-Ca-H-Cl-SO4-OH-HCO3-CO3-CO2-H2O, and extended to include largely untested literature data for Fe(II), Mn(II), Sr, Ba, Li, and Br with provision for calculations at temperatures other than 25C. An extensive literature review of published Pitzer interaction parameters for many inorganic salts is given. Also described is an interactive input code for PHRQPITZ called PITZINPT. (USGS)
Progress in Computational Simulation of Earthquakes

NASA Technical Reports Server (NTRS)

Donnellan, Andrea; Parker, Jay; Lyzenga, Gregory; Judd, Michele; Li, P. Peggy; Norton, Charles; Tisdale, Edwin; Granat, Robert

2006-01-01

GeoFEST(P) is a computer program written for use in the QuakeSim project, which is devoted to development and improvement of means of computational simulation of earthquakes. GeoFEST(P) models interacting earthquake fault systems from the fault-nucleation to the tectonic scale. The development of GeoFEST( P) has involved coupling of two programs: GeoFEST and the Pyramid Adaptive Mesh Refinement Library. GeoFEST is a message-passing-interface-parallel code that utilizes a finite-element technique to simulate evolution of stress, fault slip, and plastic/elastic deformation in realistic materials like those of faulted regions of the crust of the Earth. The products of such simulations are synthetic observable time-dependent surface deformations on time scales from days to decades. Pyramid Adaptive Mesh Refinement Library is a software library that facilitates the generation of computational meshes for solving physical problems. In an application of GeoFEST(P), a computational grid can be dynamically adapted as stress grows on a fault. Simulations on workstations using a few tens of thousands of stress and displacement finite elements can now be expanded to multiple millions of elements with greater than 98-percent scaled efficiency on over many hundreds of parallel processors (see figure).
Energy and technology review: Engineering modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cabayan, H.S.; Goudreau, G.L.; Ziolkowski, R.W.

1986-10-01

This report presents information concerning: Modeling Canonical Problems in Electromagnetic Coupling Through Apertures; Finite-Element Codes for Computing Electrostatic Fields; Finite-Element Modeling of Electromagnetic Phenomena; Modeling Microwave-Pulse Compression in a Resonant Cavity; Lagrangian Finite-Element Analysis of Penetration Mechanics; Crashworthiness Engineering; Computer Modeling of Metal-Forming Processes; Thermal-Mechanical Modeling of Tungsten Arc Welding; Modeling Air Breakdown Induced by Electromagnetic Fields; Iterative Techniques for Solving Boltzmann's Equations for p-Type Semiconductors; Semiconductor Modeling; and Improved Numerical-Solution Techniques in Large-Scale Stress Analysis.
Eddylicious: A Python package for turbulent inflow generation

NASA Astrophysics Data System (ADS)

Mukha, Timofey; Liefvendahl, Mattias

2018-01-01

A Python package for generating inflow for scale-resolving computer simulations of turbulent flow is presented. The purpose of the package is to unite existing inflow generation methods in a single code-base and make them accessible to users of various Computational Fluid Dynamics (CFD) solvers. The currently existing functionality consists of an accurate inflow generation method suitable for flows with a turbulent boundary layer inflow and input/output routines for coupling with the open-source CFD solver OpenFOAM.
Comparison Between Simulated and Experimentally Measured Performance of a Four Port Wave Rotor

NASA Technical Reports Server (NTRS)

Paxson, Daniel E.; Wilson, Jack; Welch, Gerard E.

2007-01-01

Performance and operability testing has been completed on a laboratory-scale, four-port wave rotor, of the type suitable for use as a topping cycle on a gas turbine engine. Many design aspects, and performance estimates for the wave rotor were determined using a time-accurate, one-dimensional, computational fluid dynamics-based simulation code developed specifically for wave rotors. The code follows a single rotor passage as it moves past the various ports, which in this reference frame become boundary conditions. This paper compares wave rotor performance predicted with the code to that measured during laboratory testing. Both on and off-design operating conditions were examined. Overall, the match between code and rig was found to be quite good. At operating points where there were disparities, the assumption of larger than expected internal leakage rates successfully realigned code predictions and laboratory measurements. Possible mechanisms for such leakage rates are discussed.
From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

DOE PAGES

Blazewicz, Marek; Hinder, Ian; Koppelman, David M.; ...

2013-01-01

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization ismore » based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.« less
Design Considerations of a Virtual Laboratory for Advanced X-ray Sources

NASA Astrophysics Data System (ADS)

Luginsland, J. W.; Frese, M. H.; Frese, S. D.; Watrous, J. J.; Heileman, G. L.

2004-11-01

The field of scientific computation has greatly advanced in the last few years, resulting in the ability to perform complex computer simulations that can predict the performance of real-world experiments in a number of fields of study. Among the forces driving this new computational capability is the advent of parallel algorithms, allowing calculations in three-dimensional space with realistic time scales. Electromagnetic radiation sources driven by high-voltage, high-current electron beams offer an area to further push the state-of-the-art in high fidelity, first-principles simulation tools. The physics of these x-ray sources combine kinetic plasma physics (electron beams) with dense fluid-like plasma physics (anode plasmas) and x-ray generation (bremsstrahlung). There are a number of mature techniques and software packages for dealing with the individual aspects of these sources, such as Particle-In-Cell (PIC), Magneto-Hydrodynamics (MHD), and radiation transport codes. The current effort is focused on developing an object-oriented software environment using the Rational© Unified Process and the Unified Modeling Language (UML) to provide a framework where multiple 3D parallel physics packages, such as a PIC code (ICEPIC), a MHD code (MACH), and a x-ray transport code (ITS) can co-exist in a system-of-systems approach to modeling advanced x-ray sources. Initial software design and assessments of the various physics algorithms' fidelity will be presented.
High-dimensional free-space optical communications based on orbital angular momentum coding

NASA Astrophysics Data System (ADS)

Zou, Li; Gu, Xiaofan; Wang, Le

2018-03-01

In this paper, we propose a high-dimensional free-space optical communication scheme using orbital angular momentum (OAM) coding. In the scheme, the transmitter encodes N-bits information by using a spatial light modulator to convert a Gaussian beam to a superposition mode of N OAM modes and a Gaussian mode; The receiver decodes the information through an OAM mode analyser which consists of a MZ interferometer with a rotating Dove prism, a photoelectric detector and a computer carrying out the fast Fourier transform. The scheme could realize a high-dimensional free-space optical communication, and decodes the information much fast and accurately. We have verified the feasibility of the scheme by exploiting 8 (4) OAM modes and a Gaussian mode to implement a 256-ary (16-ary) coding free-space optical communication to transmit a 256-gray-scale (16-gray-scale) picture. The results show that a zero bit error rate performance has been achieved.
JDFTx: Software for joint density-functional theory

DOE PAGES

Sundararaman, Ravishankar; Letchworth-Weaver, Kendra; Schwarz, Kathleen A.; ...

2017-11-14

Density-functional theory (DFT) has revolutionized computational prediction of atomic-scale properties from first principles in physics, chemistry and materials science. Continuing development of new methods is necessary for accurate predictions of new classes of materials and properties, and for connecting to nano- and mesoscale properties using coarse-grained theories. JDFTx is a fully-featured open-source electronic DFT software designed specifically to facilitate rapid development of new theories, models and algorithms. Using an algebraic formulation as an abstraction layer, compact C++11 code automatically performs well on diverse hardware including GPUs (Graphics Processing Units). This code hosts the development of joint density-functional theory (JDFT) thatmore » combines electronic DFT with classical DFT and continuum models of liquids for first-principles calculations of solvated and electrochemical systems. In addition, the modular nature of the code makes it easy to extend and interface with, facilitating the development of multi-scale toolkits that connect to ab initio calculations, e.g. photo-excited carrier dynamics combining electron and phonon calculations with electromagnetic simulations.« less
JDFTx: Software for joint density-functional theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sundararaman, Ravishankar; Letchworth-Weaver, Kendra; Schwarz, Kathleen A.

Density-functional theory (DFT) has revolutionized computational prediction of atomic-scale properties from first principles in physics, chemistry and materials science. Continuing development of new methods is necessary for accurate predictions of new classes of materials and properties, and for connecting to nano- and mesoscale properties using coarse-grained theories. JDFTx is a fully-featured open-source electronic DFT software designed specifically to facilitate rapid development of new theories, models and algorithms. Using an algebraic formulation as an abstraction layer, compact C++11 code automatically performs well on diverse hardware including GPUs (Graphics Processing Units). This code hosts the development of joint density-functional theory (JDFT) thatmore » combines electronic DFT with classical DFT and continuum models of liquids for first-principles calculations of solvated and electrochemical systems. In addition, the modular nature of the code makes it easy to extend and interface with, facilitating the development of multi-scale toolkits that connect to ab initio calculations, e.g. photo-excited carrier dynamics combining electron and phonon calculations with electromagnetic simulations.« less
Scaling analysis applied to the NORVEX code development and thermal energy flight experiment

NASA Technical Reports Server (NTRS)

Skarda, J. Raymond Lee; Namkoong, David; Darling, Douglas

1991-01-01

A scaling analysis is used to study the dominant flow processes that occur in molten phase change material (PCM) under 1 g and microgravity conditions. Results of the scaling analysis are applied to the development of the NORVEX (NASA Oak Ridge Void Experiment) computer program and the preparation of the Thermal Energy Storage (TES) flight experiment. The NORVEX computer program which is being developed to predict melting and freezing with void formation in a 1 g or microgravity environment of the PCM is described. NORVEX predictions are compared with the scaling and similarity results. The approach to be used to validate NORVEX with TES flight data is also discussed. Similarity and scaling show that the inertial terms must be included as part of the momentum equation in either the 1 g or microgravity environment (a creeping flow assumption is invalid). A 10(exp -4) environment was found to be a suitable microgravity environment for the proposed PCM.
An Assessment of Current Fan Noise Prediction Capability

NASA Technical Reports Server (NTRS)

Envia, Edmane; Woodward, Richard P.; Elliott, David M.; Fite, E. Brian; Hughes, Christopher E.; Podboy, Gary G.; Sutliff, Daniel L.

2008-01-01

In this paper, the results of an extensive assessment exercise carried out to establish the current state of the art for predicting fan noise at NASA are presented. Representative codes in the empirical, analytical, and computational categories were exercised and assessed against a set of benchmark acoustic data obtained from wind tunnel tests of three model scale fans. The chosen codes were ANOPP, representing an empirical capability, RSI, representing an analytical capability, and LINFLUX, representing a computational aeroacoustics capability. The selected benchmark fans cover a wide range of fan pressure ratios and fan tip speeds, and are representative of modern turbofan engine designs. The assessment results indicate that the ANOPP code can predict fan noise spectrum to within 4 dB of the measurement uncertainty band on a third-octave basis for the low and moderate tip speed fans except at extreme aft emission angles. The RSI code can predict fan broadband noise spectrum to within 1.5 dB of experimental uncertainty band provided the rotor-only contribution is taken into account. The LINFLUX code can predict interaction tone power levels to within experimental uncertainties at low and moderate fan tip speeds, but could deviate by as much as 6.5 dB outside the experimental uncertainty band at the highest tip speeds in some case.
Overview of the NCC

NASA Technical Reports Server (NTRS)

Liu, Nan-Suey

2001-01-01

A multi-disciplinary design/analysis tool for combustion systems is critical for optimizing the low-emission, high-performance combustor design process. Based on discussions between then NASA Lewis Research Center and the jet engine companies, an industry-government team was formed in early 1995 to develop the National Combustion Code (NCC), which is an integrated system of computer codes for the design and analysis of combustion systems. NCC has advanced features that address the need to meet designer's requirements such as "assured accuracy", "fast turnaround", and "acceptable cost". The NCC development team is comprised of Allison Engine Company (Allison), CFD Research Corporation (CFDRC), GE Aircraft Engines (GEAE), NASA Glenn Research Center (LeRC), and Pratt & Whitney (P&W). The "unstructured mesh" capability and "parallel computing" are fundamental features of NCC from its inception. The NCC system is composed of a set of "elements" which includes grid generator, main flow solver, turbulence module, turbulence and chemistry interaction module, chemistry module, spray module, radiation heat transfer module, data visualization module, and a post-processor for evaluating engine performance parameters. Each element may have contributions from several team members. Such a multi-source multi-element system needs to be integrated in a way that facilitates inter-module data communication, flexibility in module selection, and ease of integration. The development of the NCC beta version was essentially completed in June 1998. Technical details of the NCC elements are given in the Reference List. Elements such as the baseline flow solver, turbulence module, and the chemistry module, have been extensively validated; and their parallel performance on large-scale parallel systems has been evaluated and optimized. However the scalar PDF module and the Spray module, as well as their coupling with the baseline flow solver, were developed in a small-scale distributed computing environment. As a result, the validation of the NCC beta version as a whole was quite limited. Current effort has been focused on the validation of the integrated code and the evaluation/optimization of its overall performance on large-scale parallel systems.
An Approximate Axisymmetric Viscous Shock Layer Aeroheating Method for Three-Dimensional Bodies

NASA Technical Reports Server (NTRS)

Brykina, Irina G.; Scott, Carl D.

1998-01-01

A technique is implemented for computing hypersonic aeroheating, shear stress, and other flow properties on the windward side of a three-dimensional (3D) blunt body. The technique uses a 2D/axisymmetric flow solver modified by scale factors for a, corresponding equivalent axisymmetric body. Examples are given in which a 2D solver is used to calculate the flow at selected meridional planes on elliptic paraboloids in reentry flight. The report describes the equations and the codes used to convert the body surface parameters into input used to scale the 2D viscous shock layer equations in the axisymmetric viscous shock layer code. Very good agreement is obtained with solutions to finite rate chemistry 3D thin viscous shock layer equations for a finite rate catalytic body.
Development of a dynamic coupled hydro-geomechanical code and its application to induced seismicity

NASA Astrophysics Data System (ADS)

Miah, Md Mamun

This research describes the importance of a hydro-geomechanical coupling in the geologic sub-surface environment from fluid injection at geothermal plants, large-scale geological CO2 sequestration for climate mitigation, enhanced oil recovery, and hydraulic fracturing during wells construction in the oil and gas industries. A sequential computational code is developed to capture the multiphysics interaction behavior by linking a flow simulation code TOUGH2 and a geomechanics modeling code PyLith. Numerical formulation of each code is discussed to demonstrate their modeling capabilities. The computational framework involves sequential coupling, and solution of two sub-problems- fluid flow through fractured and porous media and reservoir geomechanics. For each time step of flow calculation, pressure field is passed to the geomechanics code to compute effective stress field and fault slips. A simplified permeability model is implemented in the code that accounts for the permeability of porous and saturated rocks subject to confining stresses. The accuracy of the TOUGH-PyLith coupled simulator is tested by simulating Terzaghi's 1D consolidation problem. The modeling capability of coupled poroelasticity is validated by benchmarking it against Mandel's problem. The code is used to simulate both quasi-static and dynamic earthquake nucleation and slip distribution on a fault from the combined effect of far field tectonic loading and fluid injection by using an appropriate fault constitutive friction model. Results from the quasi-static induced earthquake simulations show a delayed response in earthquake nucleation. This is attributed to the increased total stress in the domain and not accounting for pressure on the fault. However, this issue is resolved in the final chapter in simulating a single event earthquake dynamic rupture. Simulation results show that fluid pressure has a positive effect on slip nucleation and subsequent crack propagation. This is confirmed by running a sensitivity analysis that shows an increase in injection well distance results in delayed slip nucleation and rupture propagation on the fault.
Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas (GPS - TTBP) Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chame, Jacqueline

2011-05-27

The goal of this project is the development of the Gyrokinetic Toroidal Code (GTC) Framework and its applications to problems related to the physics of turbulence and turbulent transport in tokamaks,. The project involves physics studies, code development, noise effect mitigation, supporting computer science efforts, diagnostics and advanced visualizations, verification and validation. Its main scientific themes are mesoscale dynamics and non-locality effects on transport, the physics of secondary structures such as zonal flows, and strongly coherent wave-particle interaction phenomena at magnetic precession resonances. Special emphasis is placed on the implications of these themes for rho-star and current scalings and formore » the turbulent transport of momentum. GTC-TTBP also explores applications to electron thermal transport, particle transport; ITB formation and cross-cuts such as edge-core coupling, interaction of energetic particles with turbulence and neoclassical tearing mode trigger dynamics. Code development focuses on major initiatives in the development of full-f formulations and the capacity to simulate flux-driven transport. In addition to the full-f -formulation, the project includes the development of numerical collision models and methods for coarse graining in phase space. Verification is pursued by linear stability study comparisons with the FULL and HD7 codes and by benchmarking with the GKV, GYSELA and other gyrokinetic simulation codes. Validation of gyrokinetic models of ion and electron thermal transport is pursed by systematic stressing comparisons with fluctuation and transport data from the DIII-D and NSTX tokamaks. The physics and code development research programs are supported by complementary efforts in computer sciences, high performance computing, and data management.« less

SCALE Code System 6.2.2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rearden, Bradley T.; Jessee, Matthew Anderson

The SCALE Code System is a widely used modeling and simulation suite for nuclear safety analysis and design that is developed, maintained, tested, and managed by the Reactor and Nuclear Systems Division (RNSD) of Oak Ridge National Laboratory (ORNL). SCALE provides a comprehensive, verified and validated, user-friendly tool set for criticality safety, reactor physics, radiation shielding, radioactive source term characterization, and sensitivity and uncertainty analysis. Since 1980, regulators, licensees, and research institutions around the world have used SCALE for safety analysis and design. SCALE provides an integrated framework with dozens of computational modules including 3 deterministic and 3 Monte Carlomore » radiation transport solvers that are selected based on the desired solution strategy. SCALE includes current nuclear data libraries and problem-dependent processing tools for continuous-energy (CE) and multigroup (MG) neutronics and coupled neutron-gamma calculations, as well as activation, depletion, and decay calculations. SCALE includes unique capabilities for automated variance reduction for shielding calculations, as well as sensitivity and uncertainty analysis. SCALE’s graphical user interfaces assist with accurate system modeling, visualization of nuclear data, and convenient access to desired results. SCALE 6.2 represents one of the most comprehensive revisions in the history of SCALE, providing several new capabilities and significant improvements in many existing features.« less
Aerodynamics of small-scale vertical-axis wind turbines

NASA Astrophysics Data System (ADS)

Paraschivoiu, I.; Desy, P.

1985-12-01

The purpose of this work is to study the influence of various rotor parameters on the aerodynamic performance of a small-scale Darrieus wind turbine. To do this, a straight-bladed Darrieus rotor is calculated by using the double-multiple-streamtube model including the streamtube expansion effects through the rotor (CARDAAX computer code) and the dynamicstall effects. The straight-bladed Darrieus turbine is as expected more efficient with respect the curved-bladed rotor but for a given solidity is operates at higher wind speeds.
Results of comparative RBMK neutron computation using VNIIEF codes (cell computation, 3D statics, 3D kinetics). Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grebennikov, A.N.; Zhitnik, A.K.; Zvenigorodskaya, O.A.

1995-12-31

In conformity with the protocol of the Workshop under Contract {open_quotes}Assessment of RBMK reactor safety using modern Western Codes{close_quotes} VNIIEF performed a neutronics computation series to compare western and VNIIEF codes and assess whether VNIIEF codes are suitable for RBMK type reactor safety assessment computation. The work was carried out in close collaboration with M.I. Rozhdestvensky and L.M. Podlazov, NIKIET employees. The effort involved: (1) cell computations with the WIMS, EKRAN codes (improved modification of the LOMA code) and the S-90 code (VNIIEF Monte Carlo). Cell, polycell, burnup computation; (2) 3D computation of static states with the KORAT-3D and NEUmore » codes and comparison with results of computation with the NESTLE code (USA). The computations were performed in the geometry and using the neutron constants presented by the American party; (3) 3D computation of neutron kinetics with the KORAT-3D and NEU codes. These computations were performed in two formulations, both being developed in collaboration with NIKIET. Formulation of the first problem maximally possibly agrees with one of NESTLE problems and imitates gas bubble travel through a core. The second problem is a model of the RBMK as a whole with imitation of control and protection system controls (CPS) movement in a core.« less
Numerical modeling of separated flows at moderate Reynolds numbers appropriate for turbine blades and unmanned aero vehicles

NASA Astrophysics Data System (ADS)

Castiglioni, Giacomo

Flows over airfoils and blades in rotating machinery, for unmanned and micro-aerial vehicles, wind turbines, and propellers consist of a laminar boundary layer near the leading edge that is often followed by a laminar separation bubble and transition to turbulence further downstream. Typical Reynolds averaged Navier-Stokes turbulence models are inadequate for such flows. Direct numerical simulation is the most reliable, but is also the most computationally expensive alternative. This work assesses the capability of immersed boundary methods and large eddy simulations to reduce the computational requirements for such flows and still provide high quality results. Two-dimensional and three-dimensional simulations of a laminar separation bubble on a NACA-0012 airfoil at Rec = 5x104 and at 5° of incidence have been performed with an immersed boundary code and a commercial code using body fitted grids. Several sub-grid scale models have been implemented in both codes and their performance evaluated. For the two-dimensional simulations with the immersed boundary method the results show good agreement with the direct numerical simulation benchmark data for the pressure coefficient Cp and the friction coefficient Cf, but only when using dissipative numerical schemes. There is evidence that this behavior can be attributed to the ability of dissipative schemes to damp numerical noise coming from the immersed boundary. For the three-dimensional simulations the results show a good prediction of the separation point, but an inaccurate prediction of the reattachment point unless full direct numerical simulation resolution is used. The commercial code shows good agreement with the direct numerical simulation benchmark data in both two and three-dimensional simulations, but the presence of significant, unquantified numerical dissipation prevents a conclusive assessment of the actual prediction capabilities of very coarse large eddy simulations with low order schemes in general cases. Additionally, a two-dimensional sweep of angles of attack from 0° to 5° is performed showing a qualitative prediction of the jump in lift and drag coefficients due to the appearance of the laminar separation bubble. The numerical dissipation inhibits the predictive capabilities of large eddy simulations whenever it is of the same order of magnitude or larger than the sub-grid scale dissipation. The need to estimate the numerical dissipation is most pressing for low-order methods employed by commercial computational fluid dynamics codes. Following the recent work of Schranner et al., the equations and procedure for estimating the numerical dissipation rate and the numerical viscosity in a commercial code are presented. The method allows for the computation of the numerical dissipation rate and numerical viscosity in the physical space for arbitrary sub-domains in a self-consistent way, using only information provided by the code in question. The method is first tested for a three-dimensional Taylor-Green vortex flow in a simple cubic domain and compared with benchmark results obtained using an accurate, incompressible spectral solver. Afterwards the same procedure is applied for the first time to a realistic flow configuration, specifically to the above discussed laminar separation bubble flow over a NACA 0012 airfoil. The method appears to be quite robust and its application reveals that for the code and the flow in question the numerical dissipation can be significantly larger than the viscous dissipation or the dissipation of the classical Smagorinsky sub-grid scale model, confirming the previously qualitative finding.
A surface code quantum computer in silicon

PubMed Central

Hill, Charles D.; Peretz, Eldad; Hile, Samuel J.; House, Matthew G.; Fuechsle, Martin; Rogge, Sven; Simmons, Michelle Y.; Hollenberg, Lloyd C. L.

2015-01-01

The exceptionally long quantum coherence times of phosphorus donor nuclear spin qubits in silicon, coupled with the proven scalability of silicon-based nano-electronics, make them attractive candidates for large-scale quantum computing. However, the high threshold of topological quantum error correction can only be captured in a two-dimensional array of qubits operating synchronously and in parallel—posing formidable fabrication and control challenges. We present an architecture that addresses these problems through a novel shared-control paradigm that is particularly suited to the natural uniformity of the phosphorus donor nuclear spin qubit states and electronic confinement. The architecture comprises a two-dimensional lattice of donor qubits sandwiched between two vertically separated control layers forming a mutually perpendicular crisscross gate array. Shared-control lines facilitate loading/unloading of single electrons to specific donors, thereby activating multiple qubits in parallel across the array on which the required operations for surface code quantum error correction are carried out by global spin control. The complexities of independent qubit control, wave function engineering, and ad hoc quantum interconnects are explicitly avoided. With many of the basic elements of fabrication and control based on demonstrated techniques and with simulated quantum operation below the surface code error threshold, the architecture represents a new pathway for large-scale quantum information processing in silicon and potentially in other qubit systems where uniformity can be exploited. PMID:26601310
A surface code quantum computer in silicon.

PubMed

Hill, Charles D; Peretz, Eldad; Hile, Samuel J; House, Matthew G; Fuechsle, Martin; Rogge, Sven; Simmons, Michelle Y; Hollenberg, Lloyd C L

2015-10-01

The exceptionally long quantum coherence times of phosphorus donor nuclear spin qubits in silicon, coupled with the proven scalability of silicon-based nano-electronics, make them attractive candidates for large-scale quantum computing. However, the high threshold of topological quantum error correction can only be captured in a two-dimensional array of qubits operating synchronously and in parallel-posing formidable fabrication and control challenges. We present an architecture that addresses these problems through a novel shared-control paradigm that is particularly suited to the natural uniformity of the phosphorus donor nuclear spin qubit states and electronic confinement. The architecture comprises a two-dimensional lattice of donor qubits sandwiched between two vertically separated control layers forming a mutually perpendicular crisscross gate array. Shared-control lines facilitate loading/unloading of single electrons to specific donors, thereby activating multiple qubits in parallel across the array on which the required operations for surface code quantum error correction are carried out by global spin control. The complexities of independent qubit control, wave function engineering, and ad hoc quantum interconnects are explicitly avoided. With many of the basic elements of fabrication and control based on demonstrated techniques and with simulated quantum operation below the surface code error threshold, the architecture represents a new pathway for large-scale quantum information processing in silicon and potentially in other qubit systems where uniformity can be exploited.
Automated target recognition using passive radar and coordinated flight models

NASA Astrophysics Data System (ADS)

Ehrman, Lisa M.; Lanterman, Aaron D.

2003-09-01

Rather than emitting pulses, passive radar systems rely on illuminators of opportunity, such as TV and FM radio, to illuminate potential targets. These systems are particularly attractive since they allow receivers to operate without emitting energy, rendering them covert. Many existing passive radar systems estimate the locations and velocities of targets. This paper focuses on adding an automatic target recognition (ATR) component to such systems. Our approach to ATR compares the Radar Cross Section (RCS) of targets detected by a passive radar system to the simulated RCS of known targets. To make the comparison as accurate as possible, the received signal model accounts for aircraft position and orientation, propagation losses, and antenna gain patterns. The estimated positions become inputs for an algorithm that uses a coordinated flight model to compute probable aircraft orientation angles. The Fast Illinois Solver Code (FISC) simulates the RCS of several potential target classes as they execute the estimated maneuvers. The RCS is then scaled by the Advanced Refractive Effects Prediction System (AREPS) code to account for propagation losses that occur as functions of altitude and range. The Numerical Electromagnetic Code (NEC2) computes the antenna gain pattern, so that the RCS can be further scaled. The Rician model compares the RCS of the illuminated aircraft with those of the potential targets. This comparison results in target identification.
The DoD's High Performance Computing Modernization Program - Ensuing the National Earth Systems Prediction Capability Becomes Operational

NASA Astrophysics Data System (ADS)

Burnett, W.

2016-12-01

The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will provide the DoD with future computing assets to initially operate the N-ESPC in 2019. This talk will further describe how DoD's HPCMP will ensure N-ESPC becomes operational, efficiently and effectively, using next-generation high performance computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Heroux, Michael; Lethin, Richard

Programming models and environments play the essential roles in high performance computing of enabling the conception, design, implementation and execution of science and engineering application codes. Programmer productivity is strongly influenced by the effectiveness of our programming models and environments, as is software sustainability since our codes have lifespans measured in decades, so the advent of new computing architectures, increased concurrency, concerns for resilience, and the increasing demands for high-fidelity, multi-physics, multi-scale and data-intensive computations mean that we have new challenges to address as part of our fundamental R&D requirements. Fortunately, we also have new tools and environments that makemore » design, prototyping and delivery of new programming models easier than ever. The combination of new and challenging requirements and new, powerful toolsets enables significant synergies for the next generation of programming models and environments R&D. This report presents the topics discussed and results from the 2014 DOE Office of Science Advanced Scientific Computing Research (ASCR) Programming Models & Environments Summit, and subsequent discussions among the summit participants and contributors to topics in this report.« less
The fusion code XGC: Enabling kinetic study of multi-scale edge turbulent transport in ITER [Book Chapter

DOE Office of Scientific and Technical Information (OSTI.GOV)

D'Azevedo, Eduardo; Abbott, Stephen; Koskela, Tuomas

The XGC fusion gyrokinetic code combines state-of-the-art, portable computational and algorithmic technologies to enable complicated multiscale simulations of turbulence and transport dynamics in ITER edge plasma on the largest US open-science computer, the CRAY XK7 Titan, at its maximal heterogeneous capability, which have not been possible before due to a factor of over 10 shortage in the time-to-solution for less than 5 days of wall-clock time for one physics case. Frontier techniques such as nested OpenMP parallelism, adaptive parallel I/O, staging I/O and data reduction using dynamic and asynchronous applications interactions, dynamic repartitioning for balancing computational work in pushing particlesmore » and in grid related work, scalable and accurate discretization algorithms for non-linear Coulomb collisions, and communication-avoiding subcycling technology for pushing particles on both CPUs and GPUs are also utilized to dramatically improve the scalability and time-to-solution, hence enabling the difficult kinetic ITER edge simulation on a present-day leadership class computer.« less
Seals Research at Texas A/M University

NASA Technical Reports Server (NTRS)

Morrison, Gerald L.

1991-01-01

The Turbomachinery Laboratory at Texas A&M has been providing experimental data and computational codes for the design seals for many years. The program began with the development of a Halon based seal test rig. This facility provided information about the effective stiffness and damping in whirling seals. The Halon effectively simulated cryogenic fluids. Another test facility was developed (using air as the working fluid) where the stiffness and damping matrices can be determined. This data was used to develop bulk flow models of the seal's effect upon rotating machinery; in conjunction with this research, a bulk flow model for calculation of performance and rotordynamic coefficients of annular pressure seals of arbitrary non-uniform clearance for barotropic fluids such as LH2, LOX, LN2, and CH4 was developed. This program is very efficient (fast) and converges for very large eccentricities. Currently, work is being performed on a bulk flow analysis of the effects of the impeller-shroud interaction upon the stability of pumps. The data was used along with data from other researchers to develop an empirical leakage prediction code for MSFC. Presently, the flow field inside labyrinth and annular seals are being studied in detail. An advanced 3-D Doppler anemometer system is being used to measure the mean velocity and entire Reynolds stress tensor distribution throughout the seals. Concentric and statically eccentric seals were studied; presently, whirling seals are being studied. The data obtained are providing valuable information about the flow phenomena occurring inside the seals, as well as a data base for comparison with numerical predictions and for turbulence model development. A finite difference computer code was developed for solving the Reynolds averaged Navier Stokes equation inside labyrinth seals. A multi-scale k-epsilon turbulence model is currently being evaluated. A new seal geometry was designed and patented using a computer code. A large scale, 2-D seal flow visualization facility is also being developed.
MicroHH 1.0: a computational fluid dynamics code for direct numerical simulation and large-eddy simulation of atmospheric boundary layer flows

NASA Astrophysics Data System (ADS)

van Heerwaarden, Chiel C.; van Stratum, Bart J. H.; Heus, Thijs; Gibbs, Jeremy A.; Fedorovich, Evgeni; Mellado, Juan Pedro

2017-08-01

This paper describes MicroHH 1.0, a new and open-source (www.microhh.org) computational fluid dynamics code for the simulation of turbulent flows in the atmosphere. It is primarily made for direct numerical simulation but also supports large-eddy simulation (LES). The paper covers the description of the governing equations, their numerical implementation, and the parameterizations included in the code. Furthermore, the paper presents the validation of the dynamical core in the form of convergence and conservation tests, and comparison of simulations of channel flows and slope flows against well-established test cases. The full numerical model, including the associated parameterizations for LES, has been tested for a set of cases under stable and unstable conditions, under the Boussinesq and anelastic approximations, and with dry and moist convection under stationary and time-varying boundary conditions. The paper presents performance tests showing good scaling from 256 to 32 768 processes. The graphical processing unit (GPU)-enabled version of the code can reach a speedup of more than an order of magnitude for simulations that fit in the memory of a single GPU.
Unsteady transonic flow calculations for realistic aircraft configurations

NASA Technical Reports Server (NTRS)

Batina, John T.; Seidel, David A.; Bland, Samuel R.; Bennett, Robert M.

1987-01-01

A transonic unsteady aerodynamic and aeroelasticity code has been developed for application to realistic aircraft configurations. The new code is called CAP-TSD which is an acronym for Computational Aeroelasticity Program - Transonic Small Disturbance. The CAP-TSD code uses a time-accurate approximate factorization (AF) algorithm for solution of the unsteady transonic small-disturbance equation. The AF algorithm is very efficient for solution of steady and unsteady transonic flow problems. It can provide accurate solutions in only several hundred time steps yielding a significant computational cost savings when compared to alternative methods. The new code can treat complete aircraft geometries with multiple lifting surfaces and bodies including canard, wing, tail, control surfaces, launchers, pylons, fuselage, stores, and nacelles. Applications are presented for a series of five configurations of increasing complexity to demonstrate the wide range of geometrical applicability of CAP-TSD. These results are in good agreement with available experimental steady and unsteady pressure data. Calculations for the General Dynamics one-ninth scale F-16C aircraft model are presented to demonstrate application to a realistic configuration. Unsteady results for the entire F-16C aircraft undergoing a rigid pitching motion illustrated the capability required to perform transonic unsteady aerodynamic and aeroelastic analyses for such configurations.
Recent Progress in the Development of a Multi-Layer Green's Function Code for Ion Beam Transport

NASA Technical Reports Server (NTRS)

Tweed, John; Walker, Steven A.; Wilson, John W.; Tripathi, Ram K.

2008-01-01

To meet the challenge of future deep space programs, an accurate and efficient engineering code for analyzing the shielding requirements against high-energy galactic heavy radiation is needed. To address this need, a new Green's function code capable of simulating high charge and energy ions with either laboratory or space boundary conditions is currently under development. The computational model consists of combinations of physical perturbation expansions based on the scales of atomic interaction, multiple scattering, and nuclear reactive processes with use of the Neumann-asymptotic expansions with non-perturbative corrections. The code contains energy loss due to straggling, nuclear attenuation, nuclear fragmentation with energy dispersion and downshifts. Previous reports show that the new code accurately models the transport of ion beams through a single slab of material. Current research efforts are focused on enabling the code to handle multiple layers of material and the present paper reports on progress made towards that end.
Verification and Validation of the k-kL Turbulence Model in FUN3D and CFL3D Codes

NASA Technical Reports Server (NTRS)

Abdol-Hamid, Khaled S.; Carlson, Jan-Renee; Rumsey, Christopher L.

2015-01-01

The implementation of the k-kL turbulence model using multiple computational uid dy- namics (CFD) codes is reported herein. The k-kL model is a two-equation turbulence model based on Abdol-Hamid's closure and Menter's modi cation to Rotta's two-equation model. Rotta shows that a reliable transport equation can be formed from the turbulent length scale L, and the turbulent kinetic energy k. Rotta's equation is well suited for term-by-term mod- eling and displays useful features compared to other two-equation models. An important di erence is that this formulation leads to the inclusion of higher-order velocity derivatives in the source terms of the scale equations. This can enhance the ability of the Reynolds- averaged Navier-Stokes (RANS) solvers to simulate unsteady ows. The present report documents the formulation of the model as implemented in the CFD codes Fun3D and CFL3D. Methodology, veri cation and validation examples are shown. Attached and sepa- rated ow cases are documented and compared with experimental data. The results show generally very good comparisons with canonical and experimental data, as well as matching results code-to-code. The results from this formulation are similar or better than results using the SST turbulence model.
Blueprint for a microwave trapped ion quantum computer.

PubMed

Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K

2017-02-01

The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
Application of Adjoint Method and Spectral-Element Method to Tomographic Inversion of Regional Seismological Structure Beneath Japanese Islands

NASA Astrophysics Data System (ADS)

Tsuboi, S.; Miyoshi, T.; Obayashi, M.; Tono, Y.; Ando, K.

2014-12-01

Recent progress in large scale computing by using waveform modeling technique and high performance computing facility has demonstrated possibilities to perform full-waveform inversion of three dimensional (3D) seismological structure inside the Earth. We apply the adjoint method (Liu and Tromp, 2006) to obtain 3D structure beneath Japanese Islands. First we implemented Spectral-Element Method to K-computer in Kobe, Japan. We have optimized SPECFEM3D_GLOBE (Komatitsch and Tromp, 2002) by using OpenMP so that the code fits hybrid architecture of K-computer. Now we could use 82,134 nodes of K-computer (657,072 cores) to compute synthetic waveform with about 1 sec accuracy for realistic 3D Earth model and its performance was 1.2 PFLOPS. We use this optimized SPECFEM3D_GLOBE code and take one chunk around Japanese Islands from global mesh and compute synthetic seismograms with accuracy of about 10 second. We use GAP-P2 mantle tomography model (Obayashi et al., 2009) as an initial 3D model and use as many broadband seismic stations available in this region as possible to perform inversion. We then use the time windows for body waves and surface waves to compute adjoint sources and calculate adjoint kernels for seismic structure. We have performed several iteration and obtained improved 3D structure beneath Japanese Islands. The result demonstrates that waveform misfits between observed and theoretical seismograms improves as the iteration proceeds. We now prepare to use much shorter period in our synthetic waveform computation and try to obtain seismic structure for basin scale model, such as Kanto basin, where there are dense seismic network and high seismic activity. Acknowledgements: This research was partly supported by MEXT Strategic Program for Innovative Research. We used F-net seismograms of the National Research Institute for Earth Science and Disaster Prevention.
Computer-based coding of free-text job descriptions to efficiently identify occupations in epidemiological studies.

PubMed

Russ, Daniel E; Ho, Kwan-Yuet; Colt, Joanne S; Armenti, Karla R; Baris, Dalsu; Chow, Wong-Ho; Davis, Faith; Johnson, Alison; Purdue, Mark P; Karagas, Margaret R; Schwartz, Kendra; Schwenn, Molly; Silverman, Debra T; Johnson, Calvin A; Friesen, Melissa C

2016-06-01

Mapping job titles to standardised occupation classification (SOC) codes is an important step in identifying occupational risk factors in epidemiological studies. Because manual coding is time-consuming and has moderate reliability, we developed an algorithm called SOCcer (Standardized Occupation Coding for Computer-assisted Epidemiologic Research) to assign SOC-2010 codes based on free-text job description components. Job title and task-based classifiers were developed by comparing job descriptions to multiple sources linking job and task descriptions to SOC codes. An industry-based classifier was developed based on the SOC prevalence within an industry. These classifiers were used in a logistic model trained using 14 983 jobs with expert-assigned SOC codes to obtain empirical weights for an algorithm that scored each SOC/job description. We assigned the highest scoring SOC code to each job. SOCcer was validated in 2 occupational data sources by comparing SOC codes obtained from SOCcer to expert assigned SOC codes and lead exposure estimates obtained by linking SOC codes to a job-exposure matrix. For 11 991 case-control study jobs, SOCcer-assigned codes agreed with 44.5% and 76.3% of manually assigned codes at the 6-digit and 2-digit level, respectively. Agreement increased with the score, providing a mechanism to identify assignments needing review. Good agreement was observed between lead estimates based on SOCcer and manual SOC assignments (κ 0.6-0.8). Poorer performance was observed for inspection job descriptions, which included abbreviations and worksite-specific terminology. Although some manual coding will remain necessary, using SOCcer may improve the efficiency of incorporating occupation into large-scale epidemiological studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Software Engineering for Scientific Computer Simulations

NASA Astrophysics Data System (ADS)

Post, Douglass E.; Henderson, Dale B.; Kendall, Richard P.; Whitney, Earl M.

2004-11-01

Computer simulation is becoming a very powerful tool for analyzing and predicting the performance of fusion experiments. Simulation efforts are evolving from including only a few effects to many effects, from small teams with a few people to large teams, and from workstations and small processor count parallel computers to massively parallel platforms. Successfully making this transition requires attention to software engineering issues. We report on the conclusions drawn from a number of case studies of large scale scientific computing projects within DOE, academia and the DoD. The major lessons learned include attention to sound project management including setting reasonable and achievable requirements, building a good code team, enforcing customer focus, carrying out verification and validation and selecting the optimum computational mathematics approaches.
Three-dimensional computer model for the atmospheric general circulation experiment

NASA Technical Reports Server (NTRS)

Roberts, G. O.

1984-01-01

An efficient, flexible, three-dimensional, hydrodynamic, computer code has been developed for a spherical cap geometry. The code will be used to simulate NASA's Atmospheric General Circulation Experiment (AGCE). The AGCE is a spherical, baroclinic experiment which will model the large-scale dynamics of our atmosphere; it has been proposed to NASA for future Spacelab flights. In the AGCE a radial dielectric body force will simulate gravity, with hot fluid tending to move outwards. In order that this force be dominant, the AGCE must be operated in a low gravity environment such as Spacelab. The full potential of the AGCE will only be realized by working in conjunction with an accurate computer model. Proposed experimental parameter settings will be checked first using model runs. Then actual experimental results will be compared with the model predictions. This interaction between experiment and theory will be very valuable in determining the nature of the AGCE flows and hence their relationship to analytical theories and actual atmospheric dynamics.

A Computationally Efficient Parallel Levenberg-Marquardt Algorithm for Large-Scale Big-Data Inversion

NASA Astrophysics Data System (ADS)

Lin, Y.; O'Malley, D.; Vesselinov, V. V.

2015-12-01

Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a powerful tool for large-scale applications.
Development of the Tensoral Computer Language

NASA Technical Reports Server (NTRS)

Ferziger, Joel; Dresselhaus, Eliot

1996-01-01

The research scientist or engineer wishing to perform large scale simulations or to extract useful information from existing databases is required to have expertise in the details of the particular database, the numerical methods and the computer architecture to be used. This poses a significant practical barrier to the use of simulation data. The goal of this research was to develop a high-level computer language called Tensoral, designed to remove this barrier. The Tensoral language provides a framework in which efficient generic data manipulations can be easily coded and implemented. First of all, Tensoral is general. The fundamental objects in Tensoral represent tensor fields and the operators that act on them. The numerical implementation of these tensors and operators is completely and flexibly programmable. New mathematical constructs and operators can be easily added to the Tensoral system. Tensoral is compatible with existing languages. Tensoral tensor operations co-exist in a natural way with a host language, which may be any sufficiently powerful computer language such as Fortran, C, or Vectoral. Tensoral is very-high-level. Tensor operations in Tensoral typically act on entire databases (i.e., arrays) at one time and may, therefore, correspond to many lines of code in a conventional language. Tensoral is efficient. Tensoral is a compiled language. Database manipulations are simplified optimized and scheduled by the compiler eventually resulting in efficient machine code to implement them.
Development and application of the GIM code for the Cyber 203 computer

NASA Technical Reports Server (NTRS)

Stainaker, J. F.; Robinson, M. A.; Rawlinson, E. G.; Anderson, P. G.; Mayne, A. W.; Spradley, L. W.

1982-01-01

The GIM computer code for fluid dynamics research was developed. Enhancement of the computer code, implicit algorithm development, turbulence model implementation, chemistry model development, interactive input module coding and wing/body flowfield computation are described. The GIM quasi-parabolic code development was completed, and the code used to compute a number of example cases. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and implicit finite difference scheme were also added. Development was completed on the interactive module for generating the input data for GIM. Solutions for inviscid hypersonic flow over a wing/body configuration are also presented.
Recent advances and future prospects for Monte Carlo

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B

2010-01-01

The history of Monte Carlo methods is closely linked to that of computers: The first known Monte Carlo program was written in 1947 for the ENIAC; a pre-release of the first Fortran compiler was used for Monte Carlo In 1957; Monte Carlo codes were adapted to vector computers in the 1980s, clusters and parallel computers in the 1990s, and teraflop systems in the 2000s. Recent advances include hierarchical parallelism, combining threaded calculations on multicore processors with message-passing among different nodes. With the advances In computmg, Monte Carlo codes have evolved with new capabilities and new ways of use. Production codesmore » such as MCNP, MVP, MONK, TRIPOLI and SCALE are now 20-30 years old (or more) and are very rich in advanced featUres. The former 'method of last resort' has now become the first choice for many applications. Calculations are now routinely performed on office computers, not just on supercomputers. Current research and development efforts are investigating the use of Monte Carlo methods on FPGAs. GPUs, and many-core processors. Other far-reaching research is exploring ways to adapt Monte Carlo methods to future exaflop systems that may have 1M or more concurrent computational processes.« less
Recommendations for open data science.

PubMed

Gymrek, Melissa; Farjoun, Yossi

2016-01-01

Life science research increasingly relies on large-scale computational analyses. However, the code and data used for these analyses are often lacking in publications. To maximize scientific impact, reproducibility, and reuse, it is crucial that these resources are made publicly available and are fully transparent. We provide recommendations for improving the openness of data-driven studies in life sciences.
Program Helps Generate Boundary-Element Mathematical Models

NASA Technical Reports Server (NTRS)

Goldberg, R. K.

1995-01-01

Composite Model Generation-Boundary Element Method (COM-GEN-BEM) computer program significantly reduces time and effort needed to construct boundary-element mathematical models of continuous-fiber composite materials at micro-mechanical (constituent) scale. Generates boundary-element models compatible with BEST-CMS boundary-element code for anlaysis of micromechanics of composite material. Written in PATRAN Command Language (PCL).
A Proposal of Monitoring and Forecasting Method for Crustal Activity in and around Japan with 3-dimensional Heterogeneous Medium Using a Large-scale High-fidelity Finite Element Simulation

NASA Astrophysics Data System (ADS)

Hori, T.; Agata, R.; Ichimura, T.; Fujita, K.; Yamaguchi, T.; Takahashi, N.

2017-12-01

Recently, we can obtain continuous dense surface deformation data on land and partly on the sea floor, the obtained data are not fully utilized for monitoring and forecasting of crustal activity, such as spatio-temporal variation in slip velocity on the plate interface including earthquakes, seismic wave propagation, and crustal deformation. For construct a system for monitoring and forecasting, it is necessary to develop a physics-based data analysis system including (1) a structural model with the 3D geometry of the plate inter-face and the material property such as elasticity and viscosity, (2) calculation code for crustal deformation and seismic wave propagation using (1), (3) inverse analysis or data assimilation code both for structure and fault slip using (1) & (2). To accomplish this, it is at least necessary to develop highly reliable large-scale simulation code to calculate crustal deformation and seismic wave propagation for 3D heterogeneous structure. Unstructured FE non-linear seismic wave simulation code has been developed. This achieved physics-based urban earthquake simulation enhanced by 1.08 T DOF x 6.6 K time-step. A high fidelity FEM simulation code with mesh generator has also been developed to calculate crustal deformation in and around Japan with complicated surface topography and subducting plate geometry for 1km mesh. This code has been improved the code for crustal deformation and achieved 2.05 T-DOF with 45m resolution on the plate interface. This high-resolution analysis enables computation of change of stress acting on the plate interface. Further, for inverse analyses, waveform inversion code for modeling 3D crustal structure has been developed, and the high-fidelity FEM code has been improved to apply an adjoint method for estimating fault slip and asthenosphere viscosity. Hence, we have large-scale simulation and analysis tools for monitoring. We are developing the methods for forecasting the slip velocity variation on the plate interface. Although the prototype is for elastic half space model, we are applying it for 3D heterogeneous structure with the high-fidelity FE model. Furthermore, large-scale simulation codes for monitoring are being implemented on the GPU clusters and analysis tools are developing to include other functions such as examination in model errors.
Computational Infrastructure for Geodynamics (CIG)

NASA Astrophysics Data System (ADS)

Gurnis, M.; Kellogg, L. H.; Bloxham, J.; Hager, B. H.; Spiegelman, M.; Willett, S.; Wysession, M. E.; Aivazis, M.

2004-12-01

Solid earth geophysicists have a long tradition of writing scientific software to address a wide range of problems. In particular, computer simulations came into wide use in geophysics during the decade after the plate tectonic revolution. Solution schemes and numerical algorithms that developed in other areas of science, most notably engineering, fluid mechanics, and physics, were adapted with considerable success to geophysics. This software has largely been the product of individual efforts and although this approach has proven successful, its strength for solving problems of interest is now starting to show its limitations as we try to share codes and algorithms or when we want to recombine codes in novel ways to produce new science. With funding from the NSF, the US community has embarked on a Computational Infrastructure for Geodynamics (CIG) that will develop, support, and disseminate community-accessible software for the greater geodynamics community from model developers to end-users. The software is being developed for problems involving mantle and core dynamics, crustal and earthquake dynamics, magma migration, seismology, and other related topics. With a high level of community participation, CIG is leveraging state-of-the-art scientific computing into a suite of open-source tools and codes. The infrastructure that we are now starting to develop will consist of: (a) a coordinated effort to develop reusable, well-documented and open-source geodynamics software; (b) the basic building blocks - an infrastructure layer - of software by which state-of-the-art modeling codes can be quickly assembled; (c) extension of existing software frameworks to interlink multiple codes and data through a superstructure layer; (d) strategic partnerships with the larger world of computational science and geoinformatics; and (e) specialized training and workshops for both the geodynamics and broader Earth science communities. The CIG initiative has already started to leverage and develop long-term strategic partnerships with open source development efforts within the larger thrusts of scientific computing and geoinformatics. These strategic partnerships are essential as the frontier has moved into multi-scale and multi-physics problems in which many investigators now want to use simulation software for data interpretation, data assimilation, and hypothesis testing.
Extreme Scale Computing to Secure the Nation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, D L; McGraw, J R; Johnson, J R

2009-11-10

Since the dawn of modern electronic computing in the mid 1940's, U.S. national security programs have been dominant users of every new generation of high-performance computer. Indeed, the first general-purpose electronic computer, ENIAC (the Electronic Numerical Integrator and Computer), was used to calculate the expected explosive yield of early thermonuclear weapons designs. Even the U. S. numerical weather prediction program, another early application for high-performance computing, was initially funded jointly by sponsors that included the U.S. Air Force and Navy, agencies interested in accurate weather predictions to support U.S. military operations. For the decades of the cold war, national securitymore » requirements continued to drive the development of high performance computing (HPC), including advancement of the computing hardware and development of sophisticated simulation codes to support weapons and military aircraft design, numerical weather prediction as well as data-intensive applications such as cryptography and cybersecurity U.S. national security concerns continue to drive the development of high-performance computers and software in the U.S. and in fact, events following the end of the cold war have driven an increase in the growth rate of computer performance at the high-end of the market. This mainly derives from our nation's observance of a moratorium on underground nuclear testing beginning in 1992, followed by our voluntary adherence to the Comprehensive Test Ban Treaty (CTBT) beginning in 1995. The CTBT prohibits further underground nuclear tests, which in the past had been a key component of the nation's science-based program for assuring the reliability, performance and safety of U.S. nuclear weapons. In response to this change, the U.S. Department of Energy (DOE) initiated the Science-Based Stockpile Stewardship (SBSS) program in response to the Fiscal Year 1994 National Defense Authorization Act, which requires, 'in the absence of nuclear testing, a progam to: (1) Support a focused, multifaceted program to increase the understanding of the enduring stockpile; (2) Predict, detect, and evaluate potential problems of the aging of the stockpile; (3) Refurbish and re-manufacture weapons and components, as required; and (4) Maintain the science and engineering institutions needed to support the nation's nuclear deterrent, now and in the future'. This program continues to fulfill its national security mission by adding significant new capabilities for producing scientific results through large-scale computational simulation coupled with careful experimentation, including sub-critical nuclear experiments permitted under the CTBT. To develop the computational science and the computational horsepower needed to support its mission, SBSS initiated the Accelerated Strategic Computing Initiative, later renamed the Advanced Simulation & Computing (ASC) program (sidebar: 'History of ASC Computing Program Computing Capability'). The modern 3D computational simulation capability of the ASC program supports the assessment and certification of the current nuclear stockpile through calibration with past underground test (UGT) data. While an impressive accomplishment, continued evolution of national security mission requirements will demand computing resources at a significantly greater scale than we have today. In particular, continued observance and potential Senate confirmation of the Comprehensive Test Ban Treaty (CTBT) together with the U.S administration's promise for a significant reduction in the size of the stockpile and the inexorable aging and consequent refurbishment of the stockpile all demand increasing refinement of our computational simulation capabilities. Assessment of the present and future stockpile with increased confidence of the safety and reliability without reliance upon calibration with past or future test data is a long-term goal of the ASC program. This will be accomplished through significant increases in the scientific bases that underlie the computational tools. Computer codes must be developed that replace phenomenology with increased levels of scientific understanding together with an accompanying quantification of uncertainty. These advanced codes will place significantly higher demands on the computing infrastructure than do the current 3D ASC codes. This article discusses not only the need for a future computing capability at the exascale for the SBSS program, but also considers high performance computing requirements for broader national security questions. For example, the increasing concern over potential nuclear terrorist threats demands a capability to assess threats and potential disablement technologies as well as a rapid forensic capability for determining a nuclear weapons design from post-detonation evidence (nuclear counterterrorism).« less
Asymptotic Expansion Homogenization for Multiscale Nuclear Fuel Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hales, J. D.; Tonks, M. R.; Chockalingam, K.

2015-03-01

Engineering scale nuclear fuel performance simulations can benefit by utilizing high-fidelity models running at a lower length scale. Lower length-scale models provide a detailed view of the material behavior that is used to determine the average material response at the macroscale. These lower length-scale calculations may provide insight into material behavior where experimental data is sparse or nonexistent. This multiscale approach is especially useful in the nuclear field, since irradiation experiments are difficult and expensive to conduct. The lower length-scale models complement the experiments by influencing the types of experiments required and by reducing the total number of experiments needed.more » This multiscale modeling approach is a central motivation in the development of the BISON-MARMOT fuel performance codes at Idaho National Laboratory. These codes seek to provide more accurate and predictive solutions for nuclear fuel behavior. One critical aspect of multiscale modeling is the ability to extract the relevant information from the lower length-scale sim- ulations. One approach, the asymptotic expansion homogenization (AEH) technique, has proven to be an effective method for determining homogenized material parameters. The AEH technique prescribes a system of equations to solve at the microscale that are used to compute homogenized material constants for use at the engineering scale. In this work, we employ AEH to explore the effect of evolving microstructural thermal conductivity and elastic constants on nuclear fuel performance. We show that the AEH approach fits cleanly into the BISON and MARMOT codes and provides a natural, multidimensional homogenization capability.« less
A large-scale solar dynamics observatory image dataset for computer vision applications.

PubMed

Kucuk, Ahmet; Banda, Juan M; Angryk, Rafal A

2017-01-01

The National Aeronautics Space Agency (NASA) Solar Dynamics Observatory (SDO) mission has given us unprecedented insight into the Sun's activity. By capturing approximately 70,000 images a day, this mission has created one of the richest and biggest repositories of solar image data available to mankind. With such massive amounts of information, researchers have been able to produce great advances in detecting solar events. In this resource, we compile SDO solar data into a single repository in order to provide the computer vision community with a standardized and curated large-scale dataset of several hundred thousand solar events found on high resolution solar images. This publicly available resource, along with the generation source code, will accelerate computer vision research on NASA's solar image data by reducing the amount of time spent performing data acquisition and curation from the multiple sources we have compiled. By improving the quality of the data with thorough curation, we anticipate a wider adoption and interest from the computer vision to the solar physics community.
Extending the length and time scales of Gram-Schmidt Lyapunov vector computations

NASA Astrophysics Data System (ADS)

Costa, Anthony B.; Green, Jason R.

2013-08-01

Lyapunov vectors have found growing interest recently due to their ability to characterize systems out of thermodynamic equilibrium. The computation of orthogonal Gram-Schmidt vectors requires multiplication and QR decomposition of large matrices, which grow as N2 (with the particle count). This expense has limited such calculations to relatively small systems and short time scales. Here, we detail two implementations of an algorithm for computing Gram-Schmidt vectors. The first is a distributed-memory message-passing method using Scalapack. The second uses the newly-released MAGMA library for GPUs. We compare the performance of both codes for Lennard-Jones fluids from N=100 to 1300 between Intel Nahalem/Infiniband DDR and NVIDIA C2050 architectures. To our best knowledge, these are the largest systems for which the Gram-Schmidt Lyapunov vectors have been computed, and the first time their calculation has been GPU-accelerated. We conclude that Lyapunov vector calculations can be significantly extended in length and time by leveraging the power of GPU-accelerated linear algebra.
Quantum computing applied to calculations of molecular energies: CH2 benchmark.

PubMed

Veis, Libor; Pittner, Jiří

2010-11-21

Quantum computers are appealing for their ability to solve some tasks much faster than their classical counterparts. It was shown in [Aspuru-Guzik et al., Science 309, 1704 (2005)] that they, if available, would be able to perform the full configuration interaction (FCI) energy calculations with a polynomial scaling. This is in contrast to conventional computers where FCI scales exponentially. We have developed a code for simulation of quantum computers and implemented our version of the quantum FCI algorithm. We provide a detailed description of this algorithm and the results of the assessment of its performance on the four lowest lying electronic states of CH(2) molecule. This molecule was chosen as a benchmark, since its two lowest lying (1)A(1) states exhibit a multireference character at the equilibrium geometry. It has been shown that with a suitably chosen initial state of the quantum register, one is able to achieve the probability amplification regime of the iterative phase estimation algorithm even in this case.
Accelerating three-dimensional FDTD calculations on GPU clusters for electromagnetic field simulation.

PubMed

Nagaoka, Tomoaki; Watanabe, Soichi

2012-01-01

Electromagnetic simulation with anatomically realistic computational human model using the finite-difference time domain (FDTD) method has recently been performed in a number of fields in biomedical engineering. To improve the method's calculation speed and realize large-scale computing with the computational human model, we adapt three-dimensional FDTD code to a multi-GPU cluster environment with Compute Unified Device Architecture and Message Passing Interface. Our multi-GPU cluster system consists of three nodes. The seven GPU boards (NVIDIA Tesla C2070) are mounted on each node. We examined the performance of the FDTD calculation on multi-GPU cluster environment. We confirmed that the FDTD calculation on the multi-GPU clusters is faster than that on a multi-GPU (a single workstation), and we also found that the GPU cluster system calculate faster than a vector supercomputer. In addition, our GPU cluster system allowed us to perform the large-scale FDTD calculation because were able to use GPU memory of over 100 GB.
Blade Displacement Predictions for the Full-Scale UH-60A Airloads Rotor

NASA Technical Reports Server (NTRS)

Bledron, Robert T.; Lee-Rausch, Elizabeth M.

2014-01-01

An unsteady Reynolds-Averaged Navier-Stokes solver for unstructured grids is loosely coupled to a rotorcraft comprehensive code and used to simulate two different test conditions from a wind-tunnel test of a full-scale UH-60A rotor. Performance data and sectional airloads from the simulation are compared with corresponding tunnel data to assess the level of fidelity of the aerodynamic aspects of the simulation. The focus then turns to a comparison of the blade displacements, both rigid (blade root) and elastic. Comparisons of computed root motions are made with data from three independent measurement systems. Finally, comparisons are made between computed elastic bending and elastic twist, and the corresponding measurements obtained from a photogrammetry system. Overall the correlation between computed and measured displacements was good, especially for the root pitch and lag motions and the elastic bending deformation. The correlation of root lead-lag motion and elastic twist deformation was less favorable.
An Optimization Code for Nonlinear Transient Problems of a Large Scale Multidisciplinary Mathematical Model

NASA Astrophysics Data System (ADS)

Takasaki, Koichi

This paper presents a program for the multidisciplinary optimization and identification problem of the nonlinear model of large aerospace vehicle structures. The program constructs the global matrix of the dynamic system in the time direction by the p-version finite element method (pFEM), and the basic matrix for each pFEM node in the time direction is described by a sparse matrix similarly to the static finite element problem. The algorithm used by the program does not require the Hessian matrix of the objective function and so has low memory requirements. It also has a relatively low computational cost, and is suited to parallel computation. The program was integrated as a solver module of the multidisciplinary analysis system CUMuLOUS (Computational Utility for Multidisciplinary Large scale Optimization of Undense System) which is under development by the Aerospace Research and Development Directorate (ARD) of the Japan Aerospace Exploration Agency (JAXA).
Testing and Validating Gadget2 for GPUs

NASA Astrophysics Data System (ADS)

Wibking, Benjamin; Holley-Bockelmann, K.; Berlind, A. A.

2013-01-01

We are currently upgrading a version of Gadget2 (Springel et al., 2005) that is optimized for NVIDIA's CUDA GPU architecture (Frigaard, unpublished) to work with the latest libraries and graphics cards. Preliminary tests of its performance indicate a ~40x speedup in the particle force tree approximation calculation, with overall speedup of 5-10x for cosmological simulations run with GPUs compared to running on the same CPU cores without GPU acceleration. We believe this speedup can be reasonably increased by an additional factor of two with futher optimization, including overlap of computation on CPU and GPU. Tests of single-precision GPU numerical fidelity currently indicate accuracy of the mass function and the spectral power density to within a few percent of extended-precision CPU results with the unmodified form of Gadget. Additionally, we plan to test and optimize the GPU code for Millenium-scale "grand challenge" simulations of >10^9 particles, a scale that has been previously untested with this code, with the aid of the NSF XSEDE flagship GPU-based supercomputing cluster codenamed "Keeneland." Current work involves additional validation of numerical results, extending the numerical precision of the GPU calculations to double precision, and evaluating performance/accuracy tradeoffs. We believe that this project, if successful, will yield substantial computational performance benefits to the N-body research community as the next generation of GPU supercomputing resources becomes available, both increasing the electrical power efficiency of ever-larger computations (making simulations possible a decade from now at scales and resolutions unavailable today) and accelerating the pace of research in the field.
Large Scale Earth's Bow Shock with Northern IMF as Simulated by PIC Code in Parallel with MHD Model

NASA Astrophysics Data System (ADS)

Baraka, Suleiman

2016-06-01

In this paper, we propose a 3D kinetic model (particle-in-cell, PIC) for the description of the large scale Earth's bow shock. The proposed version is stable and does not require huge or extensive computer resources. Because PIC simulations work with scaled plasma and field parameters, we also propose to validate our code by comparing its results with the available MHD simulations under same scaled solar wind (SW) and (IMF) conditions. We report new results from the two models. In both codes the Earth's bow shock position is found to be ≈14.8 R E along the Sun-Earth line, and ≈29 R E on the dusk side. Those findings are consistent with past in situ observations. Both simulations reproduce the theoretical jump conditions at the shock. However, the PIC code density and temperature distributions are inflated and slightly shifted sunward when compared to the MHD results. Kinetic electron motions and reflected ions upstream may cause this sunward shift. Species distributions in the foreshock region are depicted within the transition of the shock (measured ≈2 c/ ω pi for Θ Bn = 90° and M MS = 4.7) and in the downstream. The size of the foot jump in the magnetic field at the shock is measured to be (1.7 c/ ω pi ). In the foreshocked region, the thermal velocity is found equal to 213 km s-1 at 15 R E and is equal to 63 km s -1 at 12 R E (magnetosheath region). Despite the large cell size of the current version of the PIC code, it is powerful to retain macrostructure of planets magnetospheres in very short time, thus it can be used for pedagogical test purposes. It is also likely complementary with MHD to deepen our understanding of the large scale magnetosphere.
Numerical investigation of roughness effects in aircraft icing calculations

NASA Astrophysics Data System (ADS)

Matheis, Brian Daniel

2008-10-01

Icing codes are playing a role of increasing significance in the design and certification of ice protected aircraft surfaces. However, in the interest of computational efficiency certain small scale physics of the icing problem are grossly approximated by the codes. One such small scale phenomena is the effect of ice roughness on the development of the surface water film and on the convective heat transfer. This study uses computational methods to study the potential effect of ice roughness on both of these small scale phenomena. First, a two-dimensional condensed layer code is used to examine the effect of roughness on surface water development. It is found that the Couette approximation within the film breaks down as the wall shear goes to zero, depending on the film thickness. Roughness elements with initial flow separation in the air induce flow separation in the water layer at steady state, causing a trapping of the film. The amount of trapping for different roughness configurations is examined. Second, a three-dimensional incompressible Navier-Stokes code is developed to examine large scale ice roughness on the leading edge. The effect on the convective heat transfer and potential effect on the surface water dynamics is examined for a number of distributed roughness parameters including Reynolds number, roughness height, streamwise extent, roughness spacing and roughness shape. In most cases the roughness field increases the net average convective heat transfer on the leading edge while narrowing surface shear lines, indicating a choking of the surface water flow. Both effects show significant variation on the scale of the ice roughness. Both the change in heat transfer as well as the potential change in surface water dynamics are presented in terms of the development of singularities in the surface shear pattern. Of particular interest is the effect of the smooth zone upstream of the roughness which shows both a relatively large increase in convective heat transfer as well as excessive choking of the surface shear lines at the upstream end of the roughness field. A summary of the heat transfer results is presented for both the averaged heat transfer as well as the maximum heat transfer over each roughness element, indicating that the roughness Reynolds number is the primary parameter which characterizes the behavior of the roughness for the problem of interest.
Enhancing Scalability and Efficiency of the TOUGH2_MP for LinuxClusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Keni; Wu, Yu-Shu

2006-04-17

TOUGH2{_}MP, the parallel version TOUGH2 code, has been enhanced by implementing more efficient communication schemes. This enhancement is achieved through reducing the amount of small-size messages and the volume of large messages. The message exchange speed is further improved by using non-blocking communications for both linear and nonlinear iterations. In addition, we have modified the AZTEC parallel linear-equation solver to nonblocking communication. Through the improvement of code structuring and bug fixing, the new version code is now more stable, while demonstrating similar or even better nonlinear iteration converging speed than the original TOUGH2 code. As a result, the new versionmore » of TOUGH2{_}MP is improved significantly in its efficiency. In this paper, the scalability and efficiency of the parallel code are demonstrated by solving two large-scale problems. The testing results indicate that speedup of the code may depend on both problem size and complexity. In general, the code has excellent scalability in memory requirement as well as computing time.« less

Multi-scale computational modeling of developmental biology.

PubMed

Setty, Yaki

2012-08-01

Normal development of multicellular organisms is regulated by a highly complex process in which a set of precursor cells proliferate, differentiate and move, forming over time a functioning tissue. To handle their complexity, developmental systems can be studied over distinct scales. The dynamics of each scale is determined by the collective activity of entities at the scale below it. I describe a multi-scale computational approach for modeling developmental systems and detail the methodology through a synthetic example of a developmental system that retains key features of real developmental systems. I discuss the simulation of the system as it emerges from cross-scale and intra-scale interactions and describe how an in silico study can be carried out by modifying these interactions in a way that mimics in vivo experiments. I highlight biological features of the results through a comparison with findings in Caenorhabditis elegans germline development and finally discuss about the applications of the approach in real developmental systems and propose future extensions. The source code of the model of the synthetic developmental system can be found in www.wisdom.weizmann.ac.il/~yaki/MultiScaleModel. yaki.setty@gmail.com Supplementary data are available at Bioinformatics online.
The Design of PSB-VVER Experiments Relevant to Accident Management

NASA Astrophysics Data System (ADS)

Nevo, Alessandro Del; D'Auria, Francesco; Mazzini, Marino; Bykov, Michael; Elkin, Ilya V.; Suslov, Alexander

Experimental programs carried-out in integral test facilities are relevant for validating the best estimate thermal-hydraulic codes(1), which are used for accident analyses, design of accident management procedures, licensing of nuclear power plants, etc. The validation process, in fact, is based on well designed experiments. It consists in the comparison of the measured and calculated parameters and the determination whether a computer code has an adequate capability in predicting the major phenomena expected to occur in the course of transient and/or accidents. University of Pisa was responsible of the numerical design of the 12 experiments executed in PSB-VVER facility (2), operated at Electrogorsk Research and Engineering Center (Russia), in the framework of the TACIS 2.03/97 Contract 3.03.03 Part A, EC financed (3). The paper describes the methodology adopted at University of Pisa, starting form the scenarios foreseen in the final test matrix until the execution of the experiments. This process considers three key topics: a) the scaling issue and the simulation, with unavoidable distortions, of the expected performance of the reference nuclear power plants; b) the code assessment process involving the identification of phenomena challenging the code models; c) the features of the concerned integral test facility (scaling limitations, control logics, data acquisition system, instrumentation, etc.). The activities performed in this respect are discussed, and emphasis is also given to the relevance of the thermal losses to the environment. This issue affects particularly the small scaled facilities and has relevance on the scaling approach related to the power and volume of the facility.
Finite-difference method Stokes solver (FDMSS) for 3D pore geometries: Software development, validation and case studies

NASA Astrophysics Data System (ADS)

Gerke, Kirill M.; Vasilyev, Roman V.; Khirevich, Siarhei; Collins, Daniel; Karsanina, Marina V.; Sizonenko, Timofey O.; Korost, Dmitry V.; Lamontagne, Sébastien; Mallants, Dirk

2018-05-01

Permeability is one of the fundamental properties of porous media and is required for large-scale Darcian fluid flow and mass transport models. Whilst permeability can be measured directly at a range of scales, there are increasing opportunities to evaluate permeability from pore-scale fluid flow simulations. We introduce the free software Finite-Difference Method Stokes Solver (FDMSS) that solves Stokes equation using a finite-difference method (FDM) directly on voxelized 3D pore geometries (i.e. without meshing). Based on explicit convergence studies, validation on sphere packings with analytically known permeabilities, and comparison against lattice-Boltzmann and other published FDM studies, we conclude that FDMSS provides a computationally efficient and accurate basis for single-phase pore-scale flow simulations. By implementing an efficient parallelization and code optimization scheme, permeability inferences can now be made from 3D images of up to 109 voxels using modern desktop computers. Case studies demonstrate the broad applicability of the FDMSS software for both natural and artificial porous media.
Large-scale functional models of visual cortex for remote sensing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brumby, Steven P; Kenyon, Garrett; Rasmussen, Craig E

Neuroscience has revealed many properties of neurons and of the functional organization of visual cortex that are believed to be essential to human vision, but are missing in standard artificial neural networks. Equally important may be the sheer scale of visual cortex requiring {approx}1 petaflop of computation. In a year, the retina delivers {approx}1 petapixel to the brain, leading to massively large opportunities for learning at many levels of the cortical system. We describe work at Los Alamos National Laboratory (LANL) to develop large-scale functional models of visual cortex on LANL's Roadrunner petaflop supercomputer. An initial run of a simplemore » region VI code achieved 1.144 petaflops during trials at the IBM facility in Poughkeepsie, NY (June 2008). Here, we present criteria for assessing when a set of learned local representations is 'complete' along with general criteria for assessing computer vision models based on their projected scaling behavior. Finally, we extend one class of biologically-inspired learning models to problems of remote sensing imagery.« less
Sharply curved turn around duct flow predictions using spectral partitioning of the turbulent kinetic energy and a pressure modified wall law

NASA Technical Reports Server (NTRS)

Santi, L. Michael

1986-01-01

Computational predictions of turbulent flow in sharply curved 180 degree turn around ducts are presented. The CNS2D computer code is used to solve the equations of motion for two-dimensional incompressible flows transformed to a nonorthogonal body-fitted coordinate system. This procedure incorporates the pressure velocity correction algorithm SIMPLE-C to iteratively solve a discretized form of the transformed equations. A multiple scale turbulence model based on simplified spectral partitioning is employed to obtain closure. Flow field predictions utilizing the multiple scale model are compared to features predicted by the traditional single scale k-epsilon model. Tuning parameter sensitivities of the multiple scale model applied to turn around duct flows are also determined. In addition, a wall function approach based on a wall law suitable for incompressible turbulent boundary layers under strong adverse pressure gradients is tested. Turn around duct flow characteristics utilizing this modified wall law are presented and compared to results based on a standard wall treatment.
Gravitational radiation from rotating gravitational collapse

NASA Technical Reports Server (NTRS)

Stark, Richard F.

1989-01-01

The efficiency of gravitational wave emission from axisymmetric rotating collapse to a black hole was found to be very low: Delta E/Mc sq. less than 7 x 10(exp -4). The main waveform shape is well defined and nearly independent of the details of the collapse. Such a signature will allow pattern recognition techniques to be used when searching experimental data. These results (which can be scaled in mass) were obtained using a fully general relativistic computer code that evolves rotating axisymmetric configurations and directly computes their gravitational radiation emission.
DESCRIPTION OF THE ENIAC CONVERTER CODE

DTIC Science & Technology

The report is intended as a working manual for personnel preparing problems for the ENIAC . It should also serve as a guide to those groups who have...computing problems that could be solved on the ENIAC . The report discusses the ENIAC from the point of view of the coder, describing its memory as well...accomplishes as well as how to use each instruction. A few remarks are made on the more general subject of problem preparation for large scale computers in general based on the experience of operating the ENIAC . (Author)
Effects of cosmic rays on single event upsets

NASA Technical Reports Server (NTRS)

Venable, D. D.; Zajic, V.; Lowe, C. W.; Olidapupo, A.; Fogarty, T. N.

1989-01-01

Assistance was provided to the Brookhaven Single Event Upset (SEU) Test Facility. Computer codes were developed for fragmentation and secondary radiation affecting Very Large Scale Integration (VLSI) in space. A computer controlled CV (HP4192) test was developed for Terman analysis. Also developed were high speed parametric tests which are independent of operator judgment and a charge pumping technique for measurement of D(sub it) (E). The X-ray secondary effects, and parametric degradation as a function of dose rate were simulated. The SPICE simulation of static RAMs with various resistor filters was tested.
New Developments in Modeling MHD Systems on High Performance Computing Architectures

NASA Astrophysics Data System (ADS)

Germaschewski, K.; Raeder, J.; Larson, D. J.; Bhattacharjee, A.

2009-04-01

Modeling the wide range of time and length scales present even in fluid models of plasmas like MHD and X-MHD (Extended MHD including two fluid effects like Hall term, electron inertia, electron pressure gradient) is challenging even on state-of-the-art supercomputers. In the last years, HPC capacity has continued to grow exponentially, but at the expense of making the computer systems more and more difficult to program in order to get maximum performance. In this paper, we will present a new approach to managing the complexity caused by the need to write efficient codes: Separating the numerical description of the problem, in our case a discretized right hand side (r.h.s.), from the actual implementation of efficiently evaluating it. An automatic code generator is used to describe the r.h.s. in a quasi-symbolic form while leaving the translation into efficient and parallelized code to a computer program itself. We implemented this approach for OpenGGCM (Open General Geospace Circulation Model), a model of the Earth's magnetosphere, which was accelerated by a factor of three on regular x86 architecture and a factor of 25 on the Cell BE architecture (commonly known for its deployment in Sony's PlayStation 3).
National Combustion Code Validated Against Lean Direct Injection Flow Field Data

NASA Technical Reports Server (NTRS)

Iannetti, Anthony C.

2003-01-01

Most combustion processes have, in some way or another, a recirculating flow field. This recirculation stabilizes the reaction zone, or flame, but an unnecessarily large recirculation zone can result in high nitrogen oxide (NOx) values for combustion systems. The size of this recirculation zone is crucial to the performance of state-of-the-art, low-emissions hardware. If this is a large-scale combustion process, the flow field will probably be turbulent and, therefore, three-dimensional. This research dealt primarily with flow fields resulting from lean direct injection (LDI) concepts, as described in Research & Technology 2001. LDI is a concept that depends heavily on the design of the swirler. The LDI concept has the potential to reduce NOx values from 50 to 70 percent of current values, with good flame stability characteristics. It is cost effective and (hopefully) beneficial to do most of the design work for an LDI swirler using computer-aided design (CAD) and computer-aided engineering (CAE) tools. Computational fluid dynamics (CFD) codes are CAE tools that can calculate three-dimensional flows in complex geometries. However, CFD codes are only beginning to correctly calculate the flow fields for complex devices, and the related combustion models usually remove a large portion of the flow physics.
Xyce parallel electronic simulator users guide, version 6.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users' guide, Version 6.0.1.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users guide, version 6.0.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Cloud4Psi: cloud computing for 3D protein structure similarity searching.

PubMed

Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur

2014-10-01

Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. © The Author 2014. Published by Oxford University Press.
Cloud4Psi: cloud computing for 3D protein structure similarity searching

PubMed Central

Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur

2014-01-01

Summary: Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Availability and implementation: Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. Contact: dariusz.mrozek@polsl.pl PMID:24930141
CHOLLA: A New Massively Parallel Hydrodynamics Code for Astrophysical Simulation

NASA Astrophysics Data System (ADS)

Schneider, Evan E.; Robertson, Brant E.

2015-04-01

We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳2563) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.
Implementation of a 3D mixing layer code on parallel computers

NASA Technical Reports Server (NTRS)

Roe, K.; Thakur, R.; Dang, T.; Bogucz, E.

1995-01-01

This paper summarizes our progress and experience in the development of a Computational-Fluid-Dynamics code on parallel computers to simulate three-dimensional spatially-developing mixing layers. In this initial study, the three-dimensional time-dependent Euler equations are solved using a finite-volume explicit time-marching algorithm. The code was first programmed in Fortran 77 for sequential computers. The code was then converted for use on parallel computers using the conventional message-passing technique, while we have not been able to compile the code with the present version of HPF compilers.
An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

DOE PAGES

Mironov, Vladimir; Moskovsky, Alexander; D’Mello, Michael; ...

2017-10-04

The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overallmore » memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.« less
GW/Bethe-Salpeter calculations for charged and model systems from real-space DFT

NASA Astrophysics Data System (ADS)

Strubbe, David A.

GW and Bethe-Salpeter (GW/BSE) calculations use mean-field input from density-functional theory (DFT) calculations to compute excited states of a condensed-matter system. Many parts of a GW/BSE calculation are efficiently performed in a plane-wave basis, and extensive effort has gone into optimizing and parallelizing plane-wave GW/BSE codes for large-scale computations. Most straightforwardly, plane-wave DFT can be used as a starting point, but real-space DFT is also an attractive starting point: it is systematically convergeable like plane waves, can take advantage of efficient domain parallelization for large systems, and is well suited physically for finite and especially charged systems. The flexibility of a real-space grid also allows convenient calculations on non-atomic model systems. I will discuss the interfacing of a real-space (TD)DFT code (Octopus, www.tddft.org/programs/octopus) with a plane-wave GW/BSE code (BerkeleyGW, www.berkeleygw.org), consider performance issues and accuracy, and present some applications to simple and paradigmatic systems that illuminate fundamental properties of these approximations in many-body perturbation theory.
The bioelectric code: An ancient computational medium for dynamic control of growth and form.

PubMed

Levin, Michael; Martyniuk, Christopher J

2018-02-01

What determines large-scale anatomy? DNA does not directly specify geometrical arrangements of tissues and organs, and a process of encoding and decoding for morphogenesis is required. Moreover, many species can regenerate and remodel their structure despite drastic injury. The ability to obtain the correct target morphology from a diversity of initial conditions reveals that the morphogenetic code implements a rich system of pattern-homeostatic processes. Here, we describe an important mechanism by which cellular networks implement pattern regulation and plasticity: bioelectricity. All cells, not only nerves and muscles, produce and sense electrical signals; in vivo, these processes form bioelectric circuits that harness individual cell behaviors toward specific anatomical endpoints. We review emerging progress in reading and re-writing anatomical information encoded in bioelectrical states, and discuss the approaches to this problem from the perspectives of information theory, dynamical systems, and computational neuroscience. Cracking the bioelectric code will enable much-improved control over biological patterning, advancing basic evolutionary developmental biology as well as enabling numerous applications in regenerative medicine and synthetic bioengineering. Copyright © 2017 Elsevier B.V. All rights reserved.

A Biosequence-based Approach to Software Characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oehmen, Christopher S.; Peterson, Elena S.; Phillips, Aaron R.

For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries.more » Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.« less
Investigation of thermal protection systems effects on viscid and inviscid flow fields for manned entry systems

NASA Technical Reports Server (NTRS)

Bartlett, E. P.; Morse, H. L.; Tong, H.

1971-01-01

Procedures and methods for predicting aerothermodynamic heating to delta orbiter shuttle vehicles were reviewed. A number of approximate methods were found to be adequate for large scale parameter studies, but are considered inadequate for final design calculations. It is recommended that final design calculations be based on a computer code which accounts for nonequilibrium chemistry, streamline spreading, entropy swallowing, and turbulence. It is further recommended that this code be developed with the intent that it can be directly coupled with an exact inviscid flow field calculation when the latter becomes available. A nonsimilar, equilibrium chemistry computer code (BLIMP) was used to evaluate the effects of entropy swallowing, turbulence, and various three dimensional approximations. These solutions were compared with available wind tunnel data. It was found study that, for wind tunnel conditions, the effect of entropy swallowing and three dimensionality are small for laminar boundary layers but entropy swallowing causes a significant increase in turbulent heat transfer. However, it is noted that even small effects (say, 10-20%) may be important for the shuttle reusability concept.
Spectral-Element Seismic Wave Propagation Codes for both Forward Modeling in Complex Media and Adjoint Tomography

NASA Astrophysics Data System (ADS)

Smith, J. A.; Peter, D. B.; Tromp, J.; Komatitsch, D.; Lefebvre, M. P.

2015-12-01

We present both SPECFEM3D_Cartesian and SPECFEM3D_GLOBE open-source codes, representing high-performance numerical wave solvers simulating seismic wave propagation for local-, regional-, and global-scale application. These codes are suitable for both forward propagation in complex media and tomographic imaging. Both solvers compute highly accurate seismic wave fields using the continuous Galerkin spectral-element method on unstructured meshes. Lateral variations in compressional- and shear-wave speeds, density, as well as 3D attenuation Q models, topography and fluid-solid coupling are all readily included in both codes. For global simulations, effects due to rotation, ellipticity, the oceans, 3D crustal models, and self-gravitation are additionally included. Both packages provide forward and adjoint functionality suitable for adjoint tomography on high-performance computing architectures. We highlight the most recent release of the global version which includes improved performance, simultaneous MPI runs, OpenCL and CUDA support via an automatic source-to-source transformation library (BOAST), parallel I/O readers and writers for databases using ADIOS and seismograms using the recently developed Adaptable Seismic Data Format (ASDF) with built-in provenance. This makes our spectral-element solvers current state-of-the-art, open-source community codes for high-performance seismic wave propagation on arbitrarily complex 3D models. Together with these solvers, we provide full-waveform inversion tools to image the Earth's interior at unprecedented resolution.
TRAC-PF1 code verification with data from the OTIS test facility. [Once-Through Intergral System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Childerson, M.T.; Fujita, R.K.

1985-01-01

A computer code (TRAC-PF1/MOD1) developed for predicting transient thermal and hydraulic integral nuclear steam supply system (NSSS) response was benchmarked. Post-small break loss-of-coolant accident (LOCA) data from a scaled, experimental facility, designated the One-Through Integral System (OTIS), were obtained for the Babcock and Wilcox NSSS and compared to TRAC predictions. The OTIS tests provided a challenging small break LOCA data set for TRAC verification. The major phases of a small break LOCA observed in the OTIS tests included pressurizer draining and loop saturation, intermittent reactor coolant system circulation, boiler-condenser mode, and the initial stages of refill. The TRAC code wasmore » successful in predicting OTIS loop conditions (system pressures and temperatures) after modification of the steam generator model. In particular, the code predicted both pool and auxiliary-feedwater initiated boiler-condenser mode heat transfer.« less
Improvements on non-equilibrium and transport Green function techniques: The next-generation TRANSIESTA

NASA Astrophysics Data System (ADS)

Papior, Nick; Lorente, Nicolás; Frederiksen, Thomas; García, Alberto; Brandbyge, Mads

2017-03-01

We present novel methods implemented within the non-equilibrium Green function code (NEGF) TRANSIESTA based on density functional theory (DFT). Our flexible, next-generation DFT-NEGF code handles devices with one or multiple electrodes (Ne ≥ 1) with individual chemical potentials and electronic temperatures. We describe its novel methods for electrostatic gating, contour optimizations, and assertion of charge conservation, as well as the newly implemented algorithms for optimized and scalable matrix inversion, performance-critical pivoting, and hybrid parallelization. Additionally, a generic NEGF "post-processing" code (TBTRANS/PHTRANS) for electron and phonon transport is presented with several novelties such as Hamiltonian interpolations, Ne ≥ 1 electrode capability, bond-currents, generalized interface for user-defined tight-binding transport, transmission projection using eigenstates of a projected Hamiltonian, and fast inversion algorithms for large-scale simulations easily exceeding 106 atoms on workstation computers. The new features of both codes are demonstrated and bench-marked for relevant test systems.
Avionic Data Bus Integration Technology

DTIC Science & Technology

1991-12-01

address the hardware-software interaction between a digital data bus and an avionic system. Very Large Scale Integration (VLSI) ICs and multiversion ...the SCP. In 1984, the Sperry Corporation developed a fault tolerant system which employed multiversion programming, voting, and monitoring for error... MULTIVERSION PROGRAMMING. N-version programming. 226 N-VERSION PROGRAMMING. The independent coding of a number, N, of redundant computer programs that
Corrigendum to "Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data"

Treesearch

Andrew T. Hudak; Nicholas L. Crookston; Jeffrey S. Evans; David E. hall; Michael J. Falkowski

2009-01-01

The authors regret that an error was discovered in the code within the R software package, yaImpute (Crookston & Finley, 2008), which led to incorrect results reported in the above article. The Most Similar Neighbor (MSN) method computes the distance between reference observations and target observations in a projected space defined using canonical correlation...
Post-Test Analysis of 11% Break at PSB-VVER Experimental Facility using Cathare 2 Code

NASA Astrophysics Data System (ADS)

Sabotinov, Luben; Chevrier, Patrick

The best estimate French thermal-hydraulic computer code CATHARE 2 Version 2.5_1 was used for post-test analysis of the experiment “11% upper plenum break”, conducted at the large-scale test facility PSB-VVER in Russia. The PSB rig is 1:300 scaled model of VVER-1000 NPP. A computer model has been developed for CATHARE 2 V2.5_1, taking into account all important components of the PSB facility: reactor model (lower plenum, core, bypass, upper plenum, downcomer), 4 separated loops, pressurizer, horizontal multitube steam generators, break section. The secondary side is represented by recirculation model. A large number of sensitivity calculations has been performed regarding break modeling, reactor pressure vessel modeling, counter current flow modeling, hydraulic losses, heat losses. The comparison between calculated and experimental results shows good prediction of the basic thermal-hydraulic phenomena and parameters such as pressures, temperatures, void fractions, loop seal clearance, etc. The experimental and calculation results are very sensitive regarding the fuel cladding temperature, which show a periodical nature. With the applied CATHARE 1D modeling, the global thermal-hydraulic parameters and the core heat up have been reasonably predicted.
Interface COMSOL-PHREEQC (iCP), an efficient numerical framework for the solution of coupled multiphysics and geochemistry

NASA Astrophysics Data System (ADS)

Nardi, Albert; Idiart, Andrés; Trinchero, Paolo; de Vries, Luis Manuel; Molinero, Jorge

2014-08-01

This paper presents the development, verification and application of an efficient interface, denoted as iCP, which couples two standalone simulation programs: the general purpose Finite Element framework COMSOL Multiphysics® and the geochemical simulator PHREEQC. The main goal of the interface is to maximize the synergies between the aforementioned codes, providing a numerical platform that can efficiently simulate a wide number of multiphysics problems coupled with geochemistry. iCP is written in Java and uses the IPhreeqc C++ dynamic library and the COMSOL Java-API. Given the large computational requirements of the aforementioned coupled models, special emphasis has been placed on numerical robustness and efficiency. To this end, the geochemical reactions are solved in parallel by balancing the computational load over multiple threads. First, a benchmark exercise is used to test the reliability of iCP regarding flow and reactive transport. Then, a large scale thermo-hydro-chemical (THC) problem is solved to show the code capabilities. The results of the verification exercise are successfully compared with those obtained using PHREEQC and the application case demonstrates the scalability of a large scale model, at least up to 32 threads.
Numerically modelling the large scale coronal magnetic field

NASA Astrophysics Data System (ADS)

Panja, Mayukh; Nandi, Dibyendu

2016-07-01

The solar corona spews out vast amounts of magnetized plasma into the heliosphere which has a direct impact on the Earth's magnetosphere. Thus it is important that we develop an understanding of the dynamics of the solar corona. With our present technology it has not been possible to generate 3D magnetic maps of the solar corona; this warrants the use of numerical simulations to study the coronal magnetic field. A very popular method of doing this, is to extrapolate the photospheric magnetic field using NLFF or PFSS codes. However the extrapolations at different time intervals are completely independent of each other and do not capture the temporal evolution of magnetic fields. On the other hand full MHD simulations of the global coronal field, apart from being computationally very expensive would be physically less transparent, owing to the large number of free parameters that are typically used in such codes. This brings us to the Magneto-frictional model which is relatively simpler and computationally more economic. We have developed a Magnetofrictional Model, in 3D spherical polar co-ordinates to study the large scale global coronal field. Here we present studies of changing connectivities between active regions, in response to photospheric motions.
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 40 Protection of Environment 26 2013-07-01 2013-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 40 Protection of Environment 26 2012-07-01 2011-07-01 true Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 40 Protection of Environment 25 2014-07-01 2014-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 25 2011-07-01 2011-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
Simulation-Based Probabilistic Seismic Hazard Assessment Using System-Level, Physics-Based Models: Assembling Virtual California

NASA Astrophysics Data System (ADS)

Rundle, P. B.; Rundle, J. B.; Morein, G.; Donnellan, A.; Turcotte, D.; Klein, W.

2004-12-01

The research community is rapidly moving towards the development of an earthquake forecast technology based on the use of complex, system-level earthquake fault system simulations. Using these topologically and dynamically realistic simulations, it is possible to develop ensemble forecasting methods similar to that used in weather and climate research. To effectively carry out such a program, one needs 1) a topologically realistic model to simulate the fault system; 2) data sets to constrain the model parameters through a systematic program of data assimilation; 3) a computational technology making use of modern paradigms of high performance and parallel computing systems; and 4) software to visualize and analyze the results. In particular, we focus attention on a new version of our code Virtual California (version 2001) in which we model all of the major strike slip faults in California, from the Mexico-California border to the Mendocino Triple Junction. Virtual California is a "backslip model", meaning that the long term rate of slip on each fault segment in the model is matched to the observed rate. We use the historic data set of earthquakes larger than magnitude M > 6 to define the frictional properties of 650 fault segments (degrees of freedom) in the model. To compute the dynamics and the associated surface deformation, we use message passing as implemented in the MPICH standard distribution on a Beowulf clusters consisting of >10 cpus. We also will report results from implementing the code on significantly larger machines so that we can begin to examine much finer spatial scales of resolution, and to assess scaling properties of the code. We present results of simulations both as static images and as mpeg movies, so that the dynamical aspects of the computation can be assessed by the viewer. We compute a variety of statistics from the simulations, including magnitude-frequency relations, and compare these with data from real fault systems. We report recent results on use of Virtual California for probabilistic earthquake forecasting for several sub-groups of major faults in California. These methods have the advantage that system-level fault interactions are explicitly included, as well as laboratory-based friction laws.
Writing analytic element programs in Python.

PubMed

Bakker, Mark; Kelson, Victor A

2009-01-01

The analytic element method is a mesh-free approach for modeling ground water flow at both the local and the regional scale. With the advent of the Python object-oriented programming language, it has become relatively easy to write analytic element programs. In this article, an introduction is given of the basic principles of the analytic element method and of the Python programming language. A simple, yet flexible, object-oriented design is presented for analytic element codes using multiple inheritance. New types of analytic elements may be added without the need for any changes in the existing part of the code. The presented code may be used to model flow to wells (with either a specified discharge or drawdown) and streams (with a specified head). The code may be extended by any hydrogeologist with a healthy appetite for writing computer code to solve more complicated ground water flow problems. Copyright © 2009 The Author(s). Journal Compilation © 2009 National Ground Water Association.
Reactive transport modeling in the subsurface environment with OGS-IPhreeqc

NASA Astrophysics Data System (ADS)

He, Wenkui; Beyer, Christof; Fleckenstein, Jan; Jang, Eunseon; Kalbacher, Thomas; Naumov, Dimitri; Shao, Haibing; Wang, Wenqing; Kolditz, Olaf

2015-04-01

Worldwide, sustainable water resource management becomes an increasingly challenging task due to the growth of population and extensive applications of fertilizer in agriculture. Moreover, climate change causes further stresses to both water quantity and quality. Reactive transport modeling in the coupled soil-aquifer system is a viable approach to assess the impacts of different land use and groundwater exploitation scenarios on the water resources. However, the application of this approach is usually limited in spatial scale and to simplified geochemical systems due to the huge computational expense involved. Such computational expense is not only caused by solving the high non-linearity of the initial boundary value problems of water flow in the unsaturated zone numerically with rather fine spatial and temporal discretization for the correct mass balance and numerical stability, but also by the intensive computational task of quantifying geochemical reactions. In the present study, a flexible and efficient tool for large scale reactive transport modeling in variably saturated porous media and its applications are presented. The open source scientific software OpenGeoSys (OGS) is coupled with the IPhreeqc module of the geochemical solver PHREEQC. The new coupling approach makes full use of advantages from both codes: OGS provides a flexible choice of different numerical approaches for simulation of water flow in the vadose zone such as the pressure-based or mixed forms of Richards equation; whereas the IPhreeqc module leads to a simplification of data storage and its communication with OGS, which greatly facilitates the coupling and code updating. Moreover, a parallelization scheme with MPI (Message Passing Interface) is applied, in which the computational task of water flow and mass transport is partitioned through domain decomposition, whereas the efficient parallelization of geochemical reactions is achieved by smart allocation of computational workload over multiple compute nodes. The plausibility of the new coupling is verified by several benchmark tests. In addition, the efficiency of the new coupling approach is demonstrated by its application in a large scale scenario, in which the environmental fate of pesticides in a complex soil-aquifer system is studied.
Reactive transport modeling in variably saturated porous media with OGS-IPhreeqc

NASA Astrophysics Data System (ADS)

He, W.; Beyer, C.; Fleckenstein, J. H.; Jang, E.; Kalbacher, T.; Shao, H.; Wang, W.; Kolditz, O.

2014-12-01

Worldwide, sustainable water resource management becomes an increasingly challenging task due to the growth of population and extensive applications of fertilizer in agriculture. Moreover, climate change causes further stresses to both water quantity and quality. Reactive transport modeling in the coupled soil-aquifer system is a viable approach to assess the impacts of different land use and groundwater exploitation scenarios on the water resources. However, the application of this approach is usually limited in spatial scale and to simplified geochemical systems due to the huge computational expense involved. Such computational expense is not only caused by solving the high non-linearity of the initial boundary value problems of water flow in the unsaturated zone numerically with rather fine spatial and temporal discretization for the correct mass balance and numerical stability, but also by the intensive computational task of quantifying geochemical reactions. In the present study, a flexible and efficient tool for large scale reactive transport modeling in variably saturated porous media and its applications are presented. The open source scientific software OpenGeoSys (OGS) is coupled with the IPhreeqc module of the geochemical solver PHREEQC. The new coupling approach makes full use of advantages from both codes: OGS provides a flexible choice of different numerical approaches for simulation of water flow in the vadose zone such as the pressure-based or mixed forms of Richards equation; whereas the IPhreeqc module leads to a simplification of data storage and its communication with OGS, which greatly facilitates the coupling and code updating. Moreover, a parallelization scheme with MPI (Message Passing Interface) is applied, in which the computational task of water flow and mass transport is partitioned through domain decomposition, whereas the efficient parallelization of geochemical reactions is achieved by smart allocation of computational workload over multiple compute nodes. The plausibility of the new coupling is verified by several benchmark tests. In addition, the efficiency of the new coupling approach is demonstrated by its application in a large scale scenario, in which the environmental fate of pesticides in a complex soil-aquifer system is studied.
Computational Astrophysical Magnetohydrodynamics

NASA Astrophysics Data System (ADS)

Norman, M. L.

1994-05-01

Cosmic magnetic fields have intrigued and vexed astrophysicists seeking to understand their complex dynamics in a wide variety of astronomical settings. Magnetic fields are believed to play an important role in regulating star formation in molecular clouds, providing an effective viscosity in accretion disks, accelerating astrophysical jets, and influencing the large scale structure of the ISM of disk galaxies. Radio observations of supernova remnants and extragalactic radio jets prove that magnetic fields are are fundamentally linked to astrophysical particle acceleration. Magnetic fields exist on cosmological scales as shown by the existence of radio halos in clusters of galaxies. Theoretical investigation of these and other phenomena require numerical simulations due to the inherent complexity of MHD, but until now neither the computer power nor the numerical algorithms existed to mount a serious attack on the most important problems. That has now changed. Advances in parallel computing and numerical algorithms now permit the simulation of fully nonlinear, time-dependent astrophysical MHD in 2D and 3D. In this talk, I will describe the ZEUS codes for astrophysical MHD developed at the Laboratory for Computational Astrophysics (LCA) at the University of Illinois. These codes are now available to the national community. The numerical algorithms and test suite used to validate them are briefly discussed. Several applications of ZEUS to topics listed above are presented. An extension of ZEUS to model ambipolar diffusion in weakly ionized plasmas is illustrated. I discuss how continuing exponential growth in computer power and new numerical algorithms under development will allow us to tackle two grand challenges: compressible MHD turbulence and relativistic MHD. This work is partially supported by grants NSF AST-9201113 and NASA NAG 5-2493.

Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

PubMed

Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

2016-04-01

Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .
Coding for parallel execution of hardware-in-the-loop millimeter-wave scene generation models on multicore SIMD processor architectures

NASA Astrophysics Data System (ADS)

Olson, Richard F.

2013-05-01

Rendering of point scatterer based radar scenes for millimeter wave (mmW) seeker tests in real-time hardware-in-the-loop (HWIL) scene generation requires efficient algorithms and vector-friendly computer architectures for complex signal synthesis. New processor technology from Intel implements an extended 256-bit vector SIMD instruction set (AVX, AVX2) in a multi-core CPU design providing peak execution rates of hundreds of GigaFLOPS (GFLOPS) on one chip. Real world mmW scene generation code can approach peak SIMD execution rates only after careful algorithm and source code design. An effective software design will maintain high computing intensity emphasizing register-to-register SIMD arithmetic operations over data movement between CPU caches or off-chip memories. Engineers at the U.S. Army Aviation and Missile Research, Development and Engineering Center (AMRDEC) applied two basic parallel coding methods to assess new 256-bit SIMD multi-core architectures for mmW scene generation in HWIL. These include use of POSIX threads built on vector library functions and more portable, highlevel parallel code based on compiler technology (e.g. OpenMP pragmas and SIMD autovectorization). Since CPU technology is rapidly advancing toward high processor core counts and TeraFLOPS peak SIMD execution rates, it is imperative that coding methods be identified which produce efficient and maintainable parallel code. This paper describes the algorithms used in point scatterer target model rendering, the parallelization of those algorithms, and the execution performance achieved on an AVX multi-core machine using the two basic parallel coding methods. The paper concludes with estimates for scale-up performance on upcoming multi-core technology.
Interactomes to Biological Phase Space: a call to begin thinking at a new level in computational biology.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davidson, George S.; Brown, William Michael

2007-09-01

Techniques for high throughput determinations of interactomes, together with high resolution protein collocalizations maps within organelles and through membranes will soon create a vast resource. With these data, biological descriptions, akin to the high dimensional phase spaces familiar to physicists, will become possible. These descriptions will capture sufficient information to make possible realistic, system-level models of cells. The descriptions and the computational models they enable will require powerful computing techniques. This report is offered as a call to the computational biology community to begin thinking at this scale and as a challenge to develop the required algorithms and codes tomore » make use of the new data.3« less
Computational thermochemistry: Automated generation of scale factors for vibrational frequencies calculated by electronic structure model chemistries

NASA Astrophysics Data System (ADS)

Yu, Haoyu S.; Fiedler, Lucas J.; Alecu, I. M.; Truhlar, Donald G.

2017-01-01

We present a Python program, FREQ, for calculating the optimal scale factors for calculating harmonic vibrational frequencies, fundamental vibrational frequencies, and zero-point vibrational energies from electronic structure calculations. The program utilizes a previously published scale factor optimization model (Alecu et al., 2010) to efficiently obtain all three scale factors from a set of computed vibrational harmonic frequencies. In order to obtain the three scale factors, the user only needs to provide zero-point energies of 15 or 6 selected molecules. If the user has access to the Gaussian 09 or Gaussian 03 program, we provide the option for the user to run the program by entering the keywords for a certain method and basis set in the Gaussian 09 or Gaussian 03 program. Four other Python programs, input.py, input6, pbs.py, and pbs6.py, are also provided for generating Gaussian 09 or Gaussian 03 input and PBS files. The program can also be used with data from any other electronic structure package. A manual of how to use this program is included in the code package.
CFD Predictions for Transonic Performance of the ERA Hybrid Wing-Body Configuration

NASA Technical Reports Server (NTRS)

Deere, Karen A.; Luckring, James M.; McMillin, S. Naomi; Flamm, Jeffrey D.; Roman, Dino

2016-01-01

A computational study was performed for a Hybrid Wing Body configuration that was focused at transonic cruise performance conditions. In the absence of experimental data, two fully independent computational fluid dynamics analyses were conducted to add confidence to the estimated transonic performance predictions. The primary analysis was performed by Boeing with the structured overset-mesh code OVERFLOW. The secondary analysis was performed by NASA Langley Research Center with the unstructured-mesh code USM3D. Both analyses were performed at full-scale flight conditions and included three configurations customary to drag buildup and interference analysis: a powered complete configuration, the configuration with the nacelle/pylon removed, and the powered nacelle in isolation. The results in this paper are focused primarily on transonic performance up to cruise and through drag rise. Comparisons between the CFD results were very good despite some minor geometric differences in the two analyses.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Peratt, A.L.; Mostrom, M.A.

With the availability of 80--125 MHz microprocessors, the methodology developed for the simulation of problems in pulsed power and plasma physics on modern day supercomputers is now amenable to application on a wide range of platforms including laptops and workstations. While execution speeds with these processors do not match those of large scale computing machines, resources such as computer-aided-design (CAD) and graphical analysis codes are available to automate simulation setup and process data. This paper reports on the adaptation of IVORY, a three-dimensional, fully-electromagnetic, particle-in-cell simulation code, to this platform independent CAD environment. The primary purpose of this talk ismore » to demonstrate how rapidly a pulsed power/plasma problem can be scoped out by an experimenter on a dedicated workstation. Demonstrations include a magnetically insulated transmission line, power flow in a graded insulator stack, a relativistic klystron oscillator, and the dynamics of a coaxial thruster for space applications.« less
Optimally combining dynamical decoupling and quantum error correction.

PubMed

Paz-Silva, Gerardo A; Lidar, D A

2013-01-01

Quantum control and fault-tolerant quantum computing (FTQC) are two of the cornerstones on which the hope of realizing a large-scale quantum computer is pinned, yet only preliminary steps have been taken towards formalizing the interplay between them. Here we explore this interplay using the powerful strategy of dynamical decoupling (DD), and show how it can be seamlessly and optimally integrated with FTQC. To this end we show how to find the optimal decoupling generator set (DGS) for various subspaces relevant to FTQC, and how to simultaneously decouple them. We focus on stabilizer codes, which represent the largest contribution to the size of the DGS, showing that the intuitive choice comprising the stabilizers and logical operators of the code is in fact optimal, i.e., minimizes a natural cost function associated with the length of DD sequences. Our work brings hybrid DD-FTQC schemes, and their potentially considerable advantages, closer to realization.
Optimally combining dynamical decoupling and quantum error correction

PubMed Central

Paz-Silva, Gerardo A.; Lidar, D. A.

2013-01-01

Quantum control and fault-tolerant quantum computing (FTQC) are two of the cornerstones on which the hope of realizing a large-scale quantum computer is pinned, yet only preliminary steps have been taken towards formalizing the interplay between them. Here we explore this interplay using the powerful strategy of dynamical decoupling (DD), and show how it can be seamlessly and optimally integrated with FTQC. To this end we show how to find the optimal decoupling generator set (DGS) for various subspaces relevant to FTQC, and how to simultaneously decouple them. We focus on stabilizer codes, which represent the largest contribution to the size of the DGS, showing that the intuitive choice comprising the stabilizers and logical operators of the code is in fact optimal, i.e., minimizes a natural cost function associated with the length of DD sequences. Our work brings hybrid DD-FTQC schemes, and their potentially considerable advantages, closer to realization. PMID:23559088
Experience Paper: Software Engineering and Community Codes Track in ATPESC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubey, Anshu; Riley, Katherine M.

Argonne Training Program in Extreme Scale Computing (ATPESC) was started by the Argonne National Laboratory with the objective of expanding the ranks of better prepared users of high performance computing (HPC) machines. One of the unique aspects of the program was inclusion of software engineering and community codes track. The inclusion was motivated by the observation that the projects with a good scientific and software process were better able to meet their scientific goals. In this paper we present our experience of running the software track from the beginning of the program until now. We discuss the motivations, the reception,more » and the evolution of the track over the years. We welcome discussion and input from the community to enhance the track in ATPESC, and also to facilitate inclusion of similar tracks in other HPC oriented training programs.« less
Comparison of two LES codes for wind turbine wake studies

NASA Astrophysics Data System (ADS)

Sarlak, H.; Pierella, F.; Mikkelsen, R.; Sørensen, J. N.

2014-06-01

For the third time a blind test comparison in Norway 2013, was conducted comparing numerical simulations for the rotor Cp and Ct and wake profiles with the experimental results. As the only large eddy simulation study among participants, results of the Technical University of Denmark (DTU) using their in-house CFD solver, EllipSys3D, proved to be more reliable among the other models for capturing the wake profiles and the turbulence intensities downstream the turbine. It was therefore remarked in the workshop to investigate other LES codes to compare their performance with EllipSys3D. The aim of this paper is to investigate on two CFD solvers, the DTU's in-house code, EllipSys3D and the open-sourse toolbox, OpenFoam, for a set of actuator line based LES computations. Two types of simulations are performed: the wake behind a signle rotor and the wake behind a cluster of three inline rotors. Results are compared in terms of velocity deficit, turbulence kinetic energy and eddy viscosity. It is seen that both codes predict similar near-wake flow structures with the exception of OpenFoam's simulations without the subgrid-scale model. The differences begin to increase with increasing the distance from the upstream rotor. From the single rotor simulations, EllipSys3D is found to predict a slower wake recovery in the case of uniform laminar flow. From the 3-rotor computations, it is seen that the difference between the codes is smaller as the disturbance created by the downstream rotors causes break down of the wake structures and more homogenuous flow structures. It is finally observed that OpenFoam computations are more sensitive to the SGS models.
Spectral-element Seismic Wave Propagation on CUDA/OpenCL Hardware Accelerators

NASA Astrophysics Data System (ADS)

Peter, D. B.; Videau, B.; Pouget, K.; Komatitsch, D.

2015-12-01

Seismic wave propagation codes are essential tools to investigate a variety of wave phenomena in the Earth. Furthermore, they can now be used for seismic full-waveform inversions in regional- and global-scale adjoint tomography. Although these seismic wave propagation solvers are crucial ingredients to improve the resolution of tomographic images to answer important questions about the nature of Earth's internal processes and subsurface structure, their practical application is often limited due to high computational costs. They thus need high-performance computing (HPC) facilities to improving the current state of knowledge. At present, numerous large HPC systems embed many-core architectures such as graphics processing units (GPUs) to enhance numerical performance. Such hardware accelerators can be programmed using either the CUDA programming environment or the OpenCL language standard. CUDA software development targets NVIDIA graphic cards while OpenCL was adopted by additional hardware accelerators, like e.g. AMD graphic cards, ARM-based processors as well as Intel Xeon Phi coprocessors. For seismic wave propagation simulations using the open-source spectral-element code package SPECFEM3D_GLOBE, we incorporated an automatic source-to-source code generation tool (BOAST) which allows us to use meta-programming of all computational kernels for forward and adjoint runs. Using our BOAST kernels, we generate optimized source code for both CUDA and OpenCL languages within the source code package. Thus, seismic wave simulations are able now to fully utilize CUDA and OpenCL hardware accelerators. We show benchmarks of forward seismic wave propagation simulations using SPECFEM3D_GLOBE on CUDA/OpenCL GPUs, validating results and comparing performances for different simulations and hardware usages.
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

DOE PAGES

Wang, Bei; Ethier, Stephane; Tang, William; ...

2017-06-29

The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Bei; Ethier, Stephane; Tang, William

The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Stroke Severity Affects Timing: Time From Stroke Code Activation to Initial Imaging is Longer in Patients With Milder Strokes.

PubMed

Kwei, Kimberly T; Liang, John; Wilson, Natalie; Tuhrim, Stanley; Dhamoon, Mandip

2018-05-01

Optimizing the time it takes to get a potential stroke patient to imaging is essential in a rapid stroke response. At our hospital, door-to-imaging time is comprised of 2 time periods: the time before a stroke is recognized, followed by the period after the stroke code is called during which the stroke team assesses and brings the patient to the computed tomography scanner. To control for delays due to triage, we isolated the time period after a potential stroke has been recognized, as few studies have examined the biases of stroke code responders. This "code-to-imaging time" (CIT) encompassed the time from stroke code activation to initial imaging, and we hypothesized that perception of stroke severity would affect how quickly stroke code responders act. In consecutively admitted ischemic stroke patients at The Mount Sinai Hospital emergency department, we tested associations between National Institutes of Health Stroke Scale scores (NIHSS), continuously and at different cutoffs, and CIT using spline regression, t tests for univariate analysis, and multivariable linear regression adjusting for age, sex, and race/ethnicity. In our study population, mean CIT was 26 minutes, and mean presentation NIHSS was 8. In univariate and multivariate analyses comparing CIT between mild and severe strokes, stroke scale scores <4 were associated with longer response times. Milder strokes are associated with a longer CIT with a threshold effect at a NIHSS of 4.
Pyramid image codes

NASA Technical Reports Server (NTRS)

Watson, Andrew B.

1990-01-01

All vision systems, both human and machine, transform the spatial image into a coded representation. Particular codes may be optimized for efficiency or to extract useful image features. Researchers explored image codes based on primary visual cortex in man and other primates. Understanding these codes will advance the art in image coding, autonomous vision, and computational human factors. In cortex, imagery is coded by features that vary in size, orientation, and position. Researchers have devised a mathematical model of this transformation, called the Hexagonal oriented Orthogonal quadrature Pyramid (HOP). In a pyramid code, features are segregated by size into layers, with fewer features in the layers devoted to large features. Pyramid schemes provide scale invariance, and are useful for coarse-to-fine searching and for progressive transmission of images. The HOP Pyramid is novel in three respects: (1) it uses a hexagonal pixel lattice, (2) it uses oriented features, and (3) it accurately models most of the prominent aspects of primary visual cortex. The transform uses seven basic features (kernels), which may be regarded as three oriented edges, three oriented bars, and one non-oriented blob. Application of these kernels to non-overlapping seven-pixel neighborhoods yields six oriented, high-pass pyramid layers, and one low-pass (blob) layer.
TRANSURANUS: a fuel rod analysis code ready for use

NASA Astrophysics Data System (ADS)

Lassmann, K.

1992-06-01

TRANSURANUS is a computer program for the thermal and mechanical analysis of fuel rods in nuclear reactors and was developed at the European Institute for Transuranium Elements (TUI). The TRANSURANUS code consists of a clearly defined mechanical-mathematical framework into which physical models can easily be incorporated. Besides its flexibility for different fuel rod designs the TRANSURANUS code can deal with very different situations, as given for instance in an experiment, under normal, off-normal and accident conditions. The time scale of the problems to be treated may range from milliseconds to years. The code has a comprehensive material data bank for oxide, mixed oxide, carbide and nitride fuels, Zircaloy and steel claddings and different coolants. During its development great effort was spent on obtaining an extremely flexible tool which is easy to handle, exhibiting very fast running times. The total development effort is approximately 40 man-years. In recent years the interest to use this code grew and the code is in use in several organisations, both research and private industry. The code is now available to all interested parties. The paper outlines the main features and capabilities of the TRANSURANUS code, its validation and treats also some practical aspects.
ogs6 - a new concept for porous-fractured media simulations

NASA Astrophysics Data System (ADS)

Naumov, Dmitri; Bilke, Lars; Fischer, Thomas; Rink, Karsten; Wang, Wenqing; Watanabe, Norihiro; Kolditz, Olaf

2015-04-01

OpenGeoSys (OGS) is a scientific open-source initiative for numerical simulation of thermo-hydro-mechanical/chemical (THMC) processes in porous and fractured media, continuously developed since the mid-eighties. The basic concept is to provide a flexible numerical framework for solving coupled multi-field problems. OGS is targeting mainly on applications in environmental geoscience, e.g. in the fields of contaminant hydrology, water resources management, waste deposits, or geothermal energy systems, but it has also been successfully applied to new topics in energy storage recently. OGS is actively participating several international benchmarking initiatives, e.g. DECOVALEX (waste management), CO2BENCH (CO2 storage and sequestration), SeSBENCH (reactive transport processes) and HM-Intercomp (coupled hydrosystems). Despite the broad applicability of OGS in geo-, hydro- and energy-sciences, several shortcomings became obvious concerning the computational efficiency as well as the code structure became too sophisticated for further efficient development. OGS-5 was designed for object-oriented FEM applications. However, in many multi-field problems a certain flexibility of tailored numerical schemes is essential. Therefore, a new concept was designed to overcome existing bottlenecks. The paradigms for ogs6 are: - Flexibility of numerical schemes (FEM#FVM#FDM), - Computational efficiency (PetaScale ready), - Developer- and user-friendly. ogs6 has a module-oriented architecture based on thematic libraries (e.g. MeshLib, NumLib) on the large scale and uses object-oriented approach for the small scale interfaces. Usage of a linear algebra library (Eigen3) for the mathematical operations together with the ISO C++11 standard increases the expressiveness of the code and makes it more developer-friendly. The new C++ standard also makes the template meta-programming technique code used for compile-time optimizations more compact. We have transitioned the main code development to the GitHub code hosting system (https://github.com/ufz/ogs). The very flexible revision control system Git in combination with issue tracking, developer feedback and the code review options improve the code quality and the development process in general. The continuous testing procedure of the benchmarks as it was established for OGS-5 is maintained. Additionally unit testing, which is automatically triggered by any code changes, is executed by two continuous integration frameworks (Jenkins CI, Travis CI) which build and test the code on different operating systems (Windows, Linux, Mac OS), in multiple configurations and with different compilers (GCC, Clang, Visual Studio). To improve the testing possibilities further, XML based file input formats are introduced helping with automatic validation of the user contributed benchmarks. The first ogs6 prototype version 6.0.1 has been implemented for solving generic elliptic problems. Next steps are envisaged to transient, non-linear and coupled problems. Literature: [1] Kolditz O, Shao H, Wang W, Bauer S (eds) (2014): Thermo-Hydro-Mechanical-Chemical Processes in Fractured Porous Media: Modelling and Benchmarking - Closed Form Solutions. In: Terrestrial Environmental Sciences, Vol. 1, Springer, Heidelberg, ISBN 978-3-319-11893-2, 315pp. http://www.springer.com/earth+sciences+and+geography/geology/book/978-3-319-11893-2 [2] Naumov D (2015): Computational Fluid Dynamics in Unconsolidated Sediments: Model Generation and Discrete Flow Simulations, PhD thesis, Technische Universität Dresden.
Variability in interhospital trauma data coding and scoring: A challenge to the accuracy of aggregated trauma registries.

PubMed

Arabian, Sandra S; Marcus, Michael; Captain, Kevin; Pomphrey, Michelle; Breeze, Janis; Wolfe, Jennefer; Bugaev, Nikolay; Rabinovici, Reuven

2015-09-01

Analyses of data aggregated in state and national trauma registries provide the platform for clinical, research, development, and quality improvement efforts in trauma systems. However, the interhospital variability and accuracy in data abstraction and coding have not yet been directly evaluated. This multi-institutional, Web-based, anonymous study examines interhospital variability and accuracy in data coding and scoring by registrars. Eighty-two American College of Surgeons (ACS)/state-verified Level I and II trauma centers were invited to determine different data elements including diagnostic, procedure, and Abbreviated Injury Scale (AIS) coding as well as selected National Trauma Data Bank definitions for the same fictitious case. Variability and accuracy in data entries were assessed by the maximal percent agreement among the registrars for the tested data elements, and 95% confidence intervals were computed to compare this level of agreement to the ideal value of 100%. Variability and accuracy in all elements were compared (χ testing) based on Trauma Quality Improvement Program (TQIP) membership, level of trauma center, ACS verification, and registrar's certifications. Fifty registrars (61%) completed the survey. The overall accuracy for all tested elements was 64%. Variability was noted in all examined parameters except for the place of occurrence code in all groups and the lower extremity AIS code in Level II trauma centers and in the Certified Specialist in Trauma Registry- and Certified Abbreviated Injury Scale Specialist-certified registrar groups. No differences in variability were noted when groups were compared based on TQIP membership, level of center, ACS verification, and registrar's certifications, except for prehospital Glasgow Coma Scale (GCS), where TQIP respondents agreed more than non-TQIP centers (p = 0.004). There is variability and inaccuracy in interhospital data coding and scoring of injury information. This finding casts doubt on the validity of registry data used in all aspects of trauma care and injury surveillance.
The Scalable Checkpoint/Restart Library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moody, A.

The Scalable Checkpoint/Restart (SCR) library provides an interface that codes may use to worite our and read in application-level checkpoints in a scalable fashion. In the current implementation, checkpoint files are cached in local storage (hard disk or RAM disk) on the compute nodes. This technique provides scalable aggregate bandwidth and uses storage resources that are fully dedicated to the job. This approach addresses the two common drawbacks of checkpointing a large-scale application to a shared parallel file system, namely, limited bandwidth and file system contention. In fact, on current platforms, SCR scales linearly with the number of compute nodes.more » It has been benchmarked as high as 720GB/s on 1094 nodes of Atlas, which is nearly two orders of magnitude faster thanthe parallel file system.« less
Scaling a Convection-Resolving RCM to Near-Global Scales

NASA Astrophysics Data System (ADS)

Leutwyler, D.; Fuhrer, O.; Chadha, T.; Kwasniewski, G.; Hoefler, T.; Lapillonne, X.; Lüthi, D.; Osuna, C.; Schar, C.; Schulthess, T. C.; Vogt, H.

2017-12-01

In the recent years, first decade-long kilometer-scale resolution RCM simulations have been performed on continental-scale computational domains. However, the size of the planet Earth is still an order of magnitude larger and thus the computational implications of performing global climate simulations at this resolution are challenging. We explore the gap between the currently established RCM simulations and global simulations by scaling the GPU accelerated version of the COSMO model to a near-global computational domain. To this end, the evolution of an idealized moist baroclinic wave has been simulated over the course of 10 days with a grid spacing of up to 930 m. The computational mesh employs 36'000 x 16'001 x 60 grid points and covers 98.4% of the planet's surface. The code shows perfect weak scaling up to 4'888 Nodes of the Piz Daint supercomputer and yields 0.043 simulated years per day (SYPD) which is approximately one seventh of the 0.2-0.3 SYPD required to conduct AMIP-type simulations. However, at half the resolution (1.9 km) we've observed 0.23 SYPD. Besides formation of frontal precipitating systems containing embedded explicitly-resolved convective motions, the simulations reveal a secondary instability that leads to cut-off warm-core cyclonic vortices in the cyclone's core, once the grid spacing is refined to the kilometer scale. The explicit representation of embedded moist convection and the representation of the previously unresolved instabilities exhibit a physically different behavior in comparison to coarser-resolution simulations. The study demonstrates that global climate simulations using kilometer-scale resolution are imminent and serves as a baseline benchmark for global climate model applications and future exascale supercomputing systems.

Application of a hybrid MPI/OpenMP approach for parallel groundwater model calibration using multi-core computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tang, Guoping; D'Azevedo, Ed F; Zhang, Fan

2010-01-01

Calibration of groundwater models involves hundreds to thousands of forward solutions, each of which may solve many transient coupled nonlinear partial differential equations, resulting in a computationally intensive problem. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelisms in software and hardware to reduce calibration time on multi-core computers. HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for direct solutions for a reactive transport model application, and a field-scale coupled flow and transport model application. In the reactive transport model, a single parallelizable loop is identified to account for over 97% of the total computational time using GPROF.more » Addition of a few lines of OpenMP compiler directives to the loop yields a speedup of about 10 on a 16-core compute node. For the field-scale model, parallelizable loops in 14 of 174 HGC5 subroutines that require 99% of the execution time are identified. As these loops are parallelized incrementally, the scalability is found to be limited by a loop where Cray PAT detects over 90% cache missing rates. With this loop rewritten, similar speedup as the first application is achieved. The OpenMP-parallelized code can be run efficiently on multiple workstations in a network or multiple compute nodes on a cluster as slaves using parallel PEST to speedup model calibration. To run calibration on clusters as a single task, the Levenberg Marquardt algorithm is added to HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, 100 200 compute cores are used to reduce the calibration time from weeks to a few hours for these two applications. This approach is applicable to most of the existing groundwater model codes for many applications.« less
SCALE Continuous-Energy Eigenvalue Sensitivity Coefficient Calculations

DOE PAGES

Perfetti, Christopher M.; Rearden, Bradley T.; Martin, William R.

2016-02-25

Sensitivity coefficients describe the fractional change in a system response that is induced by changes to system parameters and nuclear data. The Tools for Sensitivity and UNcertainty Analysis Methodology Implementation (TSUNAMI) code within the SCALE code system makes use of eigenvalue sensitivity coefficients for an extensive number of criticality safety applications, including quantifying the data-induced uncertainty in the eigenvalue of critical systems, assessing the neutronic similarity between different critical systems, and guiding nuclear data adjustment studies. The need to model geometrically complex systems with improved fidelity and the desire to extend TSUNAMI analysis to advanced applications has motivated the developmentmore » of a methodology for calculating sensitivity coefficients in continuous-energy (CE) Monte Carlo applications. The Contributon-Linked eigenvalue sensitivity/Uncertainty estimation via Tracklength importance CHaracterization (CLUTCH) and Iterated Fission Probability (IFP) eigenvalue sensitivity methods were recently implemented in the CE-KENO framework of the SCALE code system to enable TSUNAMI-3D to perform eigenvalue sensitivity calculations using continuous-energy Monte Carlo methods. This work provides a detailed description of the theory behind the CLUTCH method and describes in detail its implementation. This work explores the improvements in eigenvalue sensitivity coefficient accuracy that can be gained through the use of continuous-energy sensitivity methods and also compares several sensitivity methods in terms of computational efficiency and memory requirements.« less
3D Reconnection and SEP Considerations in the CME-Flare Problem

NASA Astrophysics Data System (ADS)

Moschou, S. P.; Cohen, O.; Drake, J. J.; Sokolov, I.; Borovikov, D.; Alvarado Gomez, J. D.; Garraffo, C.

2017-12-01

Reconnection is known to play a major role in particle acceleration in both solar and astrophysical regimes, yet little is known about its connection with the global scales and its comparative contribution in the generation of SEPs with respect to other acceleration mechanisms, such as the shock at a fast CME front, in the presence of a global structure such as a CME. Coupling efforts, combining both particle and global scales, are necessary to answer questions about the fundamentals of the energetic processes evolved. We present such a coupling modeling effort that looks into particle acceleration through reconnection in a self-consistent CME-flare model in both particle and fluid regimes. Of special interest is the supra-thermal component of the acceleration due to the reconnection that will at a later time interact colliding with the solar atmospheric material of the more dense chromospheric layer and radiate in hard X- and γ-rays for super-thermal electrons and protons respectively. Two cutting edge computational codes are used to capture the global CME and flare dynamics, specifically a two fluid MHD code and a 3D PIC code for the flare scales. Finally, we are connecting the simulations with current observations in different wavelengths in an effort to shed light to the unified CME-flare picture.
Blueprint for a microwave trapped ion quantum computer

PubMed Central

Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.

2017-01-01

The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
Gibbs sampling on large lattice with GMRF

NASA Astrophysics Data System (ADS)

Marcotte, Denis; Allard, Denis

2018-02-01

Gibbs sampling is routinely used to sample truncated Gaussian distributions. These distributions naturally occur when associating latent Gaussian fields to category fields obtained by discrete simulation methods like multipoint, sequential indicator simulation and object-based simulation. The latent Gaussians are often used in data assimilation and history matching algorithms. When the Gibbs sampling is applied on a large lattice, the computing cost can become prohibitive. The usual practice of using local neighborhoods is unsatisfying as it can diverge and it does not reproduce exactly the desired covariance. A better approach is to use Gaussian Markov Random Fields (GMRF) which enables to compute the conditional distributions at any point without having to compute and invert the full covariance matrix. As the GMRF is locally defined, it allows simultaneous updating of all points that do not share neighbors (coding sets). We propose a new simultaneous Gibbs updating strategy on coding sets that can be efficiently computed by convolution and applied with an acceptance/rejection method in the truncated case. We study empirically the speed of convergence, the effect of choice of boundary conditions, of the correlation range and of GMRF smoothness. We show that the convergence is slower in the Gaussian case on the torus than for the finite case studied in the literature. However, in the truncated Gaussian case, we show that short scale correlation is quickly restored and the conditioning categories at each lattice point imprint the long scale correlation. Hence our approach enables to realistically apply Gibbs sampling on large 2D or 3D lattice with the desired GMRF covariance.
Computer Description of Black Hawk Helicopter

DTIC Science & Technology

1979-06-01

Model Combinatorial Geometry Models Black Hawk Helicopter Helicopter GIFT Computer Code Geometric Description of Targets 20. ABSTRACT...description was made using the technique of combinatorial geometry (COM-GEOM) and will be used as input to the GIFT computer code which generates Tliic...rnHp The data used bv the COVART comtmter code was eenerated bv the Geometric Information for Targets ( GIFT )Z computer code. This report documents
Effect of asynchrony on numerical simulations of fluid flow phenomena

NASA Astrophysics Data System (ADS)

Konduri, Aditya; Mahoney, Bryan; Donzis, Diego

2015-11-01

Designing scalable CFD codes on massively parallel computers is a challenge. This is mainly due to the large number of communications between processing elements (PEs) and their synchronization, leading to idling of PEs. Indeed, communication will likely be the bottleneck in the scalability of codes on Exascale machines. Our recent work on asynchronous computing for PDEs based on finite-differences has shown that it is possible to relax synchronization between PEs at a mathematical level. Computations then proceed regardless of the status of communication, reducing the idle time of PEs and improving the scalability. However, accuracy of the schemes is greatly affected. We have proposed asynchrony-tolerant (AT) schemes to address this issue. In this work, we study the effect of asynchrony on the solution of fluid flow problems using standard and AT schemes. We show that asynchrony creates additional scales with low energy content. The specific wavenumbers affected can be shown to be due to two distinct effects: the randomness in the arrival of messages and the corresponding switching between schemes. Understanding these errors allow us to effectively control them, rendering the method's feasibility in solving turbulent flows at realistic conditions on future computing systems.
Toward Petascale Biologically Plausible Neural Networks

NASA Astrophysics Data System (ADS)

Long, Lyle

This talk will describe an approach to achieving petascale neural networks. Artificial intelligence has been oversold for many decades. Computers in the beginning could only do about 16,000 operations per second. Computer processing power, however, has been doubling every two years thanks to Moore's law, and growing even faster due to massively parallel architectures. Finally, 60 years after the first AI conference we have computers on the order of the performance of the human brain (1016 operations per second). The main issues now are algorithms, software, and learning. We have excellent models of neurons, such as the Hodgkin-Huxley model, but we do not know how the human neurons are wired together. With careful attention to efficient parallel computing, event-driven programming, table lookups, and memory minimization massive scale simulations can be performed. The code that will be described was written in C + + and uses the Message Passing Interface (MPI). It uses the full Hodgkin-Huxley neuron model, not a simplified model. It also allows arbitrary network structures (deep, recurrent, convolutional, all-to-all, etc.). The code is scalable, and has, so far, been tested on up to 2,048 processor cores using 107 neurons and 109 synapses.
Validation and Performance Comparison of Numerical Codes for Tsunami Inundation

NASA Astrophysics Data System (ADS)

Velioglu, D.; Kian, R.; Yalciner, A. C.; Zaytsev, A.

2015-12-01

In inundation zones, tsunami motion turns from wave motion to flow of water. Modelling of this phenomenon is a complex problem since there are many parameters affecting the tsunami flow. In this respect, the performance of numerical codes that analyze tsunami inundation patterns becomes important. The computation of water surface elevation is not sufficient for proper analysis of tsunami behaviour in shallow water zones and on land and hence for the development of mitigation strategies. Velocity and velocity patterns are also crucial parameters and have to be computed at the highest accuracy. There are numerous numerical codes to be used for simulating tsunami inundation. In this study, FLOW 3D and NAMI DANCE codes are selected for validation and performance comparison. Flow 3D simulates linear and nonlinear propagating surface waves as well as long waves by solving three-dimensional Navier-Stokes (3D-NS) equations. FLOW 3D is used specificaly for flood problems. NAMI DANCE uses finite difference computational method to solve linear and nonlinear forms of shallow water equations (NSWE) in long wave problems, specifically tsunamis. In this study, these codes are validated and their performances are compared using two benchmark problems which are discussed in 2015 National Tsunami Hazard Mitigation Program (NTHMP) Annual meeting in Portland, USA. One of the problems is an experiment of a single long-period wave propagating up a piecewise linear slope and onto a small-scale model of the town of Seaside, Oregon. Other benchmark problem is an experiment of a single solitary wave propagating up a triangular shaped shelf with an island feature located at the offshore point of the shelf. The computed water surface elevation and velocity data are compared with the measured data. The comparisons showed that both codes are in fairly good agreement with each other and benchmark data. All results are presented with discussions and comparisons. The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement No 603839 (Project ASTARTE - Assessment, Strategy and Risk Reduction for Tsunamis in Europe)
Comparison of OpenFOAM and EllipSys3D actuator line methods with (NEW) MEXICO results

NASA Astrophysics Data System (ADS)

Nathan, J.; Meyer Forsting, A. R.; Troldborg, N.; Masson, C.

2017-05-01

The Actuator Line Method exists for more than a decade and has become a well established choice for simulating wind rotors in computational fluid dynamics. Numerous implementations exist and are used in the wind energy research community. These codes were verified by experimental data such as the MEXICO experiment. Often the verification against other codes were made on a very broad scale. Therefore this study attempts first a validation by comparing two different implementations, namely an adapted version of SOWFA/OpenFOAM and EllipSys3D and also a verification by comparing against experimental results from the MEXICO and NEW MEXICO experiments.
An edge-based solution-adaptive method applied to the AIRPLANE code

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Thomas, Scott D.; Cliff, Susan E.

1995-01-01

Computational methods to solve large-scale realistic problems in fluid flow can be made more efficient and cost effective by using them in conjunction with dynamic mesh adaption procedures that perform simultaneous coarsening and refinement to capture flow features of interest. This work couples the tetrahedral mesh adaption scheme, 3D_TAG, with the AIRPLANE code to solve complete aircraft configuration problems in transonic and supersonic flow regimes. Results indicate that the near-field sonic boom pressure signature of a cone-cylinder is improved, the oblique and normal shocks are better resolved on a transonic wing, and the bow shock ahead of an unstarted inlet is better defined.
Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

DOE PAGES

Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; ...

2015-12-21

This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000 ® problems. These benchmark and scaling studies show promising results.« less
Simulation Studies of Mechanical Properties of Novel Silica Nano-structures

NASA Astrophysics Data System (ADS)

Muralidharan, Krishna; Torras Costa, Joan; Trickey, Samuel B.

2006-03-01

Advances in nanotechnology and the importance of silica as a technological material continue to stimulate computational study of the properties of possible novel silica nanostructures. Thus we have done classical molecular dynamics (MD) and multi-scale quantum mechanical (QM/MD) simulation studies of the mechanical properties of single-wall and multi-wall silica nano-rods of varying dimensions. Such nano-rods have been predicted by Mallik et al. to be unusually strong in tensile failure. Here we compare failure mechanisms of such nano-rods under tension, compression, and bending. The concurrent multi-scale QM/MD studies use the general PUPIL system (Torras et al.). In this case, PUPIL provides automated interoperation of the MNDO Transfer Hamiltonian QM code (Taylor et al.) and a locally written MD code. Embedding of the QM-forces domain is via the scheme of Mallik et al. Work supported by NSF ITR award DMR-0325553.
User manual for semi-circular compact range reflector code: Version 2

NASA Technical Reports Server (NTRS)

Gupta, Inder J.; Burnside, Walter D.

1987-01-01

A computer code has been developed at the Ohio State University ElectroScience Laboratory to analyze a semi-circular paraboloidal reflector with or without a rolled edge at the top and a skirt at the bottom. The code can be used to compute the total near field of the reflector or its individual components at a given distance from the center of the paraboloid. The code computes the fields along a radial, horizontal, vertical or axial cut at that distance. Thus, it is very effective in computing the size of the sweet spot for a semi-circular compact range reflector. This report describes the operation of the code. Various input and output statements are explained. Some results obtained using the computer code are presented to illustrate the code's capability as well as being samples of input/output sets.
NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations

NASA Astrophysics Data System (ADS)

Valiev, M.; Bylaska, E. J.; Govind, N.; Kowalski, K.; Straatsma, T. P.; Van Dam, H. J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T. L.; de Jong, W. A.

2010-09-01

The latest release of NWChem delivers an open-source computational chemistry package with extensive capabilities for large scale simulations of chemical and biological systems. Utilizing a common computational framework, diverse theoretical descriptions can be used to provide the best solution for a given scientific problem. Scalable parallel implementations and modular software design enable efficient utilization of current computational architectures. This paper provides an overview of NWChem focusing primarily on the core theoretical modules provided by the code and their parallel performance. Program summaryProgram title: NWChem Catalogue identifier: AEGI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGI_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open Source Educational Community License No. of lines in distributed program, including test data, etc.: 11 709 543 No. of bytes in distributed program, including test data, etc.: 680 696 106 Distribution format: tar.gz Programming language: Fortran 77, C Computer: all Linux based workstations and parallel supercomputers, Windows and Apple machines Operating system: Linux, OS X, Windows Has the code been vectorised or parallelized?: Code is parallelized Classification: 2.1, 2.2, 3, 7.3, 7.7, 16.1, 16.2, 16.3, 16.10, 16.13 Nature of problem: Large-scale atomistic simulations of chemical and biological systems require efficient and reliable methods for ground and excited solutions of many-electron Hamiltonian, analysis of the potential energy surface, and dynamics. Solution method: Ground and excited solutions of many-electron Hamiltonian are obtained utilizing density-functional theory, many-body perturbation approach, and coupled cluster expansion. These solutions or a combination thereof with classical descriptions are then used to analyze potential energy surface and perform dynamical simulations. Additional comments: Full documentation is provided in the distribution file. This includes an INSTALL file giving details of how to build the package. A set of test runs is provided in the examples directory. The distribution file for this program is over 90 Mbytes and therefore is not delivered directly when download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. Running time: Running time depends on the size of the chemical system, complexity of the method, number of cpu's and the computational task. It ranges from several seconds for serial DFT energy calculations on a few atoms to several hours for parallel coupled cluster energy calculations on tens of atoms or ab-initio molecular dynamics simulation on hundreds of atoms.
Advances and Challenges In Uncertainty Quantification with Application to Climate Prediction, ICF design and Science Stockpile Stewardship

NASA Astrophysics Data System (ADS)

Klein, R.; Woodward, C. S.; Johannesson, G.; Domyancic, D.; Covey, C. C.; Lucas, D. D.

2012-12-01

Uncertainty Quantification (UQ) is a critical field within 21st century simulation science that resides at the very center of the web of emerging predictive capabilities. The science of UQ holds the promise of giving much greater meaning to the results of complex large-scale simulations, allowing for quantifying and bounding uncertainties. This powerful capability will yield new insights into scientific predictions (e.g. Climate) of great impact on both national and international arenas, allow informed decisions on the design of critical experiments (e.g. ICF capsule design, MFE, NE) in many scientific fields, and assign confidence bounds to scientifically predictable outcomes (e.g. nuclear weapons design). In this talk I will discuss a major new strategic initiative (SI) we have developed at Lawrence Livermore National Laboratory to advance the science of Uncertainty Quantification at LLNL focusing in particular on (a) the research and development of new algorithms and methodologies of UQ as applied to multi-physics multi-scale codes, (b) incorporation of these advancements into a global UQ Pipeline (i.e. a computational superstructure) that will simplify user access to sophisticated tools for UQ studies as well as act as a self-guided, self-adapting UQ engine for UQ studies on extreme computing platforms and (c) use laboratory applications as a test bed for new algorithms and methodologies. The initial SI focus has been on applications for the quantification of uncertainty associated with Climate prediction, but the validated UQ methodologies we have developed are now being fed back into Science Based Stockpile Stewardship (SSS) and ICF UQ efforts. To make advancements in several of these UQ grand challenges, I will focus in talk on the following three research areas in our Strategic Initiative: Error Estimation in multi-physics and multi-scale codes ; Tackling the "Curse of High Dimensionality"; and development of an advanced UQ Computational Pipeline to enable complete UQ workflow and analysis for ensemble runs at the extreme scale (e.g. exascale) with self-guiding adaptation in the UQ Pipeline engine. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 and was funded by the Uncertainty Quantification Strategic Initiative Laboratory Directed Research and Development Project at LLNL under project tracking code 10-SI-013 (UCRL LLNL-ABS-569112).
Turbulence modeling for Francis turbine water passages simulation

NASA Astrophysics Data System (ADS)

Maruzewski, P.; Hayashi, H.; Munch, C.; Yamaishi, K.; Hashii, T.; Mombelli, H. P.; Sugow, Y.; Avellan, F.

2010-08-01

The applications of Computational Fluid Dynamics, CFD, to hydraulic machines life require the ability to handle turbulent flows and to take into account the effects of turbulence on the mean flow. Nowadays, Direct Numerical Simulation, DNS, is still not a good candidate for hydraulic machines simulations due to an expensive computational time consuming. Large Eddy Simulation, LES, even, is of the same category of DNS, could be an alternative whereby only the small scale turbulent fluctuations are modeled and the larger scale fluctuations are computed directly. Nevertheless, the Reynolds-Averaged Navier-Stokes, RANS, model have become the widespread standard base for numerous hydraulic machine design procedures. However, for many applications involving wall-bounded flows and attached boundary layers, various hybrid combinations of LES and RANS are being considered, such as Detached Eddy Simulation, DES, whereby the RANS approximation is kept in the regions where the boundary layers are attached to the solid walls. Furthermore, the accuracy of CFD simulations is highly dependent on the grid quality, in terms of grid uniformity in complex configurations. Moreover any successful structured and unstructured CFD codes have to offer a wide range to the variety of classic RANS model to hybrid complex model. The aim of this study is to compare the behavior of turbulent simulations for both structured and unstructured grids topology with two different CFD codes which used the same Francis turbine. Hence, the study is intended to outline the encountered discrepancy for predicting the wake of turbine blades by using either the standard k-epsilon model, or the standard k-epsilon model or the SST shear stress model in a steady CFD simulation. Finally, comparisons are made with experimental data from the EPFL Laboratory for Hydraulic Machines reduced scale model measurements.
Uncertainty Quantification Techniques of SCALE/TSUNAMI

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rearden, Bradley T; Mueller, Don

2011-01-01

The Standardized Computer Analysis for Licensing Evaluation (SCALE) code system developed at Oak Ridge National Laboratory (ORNL) includes Tools for Sensitivity and Uncertainty Analysis Methodology Implementation (TSUNAMI). The TSUNAMI code suite can quantify the predicted change in system responses, such as k{sub eff}, reactivity differences, or ratios of fluxes or reaction rates, due to changes in the energy-dependent, nuclide-reaction-specific cross-section data. Where uncertainties in the neutron cross-section data are available, the sensitivity of the system to the cross-section data can be applied to propagate the uncertainties in the cross-section data to an uncertainty in the system response. Uncertainty quantification ismore » useful for identifying potential sources of computational biases and highlighting parameters important to code validation. Traditional validation techniques often examine one or more average physical parameters to characterize a system and identify applicable benchmark experiments. However, with TSUNAMI correlation coefficients are developed by propagating the uncertainties in neutron cross-section data to uncertainties in the computed responses for experiments and safety applications through sensitivity coefficients. The bias in the experiments, as a function of their correlation coefficient with the intended application, is extrapolated to predict the bias and bias uncertainty in the application through trending analysis or generalized linear least squares techniques, often referred to as 'data adjustment.' Even with advanced tools to identify benchmark experiments, analysts occasionally find that the application models include some feature or material for which adequately similar benchmark experiments do not exist to support validation. For example, a criticality safety analyst may want to take credit for the presence of fission products in spent nuclear fuel. In such cases, analysts sometimes rely on 'expert judgment' to select an additional administrative margin to account for gap in the validation data or to conclude that the impact on the calculated bias and bias uncertainty is negligible. As a result of advances in computer programs and the evolution of cross-section covariance data, analysts can use the sensitivity and uncertainty analysis tools in the TSUNAMI codes to estimate the potential impact on the application-specific bias and bias uncertainty resulting from nuclides not represented in available benchmark experiments. This paper presents the application of methods described in a companion paper.« less
Hanford meteorological station computer codes: Volume 9, The quality assurance computer codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burk, K.W.; Andrews, G.L.

1989-02-01

The Hanford Meteorological Station (HMS) was established in 1944 on the Hanford Site to collect and archive meteorological data and provide weather forecasts and related services for Hanford Site approximately 1/2 mile east of the 200 West Area and is operated by PNL for the US Department of Energy. Meteorological data are collected from various sensors and equipment located on and off the Hanford Site. These data are stored in data bases on the Digital Equipment Corporation (DEC) VAX 11/750 at the HMS (hereafter referred to as the HMS computer). Files from those data bases are routinely transferred to themore » Emergency Management System (EMS) computer at the Unified Dose Assessment Center (UDAC). To ensure the quality and integrity of the HMS data, a set of Quality Assurance (QA) computer codes has been written. The codes will be routinely used by the HMS system manager or the data base custodian. The QA codes provide detailed output files that will be used in correcting erroneous data. The following sections in this volume describe the implementation and operation of QA computer codes. The appendices contain detailed descriptions, flow charts, and source code listings of each computer code. 2 refs.« less
Advanced computations in plasma physics

NASA Astrophysics Data System (ADS)

Tang, W. M.

2002-05-01

Scientific simulation in tandem with theory and experiment is an essential tool for understanding complex plasma behavior. In this paper we review recent progress and future directions for advanced simulations in magnetically confined plasmas with illustrative examples chosen from magnetic confinement research areas such as microturbulence, magnetohydrodynamics, magnetic reconnection, and others. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales together with access to powerful new computational resources. In particular, the fusion energy science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of turbulence self-regulation by zonal flows. It should be emphasized that these calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to plasma science.

MPD thruster technology

NASA Technical Reports Server (NTRS)

Myers, Roger M.

1991-01-01

Inhouse magnetoplasmadynamic (MPD) thruster technology is discussed. The study focussed on steady state thrusters at powers of less than 1 MW. Performance measurement and diagnostics technologies were developed for high power thrusters. Also developed was a MPD computer code. The stated goals of the program are to establish: performance and life limitation; influence of applied fields; propellant effects; and scaling laws. The presentation is mostly through graphs and charts.
Hierarchical nonlinear behavior of hot composite structures

NASA Technical Reports Server (NTRS)

Murthy, P. L. N.; Chamis, C. C.; Singhal, S. N.

1993-01-01

Hierarchical computational procedures are described to simulate the multiple scale thermal/mechanical behavior of high temperature metal matrix composites (HT-MMC) in the following three broad areas: (1) behavior of HT-MMC's from micromechanics to laminate via METCAN (Metal Matrix Composite Analyzer), (2) tailoring of HT-MMC behavior for optimum specific performance via MMLT (Metal Matrix Laminate Tailoring), and (3) HT-MMC structural response for hot structural components via HITCAN (High Temperature Composite Analyzer). Representative results from each area are presented to illustrate the effectiveness of computational simulation procedures and accompanying computer codes. The sample case results show that METCAN can be used to simulate material behavior such as the entire creep span; MMLT can be used to concurrently tailor the fabrication process and the interphase layer for optimum performance such as minimum residual stresses; and HITCAN can be used to predict the structural behavior such as the deformed shape due to component fabrication. These codes constitute virtual portable desk-top test laboratories for characterizing HT-MMC laminates, tailoring the fabrication process, and qualifying structural components made from them.
Simulation of LHC events on a millions threads

NASA Astrophysics Data System (ADS)

Childers, J. T.; Uram, T. D.; LeCompte, T. J.; Papka, M. E.; Benjamin, D. P.

2015-12-01

Demand for Grid resources is expected to double during LHC Run II as compared to Run I; the capacity of the Grid, however, will not double. The HEP community must consider how to bridge this computing gap by targeting larger compute resources and using the available compute resources as efficiently as possible. Argonne's Mira, the fifth fastest supercomputer in the world, can run roughly five times the number of parallel processes that the ATLAS experiment typically uses on the Grid. We ported Alpgen, a serial x86 code, to run as a parallel application under MPI on the Blue Gene/Q architecture. By analysis of the Alpgen code, we reduced the memory footprint to allow running 64 threads per node, utilizing the four hardware threads available per core on the PowerPC A2 processor. Event generation and unweighting, typically run as independent serial phases, are coupled together in a single job in this scenario, reducing intermediate writes to the filesystem. By these optimizations, we have successfully run LHC proton-proton physics event generation at the scale of a million threads, filling two-thirds of Mira.
X-33 Aerodynamic and Aeroheating Computations for Wind Tunnel and Flight Conditions

NASA Technical Reports Server (NTRS)

Hollis, Brian R.; Thompson, Richard A.; Murphy, Kelly J.; Nowak, Robert J.; Riley, Christopher J.; Wood, William A.; Alter, Stephen J.; Prabhu, Ramadas K.

1999-01-01

This report provides an overview of hypersonic Computational Fluid Dynamics research conducted at the NASA Langley Research Center to support the Phase II development of the X-33 vehicle. The X-33, which is being developed by Lockheed-Martin in partnership with NASA, is an experimental Single-Stage-to-Orbit demonstrator that is intended to validate critical technologies for a full-scale Reusable Launch Vehicle. As part of the development of the X-33, CFD codes have been used to predict the aerodynamic and aeroheating characteristics of the vehicle. Laminar and turbulent predictions were generated for the X 33 vehicle using two finite- volume, Navier-Stokes solvers. Inviscid solutions were also generated with an Euler code. Computations were performed for Mach numbers of 4.0 to 10.0 at angles-of-attack from 10 deg to 48 deg with body flap deflections of 0, 10 and 20 deg. Comparisons between predictions and wind tunnel aerodynamic and aeroheating data are presented in this paper. Aeroheating and aerodynamic predictions for flight conditions are also presented.
Assessing the Progress of Trapped-Ion Processors Towards Fault-Tolerant Quantum Computation

NASA Astrophysics Data System (ADS)

Bermudez, A.; Xu, X.; Nigmatullin, R.; O'Gorman, J.; Negnevitsky, V.; Schindler, P.; Monz, T.; Poschinger, U. G.; Hempel, C.; Home, J.; Schmidt-Kaler, F.; Biercuk, M.; Blatt, R.; Benjamin, S.; Müller, M.

2017-10-01

A quantitative assessment of the progress of small prototype quantum processors towards fault-tolerant quantum computation is a problem of current interest in experimental and theoretical quantum information science. We introduce a necessary and fair criterion for quantum error correction (QEC), which must be achieved in the development of these quantum processors before their sizes are sufficiently big to consider the well-known QEC threshold. We apply this criterion to benchmark the ongoing effort in implementing QEC with topological color codes using trapped-ion quantum processors and, more importantly, to guide the future hardware developments that will be required in order to demonstrate beneficial QEC with small topological quantum codes. In doing so, we present a thorough description of a realistic trapped-ion toolbox for QEC and a physically motivated error model that goes beyond standard simplifications in the QEC literature. We focus on laser-based quantum gates realized in two-species trapped-ion crystals in high-optical aperture segmented traps. Our large-scale numerical analysis shows that, with the foreseen technological improvements described here, this platform is a very promising candidate for fault-tolerant quantum computation.
Simulation of droplet impact onto a deep pool for large Froude numbers in different open-source codes

NASA Astrophysics Data System (ADS)

Korchagova, V. N.; Kraposhin, M. V.; Marchevsky, I. K.; Smirnova, E. V.

2017-11-01

A droplet impact on a deep pool can induce macro-scale or micro-scale effects like a crown splash, a high-speed jet, formation of secondary droplets or thin liquid films, etc. It depends on the diameter and velocity of the droplet, liquid properties, effects of external forces and other factors that a ratio of dimensionless criteria can account for. In the present research, we considered the droplet and the pool consist of the same viscous incompressible liquid. We took surface tension into account but neglected gravity forces. We used two open-source codes (OpenFOAM and Gerris) for our computations. We review the possibility of using these codes for simulation of processes in free-surface flows that may take place after a droplet impact on the pool. Both codes simulated several modes of droplet impact. We estimated the effect of liquid properties with respect to the Reynolds number and Weber number. Numerical simulation enabled us to find boundaries between different modes of droplet impact on a deep pool and to plot corresponding mode maps. The ratio of liquid density to that of the surrounding gas induces several changes in mode maps. Increasing this density ratio suppresses the crown splash.
Facilitating Internet-Scale Code Retrieval

ERIC Educational Resources Information Center

Bajracharya, Sushil Krishna

2010-01-01

Internet-Scale code retrieval deals with the representation, storage, and access of relevant source code from a large amount of source code available on the Internet. Internet-Scale code retrieval systems support common emerging practices among software developers related to finding and reusing source code. In this dissertation we focus on some…
Thermal-chemical Mantle Convection Models With Adaptive Mesh Refinement

NASA Astrophysics Data System (ADS)

Leng, W.; Zhong, S.

2008-12-01

In numerical modeling of mantle convection, resolution is often crucial for resolving small-scale features. New techniques, adaptive mesh refinement (AMR), allow local mesh refinement wherever high resolution is needed, while leaving other regions with relatively low resolution. Both computational efficiency for large- scale simulation and accuracy for small-scale features can thus be achieved with AMR. Based on the octree data structure [Tu et al. 2005], we implement the AMR techniques into the 2-D mantle convection models. For pure thermal convection models, benchmark tests show that our code can achieve high accuracy with relatively small number of elements both for isoviscous cases (i.e. 7492 AMR elements v.s. 65536 uniform elements) and for temperature-dependent viscosity cases (i.e. 14620 AMR elements v.s. 65536 uniform elements). We further implement tracer-method into the models for simulating thermal-chemical convection. By appropriately adding and removing tracers according to the refinement of the meshes, our code successfully reproduces the benchmark results in van Keken et al. [1997] with much fewer elements and tracers compared with uniform-mesh models (i.e. 7552 AMR elements v.s. 16384 uniform elements, and ~83000 tracers v.s. ~410000 tracers). The boundaries of the chemical piles in our AMR code can be easily refined to the scales of a few kilometers for the Earth's mantle and the tracers are concentrated near the chemical boundaries to precisely trace the evolvement of the boundaries. It is thus very suitable for our AMR code to study the thermal-chemical convection problems which need high resolution to resolve the evolvement of chemical boundaries, such as the entrainment problems [Sleep, 1988].
Measurement Requirements for Improved Modeling of Arcjet Facility Flows

NASA Technical Reports Server (NTRS)

Fletcher, Douglas G.

2000-01-01

Current efforts to develop new reusable launch vehicles and to pursue low-cost robotic planetary missions have led to a renewed interest in understanding arc-jet flows. Part of this renewed interest is concerned with improving the understanding of arc-jet test results and the potential use of available computational-fluid- dynamic (CFD) codes to aid in this effort. These CFD codes have been extensively developed and tested for application to nonequilibrium, hypersonic flow modeling. It is envisioned, perhaps naively, that the application of these CFD codes to the simulation of arc-jet flows would serve two purposes: first. the codes would help to characterize the nonequilibrium nature of the arc-jet flows; and second. arc-jet experiments could potentially be used to validate the flow models. These two objectives are, to some extent, mutually exclusive. However, the purpose of the present discussion is to address what role CFD codes can play in the current arc-jet flow characterization effort, and whether or not the simulation of arc-jet facility tests can be used to eva1uate some of the modeling that is used to formu1ate these codes. This presentation is organized into several sections. In the introductory section, the development of large-scale, constricted-arc test facilities within NASA is reviewed, and the current state of flow diagnostics using conventional instrumentation is summarized. The motivation for using CFD to simulate arc-jet flows is addressed in the next section, and the basic requirements for CFD models that would be used for these simulations are briefly discussed. This section is followed by a more detailed description of experimental measurements that are needed to initiate credible simulations and to evaluate their fidelity in the different flow regions of an arc-jet facility. Observations from a recent combined computational and experiment.al investigation of shock-layer flows in a large-scale arc-jet facility are then used to illustrate the current state of development of diagnostic instrumentation, CFD simulations, and general knowledge in the field of arc-jet characterization. Finally, the main points are summarized and recommendations for future efforts are given.
Ab initio MD simulations of Mg2SiO4 liquid at high pressures and temperatures relevant to the Earth's mantle

NASA Astrophysics Data System (ADS)

Martin, G. B.; Kirtman, B.; Spera, F. J.

2010-12-01

Computational studies implementing Density Functional Theory (DFT) methods have become very popular in the Materials Sciences in recent years. DFT codes are now used routinely to simulate properties of geomaterials—mainly silicates and geochemically important metals such as Fe. These materials are ubiquitous in the Earth’s mantle and core and in terrestrial exoplanets. Because of computational limitations, most First Principles Molecular Dynamics (FPMD) calculations are done on systems of only 100 atoms for a few picoseconds. While this approach can be useful for calculating physical quantities related to crystal structure, vibrational frequency, and other lattice-scale properties (especially in crystals), it would be useful to be able to compute larger systems especially for extracting transport properties and coordination statistics. Previous studies have used codes such as VASP where CPU time increases as N2, making calculations on systems of more than 100 atoms computationally very taxing. SIESTA (Soler, et al. 2002) is a an order-N (linear-scaling) DFT code that enables electronic structure and MD computations on larger systems (N 1000) by making approximations such as localized numerical orbitals. Here we test the applicability of SIESTA to simulate geosilicates in the liquid and glass state. We have used SIESTA for MD simulations of liquid Mg2SiO4 at various state points pertinent to the Earth’s mantle and congruous with those calculated in a previous DFT study using the VASP code (DeKoker, et al. 2008). The core electronic wave functions of Mg, Si, and O were approximated using pseudopotentials with a core cutoff radius of 1.38, 1.0, and 0.61 Angstroms respectively. The Ceperly-Alder parameterization of the Local Density Approximation (LDA) was used as the exchange-correlation functional. Known systematic overbinding of LDA was corrected with the addition of a pressure term, P 1.6 GPa, which is the pressure calculated by SIESTA at the experimental zero-pressure volume of forsterite under static conditions (Stixrude and Lithgow-Bertollini 2005). Results are reported here that show SIESTA calculations of T and P on densities in the range of 2.7 - 5.0 g/cc of liquid Mg2SiO4 are similar to the VASP calculations of DeKoker et al. (2008), which used the same functional. This opens the possibility of conducting fast /emph{ab initio} MD simulations of geomaterials with a hundreds of atoms.
User's manual for semi-circular compact range reflector code

NASA Technical Reports Server (NTRS)

Gupta, Inder J.; Burnside, Walter D.

1986-01-01

A computer code was developed to analyze a semi-circular paraboloidal reflector antenna with a rolled edge at the top and a skirt at the bottom. The code can be used to compute the total near field of the antenna or its individual components at a given distance from the center of the paraboloid. Thus, it is very effective in computing the size of the sweet spot for RCS or antenna measurement. The operation of the code is described. Various input and output statements are explained. Some results obtained using the computer code are presented to illustrate the code's capability as well as being samples of input/output sets.
Highly fault-tolerant parallel computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spielman, D.A.

We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an error-correcting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable even if each of their components can fail with some constant probability. We consider fine-grained parallel computations inmore » which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w log{sup O(l)} w processors and time t log{sup O(l)} w. The failure probability of the computation will be at most t {center_dot} exp(-w{sup 1/4}). The codes used to communicate with our fault-tolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in O(n log{sup O(1)} n) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.« less
TRANSITION FROM KINETIC TO MHD BEHAVIOR IN A COLLISIONLESS PLASMA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parashar, Tulasi N.; Matthaeus, William H.; Shay, Michael A.

The study of kinetic effects in heliospheric plasmas requires representation of dynamics at sub-proton scales, but in most cases the system is driven by magnetohydrodynamic (MHD) activity at larger scales. The latter requirement challenges available computational resources, which raises the question of how large such a system must be to exhibit MHD traits at large scales while kinetic behavior is accurately represented at small scales. Here we study this implied transition from kinetic to MHD-like behavior using particle-in-cell (PIC) simulations, initialized using an Orszag–Tang Vortex. The PIC code treats protons, as well as electrons, kinetically, and we address the questionmore » of interest by examining several different indicators of MHD-like behavior.« less
An emulator for minimizing computer resources for finite element analysis

NASA Technical Reports Server (NTRS)

Melosh, R.; Utku, S.; Islam, M.; Salama, M.

1984-01-01

A computer code, SCOPE, has been developed for predicting the computer resources required for a given analysis code, computer hardware, and structural problem. The cost of running the code is a small fraction (about 3 percent) of the cost of performing the actual analysis. However, its accuracy in predicting the CPU and I/O resources depends intrinsically on the accuracy of calibration data that must be developed once for the computer hardware and the finite element analysis code of interest. Testing of the SCOPE code on the AMDAHL 470 V/8 computer and the ELAS finite element analysis program indicated small I/O errors (3.2 percent), larger CPU errors (17.8 percent), and negligible total errors (1.5 percent).
A validated non-linear Kelvin-Helmholtz benchmark for numerical hydrodynamics

NASA Astrophysics Data System (ADS)

Lecoanet, D.; McCourt, M.; Quataert, E.; Burns, K. J.; Vasil, G. M.; Oishi, J. S.; Brown, B. P.; Stone, J. M.; O'Leary, R. M.

2016-02-01

The non-linear evolution of the Kelvin-Helmholtz instability is a popular test for code verification. To date, most Kelvin-Helmholtz problems discussed in the literature are ill-posed: they do not converge to any single solution with increasing resolution. This precludes comparisons among different codes and severely limits the utility of the Kelvin-Helmholtz instability as a test problem. The lack of a reference solution has led various authors to assert the accuracy of their simulations based on ad hoc proxies, e.g. the existence of small-scale structures. This paper proposes well-posed two-dimensional Kelvin-Helmholtz problems with smooth initial conditions and explicit diffusion. We show that in many cases numerical errors/noise can seed spurious small-scale structure in Kelvin-Helmholtz problems. We demonstrate convergence to a reference solution using both ATHENA, a Godunov code, and DEDALUS, a pseudo-spectral code. Problems with constant initial density throughout the domain are relatively straightforward for both codes. However, problems with an initial density jump (which are the norm in astrophysical systems) exhibit rich behaviour and are more computationally challenging. In the latter case, ATHENA simulations are prone to an instability of the inner rolled-up vortex; this instability is seeded by grid-scale errors introduced by the algorithm, and disappears as resolution increases. Both ATHENA and DEDALUS exhibit late-time chaos. Inviscid simulations are riddled with extremely vigorous secondary instabilities which induce more mixing than simulations with explicit diffusion. Our results highlight the importance of running well-posed test problems with demonstrated convergence to a reference solution. To facilitate future comparisons, we include as supplementary material the resolved, converged solutions to the Kelvin-Helmholtz problems in this paper in machine-readable form.
A generalized one-dimensional computer code for turbomachinery cooling passage flow calculations

NASA Technical Reports Server (NTRS)

Kumar, Ganesh N.; Roelke, Richard J.; Meitner, Peter L.

1989-01-01

A generalized one-dimensional computer code for analyzing the flow and heat transfer in the turbomachinery cooling passages was developed. This code is capable of handling rotating cooling passages with turbulators, 180 degree turns, pin fins, finned passages, by-pass flows, tip cap impingement flows, and flow branching. The code is an extension of a one-dimensional code developed by P. Meitner. In the subject code, correlations for both heat transfer coefficient and pressure loss computations were developed to model each of the above mentioned type of coolant passages. The code has the capability of independently computing the friction factor and heat transfer coefficient on each side of a rectangular passage. Either the mass flow at the inlet to the channel or the exit plane pressure can be specified. For a specified inlet total temperature, inlet total pressure, and exit static pressure, the code computers the flow rates through the main branch and the subbranches, flow through tip cap for impingement cooling, in addition to computing the coolant pressure, temperature, and heat transfer coefficient distribution in each coolant flow branch. Predictions from the subject code for both nonrotating and rotating passages agree well with experimental data. The code was used to analyze the cooling passage of a research cooled radial rotor.
Efficient Proximity Computation Techniques Using ZIP Code Data for Smart Cities †

PubMed Central

Murdani, Muhammad Harist; Hong, Bonghee

2018-01-01

In this paper, we are interested in computing ZIP code proximity from two perspectives, proximity between two ZIP codes (Ad-Hoc) and neighborhood proximity (Top-K). Such a computation can be used for ZIP code-based target marketing as one of the smart city applications. A naïve approach to this computation is the usage of the distance between ZIP codes. We redefine a distance metric combining the centroid distance with the intersecting road network between ZIP codes by using a weighted sum method. Furthermore, we prove that the results of our combined approach conform to the characteristics of distance measurement. We have proposed a general and heuristic approach for computing Ad-Hoc proximity, while for computing Top-K proximity, we have proposed a general approach only. Our experimental results indicate that our approaches are verifiable and effective in reducing the execution time and search space. PMID:29587366
Efficient Proximity Computation Techniques Using ZIP Code Data for Smart Cities †.

PubMed

Murdani, Muhammad Harist; Kwon, Joonho; Choi, Yoon-Ho; Hong, Bonghee

2018-03-24

In this paper, we are interested in computing ZIP code proximity from two perspectives, proximity between two ZIP codes ( Ad-Hoc ) and neighborhood proximity ( Top-K ). Such a computation can be used for ZIP code-based target marketing as one of the smart city applications. A naïve approach to this computation is the usage of the distance between ZIP codes. We redefine a distance metric combining the centroid distance with the intersecting road network between ZIP codes by using a weighted sum method. Furthermore, we prove that the results of our combined approach conform to the characteristics of distance measurement. We have proposed a general and heuristic approach for computing Ad-Hoc proximity, while for computing Top-K proximity, we have proposed a general approach only. Our experimental results indicate that our approaches are verifiable and effective in reducing the execution time and search space.
The future of PanDA in ATLAS distributed computing

NASA Astrophysics Data System (ADS)

De, K.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.

2015-12-01

Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyse the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. In the process, the old batch job paradigm of locally managed computing in HEP was discarded in favour of a far more automated, flexible and scalable model. The success of PanDA in ATLAS is leading to widespread adoption and testing by other experiments. PanDA is the first exascale workload management system in HEP, already operating at more than a million computing jobs per day, and processing over an exabyte of data in 2013. There are many new challenges that PanDA will face in the near future, in addition to new challenges of scale, heterogeneity and increasing user base. PanDA will need to handle rapidly changing computing infrastructure, will require factorization of code for easier deployment, will need to incorporate additional information sources including network metrics in decision making, be able to control network circuits, handle dynamically sized workload processing, provide improved visualization, and face many other challenges. In this talk we will focus on the new features, planned or recently implemented, that are relevant to the next decade of distributed computing workload management using PanDA.
Exploring a QoS Driven Scheduling Approach for Peer-to-Peer Live Streaming Systems with Network Coding

PubMed Central

Cui, Laizhong; Lu, Nan; Chen, Fu

2014-01-01

Most large-scale peer-to-peer (P2P) live streaming systems use mesh to organize peers and leverage pull scheduling to transmit packets for providing robustness in dynamic environment. The pull scheduling brings large packet delay. Network coding makes the push scheduling feasible in mesh P2P live streaming and improves the efficiency. However, it may also introduce some extra delays and coding computational overhead. To improve the packet delay, streaming quality, and coding overhead, in this paper are as follows. we propose a QoS driven push scheduling approach. The main contributions of this paper are: (i) We introduce a new network coding method to increase the content diversity and reduce the complexity of scheduling; (ii) we formulate the push scheduling as an optimization problem and transform it to a min-cost flow problem for solving it in polynomial time; (iii) we propose a push scheduling algorithm to reduce the coding overhead and do extensive experiments to validate the effectiveness of our approach. Compared with previous approaches, the simulation results demonstrate that packet delay, continuity index, and coding ratio of our system can be significantly improved, especially in dynamic environments. PMID:25114968

Center for Center for Technology for Advanced Scientific Component Software (TASCS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kostadin, Damevski

A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technologymore » for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.« less
OpenGeoSys: Performance-Oriented Computational Methods for Numerical Modeling of Flow in Large Hydrogeological Systems

NASA Astrophysics Data System (ADS)

Naumov, D.; Fischer, T.; Böttcher, N.; Watanabe, N.; Walther, M.; Rink, K.; Bilke, L.; Shao, H.; Kolditz, O.

2014-12-01

OpenGeoSys (OGS) is a scientific open source code for numerical simulation of thermo-hydro-mechanical-chemical processes in porous and fractured media. Its basic concept is to provide a flexible numerical framework for solving multi-field problems for applications in geoscience and hydrology as e.g. for CO2 storage applications, geothermal power plant forecast simulation, salt water intrusion, water resources management, etc. Advances in computational mathematics have revolutionized the variety and nature of the problems that can be addressed by environmental scientists and engineers nowadays and an intensive code development in the last years enables in the meantime the solutions of much larger numerical problems and applications. However, solving environmental processes along the water cycle at large scales, like for complete catchment or reservoirs, stays computationally still a challenging task. Therefore, we started a new OGS code development with focus on execution speed and parallelization. In the new version, a local data structure concept improves the instruction and data cache performance by a tight bundling of data with an element-wise numerical integration loop. Dedicated analysis methods enable the investigation of memory-access patterns in the local and global assembler routines, which leads to further data structure optimization for an additional performance gain. The concept is presented together with a technical code analysis of the recent development and a large case study including transient flow simulation in the unsaturated / saturated zone of the Thuringian Syncline, Germany. The analysis is performed on a high-resolution mesh (up to 50M elements) with embedded fault structures.
Fault-tolerance thresholds for the surface code with fabrication errors

NASA Astrophysics Data System (ADS)

Auger, James M.; Anwar, Hussain; Gimeno-Segovia, Mercedes; Stace, Thomas M.; Browne, Dan E.

2017-10-01

The construction of topological error correction codes requires the ability to fabricate a lattice of physical qubits embedded on a manifold with a nontrivial topology such that the quantum information is encoded in the global degrees of freedom (i.e., the topology) of the manifold. However, the manufacturing of large-scale topological devices will undoubtedly suffer from fabrication errors—permanent faulty components such as missing physical qubits or failed entangling gates—introducing permanent defects into the topology of the lattice and hence significantly reducing the distance of the code and the quality of the encoded logical qubits. In this work we investigate how fabrication errors affect the performance of topological codes, using the surface code as the test bed. A known approach to mitigate defective lattices involves the use of primitive swap gates in a long sequence of syndrome extraction circuits. Instead, we show that in the presence of fabrication errors the syndrome can be determined using the supercheck operator approach and the outcome of the defective gauge stabilizer generators without any additional computational overhead or use of swap gates. We report numerical fault-tolerance thresholds in the presence of both qubit fabrication and gate fabrication errors using a circuit-based noise model and the minimum-weight perfect-matching decoder. Our numerical analysis is most applicable to two-dimensional chip-based technologies, but the techniques presented here can be readily extended to other topological architectures. We find that in the presence of 8 % qubit fabrication errors, the surface code can still tolerate a computational error rate of up to 0.1 % .
Programming Probabilistic Structural Analysis for Parallel Processing Computer

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

1991-01-01

The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.
Electromagnetic plasma simulation in realistic geometries

NASA Astrophysics Data System (ADS)

Brandon, S.; Ambrosiano, J. J.; Nielsen, D.

1991-08-01

Particle-in-Cell (PIC) calculations have become an indispensable tool to model the nonlinear collective behavior of charged particle species in electromagnetic fields. Traditional finite difference codes, such as CONDOR (2-D) and ARGUS (3-D), are used extensively to design experiments and develop new concepts. A wide variety of physical processes can be modeled simply and efficiently by these codes. However, experiments have become more complex. Geometrical shapes and length scales are becoming increasingly more difficult to model. Spatial resolution requirements for the electromagnetic calculation force large grids and small time steps. Many hours of CRAY YMP time may be required to complete 2-D calculation -- many more for 3-D calculations. In principle, the number of mesh points and particles need only to be increased until all relevant physical processes are resolved. In practice, the size of a calculation is limited by the computer budget. As a result, experimental design is being limited by the ability to calculate, not by the experimenters ingenuity or understanding of the physical processes involved. Several approaches to meet these computational demands are being pursued. Traditional PIC codes continue to be the major design tools. These codes are being actively maintained, optimized, and extended to handle large and more complex problems. Two new formulations are being explored to relax the geometrical constraints of the finite difference codes. A modified finite volume test code, TALUS, uses a data structure compatible with that of standard finite difference meshes. This allows a basic conformal boundary/variable grid capability to be retrofitted to CONDOR. We are also pursuing an unstructured grid finite element code, MadMax. The unstructured mesh approach provides maximum flexibility in the geometrical model while also allowing local mesh refinement.
Extending the length and time scales of Gram–Schmidt Lyapunov vector computations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Costa, Anthony B., E-mail: acosta@northwestern.edu; Green, Jason R., E-mail: jason.green@umb.edu; Department of Chemistry, University of Massachusetts Boston, Boston, MA 02125

Lyapunov vectors have found growing interest recently due to their ability to characterize systems out of thermodynamic equilibrium. The computation of orthogonal Gram–Schmidt vectors requires multiplication and QR decomposition of large matrices, which grow as N{sup 2} (with the particle count). This expense has limited such calculations to relatively small systems and short time scales. Here, we detail two implementations of an algorithm for computing Gram–Schmidt vectors. The first is a distributed-memory message-passing method using Scalapack. The second uses the newly-released MAGMA library for GPUs. We compare the performance of both codes for Lennard–Jones fluids from N=100 to 1300 betweenmore » Intel Nahalem/Infiniband DDR and NVIDIA C2050 architectures. To our best knowledge, these are the largest systems for which the Gram–Schmidt Lyapunov vectors have been computed, and the first time their calculation has been GPU-accelerated. We conclude that Lyapunov vector calculations can be significantly extended in length and time by leveraging the power of GPU-accelerated linear algebra.« less
Volume accumulator design analysis computer codes

NASA Technical Reports Server (NTRS)

Whitaker, W. D.; Shimazaki, T. T.

1973-01-01

The computer codes, VANEP and VANES, were written and used to aid in the design and performance calculation of the volume accumulator units (VAU) for the 5-kwe reactor thermoelectric system. VANEP computes the VAU design which meets the primary coolant loop VAU volume and pressure performance requirements. VANES computes the performance of the VAU design, determined from the VANEP code, at the conditions of the secondary coolant loop. The codes can also compute the performance characteristics of the VAU's under conditions of possible modes of failure which still permit continued system operation.
"Hour of Code": Can It Change Students' Attitudes toward Programming?

ERIC Educational Resources Information Center

Du, Jie; Wimmer, Hayden; Rada, Roy

2016-01-01

The Hour of Code is a one-hour introduction to computer science organized by Code.org, a non-profit dedicated to expanding participation in computer science. This study investigated the impact of the Hour of Code on students' attitudes towards computer programming and their knowledge of programming. A sample of undergraduate students from two…
A compositional reservoir simulator on distributed memory parallel computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rame, M.; Delshad, M.

1995-12-31

This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Validation of the "HAMP" mapping algorithm: a tool for long-term trauma research studies in the conversion of AIS 2005 to AIS 98.

PubMed

Adams, Derk; Schreuder, Astrid B; Salottolo, Kristin; Settell, April; Goss, J Richard

2011-07-01

There are significant changes in the abbreviated injury scale (AIS) 2005 system, which make it impractical to compare patients coded in AIS version 98 with patients coded in AIS version 2005. Harborview Medical Center created a computer algorithm "Harborview AIS Mapping Program (HAMP)" to automatically convert AIS 2005 to AIS 98 injury codes. The mapping was validated using 6 months of double-coded patient injury records from a Level I Trauma Center. HAMP was used to determine how closely individual AIS and injury severity scores (ISS) were converted from AIS 2005 to AIS 98 versions. The kappa statistic was used to measure the agreement between manually determined codes and HAMP-derived codes. Seven hundred forty-nine patient records were used for validation. For the conversion of AIS codes, the measure of agreement between HAMP and manually determined codes was [kappa] = 0.84 (95% confidence interval, 0.82-0.86). The algorithm errors were smaller in magnitude than the manually determined coding errors. For the conversion of ISS, the agreement between HAMP versus manually determined ISS was [kappa] = 0.81 (95% confidence interval, 0.78-0.84). The HAMP algorithm successfully converted injuries coded in AIS 2005 to AIS 98. This algorithm will be useful when comparing trauma patient clinical data across populations coded in different versions, especially for longitudinal studies.
Scalability improvements to NRLMOL for DFT calculations of large molecules

NASA Astrophysics Data System (ADS)

Diaz, Carlos Manuel

Advances in high performance computing (HPC) have provided a way to treat large, computationally demanding tasks using thousands of processors. With the development of more powerful HPC architectures, the need to create efficient and scalable code has grown more important. Electronic structure calculations are valuable in understanding experimental observations and are routinely used for new materials predictions. For the electronic structure calculations, the memory and computation time are proportional to the number of atoms. Memory requirements for these calculations scale as N2, where N is the number of atoms. While the recent advances in HPC offer platforms with large numbers of cores, the limited amount of memory available on a given node and poor scalability of the electronic structure code hinder their efficient usage of these platforms. This thesis will present some developments to overcome these bottlenecks in order to study large systems. These developments, which are implemented in the NRLMOL electronic structure code, involve the use of sparse matrix storage formats and the use of linear algebra using sparse and distributed matrices. These developments along with other related development now allow ground state density functional calculations using up to 25,000 basis functions and the excited state calculations using up to 17,000 basis functions while utilizing all cores on a node. An example on a light-harvesting triad molecule is described. Finally, future plans to further improve the scalability will be presented.
Talking about Code: Integrating Pedagogical Code Reviews into Early Computing Courses

ERIC Educational Resources Information Center

Hundhausen, Christopher D.; Agrawal, Anukrati; Agarwal, Pawan

2013-01-01

Given the increasing importance of soft skills in the computing profession, there is good reason to provide students withmore opportunities to learn and practice those skills in undergraduate computing courses. Toward that end, we have developed an active learning approach for computing education called the "Pedagogical Code Review"…
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

NASA Astrophysics Data System (ADS)

Simunovic, S.; Zacharia, T.; Baltas, N.; Spalding, D. B.

1995-03-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. The Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, S.; Zacharia, T.; Baltas, N.

1995-04-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. Themore » Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.« less
Efficient exploration of cosmology dependence in the EFT of LSS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cataneo, Matteo; Foreman, Simon; Senatore, Leonardo, E-mail: matteoc@dark-cosmology.dk, E-mail: sfore@stanford.edu, E-mail: senatore@stanford.edu

The most effective use of data from current and upcoming large scale structure (LSS) and CMB observations requires the ability to predict the clustering of LSS with very high precision. The Effective Field Theory of Large Scale Structure (EFTofLSS) provides an instrument for performing analytical computations of LSS observables with the required precision in the mildly nonlinear regime. In this paper, we develop efficient implementations of these computations that allow for an exploration of their dependence on cosmological parameters. They are based on two ideas. First, once an observable has been computed with high precision for a reference cosmology, formore » a new cosmology the same can be easily obtained with comparable precision just by adding the difference in that observable, evaluated with much less precision. Second, most cosmologies of interest are sufficiently close to the Planck best-fit cosmology that observables can be obtained from a Taylor expansion around the reference cosmology. These ideas are implemented for the matter power spectrum at two loops and are released as public codes. When applied to cosmologies that are within 3σ of the Planck best-fit model, the first method evaluates the power spectrum in a few minutes on a laptop, with results that have 1% or better precision, while with the Taylor expansion the same quantity is instantly generated with similar precision. The ideas and codes we present may easily be extended for other applications or higher-precision results.« less
Efficient exploration of cosmology dependence in the EFT of LSS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cataneo, Matteo; Foreman, Simon; Senatore, Leonardo

The most effective use of data from current and upcoming large scale structure (LSS) and CMB observations requires the ability to predict the clustering of LSS with very high precision. The Effective Field Theory of Large Scale Structure (EFTofLSS) provides an instrument for performing analytical computations of LSS observables with the required precision in the mildly nonlinear regime. In this paper, we develop efficient implementations of these computations that allow for an exploration of their dependence on cosmological parameters. They are based on two ideas. First, once an observable has been computed with high precision for a reference cosmology, formore » a new cosmology the same can be easily obtained with comparable precision just by adding the difference in that observable, evaluated with much less precision. Second, most cosmologies of interest are sufficiently close to the Planck best-fit cosmology that observables can be obtained from a Taylor expansion around the reference cosmology. These ideas are implemented for the matter power spectrum at two loops and are released as public codes. When applied to cosmologies that are within 3σ of the Planck best-fit model, the first method evaluates the power spectrum in a few minutes on a laptop, with results that have 1% or better precision, while with the Taylor expansion the same quantity is instantly generated with similar precision. Finally, the ideas and codes we present may easily be extended for other applications or higher-precision results.« less
Large-scale molecular dynamics simulation of DNA: implementation and validation of the AMBER98 force field in LAMMPS.

PubMed

Grindon, Christina; Harris, Sarah; Evans, Tom; Novik, Keir; Coveney, Peter; Laughton, Charles

2004-07-15

Molecular modelling played a central role in the discovery of the structure of DNA by Watson and Crick. Today, such modelling is done on computers: the more powerful these computers are, the more detailed and extensive can be the study of the dynamics of such biological macromolecules. To fully harness the power of modern massively parallel computers, however, we need to develop and deploy algorithms which can exploit the structure of such hardware. The Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is a scalable molecular dynamics code including long-range Coulomb interactions, which has been specifically designed to function efficiently on parallel platforms. Here we describe the implementation of the AMBER98 force field in LAMMPS and its validation for molecular dynamics investigations of DNA structure and flexibility against the benchmark of results obtained with the long-established code AMBER6 (Assisted Model Building with Energy Refinement, version 6). Extended molecular dynamics simulations on the hydrated DNA dodecamer d(CTTTTGCAAAAG)(2), which has previously been the subject of extensive dynamical analysis using AMBER6, show that it is possible to obtain excellent agreement in terms of static, dynamic and thermodynamic parameters between AMBER6 and LAMMPS. In comparison with AMBER6, LAMMPS shows greatly improved scalability in massively parallel environments, opening up the possibility of efficient simulations of order-of-magnitude larger systems and/or for order-of-magnitude greater simulation times.
Efficient exploration of cosmology dependence in the EFT of LSS

DOE PAGES

Cataneo, Matteo; Foreman, Simon; Senatore, Leonardo

2017-04-18

The most effective use of data from current and upcoming large scale structure (LSS) and CMB observations requires the ability to predict the clustering of LSS with very high precision. The Effective Field Theory of Large Scale Structure (EFTofLSS) provides an instrument for performing analytical computations of LSS observables with the required precision in the mildly nonlinear regime. In this paper, we develop efficient implementations of these computations that allow for an exploration of their dependence on cosmological parameters. They are based on two ideas. First, once an observable has been computed with high precision for a reference cosmology, formore » a new cosmology the same can be easily obtained with comparable precision just by adding the difference in that observable, evaluated with much less precision. Second, most cosmologies of interest are sufficiently close to the Planck best-fit cosmology that observables can be obtained from a Taylor expansion around the reference cosmology. These ideas are implemented for the matter power spectrum at two loops and are released as public codes. When applied to cosmologies that are within 3σ of the Planck best-fit model, the first method evaluates the power spectrum in a few minutes on a laptop, with results that have 1% or better precision, while with the Taylor expansion the same quantity is instantly generated with similar precision. Finally, the ideas and codes we present may easily be extended for other applications or higher-precision results.« less
An Eye Model for Computational Dosimetry Using A Multi-Scale Voxel Phantom

NASA Astrophysics Data System (ADS)

Caracappa, Peter F.; Rhodes, Ashley; Fiedler, Derek

2014-06-01

The lens of the eye is a radiosensitive tissue with cataract formation being the major concern. Recently reduced recommended dose limits to the lens of the eye have made understanding the dose to this tissue of increased importance. Due to memory limitations, the voxel resolution of computational phantoms used for radiation dose calculations is too large to accurately represent the dimensions of the eye. A revised eye model is constructed using physiological data for the dimensions of radiosensitive tissues, and is then transformed into a high-resolution voxel model. This eye model is combined with an existing set of whole body models to form a multi-scale voxel phantom, which is used with the MCNPX code to calculate radiation dose from various exposure types. This phantom provides an accurate representation of the radiation transport through the structures of the eye. Two alternate methods of including a high-resolution eye model within an existing whole body model are developed. The accuracy and performance of each method is compared against existing computational phantoms.
DENA: A Configurable Microarchitecture and Design Flow for Biomedical DNA-Based Logic Design.

PubMed

Beiki, Zohre; Jahanian, Ali

2017-10-01

DNA is known as the building block for storing the life codes and transferring the genetic features through the generations. However, it is found that DNA strands can be used for a new type of computation that opens fascinating horizons in computational medicine. Significant contributions are addressed on design of DNA-based logic gates for medical and computational applications but there are serious challenges for designing the medium and large-scale DNA circuits. In this paper, a new microarchitecture and corresponding design flow is proposed to facilitate the design of multistage large-scale DNA logic systems. Feasibility and efficiency of the proposed microarchitecture are evaluated by implementing a full adder and, then, its cascadability is determined by implementing a multistage 8-bit adder. Simulation results show the highlight features of the proposed design style and microarchitecture in terms of the scalability, implementation cost, and signal integrity of the DNA-based logic system compared to the traditional approaches.

Guidelines for developing vectorizable computer programs

NASA Technical Reports Server (NTRS)

Miner, E. W.

1982-01-01

Some fundamental principles for developing computer programs which are compatible with array-oriented computers are presented. The emphasis is on basic techniques for structuring computer codes which are applicable in FORTRAN and do not require a special programming language or exact a significant penalty on a scalar computer. Researchers who are using numerical techniques to solve problems in engineering can apply these basic principles and thus develop transportable computer programs (in FORTRAN) which contain much vectorizable code. The vector architecture of the ASC is discussed so that the requirements of array processing can be better appreciated. The "vectorization" of a finite-difference viscous shock-layer code is used as an example to illustrate the benefits and some of the difficulties involved. Increases in computing speed with vectorization are illustrated with results from the viscous shock-layer code and from a finite-element shock tube code. The applicability of these principles was substantiated through running programs on other computers with array-associated computing characteristics, such as the Hewlett-Packard (H-P) 1000-F.
The Helicopter Antenna Radiation Prediction Code (HARP)

NASA Technical Reports Server (NTRS)

Klevenow, F. T.; Lynch, B. G.; Newman, E. H.; Rojas, R. G.; Scheick, J. T.; Shamansky, H. T.; Sze, K. Y.

1990-01-01

The first nine months effort in the development of a user oriented computer code, referred to as the HARP code, for analyzing the radiation from helicopter antennas is described. The HARP code uses modern computer graphics to aid in the description and display of the helicopter geometry. At low frequencies the helicopter is modeled by polygonal plates, and the method of moments is used to compute the desired patterns. At high frequencies the helicopter is modeled by a composite ellipsoid and flat plates, and computations are made using the geometrical theory of diffraction. The HARP code will provide a user friendly interface, employing modern computer graphics, to aid the user to describe the helicopter geometry, select the method of computation, construct the desired high or low frequency model, and display the results.
Scaling Critical Zone analysis tasks from desktop to the cloud utilizing contemporary distributed computing and data management approaches: A case study for project based learning of Cyberinfrastructure concepts

NASA Astrophysics Data System (ADS)

Swetnam, T. L.; Pelletier, J. D.; Merchant, N.; Callahan, N.; Lyons, E.

2015-12-01

Earth science is making rapid advances through effective utilization of large-scale data repositories such as aerial LiDAR and access to NSF-funded cyberinfrastructures (e.g. the OpenTopography.org data portal, iPlant Collaborative, and XSEDE). Scaling analysis tasks that are traditionally developed using desktops, laptops or computing clusters to effectively leverage national and regional scale cyberinfrastructure pose unique challenges and barriers to adoption. To address some of these challenges in Fall 2014 an 'Applied Cyberinfrastructure Concepts' a project-based learning course (ISTA 420/520) at the University of Arizona focused on developing scalable models of 'Effective Energy and Mass Transfer' (EEMT, MJ m-2 yr-1) for use by the NSF Critical Zone Observatories (CZO) project. EEMT is a quantitative measure of the flux of available energy to the critical zone, and its computation involves inputs that have broad applicability (e.g. solar insolation). The course comprised of 25 students with varying level of computational skills and with no prior domain background in the geosciences, collaborated with domain experts to develop the scalable workflow. The original workflow relying on open-source QGIS platform on a laptop was scaled to effectively utilize cloud environments (Openstack), UA Campus HPC systems, iRODS, and other XSEDE and OSG resources. The project utilizes public data, e.g. DEMs produced by OpenTopography.org and climate data from Daymet, which are processed using GDAL, GRASS and SAGA and the Makeflow and Work-queue task management software packages. Students were placed into collaborative groups to develop the separate aspects of the project. They were allowed to change teams, alter workflows, and design and develop novel code. The students were able to identify all necessary dependencies, recompile source onto the target execution platforms, and demonstrate a functional workflow, which was further improved upon by one of the group leaders over Spring 2015. All of the code, documentation and workflow description are currently available on GitHub and a public data portal is in development. We present a case study of how students reacted to the challenge of a real science problem, their interactions with end-users, what went right, and what could be done better in the future.
HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

PubMed

Wan, Shixiang; Zou, Quan

2017-01-01

Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
Enhanced fault-tolerant quantum computing in d-level systems.

PubMed

Campbell, Earl T

2014-12-05

Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ∼d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d.
Convergence acceleration of the Proteus computer code with multigrid methods

NASA Technical Reports Server (NTRS)

Demuren, A. O.; Ibraheem, S. O.

1992-01-01

Presented here is the first part of a study to implement convergence acceleration techniques based on the multigrid concept in the Proteus computer code. A review is given of previous studies on the implementation of multigrid methods in computer codes for compressible flow analysis. Also presented is a detailed stability analysis of upwind and central-difference based numerical schemes for solving the Euler and Navier-Stokes equations. Results are given of a convergence study of the Proteus code on computational grids of different sizes. The results presented here form the foundation for the implementation of multigrid methods in the Proteus code.
Implementation of radiation shielding calculation methods. Volume 1: Synopsis of methods and summary of results

NASA Technical Reports Server (NTRS)

Capo, M. A.; Disney, R. K.

1971-01-01

The work performed in the following areas is summarized: (1) Analysis of Realistic nuclear-propelled vehicle was analyzed using the Marshall Space Flight Center computer code package. This code package includes one and two dimensional discrete ordinate transport, point kernel, and single scatter techniques, as well as cross section preparation and data processing codes, (2) Techniques were developed to improve the automated data transfer in the coupled computation method of the computer code package and improve the utilization of this code package on the Univac-1108 computer system. (3) The MSFC master data libraries were updated.
Nonuniform code concatenation for universal fault-tolerant quantum computing

NASA Astrophysics Data System (ADS)

Nikahd, Eesa; Sedighi, Mehdi; Saheb Zamani, Morteza

2017-09-01

Using transversal gates is a straightforward and efficient technique for fault-tolerant quantum computing. Since transversal gates alone cannot be computationally universal, they must be combined with other approaches such as magic state distillation, code switching, or code concatenation to achieve universality. In this paper we propose an alternative approach for universal fault-tolerant quantum computing, mainly based on the code concatenation approach proposed in [T. Jochym-O'Connor and R. Laflamme, Phys. Rev. Lett. 112, 010505 (2014), 10.1103/PhysRevLett.112.010505], but in a nonuniform fashion. The proposed approach is described based on nonuniform concatenation of the 7-qubit Steane code with the 15-qubit Reed-Muller code, as well as the 5-qubit code with the 15-qubit Reed-Muller code, which lead to two 49-qubit and 47-qubit codes, respectively. These codes can correct any arbitrary single physical error with the ability to perform a universal set of fault-tolerant gates, without using magic state distillation.
Parallel-vector computation for structural analysis and nonlinear unconstrained optimization problems

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.

1990-01-01

Practical engineering application can often be formulated in the form of a constrained optimization problem. There are several solution algorithms for solving a constrained optimization problem. One approach is to convert a constrained problem into a series of unconstrained problems. Furthermore, unconstrained solution algorithms can be used as part of the constrained solution algorithms. Structural optimization is an iterative process where one starts with an initial design, a finite element structure analysis is then performed to calculate the response of the system (such as displacements, stresses, eigenvalues, etc.). Based upon the sensitivity information on the objective and constraint functions, an optimizer such as ADS or IDESIGN, can be used to find the new, improved design. For the structural analysis phase, the equation solver for the system of simultaneous, linear equations plays a key role since it is needed for either static, or eigenvalue, or dynamic analysis. For practical, large-scale structural analysis-synthesis applications, computational time can be excessively large. Thus, it is necessary to have a new structural analysis-synthesis code which employs new solution algorithms to exploit both parallel and vector capabilities offered by modern, high performance computers such as the Convex, Cray-2 and Cray-YMP computers. The objective of this research project is, therefore, to incorporate the latest development in the parallel-vector equation solver, PVSOLVE into the widely popular finite-element production code, such as the SAP-4. Furthermore, several nonlinear unconstrained optimization subroutines have also been developed and tested under a parallel computer environment. The unconstrained optimization subroutines are not only useful in their own right, but they can also be incorporated into a more popular constrained optimization code, such as ADS.
The use of imprecise processing to improve accuracy in weather & climate prediction

NASA Astrophysics Data System (ADS)

Düben, Peter D.; McNamara, Hugh; Palmer, T. N.

2014-08-01

The use of stochastic processing hardware and low precision arithmetic in atmospheric models is investigated. Stochastic processors allow hardware-induced faults in calculations, sacrificing bit-reproducibility and precision in exchange for improvements in performance and potentially accuracy of forecasts, due to a reduction in power consumption that could allow higher resolution. A similar trade-off is achieved using low precision arithmetic, with improvements in computation and communication speed and savings in storage and memory requirements. As high-performance computing becomes more massively parallel and power intensive, these two approaches may be important stepping stones in the pursuit of global cloud-resolving atmospheric modelling. The impact of both hardware induced faults and low precision arithmetic is tested using the Lorenz '96 model and the dynamical core of a global atmosphere model. In the Lorenz '96 model there is a natural scale separation; the spectral discretisation used in the dynamical core also allows large and small scale dynamics to be treated separately within the code. Such scale separation allows the impact of lower-accuracy arithmetic to be restricted to components close to the truncation scales and hence close to the necessarily inexact parametrised representations of unresolved processes. By contrast, the larger scales are calculated using high precision deterministic arithmetic. Hardware faults from stochastic processors are emulated using a bit-flip model with different fault rates. Our simulations show that both approaches to inexact calculations do not substantially affect the large scale behaviour, provided they are restricted to act only on smaller scales. By contrast, results from the Lorenz '96 simulations are superior when small scales are calculated on an emulated stochastic processor than when those small scales are parametrised. This suggests that inexact calculations at the small scale could reduce computation and power costs without adversely affecting the quality of the simulations. This would allow higher resolution models to be run at the same computational cost.
Development of a GPU Compatible Version of the Fast Radiation Code RRTMG

NASA Astrophysics Data System (ADS)

Iacono, M. J.; Mlawer, E. J.; Berthiaume, D.; Cady-Pereira, K. E.; Suarez, M.; Oreopoulos, L.; Lee, D.

2012-12-01

The absorption of solar radiation and emission/absorption of thermal radiation are crucial components of the physics that drive Earth's climate and weather. Therefore, accurate radiative transfer calculations are necessary for realistic climate and weather simulations. Efficient radiation codes have been developed for this purpose, but their accuracy requirements still necessitate that as much as 30% of the computational time of a GCM is spent computing radiative fluxes and heating rates. The overall computational expense constitutes a limitation on a GCM's predictive ability if it becomes an impediment to adding new physics to or increasing the spatial and/or vertical resolution of the model. The emergence of Graphics Processing Unit (GPU) technology, which will allow the parallel computation of multiple independent radiative calculations in a GCM, will lead to a fundamental change in the competition between accuracy and speed. Processing time previously consumed by radiative transfer will now be available for the modeling of other processes, such as physics parameterizations, without any sacrifice in the accuracy of the radiative transfer. Furthermore, fast radiation calculations can be performed much more frequently and will allow the modeling of radiative effects of rapid changes in the atmosphere. The fast radiation code RRTMG, developed at Atmospheric and Environmental Research (AER), is utilized operationally in many dynamical models throughout the world. We will present the results from the first stage of an effort to create a version of the RRTMG radiation code designed to run efficiently in a GPU environment. This effort will focus on the RRTMG implementation in GEOS-5. RRTMG has an internal pseudo-spectral vector of length of order 100 that, when combined with the much greater length of the global horizontal grid vector from which the radiation code is called in GEOS-5, makes RRTMG/GEOS-5 particularly suited to achieving a significant speed improvement through GPU technology. This large number of independent cases will allow us to take full advantage of the computational power of the latest GPUs, ensuring that all thread cores in the GPU remain active, a key criterion for obtaining significant speedup. The CUDA (Compute Unified Device Architecture) Fortran compiler developed by PGI and Nvidia will allow us to construct this parallel implementation on the GPU while remaining in the Fortran language. This implementation will scale very well across various CUDA-supported GPUs such as the recently released Fermi Nvidia cards. We will present the computational speed improvements of the GPU-compatible code relative to the standard CPU-based RRTMG with respect to a very large and diverse suite of atmospheric profiles. This suite will also be utilized to demonstrate the minimal impact of the code restructuring on the accuracy of radiation calculations. The GPU-compatible version of RRTMG will be directly applicable to future versions of GEOS-5, but it is also likely to provide significant associated benefits for other GCMs that employ RRTMG.
Implementation of a 3D version of ponderomotive guiding center solver in particle-in-cell code OSIRIS

NASA Astrophysics Data System (ADS)

Helm, Anton; Vieira, Jorge; Silva, Luis; Fonseca, Ricardo

2016-10-01

Laser-driven accelerators gained an increased attention over the past decades. Typical modeling techniques for laser wakefield acceleration (LWFA) are based on particle-in-cell (PIC) simulations. PIC simulations, however, are very computationally expensive due to the disparity of the relevant scales ranging from the laser wavelength, in the micrometer range, to the acceleration length, currently beyond the ten centimeter range. To minimize the gap between these despair scales the ponderomotive guiding center (PGC) algorithm is a promising approach. By describing the evolution of the laser pulse envelope separately, only the scales larger than the plasma wavelength are required to be resolved in the PGC algorithm, leading to speedups in several orders of magnitude. Previous work was limited to two dimensions. Here we present the implementation of the 3D version of a PGC solver into the massively parallel, fully relativistic PIC code OSIRIS. We extended the solver to include periodic boundary conditions and parallelization in all spatial dimensions. We present benchmarks for distributed and shared memory parallelization. We also discuss the stability of the PGC solver.
The accurate particle tracer code

NASA Astrophysics Data System (ADS)

Wang, Yulei; Liu, Jian; Qin, Hong; Yu, Zhi; Yao, Yicun

2017-11-01

The Accurate Particle Tracer (APT) code is designed for systematic large-scale applications of geometric algorithms for particle dynamical simulations. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and nonlinear problems. To provide a flexible and convenient I/O interface, the libraries of Lua and Hdf5 are used. Following a three-step procedure, users can efficiently extend the libraries of electromagnetic configurations, external non-electromagnetic forces, particle pushers, and initialization approaches by use of the extendible module. APT has been used in simulations of key physical problems, such as runaway electrons in tokamaks and energetic particles in Van Allen belt. As an important realization, the APT-SW version has been successfully distributed on the world's fastest computer, the Sunway TaihuLight supercomputer, by supporting master-slave architecture of Sunway many-core processors. Based on large-scale simulations of a runaway beam under parameters of the ITER tokamak, it is revealed that the magnetic ripple field can disperse the pitch-angle distribution significantly and improve the confinement of energetic runaway beam on the same time.
Advances in Parallelization for Large Scale Oct-Tree Mesh Generation

NASA Technical Reports Server (NTRS)

O'Connell, Matthew; Karman, Steve L.

2015-01-01

Despite great advancements in the parallelization of numerical simulation codes over the last 20 years, it is still common to perform grid generation in serial. Generating large scale grids in serial often requires using special "grid generation" compute machines that can have more than ten times the memory of average machines. While some parallel mesh generation techniques have been proposed, generating very large meshes for LES or aeroacoustic simulations is still a challenging problem. An automated method for the parallel generation of very large scale off-body hierarchical meshes is presented here. This work enables large scale parallel generation of off-body meshes by using a novel combination of parallel grid generation techniques and a hybrid "top down" and "bottom up" oct-tree method. Meshes are generated using hardware commonly found in parallel compute clusters. The capability to generate very large meshes is demonstrated by the generation of off-body meshes surrounding complex aerospace geometries. Results are shown including a one billion cell mesh generated around a Predator Unmanned Aerial Vehicle geometry, which was generated on 64 processors in under 45 minutes.
Multi-Functional Composite Fatigue

NASA Technical Reports Server (NTRS)

Minnetyan, Levon; Chamis, Christos C.

2008-01-01

Damage and fracture of composites subjected to monotonically increasing static, tension-tension cyclic, pressurization, and flexural cyclic loading are evaluated via a recently developed composite mechanics code that allows the user to focus on composite response at infinitely small scales. Constituent material properties, stress and strain limits are scaled up to the laminate level to evaluate the overall damage and durability. Results show the number of cycles to failure at different temperatures. A procedure is outlined for use of computational simulation data in the assessment of damage tolerance, determination of sensitive parameters affecting fracture, and interpretation of results with insight for design decisions.
Large Eddy simulation of turbulence: A subgrid scale model including shear, vorticity, rotation, and buoyancy

NASA Technical Reports Server (NTRS)

Canuto, V. M.

1994-01-01

The Reynolds numbers that characterize geophysical and astrophysical turbulence (Re approximately equals 10(exp 8) for the planetary boundary layer and Re approximately equals 10(exp 14) for the Sun's interior) are too large to allow a direct numerical simulation (DNS) of the fundamental Navier-Stokes and temperature equations. In fact, the spatial number of grid points N approximately Re(exp 9/4) exceeds the computational capability of today's supercomputers. Alternative treatments are the ensemble-time average approach, and/or the volume average approach. Since the first method (Reynolds stress approach) is largely analytical, the resulting turbulence equations entail manageable computational requirements and can thus be linked to a stellar evolutionary code or, in the geophysical case, to general circulation models. In the volume average approach, one carries out a large eddy simulation (LES) which resolves numerically the largest scales, while the unresolved scales must be treated theoretically with a subgrid scale model (SGS). Contrary to the ensemble average approach, the LES+SGS approach has considerable computational requirements. Even if this prevents (for the time being) a LES+SGS model to be linked to stellar or geophysical codes, it is still of the greatest relevance as an 'experimental tool' to be used, inter alia, to improve the parameterizations needed in the ensemble average approach. Such a methodology has been successfully adopted in studies of the convective planetary boundary layer. Experienc e with the LES+SGS approach from different fields has shown that its reliability depends on the healthiness of the SGS model for numerical stability as well as for physical completeness. At present, the most widely used SGS model, the Smagorinsky model, accounts for the effect of the shear induced by the large resolved scales on the unresolved scales but does not account for the effects of buoyancy, anisotropy, rotation, and stable stratification. The latter phenomenon, which affects both geophysical and astrophysical turbulence (e.g., oceanic structure and convective overshooting in stars), has been singularly difficult to account for in turbulence modeling. For example, the widely used model of Deardorff has not been confirmed by recent LES results. As of today, there is no SGS model capable of incorporating buoyancy, rotation, shear, anistropy, and stable stratification (gravity waves). In this paper, we construct such a model which we call CM (complete model). We also present a hierarchy of simpler algebraic models (called AM) of varying complexity. Finally, we present a set of models which are simplified even further (called SM), the simplest of which is the Smagorinsky-Lilly model. The incorporation of these models into the presently available LES codes should begin with the SM, to be followed by the AM and finally by the CM.
Large Eddy simulation of turbulence: A subgrid scale model including shear, vorticity, rotation, and buoyancy

NASA Astrophysics Data System (ADS)

Canuto, V. M.

1994-06-01

The Reynolds numbers that characterize geophysical and astrophysical turbulence (Re approximately equals 108 for the planetary boundary layer and Re approximately equals 1014 for the Sun's interior) are too large to allow a direct numerical simulation (DNS) of the fundamental Navier-Stokes and temperature equations. In fact, the spatial number of grid points N approximately Re9/4 exceeds the computational capability of today's supercomputers. Alternative treatments are the ensemble-time average approach, and/or the volume average approach. Since the first method (Reynolds stress approach) is largely analytical, the resulting turbulence equations entail manageable computational requirements and can thus be linked to a stellar evolutionary code or, in the geophysical case, to general circulation models. In the volume average approach, one carries out a large eddy simulation (LES) which resolves numerically the largest scales, while the unresolved scales must be treated theoretically with a subgrid scale model (SGS). Contrary to the ensemble average approach, the LES+SGS approach has considerable computational requirements. Even if this prevents (for the time being) a LES+SGS model to be linked to stellar or geophysical codes, it is still of the greatest relevance as an 'experimental tool' to be used, inter alia, to improve the parameterizations needed in the ensemble average approach. Such a methodology has been successfully adopted in studies of the convective planetary boundary layer. Experienc e with the LES+SGS approach from different fields has shown that its reliability depends on the healthiness of the SGS model for numerical stability as well as for physical completeness. At present, the most widely used SGS model, the Smagorinsky model, accounts for the effect of the shear induced by the large resolved scales on the unresolved scales but does not account for the effects of buoyancy, anisotropy, rotation, and stable stratification. The latter phenomenon, which affects both geophysical and astrophysical turbulence (e.g., oceanic structure and convective overshooting in stars), has been singularly difficult to account for in turbulence modeling. For example, the widely used model of Deardorff has not been confirmed by recent LES results. As of today, there is no SGS model capable of incorporating buoyancy, rotation, shear, anistropy, and stable stratification (gravity waves). In this paper, we construct such a model which we call CM (complete model). We also present a hierarchy of simpler algebraic models (called AM) of varying complexity. Finally, we present a set of models which are simplified even further (called SM), the simplest of which is the Smagorinsky-Lilly model. The incorporation of these models into the presently available LES codes should begin with the SM, to be followed by the AM and finally by the CM.
Green's function methods in heavy ion shielding

NASA Technical Reports Server (NTRS)

Wilson, John W.; Costen, Robert C.; Shinn, Judy L.; Badavi, Francis F.

1993-01-01

An analytic solution to the heavy ion transport in terms of Green's function is used to generate a highly efficient computer code for space applications. The efficiency of the computer code is accomplished by a nonperturbative technique extending Green's function over the solution domain. The computer code can also be applied to accelerator boundary conditions to allow code validation in laboratory experiments.
Magnetic Feature Tracking in the SDO Era: Past Sacrifices, Recent Advances, and Future Possibilities

NASA Astrophysics Data System (ADS)

Lamb, D. A.; DeForest, C. E.; Van Kooten, S.

2014-12-01

When implementing computer vision codes, a common reaction to the high angular resolution and the high cadence of SDO's image products has been to reduce the resolution and cadence of the data so that it "looks like" SOHO data. This can be partially justified on physical grounds: if the phenomenon that a computer vision code is trying to detect was characterized in low-resolution, low cadence data, then the higher quality data may not be needed. But sacrificing at least two, and sometimes all four main advantages of SDO's imaging data (the other two being a higher duty cycle and additional data products) threatens to also discard the perhaps more subtle discoveries waiting to be made: a classic baby-with-the-bath-water situation. In this presentation, we discuss some of the sacrifices made in implementing SWAMIS-EF, an automatic emerging magnetic flux region detection code for SDO/HMI, and how those sacrifices simultaneously simplified and complicated development of the code. SWAMIS-EF is a feature-finding code, and we will describe some situations and analyses in which a feature-finding code excels, and some in which a different type of algorithm may produce more favorable results. In particular, because the solar magnetic field is irreducibly complex at the currently observed spatial scales, searching for phenomena such as flux emergence using even semi-strict physical criteria often leads to large numbers of false or missed detections. This undesirable behavior can be mitigated by relaxing the imposed physical criteria, but here too there are tradeoffs: decreased numbers of missed detections may increase the number of false detections if the selection criteria are not both sensitive and specific to the searched-for phenomenon. Finally, we describe some recent steps we have taken to overcome these obstacles, by fully embracing the high resolution, high cadence SDO data, optimizing and partially parallelizing our existing code as a first step to allow fast magnetic feature tracking of full resolution HMI magnetograms. Even with the above caveats, if used correctly such a tool can provide a wealth of information on the positions, motions, and patterns of features, enabling large, cross-scale analyses that can answer important questions related to the solar dynamo and to coronal heating.
TOPICAL REVIEW: Advances and challenges in computational plasma science

NASA Astrophysics Data System (ADS)

Tang, W. M.; Chan, V. S.

2005-02-01

Scientific simulation, which provides a natural bridge between theory and experiment, is an essential tool for understanding complex plasma behaviour. Recent advances in simulations of magnetically confined plasmas are reviewed in this paper, with illustrative examples, chosen from associated research areas such as microturbulence, magnetohydrodynamics and other topics. Progress has been stimulated, in particular, by the exponential growth of computer speed along with significant improvements in computer technology. The advances in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics have produced increasingly good agreement between experimental observations and computational modelling. This was enabled by two key factors: (a) innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales and (b) access to powerful new computational resources. Excellent progress has been made in developing codes for which computer run-time and problem-size scale well with the number of processors on massively parallel processors (MPPs). Examples include the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPPs to produce three-dimensional, general geometry, nonlinear particle simulations that have accelerated advances in understanding the nature of turbulence self-regulation by zonal flows. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In looking towards the future, the current results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. This should produce the scientific excitement which will help to (a) stimulate enhanced cross-cutting collaborations with other fields and (b) attract the bright young talent needed for the future health of the field of plasma science.

Advances and challenges in computational plasma science

NASA Astrophysics Data System (ADS)

Tang, W. M.

2005-02-01

Scientific simulation, which provides a natural bridge between theory and experiment, is an essential tool for understanding complex plasma behaviour. Recent advances in simulations of magnetically confined plasmas are reviewed in this paper, with illustrative examples, chosen from associated research areas such as microturbulence, magnetohydrodynamics and other topics. Progress has been stimulated, in particular, by the exponential growth of computer speed along with significant improvements in computer technology. The advances in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics have produced increasingly good agreement between experimental observations and computational modelling. This was enabled by two key factors: (a) innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales and (b) access to powerful new computational resources. Excellent progress has been made in developing codes for which computer run-time and problem-size scale well with the number of processors on massively parallel processors (MPPs). Examples include the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPPs to produce three-dimensional, general geometry, nonlinear particle simulations that have accelerated advances in understanding the nature of turbulence self-regulation by zonal flows. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In looking towards the future, the current results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. This should produce the scientific excitement which will help to (a) stimulate enhanced cross-cutting collaborations with other fields and (b) attract the bright young talent needed for the future health of the field of plasma science.
Comprehensive Micromechanics-Analysis Code - Version 4.0

NASA Technical Reports Server (NTRS)

Arnold, S. M.; Bednarcyk, B. A.

2005-01-01

Version 4.0 of the Micromechanics Analysis Code With Generalized Method of Cells (MAC/GMC) has been developed as an improved means of computational simulation of advanced composite materials. The previous version of MAC/GMC was described in "Comprehensive Micromechanics-Analysis Code" (LEW-16870), NASA Tech Briefs, Vol. 24, No. 6 (June 2000), page 38. To recapitulate: MAC/GMC is a computer program that predicts the elastic and inelastic thermomechanical responses of continuous and discontinuous composite materials with arbitrary internal microstructures and reinforcement shapes. The predictive capability of MAC/GMC rests on a model known as the generalized method of cells (GMC) - a continuum-based model of micromechanics that provides closed-form expressions for the macroscopic response of a composite material in terms of the properties, sizes, shapes, and responses of the individual constituents or phases that make up the material. Enhancements in version 4.0 include a capability for modeling thermomechanically and electromagnetically coupled ("smart") materials; a more-accurate (high-fidelity) version of the GMC; a capability to simulate discontinuous plies within a laminate; additional constitutive models of materials; expanded yield-surface-analysis capabilities; and expanded failure-analysis and life-prediction capabilities on both the microscopic and macroscopic scales.
Analytical modeling of operating characteristics of premixing-prevaporizing fuel-air mixing passages. Volume 2: User's manual

NASA Technical Reports Server (NTRS)

Anderson, O. L.; Chiappetta, L. M.; Edwards, D. E.; Mcvey, J. B.

1982-01-01

A user's manual describing the operation of three computer codes (ADD code, PTRAK code, and VAPDIF code) is presented. The general features of the computer codes, the input/output formats, run streams, and sample input cases are described.
Automated apparatus and method of generating native code for a stitching machine

NASA Technical Reports Server (NTRS)

Miller, Jeffrey L. (Inventor)

2000-01-01

A computer system automatically generates CNC code for a stitching machine. The computer determines the locations of a present stitching point and a next stitching point. If a constraint is not found between the present stitching point and the next stitching point, the computer generates code for making a stitch at the next stitching point. If a constraint is found, the computer generates code for changing a condition (e.g., direction) of the stitching machine's stitching head.
Computer codes developed and under development at Lewis

NASA Technical Reports Server (NTRS)

Chamis, Christos C.

1992-01-01

The objective of this summary is to provide a brief description of: (1) codes developed or under development at LeRC; and (2) the development status of IPACS with some typical early results. The computer codes that have been developed and/or are under development at LeRC are listed in the accompanying charts. This list includes: (1) the code acronym; (2) select physics descriptors; (3) current enhancements; and (4) present (9/91) code status with respect to its availability and documentation. The computer codes list is grouped by related functions such as: (1) composite mechanics; (2) composite structures; (3) integrated and 3-D analysis; (4) structural tailoring; and (5) probabilistic structural analysis. These codes provide a broad computational simulation infrastructure (technology base-readiness) for assessing the structural integrity/durability/reliability of propulsion systems. These codes serve two other very important functions: they provide an effective means of technology transfer; and they constitute a depository of corporate memory.
Contributions of the ARM Program to Radiative Transfer Modeling for Climate and Weather Applications

NASA Technical Reports Server (NTRS)

Mlawer, Eli J.; Iacono, Michael J.; Pincus, Robert; Barker, Howard W.; Oreopoulos, Lazaros; Mitchell, David L.

2016-01-01

Accurate climate and weather simulations must account for all relevant physical processes and their complex interactions. Each of these atmospheric, ocean, and land processes must be considered on an appropriate spatial and temporal scale, which leads these simulations to require a substantial computational burden. One especially critical physical process is the flow of solar and thermal radiant energy through the atmosphere, which controls planetary heating and cooling and drives the large-scale dynamics that moves energy from the tropics toward the poles. Radiation calculations are therefore essential for climate and weather simulations, but are themselves quite complex even without considering the effects of variable and inhomogeneous clouds. Clear-sky radiative transfer calculations have to account for thousands of absorption lines due to water vapor, carbon dioxide, and other gases, which are irregularly distributed across the spectrum and have shapes dependent on pressure and temperature. The line-by-line (LBL) codes that treat these details have a far greater computational cost than can be afforded by global models. Therefore, the crucial requirement for accurate radiation calculations in climate and weather prediction models must be satisfied by fast solar and thermal radiation parameterizations with a high level of accuracy that has been demonstrated through extensive comparisons with LBL codes. See attachment for continuation.
Advanced Computation in Plasma Physics

NASA Astrophysics Data System (ADS)

Tang, William

2001-10-01

Scientific simulation in tandem with theory and experiment is an essential tool for understanding complex plasma behavior. This talk will review recent progress and future directions for advanced simulations in magnetically-confined plasmas with illustrative examples chosen from areas such as microturbulence, magnetohydrodynamics, magnetic reconnection, and others. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales together with access to powerful new computational resources. In particular, the fusion energy science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop MPP's to produce 3-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of turbulence self-regulation by zonal flows. It should be emphasized that these calculations, which typically utilized billions of particles for tens of thousands time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to plasma science.
Measurement and computer simulation of antennas on ships and aircraft for results of operational reliability

NASA Astrophysics Data System (ADS)

Kubina, Stanley J.

1989-09-01

The review of the status of computational electromagnetics by Miller and the exposition by Burke of the developments in one of the more important computer codes in the application of the electric field integral equation method, the Numerical Electromagnetic Code (NEC), coupled with Molinet's summary of progress in techniques based on the Geometrical Theory of Diffraction (GTD), provide a clear perspective on the maturity of the modern discipline of computational electromagnetics and its potential. Audone's exposition of the application to the computation of Radar Scattering Cross-section (RCS) is an indication of the breadth of practical applications and his exploitation of modern near-field measurement techniques reminds one of progress in the measurement discipline which is essential to the validation or calibration of computational modeling methodology when applied to complex structures such as aircraft and ships. The latter monograph also presents some comparison results with computational models. Some of the results presented for scale model and flight measurements show some serious disagreements in the lobe structure which would require some detailed examination. This also applies to the radiation patterns obtained by flight measurement compared with those obtained using wire-grid models and integral equation modeling methods. In the examples which follow, an attempt is made to match measurements results completely over the entire 2 to 30 MHz HF range for antennas on a large patrol aircraft. The problem of validating computer models of HF antennas on a helicopter and using computer models to generate radiation pattern information which cannot be obtained by measurements are discussed. The use of NEC computer models to analyze top-side ship configurations where measurement results are not available and only self-validation measures are available or at best comparisons with an alternate GTD computer modeling technique is also discussed.
Project MANTIS: A MANTle Induction Simulator for coupling geodynamic and electromagnetic modeling

NASA Astrophysics Data System (ADS)

Weiss, C. J.

2009-12-01

A key component to testing geodynamic hypotheses resulting from the 3D mantle convection simulations is the ability to easily translate the predicted physiochemical state to the model space relevant for an independent geophysical observation, such as earth's seismic, geodetic or electromagnetic response. In this contribution a new parallel code for simulating low-frequency, global-scale electromagnetic induction phenomena is introduced that has the same Earth discretization as the popular CitcomS mantle convection code. Hence, projection of the CitcomS model into the model space of electrical conductivity is greatly simplified, and focuses solely on the node-to-node, physics-based relationship between these Earth parameters without the need for "upscaling", "downscaling", averaging or harmonizing with some other model basis such as spherical harmonics. Preliminary performance tests of the MANTIS code on shared and distributed memory parallel compute platforms shows favorable scaling (>70% efficiency) for up to 500 processors. As with CitcomS, an OpenDX visualization widget (VISMAN) is also provided for 3D rendering and interactive interrogation of model results. Details of the MANTIS code will be briefly discussed here, focusing on compatibility with CitcomS modeling, as will be preliminary results in which the electromagnetic response of a CitcomS model is evaluated. VISMAN rendering of electrical tomography-derived electrical conductivity model overlain by an a 1x1 deg crustal conductivity map. Grey scale represents the log_10 magnitude of conductivity [S/m]. Arrows are horiztonal components of a hypothetical magnetospheric source field used to electromagnetically excite the conductivity model.
The NASA Neutron Star Grand Challenge: The coalescences of Neutron Star Binary System

NASA Astrophysics Data System (ADS)

Suen, Wai-Mo

1998-04-01

NASA funded a Grand Challenge Project (9/1996-1999) for the development of a multi-purpose numerical treatment for relativistic astrophysics and gravitational wave astronomy. The coalescence of binary neutron stars is chosen as the model problem for the code development. The institutes involved in it are the Argonne Lab, Livermore lab, Max-Planck Institute at Potsdam, StonyBrook, U of Illinois and Washington U. We have recently succeeded in constructing a highly optimized parallel code which is capable of solving the full Einstein equations coupled with relativistic hydrodynamics, running at over 50 GFLOPS on a T3E (the second milestone point of the project). We are presently working on the head-on collisions of two neutron stars, and the inclusion of realistic equations of state into the code. The code will be released to the relativity and astrophysics community in April of 1998. With the full dynamics of the spacetime, relativistic hydro and microphysics all combined into a unified 3D code for the first time, many interesting large scale calculations in general relativistic astrophysics can now be carried out on massively parallel computers.
Algorithms for GPU-based molecular dynamics simulations of complex fluids: Applications to water, mixtures, and liquid crystals.

PubMed

Kazachenko, Sergey; Giovinazzo, Mark; Hall, Kyle Wm; Cann, Natalie M

2015-09-15

A custom code for molecular dynamics simulations has been designed to run on CUDA-enabled NVIDIA graphics processing units (GPUs). The double-precision code simulates multicomponent fluids, with intramolecular and intermolecular forces, coarse-grained and atomistic models, holonomic constraints, Nosé-Hoover thermostats, and the generation of distribution functions. Algorithms to compute Lennard-Jones and Gay-Berne interactions, and the electrostatic force using Ewald summations, are discussed. A neighbor list is introduced to improve scaling with respect to system size. Three test systems are examined: SPC/E water; an n-hexane/2-propanol mixture; and a liquid crystal mesogen, 2-(4-butyloxyphenyl)-5-octyloxypyrimidine. Code performance is analyzed for each system. With one GPU, a 33-119 fold increase in performance is achieved compared with the serial code while the use of two GPUs leads to a 69-287 fold improvement and three GPUs yield a 101-377 fold speedup. © 2015 Wiley Periodicals, Inc.
Detection of non-coding RNA in bacteria and archaea using the DETR'PROK Galaxy pipeline.

PubMed

Toffano-Nioche, Claire; Luo, Yufei; Kuchly, Claire; Wallon, Claire; Steinbach, Delphine; Zytnicki, Matthias; Jacq, Annick; Gautheret, Daniel

2013-09-01

RNA-seq experiments are now routinely used for the large scale sequencing of transcripts. In bacteria or archaea, such deep sequencing experiments typically produce 10-50 million fragments that cover most of the genome, including intergenic regions. In this context, the precise delineation of the non-coding elements is challenging. Non-coding elements include untranslated regions (UTRs) of mRNAs, independent small RNA genes (sRNAs) and transcripts produced from the antisense strand of genes (asRNA). Here we present a computational pipeline (DETR'PROK: detection of ncRNAs in prokaryotes) based on the Galaxy framework that takes as input a mapping of deep sequencing reads and performs successive steps of clustering, comparison with existing annotation and identification of transcribed non-coding fragments classified into putative 5' UTRs, sRNAs and asRNAs. We provide a step-by-step description of the protocol using real-life example data sets from Vibrio splendidus and Escherichia coli. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Reactivity effects in VVER-1000 of the third unit of the kalinin nuclear power plant at physical start-up. Computations in ShIPR intellectual code system with library of two-group cross sections generated by UNK code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zizin, M. N.; Zimin, V. G.; Zizina, S. N., E-mail: zizin@adis.vver.kiae.ru

2010-12-15

The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit ofmore » the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.« less
Reactivity effects in VVER-1000 of the third unit of the kalinin nuclear power plant at physical start-up. Computations in ShIPR intellectual code system with library of two-group cross sections generated by UNK code

NASA Astrophysics Data System (ADS)

Zizin, M. N.; Zimin, V. G.; Zizina, S. N.; Kryakvin, L. V.; Pitilimov, V. A.; Tereshonok, V. A.

2010-12-01

The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit of the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.
Users manual and modeling improvements for axial turbine design and performance computer code TD2-2

NASA Technical Reports Server (NTRS)

Glassman, Arthur J.

1992-01-01

Computer code TD2 computes design point velocity diagrams and performance for multistage, multishaft, cooled or uncooled, axial flow turbines. This streamline analysis code was recently modified to upgrade modeling related to turbine cooling and to the internal loss correlation. These modifications are presented in this report along with descriptions of the code's expanded input and output. This report serves as the users manual for the upgraded code, which is named TD2-2.
An Object-Oriented Approach to Writing Computational Electromagnetics Codes

NASA Technical Reports Server (NTRS)

Zimmerman, Martin; Mallasch, Paul G.

1996-01-01

Presently, most computer software development in the Computational Electromagnetics (CEM) community employs the structured programming paradigm, particularly using the Fortran language. Other segments of the software community began switching to an Object-Oriented Programming (OOP) paradigm in recent years to help ease design and development of highly complex codes. This paper examines design of a time-domain numerical analysis CEM code using the OOP paradigm, comparing OOP code and structured programming code in terms of software maintenance, portability, flexibility, and speed.
Transient Solid Dynamics Simulations on the Sandia/Intel Teraflop Computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Attaway, S.; Brown, K.; Gardner, D.

1997-12-31

Transient solid dynamics simulations are among the most widely used engineering calculations. Industrial applications include vehicle crashworthiness studies, metal forging, and powder compaction prior to sintering. These calculations are also critical to defense applications including safety studies and weapons simulations. The practical importance of these calculations and their computational intensiveness make them natural candidates for parallelization. This has proved to be difficult, and existing implementations fail to scale to more than a few dozen processors. In this paper we describe our parallelization of PRONTO, Sandia`s transient solid dynamics code, via a novel algorithmic approach that utilizes multiple decompositions for differentmore » key segments of the computations, including the material contact calculation. This latter calculation is notoriously difficult to perform well in parallel, because it involves dynamically changing geometry, global searches for elements in contact, and unstructured communications among the compute nodes. Our approach scales to at least 3600 compute nodes of the Sandia/Intel Teraflop computer (the largest set of nodes to which we have had access to date) on problems involving millions of finite elements. On this machine we can simulate models using more than ten- million elements in a few tenths of a second per timestep, and solve problems more than 3000 times faster than a single processor Cray Jedi.« less
Source Stacking for Numerical Wavefield Computations - Application to Global Scale Seismic Mantle Tomography

NASA Astrophysics Data System (ADS)

MacLean, L. S.; Romanowicz, B. A.; French, S.

2015-12-01

Seismic wavefield computations using the Spectral Element Method are now regularly used to recover tomographic images of the upper mantle and crust at the local, regional, and global scales (e.g. Fichtner et al., GJI, 2009; Tape et al., Science 2010; Lekic and Romanowicz, GJI, 2011; French and Romanowicz, GJI, 2014). However, the heaviness of the computations remains a challenge, and contributes to limiting the resolution of the produced images. Using source stacking, as suggested by Capdeville et al. (GJI,2005), can considerably speed up the process by reducing the wavefield computations to only one per each set of N sources. This method was demonstrated through synthetic tests on low frequency datasets, and therefore should work for global mantle tomography. However, the large amplitudes of surface waves dominates the stacked seismograms and these cases can no longer be separated by windowing in the time domain. We have developed a processing approach that helps address this issue and demonstrate its usefulness through a series of synthetic tests performed at long periods (T >60 s) on toy upper mantle models. The summed synthetics are computed using the CSEM code (Capdeville et al., 2002). As for the inverse part of the procedure, we use a quasi-Newton method, computing Frechet derivatives and Hessian using normal mode perturbation theory.
Weather Research and Forecasting Model with Vertical Nesting Capability

DOE Office of Scientific and Technical Information (OSTI.GOV)

2014-08-01

The Weather Research and Forecasting (WRF) model with vertical nesting capability is an extension of the WRF model, which is available in the public domain, from www.wrf-model.org. The new code modifies the nesting procedure, which passes lateral boundary conditions between computational domains in the WRF model. Previously, the same vertical grid was required on all domains, while the new code allows different vertical grids to be used on concurrently run domains. This new functionality improves WRF's ability to produce high-resolution simulations of the atmosphere by allowing a wider range of scales to be efficiently resolved and more accurate lateral boundarymore » conditions to be provided through the nesting procedure.« less
Computing Properties of Hadrons, Nuclei and Nuclear Matter from Quantum Chromodynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Savage, Martin J.

This project was part of a coordinated software development effort which the nuclear physics lattice QCD community pursues in order to ensure that lattice calculations can make optimal use of present, and forthcoming leadership-class and dedicated hardware, including those of the national laboratories, and prepares for the exploitation of future computational resources in the exascale era. The UW team improved and extended software libraries used in lattice QCD calculations related to multi-nucleon systems, enhanced production running codes related to load balancing multi-nucleon production on large-scale computing platforms, and developed SQLite (addressable database) interfaces to efficiently archive and analyze multi-nucleon datamore » and developed a Mathematica interface for the SQLite databases.« less

Three-Dimensional Terahertz Coded-Aperture Imaging Based on Matched Filtering and Convolutional Neural Network.

PubMed

Chen, Shuo; Luo, Chenggao; Wang, Hongqiang; Deng, Bin; Cheng, Yongqiang; Zhuang, Zhaowen

2018-04-26

As a promising radar imaging technique, terahertz coded-aperture imaging (TCAI) can achieve high-resolution, forward-looking, and staring imaging by producing spatiotemporal independent signals with coded apertures. However, there are still two problems in three-dimensional (3D) TCAI. Firstly, the large-scale reference-signal matrix based on meshing the 3D imaging area creates a heavy computational burden, thus leading to unsatisfactory efficiency. Secondly, it is difficult to resolve the target under low signal-to-noise ratio (SNR). In this paper, we propose a 3D imaging method based on matched filtering (MF) and convolutional neural network (CNN), which can reduce the computational burden and achieve high-resolution imaging for low SNR targets. In terms of the frequency-hopping (FH) signal, the original echo is processed with MF. By extracting the processed echo in different spike pulses separately, targets in different imaging planes are reconstructed simultaneously to decompose the global computational complexity, and then are synthesized together to reconstruct the 3D target. Based on the conventional TCAI model, we deduce and build a new TCAI model based on MF. Furthermore, the convolutional neural network (CNN) is designed to teach the MF-TCAI how to reconstruct the low SNR target better. The experimental results demonstrate that the MF-TCAI achieves impressive performance on imaging ability and efficiency under low SNR. Moreover, the MF-TCAI has learned to better resolve the low-SNR 3D target with the help of CNN. In summary, the proposed 3D TCAI can achieve: (1) low-SNR high-resolution imaging by using MF; (2) efficient 3D imaging by downsizing the large-scale reference-signal matrix; and (3) intelligent imaging with CNN. Therefore, the TCAI based on MF and CNN has great potential in applications such as security screening, nondestructive detection, medical diagnosis, etc.
Underworld results as a triple (shopping list, posterior, priors)

NASA Astrophysics Data System (ADS)

Quenette, S. M.; Moresi, L. N.; Abramson, D.

2013-12-01

When studying long-term lithosphere deformation and other such large-scale, spatially distinct and behaviour rich problems, there is a natural trade-off between the meaning of a model, the observations used to validate the model and the ability to compute over this space. For example, many models of varying lithologies, rheological properties and underlying physics may reasonably match (or not match) observables. To compound this problem, each realisation is computationally intensive, requiring high resolution, algorithm tuning and code tuning to contemporary computer hardware. It is often intractable to use sampling based assimilation methods, but with better optimisation, the window of tractability becomes wider. The ultimate goal is to find a sweet-spot where a formal assimilation method is used, and where a model affines to observations. Its natural to think of this as an inverse problem, in which the underlying physics may be fixed and the rheological properties and possibly the lithologies themselves are unknown. What happens when we push this approach and treat some portion of the underlying physics as an unknown? At its extreme this is an intractable problem. However, there is an analogy here with how we develop software for these scientific problems. What happens when we treat the changing part of a largely complete code as an unknown, where the changes are working towards this sweet-spot? When posed as a Bayesian inverse problem the result is a triple - the model changes, the real priors and the real posterior. Not only does this give meaning to the process by which a code changes, it forms a mathematical bridge from an inverse problem to compiler optimisations given such changes. As a stepping stone example we show a regional scale heat flow model with constraining observations, and the inverse process including increasingly complexity in the software. The implementation uses Underworld-GT (Underworld plus research extras to import geology and export geothermic measures, etc). Underworld uses StGermain an early (partial) implementation of the theories described here.
Time-scale invariance as an emergent property in a perceptron with realistic, noisy neurons

PubMed Central

Buhusi, Catalin V.; Oprisan, Sorinel A.

2013-01-01

In most species, interval timing is time-scale invariant: errors in time estimation scale up linearly with the estimated duration. In mammals, time-scale invariance is ubiquitous over behavioral, lesion, and pharmacological manipulations. For example, dopaminergic drugs induce an immediate, whereas cholinergic drugs induce a gradual, scalar change in timing. Behavioral theories posit that time-scale invariance derives from particular computations, rules, or coding schemes. In contrast, we discuss a simple neural circuit, the perceptron, whose output neurons fire in a clockwise fashion (interval timing) based on the pattern of coincidental activation of its input neurons. We show numerically that time-scale invariance emerges spontaneously in a perceptron with realistic neurons, in the presence of noise. Under the assumption that dopaminergic drugs modulate the firing of input neurons, and that cholinergic drugs modulate the memory representation of the criterion time, we show that a perceptron with realistic neurons reproduces the pharmacological clock and memory patterns, and their time-scale invariance, in the presence of noise. These results suggest that rather than being a signature of higher-order cognitive processes or specific computations related to timing, time-scale invariance may spontaneously emerge in a massively-connected brain from the intrinsic noise of neurons and circuits, thus providing the simplest explanation for the ubiquity of scale invariance of interval timing. PMID:23518297
Computer Description of the Field Artillery Ammunition Supply Vehicle

DTIC Science & Technology

1983-04-01

Combinatorial Geometry (COM-GEOM) GIFT Computer Code Computer Target Description 2& AfTNACT (Cmne M feerve shb N ,neemssalyan ify by block number) A...input to the GIFT computer code to generate target vulnerability data. F.a- 4 ono OF I NOV 5S OLETE UNCLASSIFIED SECUOITY CLASSIFICATION OF THIS PAGE...Combinatorial Geometry (COM-GEOM) desrription. The "Geometric Information for Tarqets" ( GIFT ) computer code accepts the CO!-GEOM description and
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2011 CFR

2011-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2012 CFR

2012-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2014 CFR

2014-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2010 CFR

2010-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar data produced for...
Antenna pattern study, task 2

NASA Technical Reports Server (NTRS)

Harper, Warren

1989-01-01

Two electromagnetic scattering codes, NEC-BSC and ESP3, were delivered and installed on a NASA VAX computer for use by Marshall Space Flight Center antenna design personnel. The existing codes and certain supplementary software were updated, the codes installed on a computer that will be delivered to the customer, to provide capability for graphic display of the data to be computed by the use of the codes and to assist the customer in the solution of specific problems that demonstrate the use of the codes. With the exception of one code revision, all of these tasks were performed.
The NASA computer aided design and test system

NASA Technical Reports Server (NTRS)

Gould, J. M.; Juergensen, K.

1973-01-01

A family of computer programs facilitating the design, layout, evaluation, and testing of digital electronic circuitry is described. CADAT (computer aided design and test system) is intended for use by NASA and its contractors and is aimed predominantly at providing cost effective microelectronic subsystems based on custom designed metal oxide semiconductor (MOS) large scale integrated circuits (LSIC's). CADAT software can be easily adopted by installations with a wide variety of computer hardware configurations. Its structure permits ease of update to more powerful component programs and to newly emerging LSIC technologies. The components of the CADAT system are described stressing the interaction of programs rather than detail of coding or algorithms. The CADAT system provides computer aids to derive and document the design intent, includes powerful automatic layout software, permits detailed geometry checks and performance simulation based on mask data, and furnishes test pattern sequences for hardware testing.
Geant4 Computing Performance Benchmarking and Monitoring

DOE PAGES

Dotti, Andrea; Elvira, V. Daniel; Folger, Gunter; ...

2015-12-23

Performance evaluation and analysis of large scale computing applications is essential for optimal use of resources. As detector simulation is one of the most compute intensive tasks and Geant4 is the simulation toolkit most widely used in contemporary high energy physics (HEP) experiments, it is important to monitor Geant4 through its development cycle for changes in computing performance and to identify problems and opportunities for code improvements. All Geant4 development and public releases are being profiled with a set of applications that utilize different input event samples, physics parameters, and detector configurations. Results from multiple benchmarking runs are compared tomore » previous public and development reference releases to monitor CPU and memory usage. Observed changes are evaluated and correlated with code modifications. Besides the full summary of call stack and memory footprint, a detailed call graph analysis is available to Geant4 developers for further analysis. The set of software tools used in the performance evaluation procedure, both in sequential and multi-threaded modes, include FAST, IgProf and Open|Speedshop. In conclusion, the scalability of the CPU time and memory performance in multi-threaded application is evaluated by measuring event throughput and memory gain as a function of the number of threads for selected event samples.« less
Cloudy's Journey from FORTRAN to C, Why and How

NASA Astrophysics Data System (ADS)

Ferland, G. J.

Cloudy is a large-scale plasma simulation code that is widely used across the astronomical community as an aid in the interpretation of spectroscopic data. The cover of the ADAS VI book featured predictions of the code. The FORTRAN 77 source code has always been freely available on the Internet, contributing to its widespread use. The coming of PCs and Linux has fundamentally changed the computing environment. Modern Fortran compilers (F90 and F95) are not freely available. A common-use code must be written in either FORTRAN 77 or C to be Open Source/GNU/Linux friendly. F77 has serious drawbacks - modern language constructs cannot be used, students do not have skills in this language, and it does not contribute to their future employability. It became clear that the code would have to be ported to C to have a viable future. I describe the approach I used to convert Cloudy from FORTRAN 77 with MILSPEC extensions to ANSI/ISO 89 C. Cloudy is now openly available as a C code, and will evolve to C++ as gcc and standard C++ mature. Cloudy looks to a bright future with a modern language.
Advanced Test Reactor Core Modeling Update Project Annual Report for Fiscal Year 2012

DOE Office of Scientific and Technical Information (OSTI.GOV)

David W. Nigg, Principal Investigator; Kevin A. Steuhm, Project Manager

Legacy computational reactor physics software tools and protocols currently used for support of Advanced Test Reactor (ATR) core fuel management and safety assurance, and to some extent, experiment management, are inconsistent with the state of modern nuclear engineering practice, and are difficult, if not impossible, to properly verify and validate (V&V) according to modern standards. Furthermore, the legacy staff knowledge required for application of these tools and protocols from the 1960s and 1970s is rapidly being lost due to staff turnover and retirements. In late 2009, the Idaho National Laboratory (INL) initiated a focused effort, the ATR Core Modeling Updatemore » Project, to address this situation through the introduction of modern high-fidelity computational software and protocols. This aggressive computational and experimental campaign will have a broad strategic impact on the operation of the ATR, both in terms of improved computational efficiency and accuracy for support of ongoing DOE programs as well as in terms of national and international recognition of the ATR National Scientific User Facility (NSUF). The ATR Core Modeling Update Project, targeted for full implementation in phase with the next anticipated ATR Core Internals Changeout (CIC) in the 2014-2015 time frame, began during the last quarter of Fiscal Year 2009, and has just completed its third full year. Key accomplishments so far have encompassed both computational as well as experimental work. A new suite of stochastic and deterministic transport theory based reactor physics codes and their supporting nuclear data libraries (HELIOS, KENO6/SCALE, NEWT/SCALE, ATTILA, and an extended implementation of MCNP5) has been installed at the INL under various licensing arrangements. Corresponding models of the ATR and ATRC are now operational with all five codes, demonstrating the basic feasibility of the new code packages for their intended purpose. Of particular importance, a set of as-run core depletion HELIOS calculations for all ATR cycles since August 2009, Cycle 145A through Cycle 151B, was successfully completed during 2012. This major effort supported a decision late in the year to proceed with the phased incorporation of the HELIOS methodology into the ATR Core Safety Analysis Package (CSAP) preparation process, in parallel with the established PDQ-based methodology, beginning late in Fiscal Year 2012. Acquisition of the advanced SERPENT (VTT-Finland) and MC21 (DOE-NR) Monte Carlo stochastic neutronics simulation codes was also initiated during the year and some initial applications of SERPENT to ATRC experiment analysis were demonstrated. These two new codes will offer significant additional capability, including the possibility of full-3D Monte Carlo fuel management support capabilities for the ATR at some point in the future. Finally, a capability for rigorous sensitivity analysis and uncertainty quantification based on the TSUNAMI system has been implemented and initial computational results have been obtained. This capability will have many applications as a tool for understanding the margins of uncertainty in the new models as well as for validation experiment design and interpretation.« less
Scaling Studies for Advanced High Temperature Reactor Concepts, Final Technical Report: October 2014—December 2017

DOE Office of Scientific and Technical Information (OSTI.GOV)

Woods, Brian; Gutowska, Izabela; Chiger, Howard

Computer simulations of nuclear reactor thermal-hydraulic phenomena are often used in the design and licensing of nuclear reactor systems. In order to assess the accuracy of these computer simulations, computer codes and methods are often validated against experimental data. This experimental data must be of sufficiently high quality in order to conduct a robust validation exercise. In addition, this experimental data is generally collected at experimental facilities that are of a smaller scale than the reactor systems that are being simulated due to cost considerations. Therefore, smaller scale test facilities must be designed and constructed in such a fashion tomore » ensure that the prototypical behavior of a particular nuclear reactor system is preserved. The work completed through this project has resulted in scaling analyses and conceptual design development for a test facility capable of collecting code validation data for the following high temperature gas reactor systems and events— 1. Passive natural circulation core cooling system, 2. pebble bed gas reactor concept, 3. General Atomics Energy Multiplier Module reactor, and 4. prismatic block design steam-water ingress event. In the event that code validation data for these systems or events is needed in the future, significant progress in the design of an appropriate integral-type test facility has already been completed as a result of this project. Where applicable, the next step would be to begin the detailed design development and material procurement. As part of this project applicable scaling analyses were completed and test facility design requirements developed. Conceptual designs were developed for the implementation of these design requirements at the Oregon State University (OSU) High Temperature Test Facility (HTTF). The original HTTF is based on a ¼-scale model of a high temperature gas reactor concept with the capability for both forced and natural circulation flow through a prismatic core with an electrical heat source. The peak core region temperature capability is 1400°C. As part of this project, an inventory of test facilities that could be used for these experimental programs was completed. Several of these facilities showed some promise, however, upon further investigation it became clear that only the OSU HTTF had the power and/or peak temperature limits that would allow for the experimental programs envisioned herein. Thus the conceptual design and feasibility study development focused on examining the feasibility of configuring the current HTTF to collect validation data for these experimental programs. In addition to the scaling analyses and conceptual design development, a test plan was developed for the envisioned modified test facility. This test plan included a discussion on an appropriate shakedown test program as well as the specific matrix tests. Finally, a feasibility study was completed to determine the cost and schedule considerations that would be important to any test program developed to investigate these designs and events.« less
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2013 CFR

2013-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... funds; (ii) Studies, analyses, test data, or similar data produced for this contract, when the study...
Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

NASA Technical Reports Server (NTRS)

Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

2017-01-01

Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.
Flexible Inhibitor Fluid-Structure Interaction Simulation in RSRM.

NASA Astrophysics Data System (ADS)

Wasistho, Bono

2005-11-01

We employ our tightly coupled fluid/structure/combustion simulation code 'Rocstar-3' for solid propellant rocket motors to study 3D flows past rigid and flexible inhibitors in the Reusable Solid Rocket Motor (RSRM). We perform high resolution simulations of a section of the rocket near the center joint slot at 100 seconds after ignition, using inflow conditions based on less detailed 3D simulations of the full RSRM. Our simulations include both inviscid and turbulent flows (using LES dynamic subgrid-scale model), and explore the interaction between the inhibitor and the resulting fluid flow. The response of the solid components is computed by an implicit finite element solver. The internal mesh motion scheme in our block-structured fluid solver enables our code to handle significant changes in geometry. We compute turbulent statistics and determine the compound instabilities originated from the natural hydrodynamic instabilities and the inhibitor motion. The ultimate goal is to studdy the effect of inhibitor flexing on the turbulent field.
Comparison of Computed and Measured Vortex Evolution for a UH-60A Rotor in Forward Flight

NASA Technical Reports Server (NTRS)

Ahmad, Jasim Uddin; Yamauchi, Gloria K.; Kao, David L.

2013-01-01

A Computational Fluid Dynamics (CFD) simulation using the Navier-Stokes equations was performed to determine the evolutionary and dynamical characteristics of the vortex flowfield for a highly flexible aeroelastic UH-60A rotor in forward flight. The experimental wake data were acquired using Particle Image Velocimetry (PIV) during a test of the fullscale UH-60A rotor in the National Full-Scale Aerodynamics Complex 40- by 80-Foot Wind Tunnel. The PIV measurements were made in a stationary cross-flow plane at 90 deg rotor azimuth. The CFD simulation was performed using the OVERFLOW CFD solver loosely coupled with the rotorcraft comprehensive code CAMRAD II. Characteristics of vortices captured in the PIV plane from different blades are compared with CFD calculations. The blade airloads were calculated using two different turbulence models. A limited spatial, temporal, and CFD/comprehensive-code coupling sensitivity analysis was performed in order to verify the unsteady helicopter simulations with a moving rotor grid system.
What makes computational open source software libraries successful?

NASA Astrophysics Data System (ADS)

Bangerth, Wolfgang; Heister, Timo

2013-01-01

Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects.
User Instructions for the Systems Assessment Capability, Rev. 1, Computer Codes Volume 3: Utility Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eslinger, Paul W.; Aaberg, Rosanne L.; Lopresti, Charles A.

2004-09-14

This document contains detailed user instructions for a suite of utility codes developed for Rev. 1 of the Systems Assessment Capability. The suite of computer codes for Rev. 1 of Systems Assessment Capability performs many functions.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.