tempest parallel programming: Topics by Science.gov

Sample records for tempest parallel programming

Tempest: Accelerated MS/MS database search software for heterogeneous computing platforms

PubMed Central

Adamo, Mark E.; Gerber, Scott A.

2017-01-01

MS/MS database search algorithms derive a set of candidate peptide sequences from in-silico digest of a protein sequence database, and compute theoretical fragmentation patterns to match these candidates against observed MS/MS spectra. The original Tempest publication described these operations mapped to a CPU-GPU model, in which the CPU generates peptide candidates that are asynchronously sent to a discrete GPU to be scored against experimental spectra in parallel (Milloy et al., 2012). The current version of Tempest expands this model, incorporating OpenCL to offer seamless parallelization across multicore CPUs, GPUs, integrated graphics chips, and general-purpose coprocessors. Three protocols describe how to configure and run a Tempest search, including discussion of how to leverage Tempest's unique feature set to produce optimal results. PMID:27603022
Tempest: Accelerated MS/MS Database Search Software for Heterogeneous Computing Platforms.

PubMed

Adamo, Mark E; Gerber, Scott A

2016-09-07

MS/MS database search algorithms derive a set of candidate peptide sequences from in silico digest of a protein sequence database, and compute theoretical fragmentation patterns to match these candidates against observed MS/MS spectra. The original Tempest publication described these operations mapped to a CPU-GPU model, in which the CPU (central processing unit) generates peptide candidates that are asynchronously sent to a discrete GPU (graphics processing unit) to be scored against experimental spectra in parallel. The current version of Tempest expands this model, incorporating OpenCL to offer seamless parallelization across multicore CPUs, GPUs, integrated graphics chips, and general-purpose coprocessors. Three protocols describe how to configure and run a Tempest search, including discussion of how to leverage Tempest's unique feature set to produce optimal results. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Tempest: GPU-CPU computing for high-throughput database spectral matching.

PubMed

Milloy, Jeffrey A; Faherty, Brendan K; Gerber, Scott A

2012-07-06

Modern mass spectrometers are now capable of producing hundreds of thousands of tandem (MS/MS) spectra per experiment, making the translation of these fragmentation spectra into peptide matches a common bottleneck in proteomics research. When coupled with experimental designs that enrich for post-translational modifications such as phosphorylation and/or include isotopically labeled amino acids for quantification, additional burdens are placed on this computational infrastructure by shotgun sequencing. To address this issue, we have developed a new database searching program that utilizes the massively parallel compute capabilities of a graphical processing unit (GPU) to produce peptide spectral matches in a very high throughput fashion. Our program, named Tempest, combines efficient database digestion and MS/MS spectral indexing on a CPU with fast similarity scoring on a GPU. In our implementation, the entire similarity score, including the generation of full theoretical peptide candidate fragmentation spectra and its comparison to experimental spectra, is conducted on the GPU. Although Tempest uses the classical SEQUEST XCorr score as a primary metric for evaluating similarity for spectra collected at unit resolution, we have developed a new "Accelerated Score" for MS/MS spectra collected at high resolution that is based on a computationally inexpensive dot product but exhibits scoring accuracy similar to that of the classical XCorr. In our experience, Tempest provides compute-cluster level performance in an affordable desktop computer.
Electric Propulsion Test & Evaluation Methodologies for Plasma in the Environments of Space and Testing (EP TEMPEST) (Briefing Charts)

DTIC Science & Technology

2015-04-01

in the Environments of Space and Testing (EP TEMPEST ) - Program Review (Briefing Charts) 5a. CONTRACT NUMBER In-House 5b. GRANT NUMBER 5c...of Space and Testing (EP TEMPEST ) AFOSR T&E Program Review 13-17 April 2015 Dr. Daniel L. Brown In-Space Propulsion Branch (RQRS) Aerospace Systems...Statement A: Approved for public release; distribution is unlimited. EP TEMPEST (Lab Task, FY14-FY16) Program Goals and Objectives Title: Electric
Performance of a parallel thermal-hydraulics code TEMPEST

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fann, G.I.; Trent, D.S.

The authors describe the parallelization of the Tempest thermal-hydraulics code. The serial version of this code is used for production quality 3-D thermal-hydraulics simulations. Good speedup was obtained with a parallel diagonally preconditioned BiCGStab non-symmetric linear solver, using a spatial domain decomposition approach for the semi-iterative pressure-based and mass-conserved algorithm. The test case used here to illustrate the performance of the BiCGStab solver is a 3-D natural convection problem modeled using finite volume discretization in cylindrical coordinates. The BiCGStab solver replaced the LSOR-ADI method for solving the pressure equation in TEMPEST. BiCGStab also solves the coupled thermal energy equation. Scalingmore » performance of 3 problem sizes (221220 nodes, 358120 nodes, and 701220 nodes) are presented. These problems were run on 2 different parallel machines: IBM-SP and SGI PowerChallenge. The largest problem attains a speedup of 68 on an 128 processor IBM-SP. In real terms, this is over 34 times faster than the fastest serial production time using the LSOR-ADI solver.« less
Tempest Neoclassical Simulation of Fusion Edge Plasmas

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Xiong, Z.; Cohen, B. I.; Cohen, R. H.; Dorr, M.; Hittinger, J.; Kerbel, G. D.; Nevins, W. M.; Rognlien, T. D.

2006-04-01

We are developing a continuum gyrokinetic full-F code, TEMPEST, to simulate edge plasmas. The geometry is that of a fully diverted tokamak and so includes boundary conditions for both closed magnetic flux surfaces and open field lines. The code, presently 4-dimensional (2D2V), includes kinetic ions and electrons, a gyrokinetic Poisson solver for electric field, and the nonlinear Fokker-Planck collision operator. Here we present the simulation results of neoclassical transport with Boltzmann electrons. In a large aspect ratio circular geometry, excellent agreement is found for neoclassical equilibrium with parallel flows in the banana regime without a temperature gradient. In divertor geometry, it is found that the endloss of particles and energy induces pedestal-like density and temperature profiles inside the magnetic separatrix and parallel flow stronger than the neoclassical predictions in the SOL. The impact of the X-point divertor geometry on the self-consistent electric field and geo-acoustic oscillations will be reported. We will also discuss the status of extending TEMPEST into a 5-D code.
Verification of TEMPEST with neoclassical transport theory

NASA Astrophysics Data System (ADS)

Xiong, Z.; Cohen, B. I.; Cohen, R. H.; Dorr, M.; Hittinger, J.; Kerbel, G.; Nevins, W. M.; Rognlien, T.; Umansky, M.; Xu, X.

2006-10-01

TEMPEST is an edge gyro-kinetic continuum code developed to study boundary plasma transport over the region extending from the H-mode pedestal across the separatrix to the divertor plates. For benchmark purposes, we present results from the 4D (2r,2v) TEMPEST for both steady-state transport and time-dependent Geodesic Acoustic Modes (GAMs). We focus on an annular region inside the separatrix of a circular cross-section tokamak where analytical and numerical results are available. The parallel flow velocity and radial particle flux are obtained for different collisional regimes and compared with previous neoclassical results. The effect of radial electric field and the transition to steep edge gradients is emphasized. The dynamical response of GAMs is also shown and compared to recent theory.
TEMPEST: A three-dimensional time-dependent computer program for hydrothermal analysis: Volume 1, Numerical methods and input instructions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trent, D.S.; Eyler, L.L.; Budden, M.J.

This document describes the numerical methods, current capabilities, and the use of the TEMPEST (Version L, MOD 2) computer program. TEMPEST is a transient, three-dimensional, hydrothermal computer program that is designed to analyze a broad range of coupled fluid dynamic and heat transfer systems of particular interest to the Fast Breeder Reactor thermal-hydraulic design community. The full three-dimensional, time-dependent equations of motion, continuity, and heat transport are solved for either laminar or turbulent fluid flow, including heat diffusion and generation in both solid and liquid materials. 10 refs., 22 figs., 2 tabs.
TEMPEST: A three-dimensional time-dependence computer program for hydrothermal analysis: Volume 1, Numerical methods and input instructions: Revision 2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trent, D.S.; Eyler, L.L.

TEMPEST offers simulation capabilities over a wide range of hydrothermal problems that are definable by input instructions. These capabilities are summarized by categories as follows: modeling capabilities; program control; and I/O control. 10 refs., 22 figs., 2 tabs. (LSP)
TEMPEST simulations of the plasma transport in a single-null tokamak geometry

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Bodi, K.; Cohen, R. H.; Krasheninnikov, S.; Rognlien, T. D.

2010-06-01

We present edge kinetic ion transport simulations of tokamak plasmas in magnetic divertor geometry using the fully nonlinear (full-f) continuum code TEMPEST. Besides neoclassical transport, a term for divergence of anomalous kinetic radial flux is added to mock up the effect of turbulent transport. To study the relative roles of neoclassical and anomalous transport, TEMPEST simulations were carried out for plasma transport and flow dynamics in a single-null tokamak geometry, including the pedestal region that extends across the separatrix into the scrape-off layer and private flux region. A series of TEMPEST simulations were conducted to investigate the transition of midplane pedestal heat flux and flow from the neoclassical to the turbulent limit and the transition of divertor heat flux and flow from the kinetic to the fluid regime via an anomalous transport scan and a density scan. The TEMPEST simulation results demonstrate that turbulent transport (as modelled by large diffusion) plays a similar role to collisional decorrelation of particle orbits and that the large turbulent transport (large diffusion) leads to an apparent Maxwellianization of the particle distribution. We also show the transition of parallel heat flux and flow at the entrance to the divertor plates from the fluid to the kinetic regime. For an absorbing divertor plate boundary condition, a non-half-Maxwellian is found due to the balance between upstream radial anomalous transport and energetic ion endloss.
Simulation of the wastewater temperature in sewers with TEMPEST.

PubMed

Dürrenmatt, David J; Wanner, Oskar

2008-01-01

TEMPEST is a new interactive simulation program for the estimation of the wastewater temperature in sewers. Intuitive graphical user interfaces assist the user in managing data, performing calculations and plotting results. The program calculates the dynamics and longitudinal spatial profiles of the wastewater temperature in sewer lines. Interactions between wastewater, sewer air and surrounding soil are modeled in TEMPEST by mass balance equations, rate expressions found in the literature and a new empirical model of the airflow in the sewer. TEMPEST was developed as a tool which can be applied in practice, i.e., it requires as few input data as possible. These data include the upstream wastewater discharge and temperature, geometric and hydraulic parameters of the sewer, material properties of the sewer pipe and surrounding soil, ambient conditions, and estimates of the capacity of openings for air exchange between sewer and environment. Based on a case study it is shown how TEMPEST can be applied to estimate the decrease of the downstream wastewater temperature caused by heat recovery from the sewer. Because the efficiency of nitrification strongly depends on the wastewater temperature, this application is of practical relevance for situations in which the sewer ends at a nitrifying wastewater treatment plant.
TEMPEST Simulations of the Plasma Transport in a Single-Null Tokamak Geometry

DOE PAGES

X. Q. Xu; Bodi, K.; Cohen, R. H.; ...

2010-05-28

We present edge kinetic ion transport simulations of tokamak plasmas in magnetic divertor geometry using the fully nonlinear (full-f) continuum code TEMPEST. Besides neoclassical transport, a term for divergence of anomalous kinetic radial flux is added to mock up the effect of turbulent transport. In order to study the relative roles of neoclassical and anomalous transport, TEMPEST simulations were carried out for plasma transport and flow dynamics in a single-null tokamak geometry, including the pedestal region that extends across the separatrix into the scrape-off layer and private flux region. In a series of TEMPEST simulations were conducted to investigate themore » transition of midplane pedestal heat flux and flow from the neoclassical to the turbulent limit and the transition of divertor heat flux and flow from the kinetic to the fluid regime via an anomalous transport scan and a density scan. The TEMPEST simulation results demonstrate that turbulent transport (as modelled by large diffusion) plays a similar role to collisional decorrelation of particle orbits and that the large turbulent transport (large diffusion) leads to an apparent Maxwellianization of the particle distribution. Moreover, we show the transition of parallel heat flux and flow at the entrance to the divertor plates from the fluid to the kinetic regime. For an absorbing divertor plate boundary condition, a non-half-Maxwellian is found due to the balance between upstream radial anomalous transport and energetic ion endloss.« less
TEMPEST Simulations of the Plasma Transport in a Single-Null Tokamak Geometry

DOE Office of Scientific and Technical Information (OSTI.GOV)

X. Q. Xu; Bodi, K.; Cohen, R. H.

We present edge kinetic ion transport simulations of tokamak plasmas in magnetic divertor geometry using the fully nonlinear (full-f) continuum code TEMPEST. Besides neoclassical transport, a term for divergence of anomalous kinetic radial flux is added to mock up the effect of turbulent transport. In order to study the relative roles of neoclassical and anomalous transport, TEMPEST simulations were carried out for plasma transport and flow dynamics in a single-null tokamak geometry, including the pedestal region that extends across the separatrix into the scrape-off layer and private flux region. In a series of TEMPEST simulations were conducted to investigate themore » transition of midplane pedestal heat flux and flow from the neoclassical to the turbulent limit and the transition of divertor heat flux and flow from the kinetic to the fluid regime via an anomalous transport scan and a density scan. The TEMPEST simulation results demonstrate that turbulent transport (as modelled by large diffusion) plays a similar role to collisional decorrelation of particle orbits and that the large turbulent transport (large diffusion) leads to an apparent Maxwellianization of the particle distribution. Moreover, we show the transition of parallel heat flux and flow at the entrance to the divertor plates from the fluid to the kinetic regime. For an absorbing divertor plate boundary condition, a non-half-Maxwellian is found due to the balance between upstream radial anomalous transport and energetic ion endloss.« less
Collisional tests and an extension of the TEMPEST continuum gyrokinetic code

NASA Astrophysics Data System (ADS)

Cohen, R. H.; Dorr, M.; Hittinger, J.; Kerbel, G.; Nevins, W. M.; Rognlien, T.; Xiong, Z.; Xu, X. Q.

2006-04-01

An important requirement of a kinetic code for edge plasmas is the ability to accurately treat the effect of colllisions over a broad range of collisionalities. To test the interaction of collisions and parallel streaming, TEMPEST has been compared with published analytic and numerical (Monte Carlo, bounce-averaged Fokker-Planck) results for endloss of particles confined by combined electrostatic and magnetic wells. Good agreement is found over a wide range of collisionality, confining potential and mirror ratio, and the required velocity space resolution is modest. We also describe progress toward extension of (4-dimensional) TEMPEST into a ``kinetic edge transport code'' (a kinetic counterpart of UEDGE). The extension includes averaging of the gyrokinetic equations over fast timescales and approximating the averaged quadratic terms by diffusion terms which respect the boundaries of inaccessable regions in phase space. F. Najmabadi, R.W. Conn and R.H. Cohen, Nucl. Fusion 24, 75 (1984); T.D. Rognlien and T.A. Cutler, Nucl. Fusion 20, 1003 (1980).
TEMPEST. Transient 3-D Thermal-Hydraulic

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eyler, L.L.

TEMPEST is a transient, three-dimensional, hydrothermal program that is designed to analyze a range of coupled fluid dynamic and heat transfer systems of particular interest to the Fast Breeder Reactor (FBR) thermal-hydraulic design community. The full three-dimensional, time-dependent equations of motion, continuity, and heat transport are solved for either laminar or turbulent fluid flow, including heat diffusion and generation in both solid and liquid materials. The equations governing mass, momentum, and energy conservation for incompressible flows and small density variations (Boussinesq approximation) are solved using finite-difference techniques. Analyses may be conducted in either cylindrical or Cartesian coordinate systems. Turbulence ismore » treated using a two-equation model. Two auxiliary plotting programs, SEQUEL and MANPLOT, for use with TEMPEST output are included. SEQUEL may be operated in batch or interactive mode; it generates data required for vector plots, contour plots of scalar quantities, line plots, grid and boundary plots, and time-history plots. MANPLOT reads the SEQUEL-generated data and creates the hardcopy plots. TEMPEST can be a valuable hydrothermal design analysis tool in areas outside the intended FBR thermal-hydraulic design community.« less
Edge gyrokinetic theory and continuum simulations

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Xiong, Z.; Dorr, M. R.; Hittinger, J. A.; Bodi, K.; Candy, J.; Cohen, B. I.; Cohen, R. H.; Colella, P.; Kerbel, G. D.; Krasheninnikov, S.; Nevins, W. M.; Qin, H.; Rognlien, T. D.; Snyder, P. B.; Umansky, M. V.

2007-08-01

The following results are presented from the development and application of TEMPEST, a fully nonlinear (full-f) five-dimensional (3d2v) gyrokinetic continuum edge-plasma code. (1) As a test of the interaction of collisions and parallel streaming, TEMPEST is compared with published analytic and numerical results for endloss of particles confined by combined electrostatic and magnetic wells. Good agreement is found over a wide range of collisionality, confining potential and mirror ratio, and the required velocity space resolution is modest. (2) In a large-aspect-ratio circular geometry, excellent agreement is found for a neoclassical equilibrium with parallel ion flow in the banana regime with zero temperature gradient and radial electric field. (3) The four-dimensional (2d2v) version of the code produces the first self-consistent simulation results of collisionless damping of geodesic acoustic modes and zonal flow (Rosenbluth-Hinton residual) with Boltzmann electrons using a full-f code. The electric field is also found to agree with the standard neoclassical expression for steep density and ion temperature gradients in the plateau regime. In divertor geometry, it is found that the endloss of particles and energy induces parallel flow stronger than the core neoclassical predictions in the SOL.
TEMPEST-D MM-Wave Radiometer

NASA Astrophysics Data System (ADS)

Padmanabhan, S.; Gaier, T.; Reising, S. C.; Lim, B.; Stachnik, R. A.; Jarnot, R.; Berg, W. K.; Kummerow, C. D.; Chandrasekar, V.

2016-12-01

The TEMPEST-D radiometer is a five-frequency millimeter-wave radiometer at 89, 165, 176, 180, and 182 GHz. The direct-detection architecture of the radiometer reduces its power consumption and eliminates the need for a local oscillator, reducing complexity. The Instrument includes a blackbody calibrator and a scanning reflector, which enable precision calibration and cross-track scanning. The MMIC-based millimeter-wave radiometers take advantage of the technology developed under extensive investment by the NASA Earth Science Technology Office (ESTO). The five-frequency millimeter-wave radiometer is built by Jet Propulsion Laboratory (JPL), which has produced a number of state-of-the-art spaceborne microwave radiometers, such as the Microwave Limb Sounder (MLS), Advanced Microwave Radiometer (AMR) for Jason-2/OSTM, Jason-3, and the Juno Microwave Radiometer (MWR). The TEMPEST-D Instrument design is based on a 165 to 182 GHz radiometer design inherited from RACE and an 89 GHz receiver developed under the ESTO ACT-08 and IIP-10 programs at Colorado State University (CSU) and JPL. The TEMPEST reflector scan and calibration methodology is adapted from the Advanced Technology Microwave Sounder (ATMS) and has been validated on the Global Hawk unmanned aerial vehicle (UAV) using the High Altitude MMIC Sounding radiometer (HAMSR) instrument. This presentation will focus on the design, development and performance of the TEMPEST-D radiometer instrument. The flow-down of the TEMPEST-D mission objectives to instrument level requirements will also be discussed.
TEMPEST: A three-dimensional time-dependent computer program for hydrothermal analysis: Volume 2, Assessment and verification results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eyler, L L; Trent, D S; Budden, M J

During the course of the TEMPEST computer code development a concurrent effort was conducted to assess the code's performance and the validity of computed results. The results of this work are presented in this document. The principal objective of this effort was to assure the code's computational correctness for a wide range of hydrothermal phenomena typical of fast breeder reactor application. 47 refs., 94 figs., 6 tabs.
Simulations of 4D edge transport and dynamics using the TEMPEST gyro-kinetic code

NASA Astrophysics Data System (ADS)

Rognlien, T. D.; Cohen, B. I.; Cohen, R. H.; Dorr, M. R.; Hittinger, J. A. F.; Kerbel, G. D.; Nevins, W. M.; Xiong, Z.; Xu, X. Q.

2006-10-01

Simulation results are presented for tokamak edge plasmas with a focus on the 4D (2r,2v) option of the TEMPEST continuum gyro-kinetic code. A detailed description of a variety of kinetic simulations is reported, including neoclassical radial transport from Coulomb collisions, electric field generation, dynamic response to perturbations by geodesic acoustic modes, and parallel transport on open magnetic-field lines. Comparison is made between the characteristics of the plasma solutions on closed and open magnetic-field line regions separated by a magnetic separatrix, and simple physical models are used to qualitatively explain the differences observed in mean flow and electric-field generation. The status of extending the simulations to 5D turbulence will be summarized. The code structure used in this ongoing project is also briefly described, together with future plans.
Tempest - Efficient Computation of Atmospheric Flows Using High-Order Local Discretization Methods

NASA Astrophysics Data System (ADS)

Ullrich, P. A.; Guerra, J. E.

2014-12-01

The Tempest Framework composes several compact numerical methods to easily facilitate intercomparison of atmospheric flow calculations on the sphere and in rectangular domains. This framework includes the implementations of Spectral Elements, Discontinuous Galerkin, Flux Reconstruction, and Hybrid Finite Element methods with the goal of achieving optimal accuracy in the solution of atmospheric problems. Several advantages of this approach are discussed such as: improved pressure gradient calculation, numerical stability by vertical/horizontal splitting, arbitrary order of accuracy, etc. The local numerical discretization allows for high performance parallel computation and efficient inclusion of parameterizations. These techniques are used in conjunction with a non-conformal, locally refined, cubed-sphere grid for global simulations and standard Cartesian grids for simulations at the mesoscale. A complete implementation of the methods described is demonstrated in a non-hydrostatic setting.

TEMPEST: Twin Electric Magnetospheric Probes Exploring on Spiral Trajectories--A Proposal to the Medium Class Explorer Program

NASA Technical Reports Server (NTRS)

1995-01-01

The objective of the Twin Electric Magnetospheric Probes Exploring on Spiral Trajectories (TEMPEST) mission is to understand the nature and causes of magnetic storm conditions in the magnetosphere whether they be manifested classically in the buildup of the ring current, or (as recently discovered) by storms of relativistic electrons that cause the deep dielectric charging responsible for disabling satellites in synchronous orbit, or by the release of energy into the auroral ionosphere and the plasma sheet during substorms.
Continuum Edge Gyrokinetic Theory and Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, X Q; Xiong, Z; Dorr, M R

The following results are presented from the development and application of TEMPEST, a fully nonlinear (full-f) five dimensional (3d2v) gyrokinetic continuum edge-plasma code. (1) As a test of the interaction of collisions and parallel streaming, TEMPEST is compared with published analytic and numerical results for endloss of particles confined by combined electrostatic and magnetic wells. Good agreement is found over a wide range of collisionality, confining potential, and mirror ratio; and the required velocity space resolution is modest. (2) In a large-aspect-ratio circular geometry, excellent agreement is found for a neoclassical equilibrium with parallel ion flow in the banana regimemore » with zero temperature gradient and radial electric field. (3) The four-dimensional (2d2v) version of the code produces the first self-consistent simulation results of collisionless damping of geodesic acoustic modes and zonal flow (Rosenbluth-Hinton residual) with Boltzmann electrons using a full-f code. The electric field is also found to agree with the standard neoclassical expression for steep density and ion temperature gradients in the banana regime. In divertor geometry, it is found that the endloss of particles and energy induces parallel flow stronger than the core neoclassical predictions in the SOL. (5) Our 5D gyrokinetic formulation yields a set of nonlinear electrostatic gyrokinetic equations that are for both neoclassical and turbulence simulations.« less
The TeMPEST Transit Search: Preliminary Results

NASA Astrophysics Data System (ADS)

Baliber, N. R.; Cochran, W. D.

The Texas, McDonald Photometric Extrasolar Search for Transits, TeMPEST, is a photometric search for transits of extrasolar giant planets orbiting at distances less than approximately 0.1 AU to their parent stars. This survey is being conducted with the McDonald Observatory 0.76 meter Prime Focus Camera (PFC), which provides a 46.2 x 46.2 arcsec field of view. From August through December, 2001, we obtained our first full season of data on two fields in the Galactic plane, one in the constellation Cassiopeia and the other in Camelopardus. In these two fields, V-band time-series photometry with a cadence of about 9 minutes has been performed on over 5000 stars with sufficient precision, better than 0.01 mag, to detect transits of close-orbiting Jovian planets. We present representative light curves from variable stars and an eclipsing system from our 2001 data. The TeMPEST project is funded by the NASA Origins program.
Navigation, Guidance and Control For the CICADA Expendable Micro Air Vehicle

DTIC Science & Technology

2015-01-01

aircraft, as shown in Figure 5a. A Tempest UAV mothership was used as the host platform for the CICADA vehicles. Figure 5b shows how two CICADAs were...mounted on wing pylon drop mechanisms located on each wing of the Tempest . The Tempest was needed to carry the CICADAs back within range of the recovery...carried the Tempest and CICADA combination to a maximum altitude of 57,000 feet above sea-level. At that point, Tempest was released from the balloon and
48 CFR 239.7102-2 - Compromising emanations-TEMPEST or other standard.

Code of Federal Regulations, 2010 CFR

2010-10-01

...-TEMPEST or other standard. 239.7102-2 Section 239.7102-2 Federal Acquisition Regulations System DEFENSE... INFORMATION TECHNOLOGY Security and Privacy for Computer Systems 239.7102-2 Compromising emanations—TEMPEST or....e., an established National TEMPEST standard (e.g., NACSEM 5100, NACSIM 5100A) or a standard used by...
48 CFR 239.7102-2 - Compromising emanations-TEMPEST or other standard.

Code of Federal Regulations, 2014 CFR

2014-10-01

...-TEMPEST or other standard. 239.7102-2 Section 239.7102-2 Federal Acquisition Regulations System DEFENSE... INFORMATION TECHNOLOGY Security and Privacy for Computer Systems 239.7102-2 Compromising emanations—TEMPEST or....e., an established National TEMPEST standard (e.g., NACSEM 5100, NACSIM 5100A) or a standard used by...
48 CFR 239.7102-2 - Compromising emanations-TEMPEST or other standard.

Code of Federal Regulations, 2011 CFR

2011-10-01

...-TEMPEST or other standard. 239.7102-2 Section 239.7102-2 Federal Acquisition Regulations System DEFENSE... INFORMATION TECHNOLOGY Security and Privacy for Computer Systems 239.7102-2 Compromising emanations—TEMPEST or....e., an established National TEMPEST standard (e.g., NACSEM 5100, NACSIM 5100A) or a standard used by...
48 CFR 239.7102-2 - Compromising emanations-TEMPEST or other standard.

Code of Federal Regulations, 2012 CFR

2012-10-01

...-TEMPEST or other standard. 239.7102-2 Section 239.7102-2 Federal Acquisition Regulations System DEFENSE... INFORMATION TECHNOLOGY Security and Privacy for Computer Systems 239.7102-2 Compromising emanations—TEMPEST or....e., an established National TEMPEST standard (e.g., NACSEM 5100, NACSIM 5100A) or a standard used by...
48 CFR 239.7102-2 - Compromising emanations-TEMPEST or other standard.

Code of Federal Regulations, 2013 CFR

2013-10-01

...-TEMPEST or other standard. 239.7102-2 Section 239.7102-2 Federal Acquisition Regulations System DEFENSE... INFORMATION TECHNOLOGY Security and Privacy for Computer Systems 239.7102-2 Compromising emanations—TEMPEST or....e., an established National TEMPEST standard (e.g., NACSEM 5100, NACSIM 5100A) or a standard used by...
New Web Server - the Java Version of Tempest - Produced

NASA Technical Reports Server (NTRS)

York, David W.; Ponyik, Joseph G.

2000-01-01

A new software design and development effort has produced a Java (Sun Microsystems, Inc.) version of the award-winning Tempest software (refs. 1 and 2). In 1999, the Embedded Web Technology (EWT) team received a prestigious R&D 100 Award for Tempest, Java Version. In this article, "Tempest" will refer to the Java version of Tempest, a World Wide Web server for desktop or embedded systems. Tempest was designed at the NASA Glenn Research Center at Lewis Field to run on any platform for which a Java Virtual Machine (JVM, Sun Microsystems, Inc.) exists. The JVM acts as a translator between the native code of the platform and the byte code of Tempest, which is compiled in Java. These byte code files are Java executables with a ".class" extension. Multiple byte code files can be zipped together as a "*.jar" file for more efficient transmission over the Internet. Today's popular browsers, such as Netscape (Netscape Communications Corporation) and Internet Explorer (Microsoft Corporation) have built-in Virtual Machines to display Java applets.
Preliminary testing of turbulence and radionuclide transport modeling in deep ocean environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Onishi, Y.; Dummuller, D.C.; Trent, D.S.

Pacific Northwest Laboratory (PNL) performed a study for the US Environmental Protection Agency's Office of Radiation Programs to (1) identify candidate models for regional modeling of low-level waste ocean disposal sites in the mid-Atlantic ocean; (2) evaluate mathematical representation of the model's eddy viscosity/dispersion coefficients; and (3) evaluate the adequacy of the k-{epsilon} turbulence model and the feasibility of one of the candidate models, TEMPEST{copyright}/FLESCOT{copyright}, to deep-ocean applications on a preliminary basis. PNL identified the TEMPEST{copyright}/FLESCOT{copyright}, FLOWER, Blumberg's, and RMA 10 models as appropriate candidates for the regional radionuclide modeling. Among these models, TEMPEST/FLESCOT is currently the only model thatmore » solves distributions of flow, turbulence (with the k-{epsilon} model), salinity, water temperature, sediment, dissolved contaminants, and sediment-sorbed contaminants. Solving the Navier-Stokes equations using higher order correlations is not practical for regional modeling because of the prohibitive computational requirements; therefore, the turbulence modeling is a more practical approach. PNL applied the three-dimensional code, TEMPEST{copyright}/FLESCOT{copyright} with the k-{epsilon} model, to a very simple, hypothetical, two-dimensional, deep-ocean case, producing at least qualitatively appropriate results. However, more detailed testing should be performed for the further testing of the code. 46 refs., 39 figs., 6 tabs.« less
Tempest: Tools for Addressing the Needs of Next-Generation Climate Models

NASA Astrophysics Data System (ADS)

Ullrich, P. A.; Guerra, J. E.; Pinheiro, M. C.; Fong, J.

2015-12-01

Tempest is a comprehensive simulation-to-science infrastructure that tackles the needs of next-generation, high-resolution, data intensive climate modeling activities. This project incorporates three key components: TempestDynamics, a global modeling framework for experimental numerical methods and high-performance computing; TempestRemap, a toolset for arbitrary-order conservative and consistent remapping between unstructured grids; and TempestExtremes, a suite of detection and characterization tools for identifying weather extremes in large climate datasets. In this presentation, the latest advances with the implementation of this framework will be discussed, and a number of projects now utilizing these tools will be featured.
Enabling Global Observations of Clouds and Precipitation on Fine Spatio-Temporal Scales from CubeSat Constellations: Temporal Experiment for Storms and Tropical Systems Technology Demonstration (TEMPEST-D)

NASA Astrophysics Data System (ADS)

Reising, S. C.; Todd, G.; Padmanabhan, S.; Lim, B.; Heneghan, C.; Kummerow, C.; Chandra, C. V.; Berg, W. K.; Brown, S. T.; Pallas, M.; Radhakrishnan, C.

2017-12-01

The Temporal Experiment for Storms and Tropical Systems (TEMPEST) mission concept consists of a constellation of 5 identical 6U-Class satellites observing storms at 5 millimeter-wave frequencies with 5-10 minute temporal sampling to observe the time evolution of clouds and their transition to precipitation. Such a small satellite mission would enable the first global measurements of clouds and precipitation on the time scale of tens of minutes and the corresponding spatial scale of a few km. TEMPEST is designed to improve the understanding of cloud processes by providing critical information on temporal signatures of precipitation and helping to constrain one of the largest sources of uncertainty in cloud models. TEMPEST millimeter-wave radiometers are able to perform remote observations of the cloud interior to observe microphysical changes as the cloud begins to precipitate or ice accumulates inside the storm. The TEMPEST technology demonstration (TEMPEST-D) mission is in progress to raise the TRL of the instrument and spacecraft systems from 6 to 9 as well as to demonstrate radiometer measurement and differential drag capabilities required to deploy a constellation of 6U-Class satellites in a single orbital plane. The TEMPEST-D millimeter-wave radiometer instrument provides observations at 89, 165, 176, 180 and 182 GHz using a single compact instrument designed for 6U-Class satellites. The direct-detection topology of the radiometer receiver substantially reduces both its power consumption and design complexity compared to heterodyne receivers. The TEMPEST-D instrument performs precise, end-to-end calibration using a cross-track scanning reflector to view an ambient blackbody calibration target and cosmic microwave background every scan period. The TEMPEST-D radiometer instrument has been fabricated and successfully tested under environmental conditions (vibration, thermal cycling and vacuum) expected in low-Earth orbit. TEMPEST-D began in Aug. 2015, with a rapid 2.5-year development to deliver a complete spacecraft with integrated payload by Feb. 2018. TEMPEST-D has been manifested by NASA CSLI planned for launch on ELaNa-23 on Cygnus Antares II to the ISS in Mar. 2018. The TEMPEST-D satellite is expected to be deployed into a 400-km orbit at 51.6° inclination a few months after arrival at ISS.
LAVA web-based remote simulation: enhancements for education and technology innovation

NASA Astrophysics Data System (ADS)

Lee, Sang Il; Ng, Ka Chun; Orimoto, Takashi; Pittenger, Jason; Horie, Toshi; Adam, Konstantinos; Cheng, Mosong; Croffie, Ebo H.; Deng, Yunfei; Gennari, Frank E.; Pistor, Thomas V.; Robins, Garth; Williamson, Mike V.; Wu, Bo; Yuan, Lei; Neureuther, Andrew R.

2001-09-01

The Lithography Analysis using Virtual Access (LAVA) web site at http://cuervo.eecs.berkeley.edu/Volcano/ has been enhanced with new optical and deposition applets, graphical infrastructure and linkage to parallel execution on networks of workstations. More than ten new graphical user interface applets have been designed to support education, illustrate novel concepts from research, and explore usage of parallel machines. These applets have been improved through feedback and classroom use. Over the last year LAVA provided industry and other academic communities 1,300 session and 700 rigorous simulations per month among the SPLAT, SAMPLE2D, SAMPLE3D, TEMPEST, STORM, and BEBS simulators.
Optimum Vessel Performance in Evolving Nonlinear Wave Fields

DTIC Science & Technology

2012-11-01

TEMPEST , the new, nonlinear, time-domain ship motion code being developed by the Navy. Table of Contents Executive Summary i List of Figures iii...domain ship motion code TEMPEST . The radiation and diffraction forces in the level 3.0 version of TEMPEST will be computed by the body-exact strip theory...nonlinear responses of a ship to a seaway are being incorporated into version 3 of TEMPEST , the new, nonlinear, time-domain ship motion code that
Application of the TEMPEST computer code for simulating hydrogen distribution in model containment structures. [PWR; BWR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trent, D.S.; Eyler, L.L.

In this study several aspects of simulating hydrogen distribution in geometric configurations relevant to reactor containment structures were investigated using the TEMPEST computer code. Of particular interest was the performance of the TEMPEST turbulence model in a density-stratified environment. Computed results illustrated that the TEMPEST numerical procedures predicted the measured phenomena with good accuracy under a variety of conditions and that the turbulence model used is a viable approach in complex turbulent flow simulation.
32 CFR 623.7 - Reports.

Code of Federal Regulations, 2014 CFR

2014-07-01

... Tempest Rapid Materiel Report in message form and sent electrically. The message report will be prepared according to Army Regulation 500-60. (2) Daily message reports. Tempest Rapid Daily Materiel Reports of Army... line. (3) Final reports. In addition to the final Tempest Rapid Daily Materiel Report, a final report...
Koinonia: The Requirements and Vision for an Unclassified Information-Sharing System

DTIC Science & Technology

2013-06-01

of an effort to share information with multinational partners in Multinational Planning Augmentation Team (MPAT) ( Tempest Express Fact Sheet 2011... Tempest fact sheet. Global Security.org. May 7, 2011. Accessed May 3, 2013. http://www.globalsecurity.org/military/ops/ tempest -express.htm U.S
78 FR 25531 - Requested Administrative Waiver of the Coastwise Trade Laws: Vessel TEMPEST; Invitation for...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-05-01

... DEPARTMENT OF TRANSPORTATION Maritime Administration [Docket No. MARAD-2013-0046] Requested Administrative Waiver of the Coastwise Trade Laws: Vessel TEMPEST; Invitation for Public Comments AGENCY... TEMPEST is: Intended Commercial Use of Vessel: ``Offshore wreck diving.'' Geographic Region: Maine, New...
32 CFR 623.7 - Reports.

Code of Federal Regulations, 2012 CFR

2012-07-01

... Tempest Rapid Materiel Report in message form and sent electrically. The message report will be prepared according to Army Regulation 500-60. (2) Daily message reports. Tempest Rapid Daily Materiel Reports of Army... line. (3) Final reports. In addition to the final Tempest Rapid Daily Materiel Report, a final report...

The LTS timing analysis program :

DOE Office of Scientific and Technical Information (OSTI.GOV)

Armstrong, Darrell Jewell; Schwarz, Jens

The LTS Timing Analysis program described in this report uses signals from the Tempest Lasers, Pulse Forming Lines, and Laser Spark Detectors to carry out calculations to quantify and monitor the performance of the the Z-Accelerators laser triggered SF6 switches. The program analyzes Z-shots beginning with Z2457, when Laser Spark Detector data became available for all lines.
Remote Sensing of Precipitation from 6U-Class Small Satellite Constellations: Temporal Experiment for Storms and Tropical Systems Technology Demonstration (TEMPEST-D)

NASA Astrophysics Data System (ADS)

Reising, S. C.; Gaier, T.; Kummerow, C. D.; Chandra, C. V.; Padmanabhan, S.; Lim, B.; Heneghan, C.; Berg, W. K.; Olson, J. P.; Brown, S. T.; Carvo, J.; Pallas, M.

2016-12-01

The Temporal Experiment for Storms and Tropical Systems (TEMPEST) mission concept consists of a constellation of 5 identical 6U-Class nanosatellites observing at 5 millimeter-wave frequencies with 5-minute temporal sampling to observe the time evolution of clouds and their transition to precipitation. The TEMPEST concept is designed to improve the understanding of cloud processes, by providing critical information on the time evolution of cloud and precipitation microphysics and helping to constrain one of the largest sources of uncertainty in climate models. TEMPEST millimeter-wave radiometers are able to make observations in the cloud to observe changes as the cloud begins to precipitate or ice accumulates inside the storm. Such a constellation deployed near 400 km altitude and 50°-65° inclination is expected to capture more than 3 million observations of precipitation during a one-year mission, including over 100,000 deep convective events. The TEMPEST Technology Demonstration (TEMPEST-D) mission will be deployed to raise the TRL of the instrument and key satellite systems as well as to demonstrate measurement capabilities required for a constellation of 6U-Class nanosatellites to directly observe the temporal development of clouds and study the conditions that control their transition from non-precipitating to precipitating clouds. A partnership among Colorado State University (Lead Institution), NASA/Caltech Jet Propulsion Laboratory and Blue Canyon Technologies, TEMPEST-D will provide observations at five millimeter-wave frequencies from 89 to 183 GHz using a single compact instrument that is well suited for the 6U-Class architecture. The top-level requirements for the 90-day TEMPEST-D mission are to: (1) demonstrate precision inter-satellite calibration between TEMPEST-D and one other orbiting radiometer (e.g. GPM or MHS) measuring at similar frequencies; and (2) demonstrate orbital drag maneuvers to control altitude, as verified by GPS, sufficient to achieve relative positioning in a constellation of 6U-Class nanosatellites. The TEMPEST-D 6U-Class satellite is planned to be delivered in July 2017 for launch through NASA CSLI no later than March 2018.
Shakespeare for the 1990s: A Multicultural Tempest.

ERIC Educational Resources Information Center

Carey-Webb, Allen

1993-01-01

Argues that William Shakespeare's "The Tempest" is the play that is best suited for the high school English curriculum of the 1990s. Discusses historical and critical aspects the play's key themes. Shows ways of using the play in high school classes, and describes 19 works to read alongside of"The Tempest." (HB)
Tempest simulations of kinetic GAM mode and neoclassical turbulence

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Dimits, A. M.

2007-11-01

TEMPEST is a nonlinear five dimensional (3d2v) gyrokinetic continuum code for studies of H-mode edge plasma neoclassical transport and turbulence in real divertor geometry. The 4D TEMPEST code correctly produces frequency, collisionless damping of GAM and zonal flow with fully nonlinear Boltzmann electrons in homogeneous plasmas. For large q=4 to 9, the Tempest simulations show that a series of resonance at higher harmonics v||=φGqR0/n with n=4 become effective. The TEMPEST simulation also shows that GAM exists in edge plasma pedestal for steep density and temperature gradients, and an initial GAM relaxes to the standard neoclassical residual with neoclassical transport, rather than Rosenbluth-Hinton residual due to the presence of ion-ion collisions. The enhanced GAM damping explains experimental BES measurements on the edge q scaling of the GAM amplitude. Our 5D gyrokinetic code is built on 4D Tempest neoclassical code with extension to a fifth dimension in toroidal direction and with 3D domain decompositions. Progress on performing 5D neoclassical turbulence simulations will be reported.
[Cardiologic emergencies and natural disaster. Prospective study with Xynthia tempest].

PubMed

Trebouet, E; Lipp, D; Dimet, J; Orion, L; Fradin, P

2011-02-01

Stress-induced cardiomyopathy and ischemic cardiopathy have been described after natural disasters such as earthquakes. Count stress-induced cardiomyopathies and ischemic cardiopathies just after Xynthia tempest which damaged the Vendean coast on February2010, in order to study epidemiology. Included patients were living in a tempest damaged village, and admitted in Vendee hospital just after or in the week following the tempest, and presenting a suspected acute coronary syndrome or stress-induced cardiomyopathy. Among 3350 inhabitants of the two damaged Vendean towns, we count three acute coronary syndromes, two Tako-Tsubo cardiomyopathies, and one coronary spasm. We count five women and one man, average age is 76. The diagnosis of ischemic cardiopathy and stress-induced cardiomyopathy is over-represented in this tempest damaged population, that have been little described. Copyright © 2010 Elsevier Masson SAS. All rights reserved.
Scientific Overview of Temporal Experiment for Storms and Tropical Systems (TEMPEST) Program

NASA Astrophysics Data System (ADS)

Chandra, C. V.; Reising, S. C.; Kummerow, C. D.; van den Heever, S. C.; Todd, G.; Padmanabhan, S.; Brown, S. T.; Lim, B.; Haddad, Z. S.; Koch, T.; Berg, G.; L'Ecuyer, T.; Munchak, S. J.; Luo, Z. J.; Boukabara, S. A.; Ruf, C. S.

2014-12-01

Over the past decade and a half, we have gained a better understanding of the role of clouds and precipitation on Earth's water cycle, energy budget and climate, from focused Earth science observational satellite missions. However, these missions provide only a snapshot at one point in time of the cloud's development. Processes that govern cloud system development occur primarily on time scales of the order of 5-30 minutes that are generally not observable from low Earth orbiting satellites. Geostationary satellites, in contrast, have higher temporal resolution but at present are limited to visible and infrared wavelengths that observe only the tops of clouds. This observing gap was noted by the National Research Council's Earth Science Decadal Survey in 2007. Uncertainties in global climate models are significantly affected by processes that govern the formation and dissipation of clouds that largely control the global water and energy budgets. Current uncertainties in cloud parameterization within climate models lead to drastically different climate outcomes. With all evidence suggesting that the precipitation onset may be governed by factors such atmospheric stability, it becomes critical to have at least first-order observations globally in diverse climate regimes. Similar arguments are valid for ice processes where more efficient ice formation and precipitation have a tendency to leave fewer ice clouds behind that have different but equally important impacts on the Earth's energy budget and resulting temperature trends. TEMPEST is a unique program that will provide a small constellation of inexpensive CubeSats with millimeter-wave radiometers to address key science needs related to cloud and precipitation processes. Because these processes are most critical in the development of climate models that will soon run at scales that explicitly resolve clouds, the TEMPEST program will directly focus on examining, validating and improving the parameterizations currently used in cloud scale models. The time evolution of cloud and precipitation microphysics is dependent upon parameterized process rates. The outcome of TEMPEST will provide a first-order understanding of how individual assumptions in current cloud model parameterizations behave in diverse climate regimes.
Application of the TEMPEST computer code to canister-filling heat transfer problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Farnsworth, R.K.; Faletti, D.W.; Budden, M.J.

Pacific Northwest Laboratory (PNL) researchers used the TEMPEST computer code to simulate thermal cooldown behavior of nuclear waste glass after it was poured into steel canisters for long-term storage. The objective of this work was to determine the accuracy and applicability of the TEMPEST code when used to compute canister thermal histories. First, experimental data were obtained to provide the basis for comparing TEMPEST-generated predictions. Five canisters were instrumented with appropriately located radial and axial thermocouples. The canister were filled using the pilot-scale ceramic melter (PSCM) at PNL. Each canister was filled in either a continous or a batch fillingmore » mode. One of the canisters was also filled within a turntable simulant (a group of cylindrical shells with heat transfer resistances similar to those in an actual melter turntable). This was necessary to provide a basis for assessing the ability of the TEMPEST code to also model the transient cooling of canisters in a melter turntable. The continous-fill model, Version M, was found to predict temperatures with more accuracy. The turntable simulant experiment demonstrated that TEMPEST can adequately model the asymmetric temperature field caused by the turntable geometry. Further, TEMPEST can acceptably predict the canister cooling history within a turntable, despite code limitations in computing simultaneous radiation and convection heat transfer between shells, along with uncertainty in stainless-steel surface emissivities. Based on the successful performance of TEMPEST Version M, development was initiated to incorporate 1) full viscous glass convection, 2) a dynamically adaptive grid that automatically follows the glass/air interface throughout the transient, and 3) a full enclosure radiation model to allow radiation heat transfer to non-nearest neighbor cells. 5 refs., 47 figs., 17 tabs.« less
TEMPEST II--A NEUTRON THERMALIZATION CODE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shudde, R.H.; Dyer, J.

The TEMPEST II neutron thermalization code in Fortran for IBM 709 or 7090 calculates thermal neutron flux spectra based upon the Wigner-Wilkins equation, the Wilkins equation, or the Maxwellian distribution. When a neutron spectrum is obtained, TEMPEST II provides microscopic and macroscopic cross section averages over that spectrum. Equations used by the code and sample input and output data are given. (auth)
Recent Advances in the Tempest UAS for In-Situ Measurements in Highly-Dynamic Environments

NASA Astrophysics Data System (ADS)

Argrow, B. M.; Frew, E.; Houston, A. L.; Weiss, C.

2014-12-01

The spring 2010 deployment of the Tempest UAS during the VORTEX2 field campaign verified that a small UAS, supported by a customized mobile communications, command, and control (C3) architecture, could simultaneously satisfy Federal Aviation Administration (FAA) airspace requirements, and make in-situ thermodynamic measurements in supercell thunderstorms. A multi-hole airdata probe was recently integrated into the Tempest UAS airframe and verification flights were made in spring 2013 to collect in-situ wind measurements behind gust fronts produced by supercell thunderstorms in northeast Colorado. Using instantaneous aircraft attitude estimates from the autopilot, the in-situ measurements were converted to inertial wind estimates, and estimates of uncertainty in the wind measurements was examined. To date, the limited deployments of the Tempest UAS have primarily focused on addressing the engineering and regulatory requirements to conduct supercell research, and the Tempest UAS team of engineers and meteorologists is preparing for deployments with the focus on collecting targeted data for meteorological exploration and hypothesis testing. We describe the recent expansion of the operations area and altitude ceiling of the Tempest UAS, engineering issues for accurate inertial wind estimates, new concepts of operation that include the simultaneous deployment of multiple aircraft with mobile ground stations, and a brief description of our current effort to develop a capability for the Tempest UAS to perform autonomous path planning to maximize energy harvesting from the local wind field for increased endurance.
Tempest in a Therapeutic Community: Implementation and Evaluation Issues for Faith-Based Programming

ERIC Educational Resources Information Center

Scott, Diane L.; Crow, Matthew S.; Thompson, Carla J.

2010-01-01

The therapeutic community (TC) is an increasingly utilized intervention model in corrections settings. Rarely do these TCs include faith-based curriculum other than that included in Alcoholics Anonymous or Narcotics Anonymous programs as does the faith-based TC that serves as the basis for this article. Borrowing from the successful TC model, the…
Suggestions to Gain Deeper Understanding of Magnetic Fields in Astrophysics Classrooms

NASA Astrophysics Data System (ADS)

Woolsey, Lauren N.

2016-01-01

I present two tools that could be used in an undergraduate or graduate classroom to aid in developing intuition of magnetic fields, how they are measured, and how they affect large scale phenomena like the solar wind. The first tool is a Mathematica widget I developed that simulates observations of magnetic field in the Interstellar Medium (ISM) using the weak Zeeman effect. Woolsey (2015, JAESE) discusses the relevant background information about what structures in the ISM produce a strong enough effect and which molecules are used to make the measurement and why. This widget could be used in an entry level astronomy course as a way to show how astronomers actually make certain types of measurements and allow students to practice inquiry-based learning to understand how different aspects of the ISM environment strengthen or weaken the observed signal. The second tool is a Python model of the solar wind, The Efficient Modified Parker Equation Solving Tool (TEMPEST), that is publicly available on GitHub (https://github.com/lnwoolsey/tempest). I discuss possible short-term projects or investigations that could be done using the programs in the TEMPEST library that are suitable for upper-level undergraduates or in graduate level coursework (Woolsey, 2015, JRAEO).
Numerical Methods for Nonlinear Fokker-Planck Collision Operator in TEMPEST

NASA Astrophysics Data System (ADS)

Kerbel, G.; Xiong, Z.

2006-10-01

Early implementations of Fokker-Planck collision operator and moment computations in TEMPEST used low order polynomial interpolation schemes to reuse conservative operators developed for speed/pitch-angle (v, θ) coordinates. When this approach proved to be too inaccurate we developed an alternative higher order interpolation scheme for the Rosenbluth potentials and a high order finite volume method in TEMPEST (,) coordinates. The collision operator is thus generated by using the expansion technique in (v, θ) coordinates for the diffusion coefficients only, and then the fluxes for the conservative differencing are computed directly in the TEMPEST (,) coordinates. Combined with a cut-cell treatment at the turning-point boundary, this new approach is shown to have much better accuracy and conservation properties.
32 CFR 2001.51 - Technical security.

Code of Federal Regulations, 2013 CFR

2013-07-01

... Surveillance Countermeasures and TEMPEST necessary to detect or deter exploitation of classified information..., TEMPEST Countermeasures for Facilities, and SPB Issuance 6-97, National Policy on Technical Surveillance...
32 CFR 2001.51 - Technical security.

Code of Federal Regulations, 2012 CFR

2012-07-01

... Surveillance Countermeasures and TEMPEST necessary to detect or deter exploitation of classified information..., TEMPEST Countermeasures for Facilities, and SPB Issuance 6-97, National Policy on Technical Surveillance...
32 CFR 2001.51 - Technical security.

Code of Federal Regulations, 2011 CFR

2011-07-01

... Surveillance Countermeasures and TEMPEST necessary to detect or deter exploitation of classified information..., TEMPEST Countermeasures for Facilities, and SPB Issuance 6-97, National Policy on Technical Surveillance...
32 CFR 2001.51 - Technical security.

Code of Federal Regulations, 2010 CFR

2010-07-01

... Surveillance Countermeasures and TEMPEST necessary to detect or deter exploitation of classified information..., TEMPEST Countermeasures for Facilities, and SPB Issuance 6-97, National Policy on Technical Surveillance...
32 CFR 2001.51 - Technical security.

Code of Federal Regulations, 2014 CFR

2014-07-01

... Surveillance Countermeasures and TEMPEST necessary to detect or deter exploitation of classified information..., TEMPEST Countermeasures for Facilities, and SPB Issuance 6-97, National Policy on Technical Surveillance...
Temporal Experiment for Storms and Tropical Systems Technology Demonstration (TEMPEST-D): Risk Reduction for 6U-Class Nanosatellite Constellations

NASA Astrophysics Data System (ADS)

Reising, S. C.; Todd, G.; Kummerow, C. D.; Chandrasekar, V.; Padmanabhan, S.; Lim, B.; Brown, S. T.; van den Heever, S. C.; L'Ecuyer, T.; Ruf, C. S.; Luo, Z. J.; Munchak, S. J.; Haddad, Z. S.; Boukabara, S. A.

2015-12-01

The Temporal Experiment for Storms and Tropical Systems Technology Demonstration (TEMPEST-D) is designed to demonstrate required technology to enable a constellation of 6U-Class nanosatellites to directly observe the time evolution of clouds and study the conditions that control the transition of clouds to precipitation using high-temporal resolution observations. TEMPEST millimeter-wave radiometers in the 90-GHz to 183-GHz frequency range penetrate into the cloud to observe key changes as the cloud begins to precipitate or ice accumulates inside the storm. The evolution of ice formation in clouds is important for climate prediction since it largely drives Earth's radiation budget. TEMPEST improves understanding of cloud processes and helps to constrain one of the largest sources of uncertainty in climate models. TEMPEST-D provides observations at five millimeter-wave frequencies from 90 to 183 GHz using a single compact instrument that is well suited for the 6U-Class architecture and fits well within the capabilities of NASA's CubeSat Launch Initiative (CSLI), for which TEMPEST-D was approved in 2015. For a potential future mission of one year of operations, five identical 6U-Class satellites deployed in the same orbital plane with 5-10 minute spacing at ~400 km altitude and 50°-65° inclination are expected to capture 3 million observations of precipitation, including 100,000 deep convective events. TEMPEST is designed to provide critical information on the time evolution of cloud and precipitation microphysics, yielding a first-order understanding of the behavior of assumptions in current cloud-model parameterizations in diverse climate regimes.
TEMPEST-D Spacecraft

NASA Image and Video Library

2018-05-17

The complete TEMPEST-D spacecraft shown with the solar panels deployed. RainCube, CubeRRT and TEMPEST-D are currently integrated aboard Orbital ATKs Cygnus spacecraft and are awaiting launch on an Antares rocket. After the CubeSats have arrived at the station, they will be deployed into low-Earth orbit and will begin their missions to test these new technologies useful for predicting weather, ensuring data quality, and helping researchers better understand storms. https://photojournal.jpl.nasa.gov/catalog/PIA22458
Crisis Stability and Long-Range Strike: A Comparative Analysis of Fighters, Bombers, and Missiles

DTIC Science & Technology

2013-01-01

1947–1949 Crisis pakistan: Seize Kashmir hawker tempests , but not brandished or employed no India: Deny pakistan control of Kashmir hawker... tempests , but not brandished or employed no war Berlin 1948–1949 Crisis uSSR: Force western powers out of west Berlin Bombers and strike aircraft...resolution to border dispute B-24, B-57, hF-24, tempest , Mystère IV, Ouragan aircraft, but none brandished or employed no war Table C.1

TEMPEST code simulations of hydrogen distribution in reactor containment structures. Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trent, D.S.; Eyler, L.L.

The mass transport version of the TEMPEST computer code was used to simulate hydrogen distribution in geometric configurations relevant to reactor containment structures. Predicted results of Battelle-Frankfurt hydrogen distribution tests 1 to 6, and 12 are presented. Agreement between predictions and experimental data is good. Best agreement is obtained using the k-epsilon turbulence model in TEMPEST in flow cases where turbulent diffusion and stable stratification are dominant mechanisms affecting transport. The code's general analysis capabilities are summarized.
Tempest in a teapot: utility advertising

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ciscel, D.H.

Utility sales programs represent a form of organizational slack. It is an expense that can be traded off in times of administrative stress, providing a satisfactory payment to the consumer while maintaining the integrity of the present institutional arrangement. Because it is a trade-off commodity, regulatory control of utility advertising will remain a ''tempest in a teapot.'' Marketing programs are an integral part of the selling process in the modern corporation, and severe restrictions on advertising must be temporary in nature. Court cases have pointed out that utility companies need to inform the consumer about the use of the productmore » and to promote demand for the product. These actions will be considered legally reasonable no matter what the final disposition of current environmental regulations and energy restrictions. In fact, as acceptable social solutions develop for environmental and energy supply problems, the pressure on utility advertising can be expected to fall proportionately. However, the utility still represents the largest industrial concern in most locales. The utility advertising program makes the company even more visible. When there is public dissatisfaction with the more complex parts of the utility delivery system, the raucous voice of outrage will emerge from this tempestuous teapot.« less
Electronic Warfare Test and Evaluation (Essai et evaluation en matiere de guerre electronique)

DTIC Science & Technology

2012-12-01

Largest known chamber is 80 x 76 x 21 m. Shielding and quiet zones Usually ≥100 dB over at least 0.5 – 18 GHz. TEMPEST grade. Quiet zones: one or...accommodated as an afterthought. The highest level of RF/EO/IR/UV security control is offered by TEMPEST -grade aircraft-sized anechoic chambers. 6.9.7 SUT...aircraft-sized, RF- and laser-shielded anechoic chamber, shielded rooms, and an EW Sub-System Test Laboratory, all TEMPEST grade. It is co-located with the
Quicksilver: Middleware for Scalable Self-Regenerative Systems

DTIC Science & Technology

2006-04-01

Applications can be coded in any of about 25 programming languages ranging from the obvious ones to some very obscure languages , such as OCaml ...technology. Like Tempest, Quicksilver can support applications written in any of a wide range of programming languages supported by .NET. However, whereas...so that developers can work in standard languages and with standard tools and still exploit those solutions. Vendors need to see some success
Nonlinear Full-f Edge Gyrokinetic Turbulence Simulations

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Dimits, A. M.; Umansky, M. V.

2008-11-01

TEMPEST is a nonlinear full-f 5D electrostatic gyrokinetic code for simulations of neoclassical and turbulent transport for tokamak plasmas. Given an initial density perturbation, 4D TEMPEST simulations show that the kinetic GAM exists in the edge in the form of outgoing waves [1], its radial scale is set by plasma profiles, and the ion temperature inhomogeneity is necessary for GAM radial propagation. From an initial Maxwellian distribution with uniform poloidal profiles on flux surfaces, the 5D TEMPEST simulations in a flux coordinates with Boltzmann electron model in a circular geometry show the development of neoclassical equilibrium, the generation of the neoclassical electric field due to neoclassical polarization, and followed by a growth of instability due to the spatial gradients. 5D TEMPEST simulations of kinetic GAM turbulent generation, radial propagation, and its impact on transport will be reported. [1] X. Q. Xu, Phys. Rev. E., 78 (2008).
Anticipating Reader Response: Why I Chose "The Tempest" for English Literature Survey.

ERIC Educational Resources Information Center

Jones, Dan C.

1985-01-01

Argues in favor of a reader-response approach to the process of selecting the literary works students read in introductory or survey courses. Offers a rationale for using "The Tempest" in such a course. (FL)
Idempotent Methods for Continuous Time Nonlinear Stochastic Control

DTIC Science & Technology

2012-09-13

AND ADDRESS(ES) dba AND ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT NUMBER Stochastech Corporation dba Tempest Technologies 8939 S...Stochastic Control Problems Ben G. Fitzpatrick Tempest Technologies 8939 S. Sepulveda Boulevard, Suite 506 Los Angeles, CA 90045 Sponsored by
TeMPEST: the Texas, McDonald Photometric Extrasolar Search for Transits

NASA Astrophysics Data System (ADS)

Baliber, N. R.; Cochran, W. D.

2001-11-01

The TeMPEST project is a photometric search for transits of extrasolar giant planets orbiting at distances < ~ 0.1 AU to their parent stars. As is the case with HD 209458, the only known transiting system, measurements of the photometric dimming of stars with transiting planets, along with radial velocity (RV) data, will provide information on physical characteristics (mass, radius, and mean density) of these planets. Further study of HD 209458 b and planets like it might reveal their reflectivity, putting further constraints on their surface temperatures, as well as allow measurement of the composition of their outer atmospheres. To detect these types of systems, we use the McDonald Observatory 0.76m Prime Focus Camera (PFC), which provides a 46.2 arcmin square field. We are currently obtaining our first full season of data, and by early 2002 will have sufficient data to follow approximately 5,000 stars with the precision necessary to detect transits of close-orbiting Jovian planets. We also present data of the detection of the transit of the planet orbiting HD 209458 using the 0.76m PFC. These data are consistent with the partial occultation of the light from the star caused by the transit of an opaque disc of radius 1.4 R Jup. The TeMPEST project is funded by the NASA Origins program.
Tempest gas turbine extends EGT product line

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chellini, R.

With the introduction of the 7.8 MW (mechanical output) Tempest gas turbine, ECT has extended the company`s line of its small industrial turbines. The new Tempest machine, featuring a 7.5 MW electric output and a 33% thermal efficiency, ranks above the company`s single-shaft Typhoon gas turbine, rated 3.2 and 4.9 MW, and the 6.3 MW Tornado gas turbine. All three machines are well-suited for use in combined heat and power (CHP) plants, as demonstrated by the fact that close to 50% of the 150 Typhoon units sold are for CHP applications. This experience has induced EGT, of Lincoln, England, tomore » announce the introduction of the new gas turbine prior to completion of the testing program. The present single-shaft machine is expected to be used mainly for industrial trial cogeneration. This market segment, covering the needs of paper mills, hospitals, chemical plants, ceramic industry, etc., is a typical local market. Cogeneration plants are engineered according to local needs and have to be assisted by local organizations. For this reason, to efficiently cover the world market, EGT has selected a number of associates that will receive from Lincoln completely engineered machine packages and will engineer the cogeneration system according to custom requirements. These partners will also assist the customer and dispose locally of the spares required for maintenance operations.« less
Tempest Simulations of Collisionless Damping of the Geodesic-Acoustic Mode in Edge-Plasma Pedestals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, X. Q.; Xiong, Z.; Nevins, W. M.

The fully nonlinear (full-f) four-dimensional TEMPEST gyrokinetic continuum code correctly produces the frequency and collisionless damping of geodesic-acoustic modes (GAMs) and zonal flow, with fully nonlinear Boltzmann electrons for the inverse aspect ratio {epsilon} scan and the tokamak safety factor q scan in homogeneous plasmas. TEMPEST simulations show that the GAMs exist in the edge pedestal for steep density and temperature gradients in the form of outgoing waves. The enhanced GAM damping may explain experimental beam emission spectroscopy measurements on the edge q scaling of the GAM amplitude.
Tempest Simulations of Collisionless Damping of the Geodesic-Acoustic Mode in Edge-Plasma Pedestals

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Xiong, Z.; Gao, Z.; Nevins, W. M.; McKee, G. R.

2008-05-01

The fully nonlinear (full-f) four-dimensional TEMPEST gyrokinetic continuum code correctly produces the frequency and collisionless damping of geodesic-acoustic modes (GAMs) and zonal flow, with fully nonlinear Boltzmann electrons for the inverse aspect ratio γ scan and the tokamak safety factor q scan in homogeneous plasmas. TEMPEST simulations show that the GAMs exist in the edge pedestal for steep density and temperature gradients in the form of outgoing waves. The enhanced GAM damping may explain experimental beam emission spectroscopy measurements on the edge q scaling of the GAM amplitude.
TEMPEST simulations of collisionless damping of the geodesic-acoustic mode in edge-plasma pedestals.

PubMed

Xu, X Q; Xiong, Z; Gao, Z; Nevins, W M; McKee, G R

2008-05-30

The fully nonlinear (full-f) four-dimensional TEMPEST gyrokinetic continuum code correctly produces the frequency and collisionless damping of geodesic-acoustic modes (GAMs) and zonal flow, with fully nonlinear Boltzmann electrons for the inverse aspect ratio scan and the tokamak safety factor q scan in homogeneous plasmas. TEMPEST simulations show that the GAMs exist in the edge pedestal for steep density and temperature gradients in the form of outgoing waves. The enhanced GAM damping may explain experimental beam emission spectroscopy measurements on the edge q scaling of the GAM amplitude.
Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen).

PubMed

Rambaut, Andrew; Lam, Tommy T; Max Carvalho, Luiz; Pybus, Oliver G

2016-01-01

Gene sequences sampled at different points in time can be used to infer molecular phylogenies on a natural timescale of months or years, provided that the sequences in question undergo measurable amounts of evolutionary change between sampling times. Data sets with this property are termed heterochronous and have become increasingly common in several fields of biology, most notably the molecular epidemiology of rapidly evolving viruses. Here we introduce the cross-platform software tool, TempEst (formerly known as Path-O-Gen), for the visualization and analysis of temporally sampled sequence data. Given a molecular phylogeny and the dates of sampling for each sequence, TempEst uses an interactive regression approach to explore the association between genetic divergence through time and sampling dates. TempEst can be used to (1) assess whether there is sufficient temporal signal in the data to proceed with phylogenetic molecular clock analysis, and (2) identify sequences whose genetic divergence and sampling date are incongruent. Examination of the latter can help identify data quality problems, including errors in data annotation, sample contamination, sequence recombination, or alignment error. We recommend that all users of the molecular clock models implemented in BEAST first check their data using TempEst prior to analysis.
Dynamics of kinetic geodesic-acoustic modes and the radial electric field in tokamak neoclassical plasmas

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Belli, E.; Bodi, K.; Candy, J.; Chang, C. S.; Cohen, R. H.; Colella, P.; Dimits, A. M.; Dorr, M. R.; Gao, Z.; Hittinger, J. A.; Ko, S.; Krasheninnikov, S.; McKee, G. R.; Nevins, W. M.; Rognlien, T. D.; Snyder, P. B.; Suh, J.; Umansky, M. V.

2009-06-01

We present edge gyrokinetic simulations of tokamak plasmas using the fully non-linear (full-f) continuum code TEMPEST. A non-linear Boltzmann model is used for the electrons. The electric field is obtained by solving the 2D gyrokinetic Poisson equation. We demonstrate the following. (1) High harmonic resonances (n > 2) significantly enhance geodesic-acoustic mode (GAM) damping at high q (tokamak safety factor), and are necessary to explain the damping observed in our TEMPEST q-scans and consistent with the experimental measurements of the scaling of the GAM amplitude with edge q95 in the absence of obvious evidence that there is a strong q-dependence of the turbulent drive and damping of the GAM. (2) The kinetic GAM exists in the edge for steep density and temperature gradients in the form of outgoing waves, its radial scale is set by the ion temperature profile, and ion temperature inhomogeneity is necessary for GAM radial propagation. (3) The development of the neoclassical electric field evolves through different phases of relaxation, including GAMs, their radial propagation and their long-time collisional decay. (4) Natural consequences of orbits in the pedestal and scrape-off layer region in divertor geometry are substantial non-Maxwellian ion distributions and parallel flow characteristics qualitatively like those observed in experiments.
Enhancing Electromagnetic Side-Channel Analysis in an Operational Environment

DTIC Science & Technology

2013-09-01

phenomenon of compromising power and EM emissions has been known and exploited for decades. Declassified TEMPEST documents reveal vulnerabilities of...Components. One technique to detect potentially compromising emissions is to use a wide-band receiver tuned to a specific frequency. High-end TEMPEST
TEMPEST code modifications and testing for erosion-resisting sludge simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Onishi, Y.; Trent, D.S.

The TEMPEST computer code has been used to address many waste retrieval operational and safety questions regarding waste mobilization, mixing, and gas retention. Because the amount of sludge retrieved from the tank is directly related to the sludge yield strength and the shear stress acting upon it, it is important to incorporate the sludge yield strength into simulations of erosion-resisting tank waste retrieval operations. This report describes current efforts to modify the TEMPEST code to simulate pump jet mixing of erosion-resisting tank wastes and the models used to test for erosion of waste sludge with yield strength. Test results formore » solid deposition and diluent/slurry jet injection into sludge layers in simplified tank conditions show that the modified TEMPEST code has a basic ability to simulate both the mobility and immobility of the sludges with yield strength. Further testing, modification, calibration, and verification of the sludge mobilization/immobilization model are planned using erosion data as they apply to waste tank sludges.« less
Temporal Experiment for Storms and Tropical Systems (TEMPEST) CubeSat Constellation

NASA Astrophysics Data System (ADS)

Reising, S. C.; Todd, G.; Padmanabhan, S.; Brown, S. T.; Lim, B.; Kummerow, C. D.; Chandra, C. V.; van den Heever, S. C.; L'Ecuyer, T. S.; Luo, Z. J.; Haddad, Z. S.; Munchak, S. J.; Ruf, C. S.; Berg, G.; Koch, T.; Boukabara, S. A.

2014-12-01

TEMPEST addresses key science needs related to cloud and precipitation processes using a constellation of five CubeSats with identical five-frequency millimeter-wave radiometers spaced 5-10 minutes apart in orbit. The deployment of CubeSat constellations on satellite launches of opportunity allows Earth system observations to be accomplished with greater robustness, shorter repeat times and at a small fraction of the cost of typical Earth Science missions. The current suite of Earth-observing satellites is capable of measuring precipitation parameters using radar or radiometric observations. However, these low Earth-orbiting satellites provide only a snapshot of each storm, due to their repeat-pass times of many hours to days. With typical convective events lasting 1-2 hours, it is highly unlikely that the time evolution of clouds through the onset of precipitation will be observed with current assets. The TEMPEST CubeSat constellation directly observes the time evolution of clouds and identifies changes in time to detect the moment of the onset of precipitation. The TEMPEST millimeter-wave radiometers penetrate into the cloud to directly observe changes as the cloud begins to precipitate or ice accumulates inside the storm. The evolution of ice formation in clouds is important for climate prediction because it largely drives Earth's radiation budget. TEMPEST improves understanding of cloud processes and helps to constrain one of the largest sources of uncertainty in climate models. TEMPEST provides observations at five millimeter-wave frequencies from 90 to 183 GHz using a single compact instrument that is well suited for a 6U CubeSat architecture and fits well within the NASA CubeSat Launch Initiative (CSLI) capabilities. Five identical CubeSats deployed in the same orbital plane with 5-10 minute spacing at 390-450 km altitude and 50-65 degree inclination capture 3 million observations of precipitation, including 100,000 deep convective events in a one-year mission. TEMPEST provides critical information on the time evolution of cloud and precipitation microphysics, thereby yielding a first-order understanding of how assumptions in current cloud-model parameterizations behave in diverse climate regimes.
TEMPEST Simulations of Collisionless Damping of Geodesic-Acoustic Mode in Edge Plasma Pedestal

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, X Q; Xiong, Z; Nevins, W M

The fully nonlinear (full-f) 4D TEMPEST gyrokinetic continuum code produces frequency, collisionless damping of GAM and zonal flow with fully nonlinear Boltzmann electrons for the inverse aspect ratio {epsilon}-scan and the tokamak safety factor q-scan in homogeneous plasmas. The TEMPEST simulation shows that GAM exists in edge plasma pedestal for steep density and temperature gradients, and an initial GAM relaxes to the standard neoclassical residual, rather than Rosenbluth-Hinton residual due to the presence of ion-ion collisions. The enhanced GAM damping explains experimental BES measurements on the edge q scaling of the GAM amplitude.
TEMPEST Simulations of Collisionless Damping of Geodesic-Acoustic Mode in Edge Plasma Pedestal

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, X; Xiong, Z; Nevins, W

The fully nonlinear 4D TEMPEST gyrokinetic continuum code produces frequency, collisionless damping of geodesic-acoustic mode (GAM) and zonal flow with fully nonlinear Boltzmann electrons for the inverse aspect ratio {epsilon}-scan and the tokamak safety factor q-scan in homogeneous plasmas. The TEMPEST simulation shows that GAM exists in edge plasma pedestal for steep density and temperature gradients, and an initial GAM relaxes to the standard neoclassical residual, rather than Rosenbluth-Hinton residual due to the presence of ion-ion collisions. The enhanced GAM damping explains experimental BES measurements on the edge q scaling of the GAM amplitude.
Photometric Detection of Extra-Solar Planets

NASA Technical Reports Server (NTRS)

Hatzes, Artie P.; Cochran, William D.

2004-01-01

This NASA Origins Program grant supported the TEMPEST Texas McDonald Photometric Extrasolar Search for Transits) program at McDonald Observatory, which searches for transits of extrasolar planets across the disks of their parent stars. The basic approach is to use a wide-field ground-based telescope (in our case the McDonald Observatory 0.76m telescope and it s Prime Focus Corrector) to search for transits of short period (1-15 day orbits) of close-in hot-Jupiter planets in orbit around a large sample of field stars. The next task is to search these data streams for possible transit events. We collected our first set of test data for this program using the 0.76 m PFC in the summer of 1998. From those data, we developed the optimal observing procedures, including tailoring the stellar density, exposure times, and filters to best-suit the instrument and project. In the summer of 1999, we obtained the first partial season of data on a dedicated field in the constellation Cygnus. These data were used to develop and refine the reduction and analysis procedures to produce high-precision photometry and search for transits in the resulting light curves. The TeMPEST project subsequently obtained three full seasons of data on six different fields using the McDonald Observatory 0.76m PFC.

Shakespeare's Poetics of Play-Making and Therapeutic Action in "The Tempest."

ERIC Educational Resources Information Center

Reed, Melissa Ann

2000-01-01

Practices Kenneth Burke's rhetoric of empathic identification to read and understand six levels of consubstantiality between Shakespeare and his Elizabethan audience blueprinted by the authorized text of "The Tempest." Offers implications for the contemporary practices of poetry and drama therapy with participants capable of…
Novel Algorithm/Hardware Partnerships for Real-Time Nonlinear Control

DTIC Science & Technology

2014-02-28

Investigate Tempest Technologies 28 February 2014 Abstract The real-time implementation of controls in nonlinear systems remains one of the great...button for resetting the FPGA board in Max-Plus MVM FPGA system. We utilize the built-in 32MB BPI flash as storage for the Tempest Max-Plus MVM
Asia-Pacific: A Selected Bibliography

DTIC Science & Technology

2013-01-01

www.rsis.edu.sg/publications/Perspective/RSIS0842009.pdf Kurlantzick, Joshua. "Avoiding a Tempest in the South China Sea." Council on Foreign Relations...September 2, 2010. http://www.cfr.org/china/avoiding- tempest -south-china- sea/p22858 Kurlantzick, Joshua. "Growing U.S. Role in South China Sea
A "Tempest" Project: Shakespeare and Critical Conflicts.

ERIC Educational Resources Information Center

McCann, Thomas M.; Flanagan, Joseph M.

2002-01-01

Describes a 4-week unit of study that focuses on Shakespeare's "The Tempest," a text that has been especially controversial in today's climate of increased multicultural awareness. Involves students in a larger conversation about the possibilities for reading and interpreting literature and prepares them to write mature analyses of the…
"The Tempest": A Negotiable Meta-Panopticon

ERIC Educational Resources Information Center

Motlagh, Hanieh Mehr

2015-01-01

In "The Tempest", Shakespeare represents a world in which the model of a panopticon within a panopticon reveals how the power relations functions. All the major and minor characters establish panopticons which start from their own bodies and soul and move toward the larger one which belongs to that of Prospero as the higher order who has…
Naval Mine Countermeasures: The Achilles Heel of U.S. Homeland Defense

DTIC Science & Technology

2013-05-20

vii. 5 Ibid, viii. 6 Mark Tempest , “Port Security: Sea Mines, UWIEDS and Other Threats,” EagleSpeak. May 1, 2008, http://observer.guardian.co.uk...2013. http://news.yahoo.com/blogs/lookout /nypd-ray-kelly-boston-marathon-bombings-173922894.html. Tempest , Mark. “Port Security: Sea Mines, UWIEDS
Dramatic Prelude: Using Drama To Introduce Classic Literature to Young Readers.

ERIC Educational Resources Information Center

Winstead, Anita

1997-01-01

This paper describes the work done by a third-grade class to write and present adaptations of Dickens'"A Christmas Carol" and Shakespeare's "The Tempest." Students explored the authors' lives and collectively wrote their own renditions of the stories; the entire short text of their version of "The Tempest" is included. (PB)
Observations of Convective Development from Repeat Pass Radiometry during CalWaters 2015: Outlook for the TEMPEST Mission

NASA Astrophysics Data System (ADS)

Brown, S. T.

2015-12-01

The Temporal Experiment for Storms and Tropical Systems (TEMPEST), which was recently selected as a NASA Earth Ventures technology demonstration mission, uses a constellation of five CubeSats flying in formation to provide observations of developing precipitation with a temporal resolution of 5 minutes. The observations are made using small mm-wave radiometers with frequencies ranging from 90 to 183 GHz which are sensitive to the integrated ice water path above the precipitation layer in the storm. This paper describes TEMPEST like observations that were made with the High Altitude MMIC Sounding Radiometer (HAMSR) on the ER-2 during CalWaters 2015. HAMSR is a mm-wave airborne radiometer with 25 channels in three bands; 50, 118 and 183 GHz. During the campaign, a small isolated area of convection was identified by the ER-2 pilot and 5 overpasses of the area were made with about 5 minutes between each pass. The HAMSR data reveal two convective cells, one which was diminishing and one which was developing. The mm-wave channels near the 183 GHz water vapor line clearly show the change in the vertical extent of the storm with time, a proxy for vertical velocity. These data demonstrate the potential for TEMPEST like observations from an orbital vantage point. This paper will provide an overview of the measurements, an analysis of the observations and offer perspectives for the TEMPEST mission.
Temporal Experiment for Storms and Tropical Systems Technology Demonstration (TEMPEST-D): Risk Reduction for 6U-Class Nanosatellite Constellations

NASA Astrophysics Data System (ADS)

Reising, Steven C.; Gaier, Todd C.; Kummerow, Christian D.; Padmanabhan, Sharmila; Lim, Boon H.; Brown, Shannon T.; Heneghan, Cate; Chandra, Chandrasekar V.; Olson, Jon; Berg, Wesley

2016-04-01

TEMPEST-D will reduce the risk, cost and development time of a future constellation of 6U-Class nanosatellites to directly observe the time evolution of clouds and study the conditions that control the transition from non-precipitating to precipitating clouds using high-temporal resolution observations. TEMPEST-D provides passive millimeter-wave observations using a compact instrument that fits well within the size, weight and power (SWaP) requirements of the 6U-Class satellite architecture. TEMPEST-D is suitable for launch through NASA's CubeSat Launch Initiative (CSLI), for which it was selected in February 2015. By measuring the temporal evolution of clouds from the moment of the onset of precipitation, a TEMPEST constellation mission would improve our understanding of cloud processes and help to constrain one of the largest sources of uncertainty in climate models. Knowledge of clouds, cloud processes and precipitation is essential to our understanding of climate change. Uncertainties in the representation of key processes that govern the formation and dissipation of clouds and, in turn, control the global water and energy budgets lead to substantially different predictions of future climate in current models. TEMPEST millimeter-wave radiometers with five frequencies from 89 GHz to 182 GHz penetrate into the cloud to observe key changes as precipitation begins or ice accumulates inside the storm. The evolution of ice formation in clouds is important for climate prediction and a key factor in Earth's radiation budget. TEMPEST is designed to provide critical information on the time evolution of cloud and precipitation, yielding a first-order understanding of assumptions and uncertainties in current cloud parameterizations in general circulation models in diverse climate regimes. For a potential future one-year operational mission, five identical 6U-Class satellites would be deployed in the same orbital plane with 5- to 10-minute spacing deployed in an orbit similar to the International Space Station resupply missions, i.e. at ~400 km altitude and ~51° inclination. A one-year mission would capture 3 million observations of precipitation greater than 1 mm/hour rain rate, including at least 100,000 deep convective events. Passive drag-adjusting maneuvers would separate the five CubeSats in the same orbital plane by 5-10 minutes each, similar to deployment techniques to be used by NASA's Cyclone Global Navigation Satellite Systems (CYGNSS) mission.
Liberation Tigers of Tamil Elam, Aum Shinrikyo, Al Qaeda, and the Syrian Crisis: Nonstate Actors Acquiring WMD

DTIC Science & Technology

2013-12-01

Qaeda’s Tactics and Targets (Alexandria, VA: Tempest Publishing, 2003), 52; Jason Burke, Al-Qa’ida Casting a Shadow of Terror (London: I.B. Tauris...Aimee Ibrahim. The al-Qaeda Theat: An Analytical Guide to al Qaeda’s Tactics and Targets. Alexandria, VA: Tempest Publishing, 2003. Warrick, Joby
Cleaning Up and Maintenance in the Wake of an Urban School Administration Tempest.

ERIC Educational Resources Information Center

Murtadha-Watts, Khuala

2000-01-01

Describes the context of a city corporation's attempt to initiate educational reform, focusing on two city school administrators, a newly hired Latina superintendent and an African American female assistant superintendent. Uses the metaphor of a tempest to describe the tension between the urge for rapid reform of the new superintendent and the…
Numerical Solution of the Gyrokinetic Poisson Equation in TEMPEST

NASA Astrophysics Data System (ADS)

Dorr, Milo; Cohen, Bruce; Cohen, Ronald; Dimits, Andris; Hittinger, Jeffrey; Kerbel, Gary; Nevins, William; Rognlien, Thomas; Umansky, Maxim; Xiong, Andrew; Xu, Xueqiao

2006-10-01

The gyrokinetic Poisson (GKP) model in the TEMPEST continuum gyrokinetic edge plasma code yields the electrostatic potential due to the charge density of electrons and an arbitrary number of ion species including the effects of gyroaveraging in the limit kρ1. The TEMPEST equations are integrated as a differential algebraic system involving a nonlinear system solve via Newton-Krylov iteration. The GKP preconditioner block is inverted using a multigrid preconditioned conjugate gradient (CG) algorithm. Electrons are treated as kinetic or adiabatic. The Boltzmann relation in the adiabatic option employs flux surface averaging to maintain neutrality within field lines and is solved self-consistently with the GKP equation. A decomposition procedure circumvents the near singularity of the GKP Jacobian block that otherwise degrades CG convergence.
Combatting Electoral Traces: The Dutch Tempest Discussion and Beyond

NASA Astrophysics Data System (ADS)

Pieters, Wolter

In the Dutch e-voting debate, the crucial issue leading to the abandonment of all electronic voting machines was compromising radiation, or tempest: it would be possible to eavesdrop on the choice of the voter by capturing the radiation from the machine. Other countries, however, do not seem to be bothered by this risk. In this paper, we use actor-network theory to analyse the socio-technical origins of the Dutch tempest issue in e-voting, and we introduce concepts for discussing its implications for e-voting beyond the Netherlands. We introduce the term electoral traces to denote any physical, digital or social evidence of a voter’s choices in an election. From this perspective, we provide a framework for risk classification as well as an overview of countermeasures against such traces.
Simulation of Plasma Transport in a Toroidal Annulus with TEMPEST

NASA Astrophysics Data System (ADS)

Xiong, Z.

2005-10-01

TEMPEST is an edge gyro-kinetic continuum code currently under development at LLNL to study boundary plasma transport over a region extending from inside the H-mode pedestal across the separatrix to the divertor plates. Here we report simulation results from the 4D (θ, ψ, E, μ) TEMPEST, for benchmark purpose, in an annulus region immediately inside the separatrix of a large aspect ratio, circular cross-section tokamak. Besides the normal poloidal trapping regions, there are radial inaccessible regions at a fixed poloid angle, energy and magnetic moment due to the radial variation of the B field. To handle such cases, a fifth-order WENO differencing scheme is used in the radial direction. The particle and heat transport coefficients are obtained for different collisional regimes and compared with the neo-classical transport theory.
Modeling study of deposition locations in the 291-Z plenum

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mahoney, L.A.; Glissmeyer, J.A.

The TEMPEST (Trent and Eyler 1991) and PART5 computer codes were used to predict the probable locations of particle deposition in the suction-side plenum of the 291-Z building in the 200 Area of the Hanford Site, the exhaust fan building for the 234-5Z, 236-Z, and 232-Z buildings in the 200 Area of the Hanford Site. The Tempest code provided velocity fields for the airflow through the plenum. These velocity fields were then used with TEMPEST to provide modeling of near-floor particle concentrations without particle sticking (100% resuspension). The same velocity fields were also used with PART5 to provide modeling ofmore » particle deposition with sticking (0% resuspension). Some of the parameters whose importance was tested were particle size, point of injection and exhaust fan configuration.« less
Director, Operational Test and Evaluation FY 2015 Annual Report

DTIC Science & Technology

2016-01-01

review. For example, where a wind turbine project was found to have the potential to seriously degrade radar cross section testing at the Naval Air...Assessment Plan U.S. Special Operations Command Tempest Wind 2015 Assessment Plan U.S. Transportation Command Turbo Challenge 2015 Final Assessment...U.S. Air Forces Central Command 2015 May 2015 U.S. Special Operations Command-Pacific Tempest Wind 2014 May 2015 North American Aerospace Defense
Correlation models for waste tank sludges and slurries

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mahoney, L.A.; Trent, D.S.

This report presents the results of work conducted to support the TEMPEST computer modeling under the Flammable Gas Program (FGP) and to further the comprehension of the physical processes occurring in the Hanford waste tanks. The end products of this task are correlation models (sets of algorithms) that can be added to the TEMPEST computer code to improve the reliability of its simulation of the physical processes that occur in Hanford tanks. The correlation models can be used to augment, not only the TEMPEST code, but other computer codes that can simulate sludge motion and flammable gas retention. This reportmore » presents the correlation models, also termed submodels, that have been developed to date. The submodel-development process is an ongoing effort designed to increase our understanding of sludge behavior and improve our ability to realistically simulate the sludge fluid characteristics that have an impact on safety analysis. The effort has employed both literature searches and data correlation to provide an encyclopedia of tank waste properties in forms that are relatively easy to use in modeling waste behavior. These properties submodels will be used in other tasks to simulate waste behavior in the tanks. Density, viscosity, yield strength, surface tension, heat capacity, thermal conductivity, salt solubility, and ammonia and water vapor pressures were compiled for solutions and suspensions of sodium nitrate and other salts (where data were available), and the data were correlated by linear regression. In addition, data for simulated Hanford waste tank supernatant were correlated to provide density, solubility, surface tension, and vapor pressure submodels for multi-component solutions containing sodium hydroxide, sodium nitrate, sodium nitrite, and sodium aluminate.« less
Impact of xynthia tempest on viral contamination of shellfish.

PubMed

Grodzki, Marco; Ollivier, Joanna; Le Saux, Jean-Claude; Piquet, Jean-Côme; Noyer, Mathilde; Le Guyader, Françoise S

2012-05-01

Viral contamination in oyster and mussel samples was evaluated after a massive storm with hurricane wind named "Xynthia tempest" destroyed a number of sewage treatment plants in an area harboring many shellfish farms. Although up to 90% of samples were found to be contaminated 2 days after the disaster, detected viral concentrations were low. A 1-month follow-up showed a rapid decrease in the number of positive samples, even for norovirus.
Evaluation of Information Leakage via Electromagnetic Emanation and Effectiveness of Tempest

NASA Astrophysics Data System (ADS)

Tanaka, Hidema

It is well known that there is relationship between electromagnetic emanation and processing information in IT devices such as personal computers and smart cards. By analyzing such electromagnetic emanation, eavesdropper will be able to get some information, so it becomes a real threat of information security. In this paper, we show how to estimate amount of information that is leaked as electromagnetic emanation. We assume the space between the IT device and the receiver is a communication channel, and we define the amount of information leakage via electromagnetic emanations by its channel capacity. By some experimental results of Tempest, we show example estimations of amount of information leakage. Using the value of channel capacity, we can calculate the amount of information per pixel in the reconstructed image. And we evaluate the effectiveness of Tempest fonts generated by Gaussian method and its threshold of security.
Tempest: Mesoscale test case suite results and the effect of order-of-accuracy on pressure gradient force errors

NASA Astrophysics Data System (ADS)

Guerra, J. E.; Ullrich, P. A.

2014-12-01

Tempest is a new non-hydrostatic atmospheric modeling framework that allows for investigation and intercomparison of high-order numerical methods. It is composed of a dynamical core based on a finite-element formulation of arbitrary order operating on cubed-sphere and Cartesian meshes with topography. The underlying technology is briefly discussed, including a novel Hybrid Finite Element Method (HFEM) vertical coordinate coupled with high-order Implicit/Explicit (IMEX) time integration to control vertically propagating sound waves. Here, we show results from a suite of Mesoscale testing cases from the literature that demonstrate the accuracy, performance, and properties of Tempest on regular Cartesian meshes. The test cases include wave propagation behavior, Kelvin-Helmholtz instabilities, and flow interaction with topography. Comparisons are made to existing results highlighting improvements made in resolving atmospheric dynamics in the vertical direction where many existing methods are deficient.

5D Tempest simulations of kinetic edge turbulence

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Xiong, Z.; Cohen, B. I.; Cohen, R. H.; Dorr, M. R.; Hittinger, J. A.; Kerbel, G. D.; Nevins, W. M.; Rognlien, T. D.; Umansky, M. V.; Qin, H.

2006-10-01

Results are presented from the development and application of TEMPEST, a nonlinear five dimensional (3d2v) gyrokinetic continuum code. The simulation results and theoretical analysis include studies of H-mode edge plasma neoclassical transport and turbulence in real divertor geometry and its relationship to plasma flow generation with zero external momentum input, including the important orbit-squeezing effect due to the large electric field flow-shear in the edge. In order to extend the code to 5D, we have formulated a set of fully nonlinear electrostatic gyrokinetic equations and a fully nonlinear gyrokinetic Poisson's equation which is valid for both neoclassical and turbulence simulations. Our 5D gyrokinetic code is built on 4D version of Tempest neoclassical code with extension to a fifth dimension in binormal direction. The code is able to simulate either a full torus or a toroidal segment. Progress on performing 5D turbulence simulations will be reported.
Simulation framework for electromagnetic effects in plasmonics, filter apertures, wafer scattering, grating mirrors, and nano-crystals

NASA Astrophysics Data System (ADS)

Ceperley, Daniel Peter

This thesis presents a Finite-Difference Time-Domain simulation framework as well as both scientific observations and quantitative design data for emerging optical devices. These emerging applications required the development of simulation capabilities to carefully control numerical experimental conditions, isolate and quantifying specific scattering processes, and overcome memory and run-time limitations on large device structures. The framework consists of a new version 7 of TEMPEST and auxiliary tools implemented as Matlab scripts. In improving the geometry representation and absorbing boundary conditions in TEMPEST from v6 the accuracy has been sustained and key improvements have yielded application specific speed and accuracy improvements. These extensions include pulsed methods, PML for plasmon termination, and plasmon and scattered field sources. The auxiliary tools include application specific methods such as signal flow graphs of plasmon couplers, Bloch mode expansions of sub-wavelength grating waves, and back-propagation methods to characterize edge scattering in diffraction masks. Each application posed different numerical hurdles and physical questions for the simulation framework. The Terrestrial Planet Finder Coronagraph required accurate modeling of diffraction mask structures too large for solely FDTD analysis. This analysis was achieved through a combination of targeted TEMPEST simulations and full system simulator based on thin mask scalar diffraction models by Ball Aerospace for JPL. TEMPEST simulation showed that vertical sidewalls were the strongest scatterers, adding nearly 2lambda of light per mask edge, which could be reduced by 20° undercuts. TEMPEST assessment of coupling in rapid thermal annealing was complicated by extremely sub-wavelength features and fine meshes. Near 100% coupling and low variability was confirmed even in the presence of unidirectional dense metal gates. Accurate analysis of surface plasmon coupling efficiency by small surface features required capabilities to isolate these features and cleanly illuminate them with plasmons and plane-waves. These features were shown to have coupling cross-sections up to and slightly exceeding their physical size. Long run-times for TEMPEST simulations of finite length gratings were overcome with a signal flow graph method. With these methods a plasmon coupler with over a 10lambda 100% capture length was demonstrated. Simulation of 3D nano-particle arrays utilized TEMPEST v7's pulsed methods to minimize the number of multi-day simulations. These simulations led to the discovery that interstitial plasmons were responsible for resonant absorption and transmission but not reflection. Simulation of a sub-wavelength grating mirror using pulsed sources to map resonant spectra showed that neither coupled guided waves nor coupled isolated resonators accurately described the operation. However, a new model based on vertical propagation of lateral Bloch modes with zero phase progression efficiently characterized the device and provided principles for designing similar devices at other wavelengths.
Space Weathering Perspectives on Europa Amidst the Tempest of the Jupiter Magnetospheric System

NASA Technical Reports Server (NTRS)

Cooper, J. F.; Hartle, R. E.; Lipatov, A. S.; Sittler, E. C.; Cassidy, T. A.; Ip. W.-H.

2010-01-01

Europa resides within a "perfect storm" tempest of extreme external field, plasma, and energetic particle interactions with the magnetospheric system of Jupiter. Missions to Europa must survive, functionally operate, make useful measurements, and return critical science data, while also providing full context on this ocean moon's response to the extreme environment. Related general perspectives on space weathering in the solar system are applied to mission and instrument science requirements for Europa.
Solar Wind Acceleration: Modeling Effects of Turbulent Heating in Open Flux Tubes

NASA Astrophysics Data System (ADS)

Woolsey, Lauren N.; Cranmer, Steven R.

2014-06-01

We present two self-consistent coronal heating models that determine the properties of the solar wind generated and accelerated in magnetic field geometries that are open to the heliosphere. These models require only the radial magnetic field profile as input. The first code, ZEPHYR (Cranmer et al. 2007) is a 1D MHD code that includes the effects of turbulent heating created by counter-propagating Alfven waves rather than relying on empirical heating functions. We present the analysis of a large grid of modeled flux tubes (> 400) and the resulting solar wind properties. From the models and results, we recreate the observed anti-correlation between wind speed at 1 AU and the so-called expansion factor, a parameterization of the magnetic field profile. We also find that our models follow the same observationally-derived relation between temperature at 1 AU and wind speed at 1 AU. We continue our analysis with a newly-developed code written in Python called TEMPEST (The Efficient Modified-Parker-Equation-Solving Tool) that runs an order of magnitude faster than ZEPHYR due to a set of simplifying relations between the input magnetic field profile and the temperature and wave reflection coefficient profiles. We present these simplifying relations as a useful result in themselves as well as the anti-correlation between wind speed and expansion factor also found with TEMPEST. Due to the nature of the algorithm TEMPEST utilizes to find solar wind solutions, we can effectively separate the two primary ways in which Alfven waves contribute to solar wind acceleration: 1) heating the surrounding gas through a turbulent cascade and 2) providing a separate source of wave pressure. We intend to make TEMPEST easily available to the public and suggest that TEMPEST can be used as a valuable tool in the forecasting of space weather, either as a stand-alone code or within an existing modeling framework.
Neoclassical simulation of tokamak plasmas using the continuum gyrokinetic code TEMPEST.

PubMed

Xu, X Q

2008-07-01

We present gyrokinetic neoclassical simulations of tokamak plasmas with a self-consistent electric field using a fully nonlinear (full- f ) continuum code TEMPEST in a circular geometry. A set of gyrokinetic equations are discretized on a five-dimensional computational grid in phase space. The present implementation is a method of lines approach where the phase-space derivatives are discretized with finite differences, and implicit backward differencing formulas are used to advance the system in time. The fully nonlinear Boltzmann model is used for electrons. The neoclassical electric field is obtained by solving the gyrokinetic Poisson equation with self-consistent poloidal variation. With a four-dimensional (psi,theta,micro) version of the TEMPEST code, we compute the radial particle and heat fluxes, the geodesic-acoustic mode, and the development of the neoclassical electric field, which we compare with neoclassical theory using a Lorentz collision model. The present work provides a numerical scheme for self-consistently studying important dynamical aspects of neoclassical transport and electric field in toroidal magnetic fusion devices.
Neoclassical simulation of tokamak plasmas using the continuum gyrokinetic code TEMPEST

NASA Astrophysics Data System (ADS)

Xu, X. Q.

2008-07-01

We present gyrokinetic neoclassical simulations of tokamak plasmas with a self-consistent electric field using a fully nonlinear (full- f ) continuum code TEMPEST in a circular geometry. A set of gyrokinetic equations are discretized on a five-dimensional computational grid in phase space. The present implementation is a method of lines approach where the phase-space derivatives are discretized with finite differences, and implicit backward differencing formulas are used to advance the system in time. The fully nonlinear Boltzmann model is used for electrons. The neoclassical electric field is obtained by solving the gyrokinetic Poisson equation with self-consistent poloidal variation. With a four-dimensional (ψ,θ,γ,μ) version of the TEMPEST code, we compute the radial particle and heat fluxes, the geodesic-acoustic mode, and the development of the neoclassical electric field, which we compare with neoclassical theory using a Lorentz collision model. The present work provides a numerical scheme for self-consistently studying important dynamical aspects of neoclassical transport and electric field in toroidal magnetic fusion devices.
Implementation of an anomalous radial transport model for continuum kinetic edge codes

NASA Astrophysics Data System (ADS)

Bodi, K.; Krasheninnikov, S. I.; Cohen, R. H.; Rognlien, T. D.

2007-11-01

Radial plasma transport in magnetic fusion devices is often dominated by plasma turbulence compared to neoclassical collisional transport. Continuum kinetic edge codes [such as the (2d,2v) transport version of TEMPEST and also EGK] compute the collisional transport directly, but there is a need to model the anomalous transport from turbulence for long-time transport simulations. Such a model is presented and results are shown for its implementation in the TEMPEST gyrokinetic edge code. The model includes velocity-dependent convection and diffusion coefficients expressed as a Hermite polynominals in velocity. The specification of the Hermite coefficients can be set, e.g., by specifying the ratio of particle and energy transport as in fluid transport codes. The anomalous transport terms preserve the property of no particle flux into unphysical regions of velocity space. TEMPEST simulations are presented showing the separate control of particle and energy anomalous transport, and comparisons are made with neoclassical transport also included.
Development of a Task-Exposure Matrix (TEM) for Pesticide Use (TEMPEST).

PubMed

Dick, F D; Semple, S E; van Tongeren, M; Miller, B G; Ritchie, P; Sherriff, D; Cherrie, J W

2010-06-01

Pesticides have been associated with increased risks for a range of conditions including Parkinson's disease, but identifying the agents responsible has proven challenging. Improved pesticide exposure estimates would increase the power of epidemiological studies to detect such an association if one exists. Categories of pesticide use were identified from the tasks reported in a previous community-based case-control study in Scotland. Typical pesticides used in each task in each decade were identified from published scientific and grey literature and from expert interviews, with the number of potential agents collapsed into 10 groups of pesticides. A pesticide usage database was then created, using the task list and the typical pesticide groups employed in those tasks across seven decades spanning the period 1945-2005. Information about the method of application and concentration of pesticides used in these tasks was then incorporated into the database. A list was generated of 81 tasks involving pesticide exposure in Scotland covering seven decades producing a total of 846 task per pesticide per decade combinations. A Task-Exposure Matrix for PESTicides (TEMPEST) was produced by two occupational hygienists who quantified the likely probability and intensity of inhalation and dermal exposures for each pesticide group for a given use during each decade. TEMPEST provides a basis for assessing exposures to specific pesticide groups in Scotland covering the period 1945-2005. The methods used to develop TEMPEST could be used in a retrospective assessment of occupational exposure to pesticides for Scottish epidemiological studies or adapted for use in other countries.
Turbulence-driven Coronal Heating and Improvements to Empirical Forecasting of the Solar Wind

NASA Astrophysics Data System (ADS)

Woolsey, Lauren N.; Cranmer, Steven R.

2014-06-01

Forecasting models of the solar wind often rely on simple parameterizations of the magnetic field that ignore the effects of the full magnetic field geometry. In this paper, we present the results of two solar wind prediction models that consider the full magnetic field profile and include the effects of Alfvén waves on coronal heating and wind acceleration. The one-dimensional magnetohydrodynamic code ZEPHYR self-consistently finds solar wind solutions without the need for empirical heating functions. Another one-dimensional code, introduced in this paper (The Efficient Modified-Parker-Equation-Solving Tool, TEMPEST), can act as a smaller, stand-alone code for use in forecasting pipelines. TEMPEST is written in Python and will become a publicly available library of functions that is easy to adapt and expand. We discuss important relations between the magnetic field profile and properties of the solar wind that can be used to independently validate prediction models. ZEPHYR provides the foundation and calibration for TEMPEST, and ultimately we will use these models to predict observations and explain space weather created by the bulk solar wind. We are able to reproduce with both models the general anticorrelation seen in comparisons of observed wind speed at 1 AU and the flux tube expansion factor. There is significantly less spread than comparing the results of the two models than between ZEPHYR and a traditional flux tube expansion relation. We suggest that the new code, TEMPEST, will become a valuable tool in the forecasting of space weather.
TEMPEST: A computer code for three-dimensional analysis of transient fluid dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fort, J.A.

TEMPEST (Transient Energy Momentum and Pressure Equations Solutions in Three dimensions) is a powerful tool for solving engineering problems in nuclear energy, waste processing, chemical processing, and environmental restoration because it analyzes and illustrates 3-D time-dependent computational fluid dynamics and heat transfer analysis. It is a family of codes with two primary versions, a N- Version (available to public) and a T-Version (not currently available to public). This handout discusses its capabilities, applications, numerical algorithms, development status, and availability and assistance.
TEMperature Pressure ESTimation of a homogeneous boiling fuel-steel mixture in an LMFBR core. [TEMPEST code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pyun, J.J.; Majumdar, D.

The paper describes TEMPEST, a simple computer program for the temperature and pressure estimation of a boiling fuel-steel pool in an LMFBR core. The time scale of interest of this program is large, of the order of ten seconds. Further, the vigorous boiling in the pool will generate a large contact, and hence a large heat transfer between fuel and steel. The pool is assumed to be a uniform mixture of fuel and steel, and consequently vapor production is also assumed to be uniform throughout the pool. The pool is allowed to expand in volume if there is steel meltingmore » at the walls. In this program, the total mass of liquid and vapor fuel is always kept constant, but the total steel mass in the pool may change by steel wall melting. Because of a lack of clear understanding of the physical phenomena associated with the progression of a fuel-steel mixture at high temperature, various input options have been built-in to enable one to perform parametric studies. For example, the heat transfer from the pool to the surrounding steel structure may be controlled by input values for the heat transfer coefficients, or, the heat transfer may be calculated by a correlation obtained from the literature. Similarly, condensation of vapor on the top wall can be specified by input values of the condensation coefficient; the program can otherwise calculate condensation according to the non-equilibrium model predictions. Meltthrough rates of the surrounding steel walls can be specified by a fixed melt-rate or can be determined by a fraction of the heat loss that goes to steel-melting. The melted steel is raised to the pool temperature before it is joined with the pool material. Several applications of this program to various fuel-steel pools in the FFTF and the CRBR cores are discussed.« less
Neoclassical Simulation of Tokamak Plasmas using Continuum Gyrokinetc Code TEMPEST

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, X Q

We present gyrokinetic neoclassical simulations of tokamak plasmas with self-consistent electric field for the first time using a fully nonlinear (full-f) continuum code TEMPEST in a circular geometry. A set of gyrokinetic equations are discretized on a five dimensional computational grid in phase space. The present implementation is a Method of Lines approach where the phase-space derivatives are discretized with finite differences and implicit backwards differencing formulas are used to advance the system in time. The fully nonlinear Boltzmann model is used for electrons. The neoclassical electric field is obtained by solving gyrokinetic Poisson equation with self-consistent poloidal variation. Withmore » our 4D ({psi}, {theta}, {epsilon}, {mu}) version of the TEMPEST code we compute radial particle and heat flux, the Geodesic-Acoustic Mode (GAM), and the development of neoclassical electric field, which we compare with neoclassical theory with a Lorentz collision model. The present work provides a numerical scheme and a new capability for self-consistently studying important aspects of neoclassical transport and rotations in toroidal magnetic fusion devices.« less
Ground based research in microgravity materials processing

NASA Technical Reports Server (NTRS)

Workman, Gary L.; Rathz, Tom

1994-01-01

The core activities performed during this time period have been concerned with tracking the TEMPEST experiments on the shuttle with drops of Zr, Ni, and Nb alloys. In particular a lot of Zr drops are being made to better define the recalescence characteristics of that system so that accurate comparisons of the drop tube results with Tempest can be made. A new liner, with minimal reflectivity characteristics, has been inserted into the drop tube in order to improve the recalescence measurements of the falling drops. The first installation to make the geometric measurements to ensure a proper fit has been made. The stovepipe sections are currently in the shop at MSFC being painted with low reflectivity black paint. Work has also continued on setting up the MEL apparatus obtained from Oak Ridge in the down stairs laboratory at the Drop Tube Facilities. Some ground-based experiments on the same metals as are being processed on TEMPEST are planned for the MEL. The flight schedules for the KC-135 experiments are still to be determined in the near future.
The next-generation ESL continuum gyrokinetic edge code

NASA Astrophysics Data System (ADS)

Cohen, R.; Dorr, M.; Hittinger, J.; Rognlien, T.; Collela, P.; Martin, D.

2009-05-01

The Edge Simulation Laboratory (ESL) project is developing continuum-based approaches to kinetic simulation of edge plasmas. A new code is being developed, based on a conservative formulation and fourth-order discretization of full-f gyrokinetic equations in parallel-velocity, magnetic-moment coordinates. The code exploits mapped multiblock grids to deal with the geometric complexities of the edge region, and utilizes a new flux limiter [P. Colella and M.D. Sekora, JCP 227, 7069 (2008)] to suppress unphysical oscillations about discontinuities while maintaining high-order accuracy elsewhere. The code is just becoming operational; we will report initial tests for neoclassical orbit calculations in closed-flux surface and limiter (closed plus open flux surfaces) geometry. It is anticipated that the algorithmic refinements in the new code will address the slow numerical instability that was observed in some long simulations with the existing TEMPEST code. We will also discuss the status and plans for physics enhancements to the new code.
Neoclassical orbit calculations with a full-f code for tokamak edge plasmas

NASA Astrophysics Data System (ADS)

Rognlien, T. D.; Cohen, R. H.; Dorr, M.; Hittinger, J.; Xu, X. Q.; Collela, P.; Martin, D.

2008-11-01

Ion distribution function modifications are considered for the case of neoclassical orbit widths comparable to plasma radial-gradient scale-lengths. Implementation of proper boundary conditions at divertor plates in the continuum TEMPEST code, including the effect of drifts in determining the direction of total flow, enables such calculations in single-null divertor geometry, with and without an electrostatic potential. The resultant poloidal asymmetries in densities, temperatures, and flows are discussed. For long-time simulations, a slow numerical instability develops, even in simplified (circular) geometry with no endloss, which aids identification of the mixed treatment of parallel and radial convection terms as the cause. The new Edge Simulation Laboratory code, expected to be operational, has algorithmic refinements that should address the instability. We will present any available results from the new code on this problem as well as geodesic acoustic mode tests.
Mourning in the psychoanalytic situation and in Shakespeare's The Tempest.

PubMed

Houlding, Sybil

2015-01-01

Recognizing that mourning builds psychic structure, the author highlights the ubiquitous and essential nature of mourning in the psychoanalytic situation. Reality testing is intimately connected to mourning and is the warp on which psychic structure is woven in the analytic situation. Reality testing necessarily involves opportunities for mourning and thus will be present in every analytic hour. The confrontation with reality is the basis for all processes of mourning, or for creating defenses against this painful experience. The author views mourning as fundamentally a transformational process, and Shakespeare's The Tempest is used to illustrate this aspect of mourning. © 2015 The Psychoanalytic Quarterly, Inc.
Pressurized thermal shock: TEMPEST computer code simulation of thermal mixing in the cold leg and downcomer of a pressurized water reactor. [Creare 61 and 64

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eyler, L.L.; Trent, D.S.

The TEMPEST computer program was used to simulate fluid and thermal mixing in the cold leg and downcomer of a pressurized water reactor under emergency core cooling high-pressure injection (HPI), which is of concern to the pressurized thermal shock (PTS) problem. Application of the code was made in performing an analysis simulation of a full-scale Westinghouse three-loop plant design cold leg and downcomer. Verification/assessment of the code was performed and analysis procedures developed using data from Creare 1/5-scale experimental tests. Results of three simulations are presented. The first is a no-loop-flow case with high-velocity, low-negative-buoyancy HPI in a 1/5-scale modelmore » of a cold leg and downcomer. The second is a no-loop-flow case with low-velocity, high-negative density (modeled with salt water) injection in a 1/5-scale model. Comparison of TEMPEST code predictions with experimental data for these two cases show good agreement. The third simulation is a three-dimensional model of one loop of a full size Westinghouse three-loop plant design. Included in this latter simulation are loop components extending from the steam generator to the reactor vessel and a one-third sector of the vessel downcomer and lower plenum. No data were available for this case. For the Westinghouse plant simulation, thermally coupled conduction heat transfer in structural materials is included. The cold leg pipe and fluid mixing volumes of the primary pump, the stillwell, and the riser to the steam generator are included in the model. In the reactor vessel, the thermal shield, pressure vessel cladding, and pressure vessel wall are thermally coupled to the fluid and thermal mixing in the downcomer. The inlet plenum mixing volume is included in the model. A 10-min (real time) transient beginning at the initiation of HPI is computed to determine temperatures at the beltline of the pressure vessel wall.« less
Numerical simulation to determine the effects of incident wind shear and turbulence level on the flow around a building

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Y.Q.; Huber, A.H.; Arya, S.P.S.

The effects of incident shear and turbulence on flow around a cubical building are being investigated by a turbulent kinetic energy/dissipation model (TEMPEST). The numerical simulations demonstrate significant effects due to the differences in the incident flow. The addition of upstream turbulence and shear results in a reduced size of the cavity directly behind the building. The accuracy of numerical simulations is verified by comparing the predicted mean flow fields with the available wind-tunnel measurements of Castro and Robins (1977). Comparing the authors' results with experimental data, the authors show that the TEMPEST model can reasonably simulate the mean flow.
Strategy Plan A Methodology to Predict the Uniformity of Double-Shell Tank Waste Slurries Based on Mixing Pump Operation

DOE Office of Scientific and Technical Information (OSTI.GOV)

J.A. Bamberger; L.M. Liljegren; P.S. Lowery

This document presents an analysis of the mechanisms influencing mixing within double-shell slurry tanks. A research program to characterize mixing of slurries within tanks has been proposed. The research program presents a combined experimental and computational approach to produce correlations describing the tank slurry concentration profile (and therefore uniformity) as a function of mixer pump operating conditions. The TEMPEST computer code was used to simulate both a full-scale (prototype) and scaled (model) double-shell waste tank to predict flow patterns resulting from a stationary jet centered in the tank. The simulation results were used to evaluate flow patterns in the tankmore » and to determine whether flow patterns are similar between the full-scale prototype and an existing 1/12-scale model tank. The flow patterns were sufficiently similar to recommend conducting scoping experiments at 1/12-scale. Also, TEMPEST modeled velocity profiles of the near-floor jet were compared to experimental measurements of the near-floor jet with good agreement. Reported values of physical properties of double-shell tank slurries were analyzed to evaluate the range of properties appropriate for conducting scaled experiments. One-twelfth scale scoping experiments are recommended to confirm the prioritization of the dimensionless groups (gravitational settling, Froude, and Reynolds numbers) that affect slurry suspension in the tank. Two of the proposed 1/12-scale test conditions were modeled using the TEMPEST computer code to observe the anticipated flow fields. This information will be used to guide selection of sampling probe locations. Additional computer modeling is being conducted to model a particulate laden, rotating jet centered in the tank. The results of this modeling effort will be compared to the scaled experimental data to quantify the agreement between the code and the 1/12-scale experiment. The scoping experiment results will guide selection of parameters to be varied in the follow-on experiments. Data from the follow-on experiments will be used to develop correlations to describe slurry concentration profile as a function of mixing pump operating conditions. This data will also be used to further evaluate the computer model applications. If the agreement between the experimental data and the code predictions is good, the computer code will be recommended for use to predict slurry uniformity in the tanks under various operating conditions. If the agreement between the code predictions and experimental results is not good, the experimental data correlations will be used to predict slurry uniformity in the tanks within the range of correlation applicability.« less
Multiphase, multi-electrode Joule heat computations for glass melter and in situ vitrification simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lowery, P.S.; Lessor, D.L.

Waste glass melter and in situ vitrification (ISV) processes represent the combination of electrical thermal, and fluid flow phenomena to produce a stable waste-from product. Computational modeling of the thermal and fluid flow aspects of these processes provides a useful tool for assessing the potential performance of proposed system designs. These computations can be performed at a fraction of the cost of experiment. Consequently, computational modeling of vitrification systems can also provide and economical means for assessing the suitability of a proposed process application. The computational model described in this paper employs finite difference representations of the basic continuum conservationmore » laws governing the thermal, fluid flow, and electrical aspects of the vitrification process -- i.e., conservation of mass, momentum, energy, and electrical charge. The resulting code is a member of the TEMPEST family of codes developed at the Pacific Northwest Laboratory (operated by Battelle for the US Department of Energy). This paper provides an overview of the numerical approach employed in TEMPEST. In addition, results from several TEMPEST simulations of sample waste glass melter and ISV processes are provided to illustrate the insights to be gained from computational modeling of these processes. 3 refs., 13 figs.« less

A psychoanalytic study of Edward de Vere's The Tempest.

PubMed

Waugaman, Richard M

2009-01-01

There is now abundant evidence that Freud was correct in believing Edward de Vere (1550-1604) wrote under the pseudonym "William Shakespeare." One common reaction is "What difference does it make?" I address that question by examining many significant connections between de Vere's life and The Tempest. Such studies promise to bring our understanding of Shakespeare's works back into line with our usual psychoanalytic approach to literature, which examines how a great writer's imagination weaves a new creation out of the threads of his or her life experiences. One source of the intense controversy about de Vere's authorship is our idealization of the traditional author, about whom we know so little that, as Freud noted, we can imagine his personality was as fine as his works.
Four-Dimensional Continuum Gyrokinetic Code: Neoclassical Simulation of Fusion Edge Plasmas

NASA Astrophysics Data System (ADS)

Xu, X. Q.

2005-10-01

We are developing a continuum gyrokinetic code, TEMPEST, to simulate edge plasmas. Our code represents velocity space via a grid in equilibrium energy and magnetic moment variables, and configuration space via poloidal magnetic flux and poloidal angle. The geometry is that of a fully diverted tokamak (single or double null) and so includes boundary conditions for both closed magnetic flux surfaces and open field lines. The 4-dimensional code includes kinetic electrons and ions, and electrostatic field-solver options, and simulates neoclassical transport. The present implementation is a Method of Lines approach where spatial finite-differences (higher order upwinding) and implicit time advancement are used. We present results of initial verification and validation studies: transition from collisional to collisionless limits of parallel end-loss in the scrape-off layer, self-consistent electric field, and the effect of the real X-point geometry and edge plasma conditions on the standard neoclassical theory, including a comparison of our 4D code with other kinetic neoclassical codes and experiments.
TEMPEST simulations of the neoclassical transport in a single-null tokamak geometry

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Cohen, R. H.; Rognlien, T. D.

2009-05-01

TEMPEST simulations were carried out for plasma transport and flow dynamics in a single-null tokamak geometry. The core radial boundary ion distribution is a fixed Maxwellian FM with N0=N(ψ0) and Ti0=Ti(ψ0)=300eV, and exterior radial boundary ion distribution is Neumann boundary condition with Fi(,,μ)/ψ|ψw=0 during a simulation. Given boundary conditions and initial profiles, the interior plasmas in the simulations should evolve into a neoclassical steady state. A volume source term in the private flux region is included, representing the ionization in the private flux region to achieve the neoclassical steady state. A series of TEMPEST simulations are conducted to investigate the scaling characteristics of the neoclassical transport and flow as a function of ν*i via a density scan. Here ν*i is the effective collision frequency, defined by ν*i=&-3/2circ;νii√2qR0/vTi, νii is the ion-ion collision, and vTi the ion thermal velocity. Simulation results show significant poloidal variation of density and ion temperature profiles due to the endloss machanism at the divertor plates. Each region (Edge, the SOL and private flux) achieves the dynamical steady state at its own time scale due to the different physical processes. The impact of self-consistent electric field on transport and flow will be presented.
Efficient Computation of Atmospheric Flows with Tempest: Development of Next-Generation Climate and Weather Prediction Algorithms at Non-Hydrostatic Scales

NASA Astrophysics Data System (ADS)

Guerra, J. E.; Ullrich, P. A.

2015-12-01

Tempest is a next-generation global climate and weather simulation platform designed to allow experimentation with numerical methods at very high spatial resolutions. The atmospheric fluid equations are discretized by continuous / discontinuous finite elements in the horizontal and by a staggered nodal finite element method (SNFEM) in the vertical, coupled with implicit/explicit time integration. At global horizontal resolutions below 10km, many important questions remain on optimal techniques for solving the fluid equations. We present results from a suite of meso-scale test cases to validate the performance of the SNFEM applied in the vertical. Internal gravity wave, mountain wave, convective, and Cartesian baroclinic instability tests will be shown at various vertical orders of accuracy and compared with known results.
Approach to Shakespeare.

ERIC Educational Resources Information Center

Bannerman, Andrew

1969-01-01

For an introduction to Shakespeare's "Tempest," dramatic interest and tension were created in the classroom through taped interviews with survivors of present-day sea disasters, student improvisations of scenes, music, and historical accounts of shipwrecks. (MF)
Efficient Computation of Atmospheric Flows with Tempest: Validation of Next-Generation Climate and Weather Prediction Algorithms at Non-Hydrostatic Scales

NASA Astrophysics Data System (ADS)

Guerra, Jorge; Ullrich, Paul

2016-04-01

Tempest is a next-generation global climate and weather simulation platform designed to allow experimentation with numerical methods for a wide range of spatial resolutions. The atmospheric fluid equations are discretized by continuous / discontinuous finite elements in the horizontal and by a staggered nodal finite element method (SNFEM) in the vertical, coupled with implicit/explicit time integration. At horizontal resolutions below 10km, many important questions remain on optimal techniques for solving the fluid equations. We present results from a suite of idealized test cases to validate the performance of the SNFEM applied in the vertical with an emphasis on flow features and dynamic behavior. Internal gravity wave, mountain wave, convective bubble, and Cartesian baroclinic instability tests will be shown at various vertical orders of accuracy and compared with known results.
Of Tales and Tempests.

ERIC Educational Resources Information Center

Bottoms, Janet

1996-01-01

Examines the prose versions of Shakespeare plays written for children by Charles and Mary Lamb, Bernard Miles, and Leon Garfield. Suggests that the content ranges far from Shakespeare's originals and promotes values that should be questioned critically. (TB)
Serenity Above, Tempests Below

NASA Image and Video Library

2006-01-10

Whiffs of cloud dance in Saturn atmosphere, while the dim crescent of Rhea 1,528 kilometers, or 949 miles across hangs in the distance. The dark ringplane cuts a diagonal across the top left corner of this view
Calculation of ion distribution functions and neoclassical transport in the edge of single-null divertor tokamaks

NASA Astrophysics Data System (ADS)

Rognlien, T. D.; Cohen, R. H.; Xu, X. Q.

2007-11-01

The ion distribution function in the H-mode pedestal region and outward across the magnetic separatrix is expected to have a substantial non-Maxwellian character owing to the large banana orbits and steep gradients in temperature and density. The 4D (2r,2v) version of the TEMPEST continuum gyrokinetic code is used with a Coulomb collision model to calculate the ion distribution in a single-null tokamak geometry throughout the pedestal/scrape-off-layer regions. The mean density, parallel velocity, and energy radial profiles are shown at various poloidal locations. The collisions cause neoclassical energy transport through the pedestal that is then lost to the divertor plates along the open field lines outside the separatrix. The resulting heat flux profiles at the inner and outer divertor plates are presented and discussed, including asymmetries that depend on the B-field direction. Of particular focus is the effect on ion profiles and fluxes of a radial electric field exhibiting a deep well just inside the separatrix, which reduces the width of the banana orbits by the well-known squeezing effect.
Turbulence-driven coronal heating and improvements to empirical forecasting of the solar wind

DOE Office of Scientific and Technical Information (OSTI.GOV)

Woolsey, Lauren N.; Cranmer, Steven R.

Forecasting models of the solar wind often rely on simple parameterizations of the magnetic field that ignore the effects of the full magnetic field geometry. In this paper, we present the results of two solar wind prediction models that consider the full magnetic field profile and include the effects of Alfvén waves on coronal heating and wind acceleration. The one-dimensional magnetohydrodynamic code ZEPHYR self-consistently finds solar wind solutions without the need for empirical heating functions. Another one-dimensional code, introduced in this paper (The Efficient Modified-Parker-Equation-Solving Tool, TEMPEST), can act as a smaller, stand-alone code for use in forecasting pipelines. TEMPESTmore » is written in Python and will become a publicly available library of functions that is easy to adapt and expand. We discuss important relations between the magnetic field profile and properties of the solar wind that can be used to independently validate prediction models. ZEPHYR provides the foundation and calibration for TEMPEST, and ultimately we will use these models to predict observations and explain space weather created by the bulk solar wind. We are able to reproduce with both models the general anticorrelation seen in comparisons of observed wind speed at 1 AU and the flux tube expansion factor. There is significantly less spread than comparing the results of the two models than between ZEPHYR and a traditional flux tube expansion relation. We suggest that the new code, TEMPEST, will become a valuable tool in the forecasting of space weather.« less
A photometric search for transiting planets

NASA Astrophysics Data System (ADS)

Baliber, Nairn Reese

In the decade since the discovery of the first planet orbiting a main-sequence star other than the Sun, more than 160 planets have been detected in orbit around other stars, most of them discovered by measuring the velocity of the reflexive motion of their parent stars caused by the gravitational pull of the planets. These discoveries produced a population of planets much different to the ones in our Solar System and created interest in other methods to detect these planets. One such method is searching for transits, the slight photometric dimming of stars caused by a close-orbiting, Jupiter-sized planet passing between a star and our line of sight once per orbit. We report results from TeMPEST, the Texas, McDonald Photometric Extrasolar Search for Transits, a transit survey conducted with the McDonald Observatory 0.76 m Prime Focus Corrector (PFC). We monitored five fields of stars in the plane of the Milky Way over the course of two and a half years. We created a photometry pipeline to perform high-precision differential photometry on all of the images, and used a software detection algorithm to detect transit signals in the light curves. Although no transits were found, we calculated our detection probability by determining the fraction of the stars monitored by TeMPEST which were suitable to show transits, measuring the probability of detecting transit signals based on the temporal coverage of our fields, and measuring our detection efficiency by inserting false transits into TeMPEST data to see what fraction could be recovered by our automatic detection software. We conclude that in our entire data set, we generated an effective sample of 2660 stars, a sample in which if any star is showing a transit, it would have been detected. We found no convincing transits in our data, but current statistics from radial velocity surveys indicate that only one in about 1300 of these stars should be showing transits. These numbers are consistent with the lack of transits produced by TeMPEST and the small number of transits generated by other surveys. We therefore discuss methods by which a transit survey's effective sample may be increased to make such surveys productive in a reasonable amount of time.
Teaching Modules for Nine Plays by Shakespeare.

ERIC Educational Resources Information Center

Smith, Denzell

The nine modules presented in this paper are designed to guide students in a one-semester Shakespeare Course through the reading of three Shakespearean tragedies ("Hamlet,""Othello," and "Macbeth"), three comedies ("Midsummer Night's Dream,""Merchant of Venice," and "The Tempest"), and…
Word Magic: Shakespeare's Rhetoric for Gifted Students.

ERIC Educational Resources Information Center

Kester, Ellen S.

Intended for teachers of gifted students in grades 4-12, the curriculum uses six of Shakespeare's comedies ("The Taming of the Shrew,""The Tempest,""Twelfth Night,""The Comedy of Errors,""As You Like It," and "A Midsummer Night's Dream") as materials for nurturing intellectual and…
'Towers in the Tempest' Computer Animation Submission

NASA Technical Reports Server (NTRS)

Shirah, Greg

2008-01-01

The following describes a computer animation that has been submitted to the ACM/SIGGRAPH 2008 computer graphics conference: 'Towers in the Tempest' clearly communicates recent scientific research into how hurricanes intensify. This intensification can be caused by a phenomenon called a 'hot tower.' For the first time, research meteorologists have run complex atmospheric simulations at a very fine temporal resolution of 3 minutes. Combining this simulation data with satellite observations enables detailed study of 'hot towers.' The science of 'hot towers' is described using: satellite observation data, conceptual illustrations, and a volumetric atmospheric simulation data. The movie starts by showing a 'hot tower' observed by NASA's Tropical Rainfall Measuring Mission (TRMM) spacecraft's three dimensional precipitation radar data of Hurricane Bonnie. Next, the dynamics of a hurricane and the formation of 'hot towers' are briefly explained using conceptual illustrations. Finally, volumetric cloud, wind, and vorticity data from a supercomputer simulation of Hurricane Bonnie are shown using volume techniques such as ray marching.
Davies the Manipulator of "The Salterton Trilogy."

ERIC Educational Resources Information Center

Tedford, Barbara W.

Some critics of Robertson Davies' three novels that comprise the Salterton trilogy, "Tempest-Tost" (1951), "Leaven of Malice" (1954), and "A Mixture of Frailties" (1958) complain of their creaky novelistic machinery, suggesting that they merely show an essayist, or journalist, becoming a novelist. These three novels,…
Possible Pasts: Historiography and Legitimation in "Henry VIII."

ERIC Educational Resources Information Center

Kamps, Ivo

1996-01-01

Aims to rehabilitate the reputation of Shakespeare's "Henry VIII" and emphasizes its potential usefulness in the classroom by reconsidering it in the context of Renaissance history writing. Shows how "Henry VIII" can be taught as a commentary on or seen as a continuation of incipient themes in "The Tempest" and…
SIMULATING THE EFFECTS OF UPSTREAM TURBULENCE ON DISPERSION AROUND A BUILDING

EPA Science Inventory

The effects of high turbulence versus no turbulence in a sheared boundary-layer flow approaching a building are being investigated by a turbulent kinetic energy/dissipation (k-e) model (TEMPEST). The effects on both the mean flow and the concentration field around a cubical build...
Tempest, Arizona: Criminal Epistemologies and the Rhetorical Possibilities of Raza Studies

ERIC Educational Resources Information Center

Serna, Elias

2013-01-01

This essay looks at Ethnic Studies activism in Arizona through a rhetorical lens in order to highlight epistemological aspects of activities such as a high school Chicano Literature class, Roberto "Dr. Cintli" Rodriguez's journalism, and student activism to defend the Mexican-American Studies Department. Taking rhetoric's premise that…
A Decade of Inquiry: Tempest in a Teacup?

ERIC Educational Resources Information Center

Merwin, William

This review covers research literature concerning the inquiry method reported by "Social Education" from 1960 to 1968. Studies were selected which dealt with either the learning outcomes or adaptability aspects of inquiry. It is cautiously concluded that under certain conditions inquiry is at least as effective as more traditional…
Tinkering Change vs. System Change

ERIC Educational Resources Information Center

Hubbard, Russ

2009-01-01

In this article, the author makes a distinction between two kinds of change: tinkering change and systemic change. Tinkering change includes reforms intended to address a specific deficiency or practice. Such tinkering change can be contrasted to what Shakespeare termed "sea change" in "The Tempest" ("a sea change into something rich and strange")…

Shakespeare in an Elementary School Setting.

ERIC Educational Resources Information Center

Wood, Robin H.

1997-01-01

For almost 50 years, the 8th-grade graduating class at a New Jersey private elementary school has presented an expertly produced Shakespeare play, alternating between "The Tempest" and "A Midsummer Night's Dream." The whole school becomes involved, from younger kids reading story versions of the plays, to older kids making…
Discipline Issues: Is There a Tempest Brewing in B.C. Schools?

ERIC Educational Resources Information Center

Fraser, Stephen R.

1987-01-01

Educational policy in British Columbia does not distinguish between special needs and regular class students in relation to discipline practices. Although Canadian courts have generally upheld the rights of school boards rather than the unspecified rights of special needs children, a recent court case suggests the possibility of change. (JW)
NUMERICAL SIMULATION TO DETERMINE THE EFFECTS OF INCIDENT WIND SHEAR AND TURBULENCE LEVEL ON THE FLOW AROUND A BUILDING

EPA Science Inventory

The effects of incident shear and turbulence on flow around a cubical building are being investigated by a turbulent kinetic energy dissipation (k-e) model (TEMPEST). he numerical simulations demonstrate significant effects due to the differences in the incident flow. he addition...
Tempests into Rainbows. Managing Turbulence.

ERIC Educational Resources Information Center

Fleming, Robben W.

This autobiography recounts personal experiences as a college professor and administrator (President of the University of Michigan) during the 1960s to the early 1980s. The 17 chapters discuss early years growing up in Illinois; college years; employment with the federal government; enlistment in the Army; the war in Germany; the end of the war,…
A Triple Tropical Tempest Train: Karina, Lowell, Mariest

NASA Image and Video Library

2014-08-22

NASA and NOAA satellites are studying the triple tropical tempests that are now romping through the Eastern Pacific Ocean. NOAA's GOES-West satellite captured Tropical Storm Karina, Tropical Storm Lowell and newly formed Tropical Storm Marie on August 22. NOAA's GOES-West satellite captured all three storms in an infrared image at 0900 UTC (5 a.m. EDT), and Tropical Lowell clearly dwarfs Karina to its west, and Marie to the east. The infrared image was created at NASA/NOAA's GOES Project at the NASA Goddard Space Flight Center in Greenbelt, Maryland. For more information about Lowell, visit: www.nasa.gov/content/goddard/12e-eastern-pacific-ocean/ For more information about Karina, visit: www.nasa.gov/content/goddard/karina-eastern-pacific/ Rob Gutro NASA's Goddard Space Flight Center NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
Fatal Amusements: Contemplating the Tempest of Contemporary Media and American Culture

ERIC Educational Resources Information Center

Strate, Lance

2016-01-01

Our use of the electronic media to conduct serious discourse raises the question of whether "we are amusing ourselves to death," as Neil Postman argued. The approach known as "media ecology," the study of media as environments, which emphasizes the need to understand context and find balance, provides a basis for the analysis…
Transforming Conceptual Space into a Creative Learning Place: Crossing a Threshold

ERIC Educational Resources Information Center

Moffat, Kirstine; McKim, Anne

2016-01-01

This article describes, discusses and reflects on a teaching and learning experiment in a first year BA course. Students were led out of the lecture room to a different space, the New Place Theatre. While this move out of the usual teaching space was appropriate for the text being studied, William Shakespeare's "The Tempest", the…
Watered by Tempests: Hurricanes in the Cultural Fabric of the United Houma Nation

ERIC Educational Resources Information Center

D'Oney, J. Daniel

2008-01-01

Hurricanes Katrina and Rita affected hundreds of thousands in southern Louisiana. To say that they touched people of every stripe and color dramatically is a gross understatement. Aside from the loss of life and property damage, families were uprooted, traditions disrupted, and one of the largest migrations in American history forced on a state…
Monstrous No More: How Shakespeare's Caliban and the Community College Student Aspire Together

ERIC Educational Resources Information Center

Gold Wright, Jill Y.

2006-01-01

Many students enter classes like the Shakespeare character Caliban, knowing books to be powerful but feeling eluded by them, unable to access their knowledge. Author Jill Wright shares new-found inspiration and insight she discovered while co-directing Act III, Scene ii of William Shakespeare's "The Tempest" and suddenly realized a…
"Score Choice": A Tempest in a Teapot?

ERIC Educational Resources Information Center

Hoover, Eric

2009-01-01

A new option that allows students to choose which of their test scores to send to colleges has generated renewed criticism of the College Board. College Board officials tout the option, called Score Choice, as a way to ease test taker anxiety. Some prominent admissions officials have publicly described Score Choice as a sales tactic that will…
Television and the Crisis in the Humanities.

ERIC Educational Resources Information Center

Burns, Gary

It is indeed a problem, perhaps even a crisis, that many Americans are ignorant of "The Tempest," the Civil War, the location of the Persian Gulf, the Constitution, or the chief justice of the Supreme Court. However, if conservative humanists continue to ostracize, scorn, and ignore both media studies and the media themselves, the result will not…
A Tutorial on Parallel and Concurrent Programming in Haskell

NASA Astrophysics Data System (ADS)

Peyton Jones, Simon; Singh, Satnam

This practical tutorial introduces the features available in Haskell for writing parallel and concurrent programs. We first describe how to write semi-explicit parallel programs by using annotations to express opportunities for parallelism and to help control the granularity of parallelism for effective execution on modern operating systems and processors. We then describe the mechanisms provided by Haskell for writing explicitly parallel programs with a focus on the use of software transactional memory to help share information between threads. Finally, we show how nested data parallelism can be used to write deterministically parallel programs which allows programmers to use rich data types in data parallel programs which are automatically transformed into flat data parallel versions for efficient execution on multi-core processors.
Lessons from the great egret: Cosmopolitan species as environmental guides

NASA Astrophysics Data System (ADS)

Lewis, Celia

This dissertation is an experiment in environmental learning. The cosmopolitan species, the great egret (Egretta alba), is used as a guide to learning about local environmental history and local ecology in four places: Long Island Sound, USA; Delaware Bay, USA; Neusiedler See, Austria; and the Hunter River Valley in New South Wales, Australia. It is also used as a guide to the development of a cosmopolitan environmental perspective. The development of this broad perspective is based on the thesis that knowledge of the ecology of species mobility and cosmopolitanism may bring to light ecological connections within and between places, and that human migration and cultural mobility are also part of the ecological history of the environment. The concept of species guides is reviewed in nature literature, including examples from the works of Richard Nelson, Robert Michael Pyle, Terry Tempest Williams, Scott Weidensaul, and Peter Matthiessen. The author visits egret colonies, interviews biologists working at these sites, and develops narratives about the environmental history and the cultural history of each site, and the connections between egrets and humans in those places. Parallels are drawn between the migrant and cosmopolitan nature of great egrets and other species, and of the human species, and how recognition of these similarities can lead to a cosmopolitan environmental perspective.
Drag and Cooling Tests in the 24 ft Wind Tunnel on a Centaurus-Buckingham Wing Nacelle Installation. Part 3. Tests with High Speed Cowl Entry (Tempest Type)

DTIC Science & Technology

1946-07-01

ESTABLISHMENT Farnboruugh. Hants. DRAG AND COOLING TESTS IN THE 24 ft. WIND TUNNEL ON A ^ CENTAURUS -BUCKINGHAM WING NACELLE INSTALLATION PART...ILLIETUATICNS Ccntaurus-f uckinghain nacelle inst -illation with high entry velocity cowl ^soolii» ; fan not fitted. Installation of Centaurus .inline
A Struggle Well Worth Having: The Uses of Theatre-in-education (TIE) for Learning

ERIC Educational Resources Information Center

Cooper, Chris

2004-01-01

In this article Chris Cooper conveys something of his passionate belief in the importance of attending to the preconditions of learning. He stresses the crucial role of the imagination in this, bringing, as he puts it, creativity to the process of learning. His account of a drama project based on "The Tempest" provides important insights into the…
Learning and Teaching in the Arts. Research Monograph 4.

ERIC Educational Resources Information Center

Aiken, Henry David

This paper, part of a research monograph series, focuses on a philosophy of education which is humanistic. The author discusses theories of art education, using as an example of visual art Giorgione's "The Tempest". A synopsis of what needs to be known in order to appreciate the various levels of significance in a great work of visual art precedes…
Transforming Pedagogies: Encouraging Pre-Service Teachers to Engage the Power of the Arts in Their Approach to Teaching and Learning

ERIC Educational Resources Information Center

McLaren, Mary-Rose; Arnold, Julie

2016-01-01

This paper describes and analyses, through the use of case studies, two experiences of transformative learning in an undergraduate arts education unit. Pre-service teachers designed and engaged with arts-based curriculum activities, created their own artwork, participated in a modified production of The Tempest and kept a reflective journal. These…
No Menstrual Cyclicity in Mood and Interpersonal Behaviour in Nine Women with Self-Reported Premenstrual Syndrome.

PubMed

Bosman, Renske C; Albers, Casper J; de Jong, Jettie; Batalas, Nikolaos; Aan Het Rot, Marije

2018-06-06

Before diagnosing premenstrual dysphoric disorder (PMDD), 2 months of prospective assessment are required to confirm menstrual cyclicity in symptoms. For a diagnosis of premenstrual syndrome (PMS), this is not required. Women with PMDD and PMS often report that their symptoms interfere with mood and social functioning, and are said to show cyclical changes in interpersonal behaviour, but this has not been examined using a prospective approach. We sampled cyclicity in mood and interpersonal behaviour for 2 months in women with self- reported PMS. Participants met the criteria for PMS on the Premenstrual Symptoms Screening Tool (PSST), a retrospective questionnaire. For 2 menstrual cycles, after each social interaction, they used the online software TEMPEST to record on their smartphones how they felt and behaved. We examined within-person variability in negative affect, positive affect, quarrelsomeness, and agreeableness. Participants evaluated TEMPEST as positive. However, we found no evidence for menstrual cyclicity in mood and interpersonal behaviour in any of the individual women (n = 9). Retrospective questionnaires such as the PSST may lead to oversampling of PMS. The diagnosis of PMS, like that of PMDD, might require 2 months of prospective assessment. © 2018 S. Karger AG, Basel.
Dilution physics modeling: Dissolution/precipitation chemistry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Onishi, Y.; Reid, H.C.; Trent, D.S.

This report documents progress made to date on integrating dilution/precipitation chemistry and new physical models into the TEMPEST thermal-hydraulics computer code. Implementation of dissolution/precipitation chemistry models is necessary for predicting nonhomogeneous, time-dependent, physical/chemical behavior of tank wastes with and without a variety of possible engineered remediation and mitigation activities. Such behavior includes chemical reactions, gas retention, solids resuspension, solids dissolution and generation, solids settling/rising, and convective motion of physical and chemical species. Thus this model development is important from the standpoint of predicting the consequences of various engineered activities, such as mitigation by dilution, retrieval, or pretreatment, that can affectmore » safe operations. The integration of a dissolution/precipitation chemistry module allows the various phase species concentrations to enter into the physical calculations that affect the TEMPEST hydrodynamic flow calculations. The yield strength model of non-Newtonian sludge correlates yield to a power function of solids concentration. Likewise, shear stress is concentration-dependent, and the dissolution/precipitation chemistry calculations develop the species concentration evolution that produces fluid flow resistance changes. Dilution of waste with pure water, molar concentrations of sodium hydroxide, and other chemical streams can be analyzed for the reactive species changes and hydrodynamic flow characteristics.« less
Simulating the effects of upstream turbulence on dispersion around a building

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Y.Q.; Arya, S.P.S.; Huber, A.H.

The effects of high turbulence versus no turbulence in a sheared boundary-layer flow approaching a building are being investigated by a turbulent kinetic energy/dissipation model (TEMPEST). The effects on both the mean flow and the concentration field around a cubical building are presented. The numerical simulations demonstrate significant effects due to the differences in the incident flow. The addition of upstream turbulence results in a reduced size of the cavity directly behind the building. The velocity deficits in the wake strongly depend on the upstream turbulence intensities. The accuracy of numerical simulations is verified by comparing the predicted mean flowmore » and concentration fields with the wind tunnel measurements of Castro and Robins (1977) and Robins and Castro (1977, 1975). Comparing the results with experimental data, the authors show that the TEMPEST model can reasonably simulate the mean flow. The numerical simulations of the concentration fields due to a source on the roof-top of the building are presented. Both the value and the position of the maximum ground-level concentration are changed dramatically due to the effects of the upstream level of turblence.« less

Central ridge of Newfoundland: Little explored, potential large

DOE Office of Scientific and Technical Information (OSTI.GOV)

Silva, N.R. De

The Central ridge on the northeastern Grand Banks off Newfoundland represents a large area with known hydrocarbon accumulations and the potential for giant fields. It covers some 17,000 sq km with water less than 400 m deep. The first major hydrocarbon discovery on the Newfoundland Grand Banks is giant Hibernia field in the Jeanne d'Arc basin. Hibernia field, discovered in 1979, has reserves of 666 million bbl and is due onstream in 1997. Since Hibernia, 14 other discoveries have been made on the Grand Banks, with three on the Central ridge. Oil was first discovered on Central Ridge in 1980more » with the Mobil et al. South Tempest G-88 well. In 1982 gas was discovered with the Mobil et al. North Dana I-43 well 30 km northeast of the earlier discovery. In 1983 gas and condensate were discovered with the Husky-Bow Valley et al. Trave E-87 well 20 km south of the South Tempest well. These discoveries are held under significant discovery licenses and an additional 2,400 sq km are held under exploration licenses. The paper discusses the history of the basin, the reservoir source traps, and the basin potential.« less
Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.
Director, Operational Test and Evaluation FY 2014 Annual Report

DTIC Science & Technology

2015-01-01

Federal Departments and Agencies. Mitigation measures such as curtailment of wind turbine operations during test periods, identification of alternative...impact of wind turbines on ground-based and airborne radars, and this investment may help mitigate interference of wind turbines with test range...Frequency Active (SURTASS CLFA) Test Plan Tactical Unmanned Aircraft System Tactical Common Data Link (Shadow) FOT&E OTA Test Plan Tempest Wind 2014
"The Tempest" in an English Teapot: Colonialism and the Measure of a Man in Zadie Smith's "White Teeth"

ERIC Educational Resources Information Center

Gustar, Jennifer J.

2010-01-01

Zadie Smith's "White Teeth" argues that we can take responsibility for the future if we refuse to act in thrall to the legacies of the past, which favour one human life over another, and act instead with the conviction that all lives are "lives" (Judith Butler). "White Teeth" examines the colonial legacy of violence…
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

PubMed

Nadkarni, P M; Miller, P L

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
Parallel programming with Easy Java Simulations

NASA Astrophysics Data System (ADS)

Esquembre, F.; Christian, W.; Belloni, M.

2018-01-01

Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
Genetic Parallel Programming: design and implementation.

PubMed

Cheang, Sin Man; Leung, Kwong Sak; Lee, Kin Hong

2006-01-01

This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential program if required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

PubMed Central

Nadkarni, P. M.; Miller, P. L.

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations. PMID:1807632
English in "The Tempest": The Value of Metaphor and Re-Imagining Grammar in English

ERIC Educational Resources Information Center

Macken-Horarik, Mary

2013-01-01

Garth Boomer's thinking influenced many of those working in school English during the time he was alive. The ripple effects of his legacy continue to be felt. For the author, it is Boomer's interests in metaphor and meaning that resonate most. The use of tropes and figure is a distinctive feature of his writing and offers a rich…
The Multicolored Patchwork Portraiture of an Effective Veteran High School Special Education Teacher Amidst the Tempest of the High Stakes Testing Movement

ERIC Educational Resources Information Center

Bicehouse, Vaughn L.

2010-01-01

This single-subject study used the art and science of portraiture to illuminate a veteran special education teacher who is meeting the needs of her students with disabilities. This qualitative study was not done for the purposes of generalization but rather to show how this remarkable and effective special educator acts as an inspirational…
Trio of Tempests

NASA Image and Video Library

2017-10-04

Three distinct active regions with towering arches above them rotated into view over a three-day period (Sept. 24-26, 2017). In extreme ultraviolet light, charged particles that are spinning along the ever-changing magnetic field lines above the active regions make the lines visible. To give some sense of scale, the largest arches rose up many times the size of Earth. Movies are available at https://photojournal.jpl.nasa.gov/catalog/PIA22038
Fully Nonlinear Edge Gyrokinetic Simulations of Kinetic Geodesic-Acoustic Modes and Boundary Flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, X Q; Belli, E; Bodi, K

We present edge gyrokinetic neoclassical simulations of tokamak plasmas using the fully nonlinear (full-f) continuum code TEMPEST. A nonlinear Boltzmann model is used for the electrons. The electric field is obtained by solving the 2D gyrokinetic Poisson Equation. We demonstrate the following: (1) High harmonic resonances (n > 2) significantly enhance geodesic-acoustic mode (GAM) damping at high-q (tokamak safety factor), and are necessary to explain both the damping observed in our TEMPEST q-scans and experimental measurements of the scaling of the GAM amplitude with edge q{sub 95} in the absence of obvious evidence that there is a strong q dependencemore » of the turbulent drive and damping of the GAM. (2) The kinetic GAM exists in the edge for steep density and temperature gradients in the form of outgoing waves, its radial scale is set by the ion temperature profile, and ion temperature inhomogeneity is necessary for GAM radial propagation. (3) The development of the neoclassical electric field evolves through different phases of relaxation, including GAMs, their radial propagation, and their long-time collisional decay. (4) Natural consequences of orbits in the pedestal and scrape-off layer region in divertor geometry are substantial non-Maxwellian ion distributions and flow characteristics qualitatively like those observed in experiments.« less
Spring Dust Storm Smothers Beijing

NASA Technical Reports Server (NTRS)

2002-01-01

A few days earlier than usual, a large, dense plume of dust blew southward and eastward from the desert plains of Mongolia-quite smothering to the residents of Beijing. Citizens of northeastern China call this annual event the 'shachenbao,' or 'dust cloud tempest.' However, the tempest normally occurs during the spring time. The dust storm hit Beijing on Friday night, March 15, and began coating everything with a fine, pale brown layer of grit. The region is quite dry; a problem some believe has been exacerbated by decades of deforestation. According to Chinese government estimates, roughly 1 million tons of desert dust and sand blow into Beijing each year. This true-color image was made using two adjacent swaths (click to see the full image) of data from the Sea-viewing Wide Field-of-view Sensor (SeaWiFS), flying aboard the OrbView-2 satellite, on March 17, 2002. The massive dust storm (brownish pixels) can easily be distinguished from clouds (bright white pixels) as it blows across northern Japan and eastward toward the open Pacific Ocean. The black regions are gaps between SeaWiFS' viewing swaths and represent areas where no data were collected. Image courtesy the SeaWiFS Project, NASA/Goddard Space Flight Center, and ORBIMAGE
Bilingual parallel programming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Foster, I.; Overbeek, R.

1990-01-01

Numerous experiments have demonstrated that computationally intensive algorithms support adequate parallelism to exploit the potential of large parallel machines. Yet successful parallel implementations of serious applications are rare. The limiting factor is clearly programming technology. None of the approaches to parallel programming that have been proposed to date -- whether parallelizing compilers, language extensions, or new concurrent languages -- seem to adequately address the central problems of portability, expressiveness, efficiency, and compatibility with existing software. In this paper, we advocate an alternative approach to parallel programming based on what we call bilingual programming. We present evidence that this approach providesmore » and effective solution to parallel programming problems. The key idea in bilingual programming is to construct the upper levels of applications in a high-level language while coding selected low-level components in low-level languages. This approach permits the advantages of a high-level notation (expressiveness, elegance, conciseness) to be obtained without the cost in performance normally associated with high-level approaches. In addition, it provides a natural framework for reusing existing code.« less
Application Portable Parallel Library

NASA Technical Reports Server (NTRS)

Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

1995-01-01

Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Electric Propulsion Test and Evaluation Methodologies for Plasma in the Environments of Space and Testing (EP TEMPEST)

DTIC Science & Technology

2016-04-14

Swanson AEDC Path 1: Magnetized electron transport impeded across magnetic field lines; transport via electron-particle collisions Path 2*: Electron...T&E (higher pressure, metallic walls) → Impacts stability, performance, plume properties, thruster lifetime Magnetic Field Lines Plasma Plume...Development of T&E Methodologies • Current-Voltage- Magnetic Field (I-V-B) Mapping • Facility Interaction Studies • Background Pressure • Plasma Wall
Tempest on the Hudson: The Struggle for "Equal Pay for Equal Work" in the New York City Public Schools, 1907-1911.

ERIC Educational Resources Information Center

Doherty, Robert E.

1979-01-01

Traces trends in salaries paid to male and female public school teachers in New York City during a four-year period in the early twentieth century. Findings indicate that, in direct opposition to the situation around the turn of the century, there were few school districts that differentiated in the 1970s in salary on the basis of sex. (DB)
Partitioning problems in parallel, pipelined and distributed computing

NASA Technical Reports Server (NTRS)

Bokhari, S.

1985-01-01

The problem of optimally assigning the modules of a parallel program over the processors of a multiple computer system is addressed. A Sum-Bottleneck path algorithm is developed that permits the efficient solution of many variants of this problem under some constraints on the structure of the partitions. In particular, the following problems are solved optimally for a single-host, multiple satellite system: partitioning multiple chain structured parallel programs, multiple arbitrarily structured serial programs and single tree structured parallel programs. In addition, the problems of partitioning chain structured parallel programs across chain connected systems and across shared memory (or shared bus) systems are also solved under certain constraints. All solutions for parallel programs are equally applicable to pipelined programs. These results extend prior research in this area by explicitly taking concurrency into account and permit the efficient utilization of multiple computer architectures for a wide range of problems of practical interest.
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes

NASA Technical Reports Server (NTRS)

Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.
An interactive parallel programming environment applied in atmospheric science

NASA Technical Reports Server (NTRS)

vonLaszewski, G.

1996-01-01

This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

2001-01-01

This viewgraph presentation provides information on the technical aspects of debugging computer code that has been automatically converted for use in a parallel computing system. Shared memory parallelization and distributed memory parallelization entail separate and distinct challenges for a debugging program. A prototype system has been developed which integrates various tools for the debugging of automatically parallelized programs including the CAPTools Database which provides variable definition information across subroutines as well as array distribution information.
A Tutorial Introduction to Bayesian Models of Cognitive Development

DTIC Science & Technology

2011-01-01

typewriter with an infinite amount of paper. There is a space of documents that it is capable of producing, which includes things like The Tempest and does...not include, say, a Vermeer painting or a poem written in Russian. This typewriter represents a means of generating the hypothesis space for a Bayesian...learner: each possible document that can be typed on it is a hypothesis, the infinite set of documents producible by the typewriter is the latent
Numerical simulation of jet mixing concepts in Tank 241-SY-101

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trent, D.S.; Michener, T.E.

The episodic gas release events (GRES) that have characterized the behavior of Tank 241-SY-101 for the past several years are thought to result from gases generated by the waste material in it that become trapped in the layer of settled solids at the bottom of the tank. Several concepts for mitigating the GREs have been proposed. One concept involves mobilizing the solid particles with mixing jets. The rationale behind this idea is to prevent formation of a consolidated layer of settled solids at the bottom of the tank, thus inhibiting the accumulation of gas bubbles in this layer. Numerical simulationsmore » were conducted using the TEMPEST computer code to assess the viability and effectiveness of the proposed jet discharge concepts and operating parameters. Before these parametric studies were commenced, a series of turbulent jet studies were conducted that established the adequacy of the TEMPEST code for this application. Configurations studied for Tank 241-SY-101 include centrally located downward discharging jets, draft tubes, and horizontal jets that are either stationary or rotating. Parameter studies included varying the jet discharge velocity, jet diameter, discharge elevation, and material properties. A total of 18 simulations were conducted and are reported in this document. The effect of gas bubbles on the mixing dynamics was not included within the scope of this study.« less
L-band Soil Moisture Mapping using Small UnManned Aerial Systems

NASA Astrophysics Data System (ADS)

Dai, E.

2015-12-01

Soil moisture is of fundamental importance to many hydrological, biological and biogeochemical processes, plays an important role in the development and evolution of convective weather and precipitation, and impacts water resource management, agriculture, and flood runoff prediction. The launch of NASA's Soil Moisture Active/Passive (SMAP) mission in 2015 promises to provide global measurements of soil moisture and surface freeze/thaw state at fixed crossing times and spatial resolutions as low as 5 km for some products. However, there exists a need for measurements of soil moisture on smaller spatial scales and arbitrary diurnal times for SMAP validation, precision agriculture and evaporation and transpiration studies of boundary layer heat transport. The Lobe Differencing Correlation Radiometer (LDCR) provides a means of mapping soil moisture on spatial scales as small as several meters (i.e., the height of the platform) .Compared with various other proposed methods of validation based on either situ measurements [1,2] or existing airborne sensors suitable for manned aircraft deployment [3], the integrated design of the LDCR on a lightweight small UAS (sUAS) is capable of providing sub-watershed (~km scale) coverage at very high spatial resolution (~15 m) suitable for scaling scale studies, and at comparatively low operator cost. The LDCR on Tempest unit can supply the soil moisture mapping with different resolution which is of order the Tempest altitude.
Thermal modeling of tanks 241-AW-101 and 241-AN-104 with the TEMPEST code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Antoniak, Z.I.; Recknagle, K.P.

The TEMPEST code was exercised in a preliminary study of double-shell Tanks 241 -AW-101 and 241-AN-104 thermal behavior. The two-dimensional model used is derived from our earlier studies on heat transfer from Tank 241-SY-101. Several changes were made to the model to simulate the waste and conditions in 241-AW-101 and 241-AN-104. The nonconvective waste layer was assumed to be 254 cm (100 in.) thick for Tank 241-AW-101, and 381 cm (150 in.) in Tank 241-AN-104. The remaining waste was assumed, for each tank, to consist of a convective layer with a 7.6-cm (3-inch) crust on top. The waste heat loadsmore » for 241-AW-101 and 241-AN-104 were taken to be 10 kW (3.4E4 Btu/hr) and 12 kW (4.0E4 Btu/hr), respectively. Present model predictions of maximum and convecting waste temperatures are within 1.7{degrees}C (3{degrees}F) of those measured in Tanks 241-AW-101 and 241-AN-104. The difference between the predicted and measured temperature is comparable to the uncertainty of the measurement equipment. These models, therefore, are suitable for estimating the temperatures within the tanks in the event of changing air flows, waste levels, and/or waste configurations.« less
Defect printability of ArF alternative phase-shift mask: a critical comparison of simulation and experiment

NASA Astrophysics Data System (ADS)

Ozawa, Ken; Komizo, Tooru; Ohnuma, Hidetoshi

2002-07-01

An alternative phase shift mask (alt-PSM) is a promising device for extending optical lithography to finer design rules. There have been few reports, however, on the mask's ability to identify phase defects. We report here an alt-PSM of a single-trench type with undercut for ArF exposure, with programmed phase defects used to evaluate defect printability by measuring aerial images with a Zeiss MSM193 measuring system. The experimental results are simulated using the TEMPEST program. First, a critical comparison of the simulation and the experiment is conducted. The actual measured topographies of quartz defects are used in the simulation. Moreover, a general simulation study on defect printability using an alt-PSM for ArF exposure is conducted. The defect dimensions, which produce critical CD errors, are determined by simulation that takes into account the full 3-dimensional structure of phase defects as well as a simplified structure. The critical dimensions of an isolated bump defect identified by the alt-PSM of a single-trench type with undercut for ArF exposure are 300 nm in bottom dimension and 74 degrees in height (phase) for the real shape, where the depth of wet-etching is 100 nm and the CD error limit is +/- 5 percent.
Defect printability of alternating phase-shift mask: a critical comparison of simulation and experiment

NASA Astrophysics Data System (ADS)

Ozawa, Ken; Komizo, Tooru; Kikuchi, Koji; Ohnuma, Hidetoshi; Kawahira, Hiroichi

2002-07-01

An alternative phase shift mask (alt-PSM) is a promising device for extending optical lithography to finer design rules. There have been few reports, however, on the mask's ability to identify phase defects. We report here an alt-PSM of a dual-trench type for KrF exposure, with programmed quartz defects used to evaluate defect printability by measuring aerial images with a Zeiss MSM100 measuring system. The experimental results are simulated using the TEMPEST program. First, a critical comparison of the simulation and the experiment is conducted. The actual measured topography of quartz defects are used in the simulation. Moreover, a general simulation study on defect printability using an alt-PSM for ArF exposure is conducted. The defect dimensions, which produce critical CD errors are determined by simulation that takes into account the full 3-dimensional structure of phase defects as well as a simplified structure. The critical dimensions of an isolated defect identified by the alt-PSM of a single-trench type for ArF exposure are 240 nm in bottom diameter and 50 degrees in height (phase) for the cylindrical shape and 240 nm in bottom diameter and 90 degrees in height (phase) for the rotating trapezoidal shape, where the CD error limit is +/- 5%.
Architecture Adaptive Computing Environment

NASA Technical Reports Server (NTRS)

Dorband, John E.

2006-01-01

Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
The Automatic Parallelisation of Scientific Application Codes Using a Computer Aided Parallelisation Toolkit

NASA Technical Reports Server (NTRS)

Ierotheou, C.; Johnson, S.; Leggett, P.; Cross, M.; Evans, E.; Jin, Hao-Qiang; Frumkin, M.; Yan, J.; Biegel, Bryan (Technical Monitor)

2001-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. Historically, the lack of a programming standard for using directives and the rather limited performance due to scalability have affected the take-up of this programming model approach. Significant progress has been made in hardware and software technologies, as a result the performance of parallel programs with compiler directives has also made improvements. The introduction of an industrial standard for shared-memory programming with directives, OpenMP, has also addressed the issue of portability. In this study, we have extended the computer aided parallelization toolkit (developed at the University of Greenwich), to automatically generate OpenMP based parallel programs with nominal user assistance. We outline the way in which loop types are categorized and how efficient OpenMP directives can be defined and placed using the in-depth interprocedural analysis that is carried out by the toolkit. We also discuss the application of the toolkit on the NAS Parallel Benchmarks and a number of real-world application codes. This work not only demonstrates the great potential of using the toolkit to quickly parallelize serial programs but also the good performance achievable on up to 300 processors for hybrid message passing and directive-based parallelizations.
Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores

NASA Astrophysics Data System (ADS)

Kegel, Philipp; Schellmann, Maraike; Gorlatch, Sergei

We compare two parallel programming approaches for multi-core systems: the well-known OpenMP and the recently introduced Threading Building Blocks (TBB) library by Intel®. The comparison is made using the parallelization of a real-world numerical algorithm for medical imaging. We develop several parallel implementations, and compare them w.r.t. programming effort, programming style and abstraction, and runtime performance. We show that TBB requires a considerable program re-design, whereas with OpenMP simple compiler directives are sufficient. While TBB appears to be less appropriate for parallelizing existing implementations, it fosters a good programming style and higher abstraction level for newly developed parallel programs. Our experimental measurements on a dual quad-core system demonstrate that OpenMP slightly outperforms TBB in our implementation.
An object-oriented approach to nested data parallelism

NASA Technical Reports Server (NTRS)

Sheffler, Thomas J.; Chatterjee, Siddhartha

1994-01-01

This paper describes an implementation technique for integrating nested data parallelism into an object-oriented language. Data-parallel programming employs sets of data called 'collections' and expresses parallelism as operations performed over the elements of a collection. When the elements of a collection are also collections, then there is the possibility for 'nested data parallelism.' Few current programming languages support nested data parallelism however. In an object-oriented framework, a collection is a single object. Its type defines the parallel operations that may be applied to it. Our goal is to design and build an object-oriented data-parallel programming environment supporting nested data parallelism. Our initial approach is built upon three fundamental additions to C++. We add new parallel base types by implementing them as classes, and add a new parallel collection type called a 'vector' that is implemented as a template. Only one new language feature is introduced: the 'foreach' construct, which is the basis for exploiting elementwise parallelism over collections. The strength of the method lies in the compilation strategy, which translates nested data-parallel C++ into ordinary C++. Extracting the potential parallelism in nested 'foreach' constructs is called 'flattening' nested parallelism. We show how to flatten 'foreach' constructs using a simple program transformation. Our prototype system produces vector code which has been successfully run on workstations, a CM-2, and a CM-5.
The BLAZE language: A parallel language for scientific programming

NASA Technical Reports Server (NTRS)

Mehrotra, P.; Vanrosendale, J.

1985-01-01

A Pascal-like scientific programming language, Blaze, is described. Blaze contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus Blaze should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with onceptually sequential control flow. A central goal in the design of Blaze is portability across a broad range of parallel architectures. The multiple levels of parallelism present in Blaze code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of Blaze are described and shows how this language would be used in typical scientific programming.
Adapting high-level language programs for parallel processing using data flow

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1988-01-01

EASY-FLOW, a very high-level data flow language, is introduced for the purpose of adapting programs written in a conventional high-level language to a parallel environment. The level of parallelism provided is of the large-grained variety in which parallel activities take place between subprograms or processes. A program written in EASY-FLOW is a set of subprogram calls as units, structured by iteration, branching, and distribution constructs. A data flow graph may be deduced from an EASY-FLOW program.
Collectively loading programs in a multiple program multiple data environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the programmore » needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.« less
Response to comments by Adam Smiarowski and Shane Mulè on: Christensen, N., and Lawrie, K., 2012. Resolution analyses for selecting an appropriate airborne electromagnetic (AEM) system, Exploration Geophysics, 43, 213-227

NASA Astrophysics Data System (ADS)

Christensen, Niels B.; Lawrie, Ken

2015-06-01

We analyse and compare the resolution improvement that can be obtained from including x-component data in the inversion of AEM data from the SkyTEM and TEMPEST systems. Except for the resistivity of the bottom layer, the SkyTEM system, even without including x-component data, has the better resolution of the parameters of the analysed models.
TEMPEST/N33.5. Computational Fluid Dynamics Package For Incompressible, 3D, Time Dependent Pro

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trent, Dr.D.S.; Eyler, Dr.L.L.

TEMPESTN33.5 provides numerical solutions to general incompressible flow problems with coupled heat transfer in fluids and solids. Turbulence is created with a k-e model and gas, liquid or solid constituents may be included with the bulk flow. Problems may be modeled in Cartesian or cylindrical coordinates. Limitations include incompressible flow, Boussinesq approximation, and passive constituents. No direct steady state solution is available; steady state is obtained as the limit of a transient.
RainCube 6U CubeSat

NASA Image and Video Library

2018-05-17

The RainCube 6U CubeSat with fully-deployed antenna. RainCube, CubeRRT and TEMPEST-D are currently integrated aboard Orbital ATKs Cygnus spacecraft and are awaiting launch on an Antares rocket. After the CubeSats have arrived at the station, they will be deployed into low-Earth orbit and will begin their missions to test these new technologies useful for predicting weather, ensuring data quality, and helping researchers better understand storms. https://photojournal.jpl.nasa.gov/catalog/PIA22457
The BLAZE language - A parallel language for scientific programming

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Van Rosendale, John

1987-01-01

A Pascal-like scientific programming language, BLAZE, is described. BLAZE contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus BLAZE should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with conceptually sequential control flow. A central goal in the design of BLAZE is portability across a broad range of parallel architectures. The multiple levels of parallelism present in BLAZE code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of BLAZE are described and it is shown how this language would be used in typical scientific programming.
MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program

NASA Astrophysics Data System (ADS)

Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.

2018-02-01

We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.
IOPA: I/O-aware parallelism adaption for parallel programs

PubMed Central

Liu, Tao; Liu, Yi; Qian, Chen; Qian, Depei

2017-01-01

With the development of multi-/many-core processors, applications need to be written as parallel programs to improve execution efficiency. For data-intensive applications that use multiple threads to read/write files simultaneously, an I/O sub-system can easily become a bottleneck when too many of these types of threads exist; on the contrary, too few threads will cause insufficient resource utilization and hurt performance. Therefore, programmers must pay much attention to parallelism control to find the appropriate number of I/O threads for an application. This paper proposes a parallelism control mechanism named IOPA that can adjust the parallelism of applications to adapt to the I/O capability of a system and balance computing resources and I/O bandwidth. The programming interface of IOPA is also provided to programmers to simplify parallel programming. IOPA is evaluated using multiple applications with both solid state and hard disk drives. The results show that the parallel applications using IOPA can achieve higher efficiency than those with a fixed number of threads. PMID:28278236

IOPA: I/O-aware parallelism adaption for parallel programs.

PubMed

Liu, Tao; Liu, Yi; Qian, Chen; Qian, Depei

2017-01-01

With the development of multi-/many-core processors, applications need to be written as parallel programs to improve execution efficiency. For data-intensive applications that use multiple threads to read/write files simultaneously, an I/O sub-system can easily become a bottleneck when too many of these types of threads exist; on the contrary, too few threads will cause insufficient resource utilization and hurt performance. Therefore, programmers must pay much attention to parallelism control to find the appropriate number of I/O threads for an application. This paper proposes a parallelism control mechanism named IOPA that can adjust the parallelism of applications to adapt to the I/O capability of a system and balance computing resources and I/O bandwidth. The programming interface of IOPA is also provided to programmers to simplify parallel programming. IOPA is evaluated using multiple applications with both solid state and hard disk drives. The results show that the parallel applications using IOPA can achieve higher efficiency than those with a fixed number of threads.
Thermal-hydraulic simulation of natural convection decay heat removal in the High Flux Isotope Reactor (HFIR) using RELAP5 and TEMPEST: Part 2, Interpretation and validation of results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruggles, A.E.; Morris, D.G.

The RELAP5/MOD2 code was used to predict the thermal-hydraulic behavior of the HFIR core during decay heat removal through boiling natural circulation. The low system pressure and low mass flux values associated with boiling natural circulation are far from conditions for which RELAP5 is well exercised. Therefore, some simple hand calculations are used herein to establish the physics of the results. The interpretation and validation effort is divided between the time average flow conditions and the time varying flow conditions. The time average flow conditions are evaluated using a lumped parameter model and heat balance. The Martinelli-Nelson correlations are usedmore » to model the two-phase pressure drop and void fraction vs flow quality relationship within the core region. Systems of parallel channels are susceptible to both density wave oscillations and pressure drop oscillations. Periodic variations in the mass flux and exit flow quality of individual core channels are predicted by RELAP5. These oscillations are consistent with those observed experimentally and are of the density wave type. The impact of the time varying flow properties on local wall superheat is bounded herein. The conditions necessary for Ledinegg flow excursions are identified. These conditions do not fall within the envelope of decay heat levels relevant to HFIR in boiling natural circulation. 14 refs., 5 figs., 1 tab.« less
Amid the Tempest: An Observational View of Magnetic Reconnection in Explosions on the Sun

NASA Astrophysics Data System (ADS)

Qiu, Jiong

2007-05-01

Viewed through telescopes, the Sun is a restless star. Frequently, impulsive brightenings in the Sun's atmosphere, known as solar flares, are observed across a broad range of the electromagnetic spectrum. It is considered that solar flares are driven by magnetic reconnection, when anti-parallel magnetic field lines collide and reconnect with each other, efficiently converting free magnetic energy into heating plasmas and accelerating charged particles. Over the past decades, solar physicists have discovered observational signatures as indirect evidence for magnetic reconnection. Careful analyses of these observations lead to evaluation of key physical parameters of magnetic reconnection. Growing efforts have been extended to understand the process of magnetic reconnection in some of the most spectacular explosions on the Sun in the form of coronal mass ejections (CMEs). Often accompanied by flares, nearly once a day, a large bundle of plasma wrapped in magnetic field lines is violently hurled out of the Sun into interplanetary space. This is a CME. CMEs are driven magnetically, although the exact mechanisms remain in heated debate. Among many mysteries of CMEs, a fundamental question has been the origin of the specific magnetic structure of CMEs, some reaching the earth and being observed in-situ as a nested set of helical field lines, or a magnetic flux rope. Analyses of interplanetary magnetic flux ropes and their solar progenitors, including flares and CMEs, provide an observational insight into the role of magnetic reconnection at the early stage of flux rope eruption.
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Vanrosendale, John

1989-01-01

Distributed memory architectures offer high levels of performance and flexibility, but have proven awkard to program. Current languages for nonshared memory architectures provide a relatively low level programming environment, and are poorly suited to modular programming, and to the construction of libraries. A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. Tensor product array computations are focused on along with a simple but important class of numerical algorithms. The problem of programming 1-D kernal routines is focused on first, such as parallel tridiagonal solvers, and then how such parallel kernels can be combined to form parallel tensor product algorithms is examined.
Methods for design and evaluation of parallel computating systems (The PISCES project)

NASA Technical Reports Server (NTRS)

Pratt, Terrence W.; Wise, Robert; Haught, Mary JO

1989-01-01

The PISCES project started in 1984 under the sponsorship of the NASA Computational Structural Mechanics (CSM) program. A PISCES 1 programming environment and parallel FORTRAN were implemented in 1984 for the DEC VAX (using UNIX processes to simulate parallel processes). This system was used for experimentation with parallel programs for scientific applications and AI (dynamic scene analysis) applications. PISCES 1 was ported to a network of Apollo workstations by N. Fitzgerald.
Computer-aided programming for message-passing system; Problems and a solution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, M.Y.; Gajski, D.D.

1989-12-01

As the number of processors and the complexity of problems to be solved increase, programming multiprocessing systems becomes more difficult and error-prone. Program development tools are necessary since programmers are not able to develop complex parallel programs efficiently. Parallel models of computation, parallelization problems, and tools for computer-aided programming (CAP) are discussed. As an example, a CAP tool that performs scheduling and inserts communication primitives automatically is described. It also generates the performance estimates and other program quality measures to help programmers in improving their algorithms and programs.
Parallel implementation of an adaptive and parameter-free N-body integrator

NASA Astrophysics Data System (ADS)

Pruett, C. David; Ingham, William H.; Herman, Ralph D.

2011-05-01

Previously, Pruett et al. (2003) [3] described an N-body integrator of arbitrarily high order M with an asymptotic operation count of O(MN). The algorithm's structure lends itself readily to data parallelization, which we document and demonstrate here in the integration of point-mass systems subject to Newtonian gravitation. High order is shown to benefit parallel efficiency. The resulting N-body integrator is robust, parameter-free, highly accurate, and adaptive in both time-step and order. Moreover, it exhibits linear speedup on distributed parallel processors, provided that each processor is assigned at least a handful of bodies. Program summaryProgram title: PNB.f90 Catalogue identifier: AEIK_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIK_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 3052 No. of bytes in distributed program, including test data, etc.: 68 600 Distribution format: tar.gz Programming language: Fortran 90 and OpenMPI Computer: All shared or distributed memory parallel processors Operating system: Unix/Linux Has the code been vectorized or parallelized?: The code has been parallelized but has not been explicitly vectorized. RAM: Dependent upon N Classification: 4.3, 4.12, 6.5 Nature of problem: High accuracy numerical evaluation of trajectories of N point masses each subject to Newtonian gravitation. Solution method: Parallel and adaptive extrapolation in time via power series of arbitrary degree. Running time: 5.1 s for the demo program supplied with the package.
Parallel solution of sparse one-dimensional dynamic programming problems

NASA Technical Reports Server (NTRS)

Nicol, David M.

1989-01-01

Parallel computation offers the potential for quickly solving large computational problems. However, it is often a non-trivial task to effectively use parallel computers. Solution methods must sometimes be reformulated to exploit parallelism; the reformulations are often more complex than their slower serial counterparts. We illustrate these points by studying the parallelization of sparse one-dimensional dynamic programming problems, those which do not obviously admit substantial parallelization. We propose a new method for parallelizing such problems, develop analytic models which help us to identify problems which parallelize well, and compare the performance of our algorithm with existing algorithms on a multiprocessor.
76 FR 66309 - Pilot Program for Parallel Review of Medical Products; Correction

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-26

... DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Medicare and Medicaid Services [CMS-3180-N2] Food and Drug Administration [Docket No. FDA-2010-N-0308] Pilot Program for Parallel Review of Medical... technologies to participate in a program of parallel FDA-CMS review. The document was published with an...
Massively parallel implementation of 3D-RISM calculation with volumetric 3D-FFT.

PubMed

Maruyama, Yutaka; Yoshida, Norio; Tadano, Hiroto; Takahashi, Daisuke; Sato, Mitsuhisa; Hirata, Fumio

2014-07-05

A new three-dimensional reference interaction site model (3D-RISM) program for massively parallel machines combined with the volumetric 3D fast Fourier transform (3D-FFT) was developed, and tested on the RIKEN K supercomputer. The ordinary parallel 3D-RISM program has a limitation on the number of parallelizations because of the limitations of the slab-type 3D-FFT. The volumetric 3D-FFT relieves this limitation drastically. We tested the 3D-RISM calculation on the large and fine calculation cell (2048(3) grid points) on 16,384 nodes, each having eight CPU cores. The new 3D-RISM program achieved excellent scalability to the parallelization, running on the RIKEN K supercomputer. As a benchmark application, we employed the program, combined with molecular dynamics simulation, to analyze the oligomerization process of chymotrypsin Inhibitor 2 mutant. The results demonstrate that the massive parallel 3D-RISM program is effective to analyze the hydration properties of the large biomolecular systems. Copyright © 2014 Wiley Periodicals, Inc.
F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

NASA Technical Reports Server (NTRS)

DiNucci, David C.; Saini, Subhash (Technical Monitor)

1998-01-01

Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).
Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

NASA Technical Reports Server (NTRS)

Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

1994-01-01

Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.
Using CLIPS in the domain of knowledge-based massively parallel programming

NASA Technical Reports Server (NTRS)

Dvorak, Jiri J.

1994-01-01

The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.
Evolving binary classifiers through parallel computation of multiple fitness cases.

PubMed

Cagnoni, Stefano; Bergenti, Federico; Mordonini, Monica; Adorni, Giovanni

2005-06-01

This paper describes two versions of a novel approach to developing binary classifiers, based on two evolutionary computation paradigms: cellular programming and genetic programming. Such an approach achieves high computation efficiency both during evolution and at runtime. Evolution speed is optimized by allowing multiple solutions to be computed in parallel. Runtime performance is optimized explicitly using parallel computation in the case of cellular programming or implicitly taking advantage of the intrinsic parallelism of bitwise operators on standard sequential architectures in the case of genetic programming. The approach was tested on a digit recognition problem and compared with a reference classifier.
Implementations of BLAST for parallel computers.

PubMed

Jülich, A

1995-02-01

The BLAST sequence comparison programs have been ported to a variety of parallel computers-the shared memory machine Cray Y-MP 8/864 and the distributed memory architectures Intel iPSC/860 and nCUBE. Additionally, the programs were ported to run on workstation clusters. We explain the parallelization techniques and consider the pros and cons of these methods. The BLAST programs are very well suited for parallelization for a moderate number of processors. We illustrate our results using the program blastp as an example. As input data for blastp, a 799 residue protein query sequence and the protein database PIR were used.
Programming parallel architectures: The BLAZE family of languages

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush

1988-01-01

Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.
Exploiting Vector and Multicore Parallelsim for Recursive, Data- and Task-Parallel Programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ren, Bin; Krishnamoorthy, Sriram; Agrawal, Kunal

Modern hardware contains parallel execution resources that are well-suited for data-parallelism-vector units-and task parallelism-multicores. However, most work on parallel scheduling focuses on one type of hardware or the other. In this work, we present a scheduling framework that allows for a unified treatment of task- and data-parallelism. Our key insight is an abstraction, task blocks, that uniformly handles data-parallel iterations and task-parallel tasks, allowing them to be scheduled on vector units or executed independently as multicores. Our framework allows us to define schedulers that can dynamically select between executing task- blocks on vector units or multicores. We show that thesemore » schedulers are asymptotically optimal, and deliver the maximum amount of parallelism available in computation trees. To evaluate our schedulers, we develop program transformations that can convert mixed data- and task-parallel pro- grams into task block-based programs. Using a prototype instantiation of our scheduling framework, we show that, on an 8-core system, we can simultaneously exploit vector and multicore parallelism to achieve 14×-108× speedup over sequential baselines.« less
High-performance computing — an overview

NASA Astrophysics Data System (ADS)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
The Design and Evaluation of "CAPTools"--A Computer Aided Parallelization Toolkit

NASA Technical Reports Server (NTRS)

Yan, Jerry; Frumkin, Michael; Hribar, Michelle; Jin, Haoqiang; Waheed, Abdul; Johnson, Steve; Cross, Jark; Evans, Emyr; Ierotheou, Constantinos; Leggett, Pete;

1998-01-01

Writing applications for high performance computers is a challenging task. Although writing code by hand still offers the best performance, it is extremely costly and often not very portable. The Computer Aided Parallelization Tools (CAPTools) are a toolkit designed to help automate the mapping of sequential FORTRAN scientific applications onto multiprocessors. CAPTools consists of the following major components: an inter-procedural dependence analysis module that incorporates user knowledge; a 'self-propagating' data partitioning module driven via user guidance; an execution control mask generation and optimization module for the user to fine tune parallel processing of individual partitions; a program transformation/restructuring facility for source code clean up and optimization; a set of browsers through which the user interacts with CAPTools at each stage of the parallelization process; and a code generator supporting multiple programming paradigms on various multiprocessors. Besides describing the rationale behind the architecture of CAPTools, the parallelization process is illustrated via case studies involving structured and unstructured meshes. The programming process and the performance of the generated parallel programs are compared against other programming alternatives based on the NAS Parallel Benchmarks, ARC3D and other scientific applications. Based on these results, a discussion on the feasibility of constructing architectural independent parallel applications is presented.

Real-time implementations of image segmentation algorithms on shared memory multicore architecture: a survey (Conference Presentation)

NASA Astrophysics Data System (ADS)

Akil, Mohamed

2017-05-01

The real-time processing is getting more and more important in many image processing applications. Image segmentation is one of the most fundamental tasks image analysis. As a consequence, many different approaches for image segmentation have been proposed. The watershed transform is a well-known image segmentation tool. The watershed transform is a very data intensive task. To achieve acceleration and obtain real-time processing of watershed algorithms, parallel architectures and programming models for multicore computing have been developed. This paper focuses on the survey of the approaches for parallel implementation of sequential watershed algorithms on multicore general purpose CPUs: homogeneous multicore processor with shared memory. To achieve an efficient parallel implementation, it's necessary to explore different strategies (parallelization/distribution/distributed scheduling) combined with different acceleration and optimization techniques to enhance parallelism. In this paper, we give a comparison of various parallelization of sequential watershed algorithms on shared memory multicore architecture. We analyze the performance measurements of each parallel implementation and the impact of the different sources of overhead on the performance of the parallel implementations. In this comparison study, we also discuss the advantages and disadvantages of the parallel programming models. Thus, we compare the OpenMP (an application programming interface for multi-Processing) with Ptheads (POSIX Threads) to illustrate the impact of each parallel programming model on the performance of the parallel implementations.

Multiprocessor speed-up, Amdahl's Law, and the Activity Set Model of parallel program behavior

NASA Technical Reports Server (NTRS)

Gelenbe, Erol

1988-01-01

An important issue in the effective use of parallel processing is the estimation of the speed-up one may expect as a function of the number of processors used. Amdahl's Law has traditionally provided a guideline to this issue, although it appears excessively pessimistic in the light of recent experimental results. In this note, Amdahl's Law is amended by giving a greater importance to the capacity of a program to make effective use of parallel processing, but also recognizing the fact that imbalance of the workload of each processor is bound to occur. An activity set model of parallel program behavior is then introduced along with the corresponding parallelism index of a program, leading to upper and lower bounds to the speed-up.
Experiences with hypercube operating system instrumentation

NASA Technical Reports Server (NTRS)

Reed, Daniel A.; Rudolph, David C.

1989-01-01

The difficulties in conceptualizing the interactions among a large number of processors make it difficult both to identify the sources of inefficiencies and to determine how a parallel program could be made more efficient. This paper describes an instrumentation system that can trace the execution of distributed memory parallel programs by recording the occurrence of parallel program events. The resulting event traces can be used to compile summary statistics that provide a global view of program performance. In addition, visualization tools permit the graphic display of event traces. Visual presentation of performance data is particularly useful, indeed, necessary for large-scale parallel computers; the enormous volume of performance data mandates visual display.
Communications oriented programming of parallel iterative solutions of sparse linear systems

NASA Technical Reports Server (NTRS)

Patrick, M. L.; Pratt, T. W.

1986-01-01

Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.
Parallel programming of saccades during natural scene viewing: evidence from eye movement positions.

PubMed

Wu, Esther X W; Gilani, Syed Omer; van Boxtel, Jeroen J A; Amihai, Ido; Chua, Fook Kee; Yen, Shih-Cheng

2013-10-24

Previous studies have shown that saccade plans during natural scene viewing can be programmed in parallel. This evidence comes mainly from temporal indicators, i.e., fixation durations and latencies. In the current study, we asked whether eye movement positions recorded during scene viewing also reflect parallel programming of saccades. As participants viewed scenes in preparation for a memory task, their inspection of the scene was suddenly disrupted by a transition to another scene. We examined whether saccades after the transition were invariably directed immediately toward the center or were contingent on saccade onset times relative to the transition. The results, which showed a dissociation in eye movement behavior between two groups of saccades after the scene transition, supported the parallel programming account. Saccades with relatively long onset times (>100 ms) after the transition were directed immediately toward the center of the scene, probably to restart scene exploration. Saccades with short onset times (<100 ms) moved to the center only one saccade later. Our data on eye movement positions provide novel evidence of parallel programming of saccades during scene viewing. Additionally, results from the analyses of intersaccadic intervals were also consistent with the parallel programming hypothesis.
PyPele Rewritten To Use MPI

NASA Technical Reports Server (NTRS)

Hockney, George; Lee, Seungwon

2008-01-01

A computer program known as PyPele, originally written as a Pythonlanguage extension module of a C++ language program, has been rewritten in pure Python language. The original version of PyPele dispatches and coordinates parallel-processing tasks on cluster computers and provides a conceptual framework for spacecraft-mission- design and -analysis software tools to run in an embarrassingly parallel mode. The original version of PyPele uses SSH (Secure Shell a set of standards and an associated network protocol for establishing a secure channel between a local and a remote computer) to coordinate parallel processing. Instead of SSH, the present Python version of PyPele uses Message Passing Interface (MPI) [an unofficial de-facto standard language-independent application programming interface for message- passing on a parallel computer] while keeping the same user interface. The use of MPI instead of SSH and the preservation of the original PyPele user interface make it possible for parallel application programs written previously for the original version of PyPele to run on MPI-based cluster computers. As a result, engineers using the previously written application programs can take advantage of embarrassing parallelism without need to rewrite those programs.
TEMPEST Level-0 Theory

DTIC Science & Technology

2011-11-01

trajectory of the ship-fixed reference system relative to an earth-fixed reference system. The earth-fixed reference frame, EEE ZYX O , is assumed to be...the ship and moves with all the motions of the ship. The EEE ZYX O axis system is fixed to the earth. A third axis system, ’’’ zyxO , is required...added to account for the turbulence in the propeller slipstream: 2075.0445.0225.1 eeS aa  radians for aeɛ.0 565.0S radians for ae
A survey of parallel programming tools

NASA Technical Reports Server (NTRS)

Cheng, Doreen Y.

1991-01-01

This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.
Backtracking and Re-execution in the Automatic Debugging of Parallelized Programs

NASA Technical Reports Server (NTRS)

Matthews, Gregory; Hood, Robert; Johnson, Stephen; Leggett, Peter; Biegel, Bryan (Technical Monitor)

2002-01-01

In this work we describe a new approach using relative debugging to find differences in computation between a serial program and a parallel version of th it program. We use a combination of re-execution and backtracking in order to find the first difference in computation that may ultimately lead to an incorrect value that the user has indicated. In our prototype implementation we use static analysis information from a parallelization tool in order to perform the backtracking as well as the mapping required between serial and parallel computations.
Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry

1998-01-01

This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
An OpenACC-Based Unified Programming Model for Multi-accelerator Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Jungwon; Lee, Seyong; Vetter, Jeffrey S

2015-01-01

This paper proposes a novel SPMD programming model of OpenACC. Our model integrates the different granularities of parallelism from vector-level parallelism to node-level parallelism into a single, unified model based on OpenACC. It allows programmers to write programs for multiple accelerators using a uniform programming model whether they are in shared or distributed memory systems. We implement a prototype of our model and evaluate its performance with a GPU-based supercomputer using three benchmark applications.
Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Hao-Qiang; anMey, Dieter; Hatay, Ferhat F.

2003-01-01

Clusters of SMP (Symmetric Multi-Processors) nodes provide support for a wide range of parallel programming paradigms. The shared address space within each node is suitable for OpenMP parallelization. Message passing can be employed within and across the nodes of a cluster. Multiple levels of parallelism can be achieved by combining message passing and OpenMP parallelization. Which programming paradigm is the best will depend on the nature of the given problem, the hardware components of the cluster, the network, and the available software. In this study we compare the performance of different implementations of the same CFD benchmark application, using the same numerical algorithm but employing different programming paradigms.
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems.

PubMed

Stone, John E; Gohara, David; Shi, Guochun

2010-05-01

We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures.
Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Haoqiang; anMey, Dieter; Hatay, Ferhat F.

2003-01-01

With the advent of parallel hardware and software technologies users are faced with the challenge to choose a programming paradigm best suited for the underlying computer architecture. With the current trend in parallel computer architectures towards clusters of shared memory symmetric multi-processors (SMP), parallel programming techniques have evolved to support parallelism beyond a single level. Which programming paradigm is the best will depend on the nature of the given problem, the hardware architecture, and the available software. In this study we will compare different programming paradigms for the parallelization of a selected benchmark application on a cluster of SMP nodes. We compare the timings of different implementations of the same CFD benchmark application employing the same numerical algorithm on a cluster of Sun Fire SMP nodes. The rest of the paper is structured as follows: In section 2 we briefly discuss the programming models under consideration. We describe our compute platform in section 3. The different implementations of our benchmark code are described in section 4 and the performance results are presented in section 5. We conclude our study in section 6.
Rubus: A compiler for seamless and extensible parallelism.

PubMed

Adnan, Muhammad; Aslam, Faisal; Nawaz, Zubair; Sarwar, Syed Mansoor

2017-01-01

Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer's expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been achieved by Rubus on the same GPU. Moreover, Rubus achieves this performance without drastically increasing the memory footprint of a program.
Rubus: A compiler for seamless and extensible parallelism

PubMed Central

Adnan, Muhammad; Aslam, Faisal; Sarwar, Syed Mansoor

2017-01-01

Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer’s expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been achieved by Rubus on the same GPU. Moreover, Rubus achieves this performance without drastically increasing the memory footprint of a program. PMID:29211758
Efficient partitioning and assignment on programs for multiprocessor execution

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1993-01-01

The general problem studied is that of segmenting or partitioning programs for distribution across a multiprocessor system. Efficient partitioning and the assignment of program elements are of great importance since the time consumed in this overhead activity may easily dominate the computation, effectively eliminating any gains made by the use of the parallelism. In this study, the partitioning of sequentially structured programs (written in FORTRAN) is evaluated. Heuristics, developed for similar applications are examined. Finally, a model for queueing networks with finite queues is developed which may be used to analyze multiprocessor system architectures with a shared memory approach to the problem of partitioning. The properties of sequentially written programs form obstacles to large scale (at the procedure or subroutine level) parallelization. Data dependencies of even the minutest nature, reflecting the sequential development of the program, severely limit parallelism. The design of heuristic algorithms is tied to the experience gained in the parallel splitting. Parallelism obtained through the physical separation of data has seen some success, especially at the data element level. Data parallelism on a grander scale requires models that accurately reflect the effects of blocking caused by finite queues. A model for the approximation of the performance of finite queueing networks is developed. This model makes use of the decomposition approach combined with the efficiency of product form solutions.
A mechanism for efficient debugging of parallel programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, B.P.; Choi, J.D.

1988-01-01

This paper addresses the design and implementation of an integrated debugging system for parallel programs running on shared memory multi-processors (SMMP). The authors describe the use of flowback analysis to provide information on causal relationships between events in a program's execution without re-executing the program for debugging. The authors introduce a mechanism called incremental tracing that, by using semantic analyses of the debugged program, makes the flowback analysis practical with only a small amount of trace generated during execution. The extend flowback analysis to apply to parallel programs and describe a method to detect race conditions in the interactions ofmore » the co-operating processes.« less
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems

PubMed Central

Stone, John E.; Gohara, David; Shi, Guochun

2010-01-01

We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures. PMID:21037981
Vector scattering analysis of TPF coronagraph pupil masks

NASA Astrophysics Data System (ADS)

Ceperley, Daniel P.; Neureuther, Andrew R.; Lieber, Michael D.; Kasdin, N. Jeremy; Shih, Ta-Ming

2004-10-01

Rigorous finite-difference time-domain electromagnetic simulation is used to simulate the scattering from proto-typical pupil mask cross-section geometries and to quantify the differences from the normally assumed ideal on-off behavior. Shaped pupil plane masks are a promising technology for the TPF coronagraph mission. However the stringent requirements placed on the optics require that the detailed behavior of the edge-effects of these masks be examined carefully. End-to-end optical system simulation is essential and an important aspect is the polarization and cross-section dependent edge-effects which are the subject of this paper. Pupil plane masks are similar in many respects to photomasks used in the integrated circuit industry. Simulation capabilities such as the FDTD simulator, TEMPEST, developed for analyzing polarization and intensity imbalance effects in nonplanar phase-shifting photomasks, offer a leg-up in analyzing coronagraph masks. However, the accuracy in magnitude and phase required for modeling a chronograph system is extremely demanding and previously inconsequential errors may be of the same order of magnitude as the physical phenomena under study. In this paper, effects of thick masks, finite conductivity metals, and various cross-section geometries on the transmission of pupil-plane masks are illustrated. Undercutting the edge shape of Cr masks improves the effective opening width to within λ/5 of the actual opening but TE and TM polarizations require opposite compensations. The deviation from ideal is examined at the reference plane of the mask opening. Numerical errors in TEMPEST, such as numerical dispersion, perfectly matched layer reflections, and source haze are also discussed along with techniques for mitigating their impacts.
Genetic algorithms using SISAL parallel programming language

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tejada, S.

1994-05-06

Genetic algorithms are a mathematical optimization technique developed by John Holland at the University of Michigan [1]. The SISAL programming language possesses many of the characteristics desired to implement genetic algorithms. SISAL is a deterministic, functional programming language which is inherently parallel. Because SISAL is functional and based on mathematical concepts, genetic algorithms can be efficiently translated into the language. Several of the steps involved in genetic algorithms, such as mutation, crossover, and fitness evaluation, can be parallelized using SISAL. In this paper I will l discuss the implementation and performance of parallel genetic algorithms in SISAL.

An Expert System for the Development of Efficient Parallel Code

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Chun, Robert; Jin, Hao-Qiang; Labarta, Jesus; Gimenez, Judit

2004-01-01

We have built the prototype of an expert system to assist the user in the development of efficient parallel code. The system was integrated into the parallel programming environment that is currently being developed at NASA Ames. The expert system interfaces to tools for automatic parallelization and performance analysis. It uses static program structure information and performance data in order to automatically determine causes of poor performance and to make suggestions for improvements. In this paper we give an overview of our programming environment, describe the prototype implementation of our expert system, and demonstrate its usefulness with several case studies.
Optics Program Modified for Multithreaded Parallel Computing

NASA Technical Reports Server (NTRS)

Lou, John; Bedding, Dave; Basinger, Scott

2006-01-01

A powerful high-performance computer program for simulating and analyzing adaptive and controlled optical systems has been developed by modifying the serial version of the Modeling and Analysis for Controlled Optical Systems (MACOS) program to impart capabilities for multithreaded parallel processing on computing systems ranging from supercomputers down to Symmetric Multiprocessing (SMP) personal computers. The modifications included the incorporation of OpenMP, a portable and widely supported application interface software, that can be used to explicitly add multithreaded parallelism to an application program under a shared-memory programming model. OpenMP was applied to parallelize ray-tracing calculations, one of the major computing components in MACOS. Multithreading is also used in the diffraction propagation of light in MACOS based on pthreads [POSIX Thread, (where "POSIX" signifies a portable operating system for UNIX)]. In tests of the parallelized version of MACOS, the speedup in ray-tracing calculations was found to be linear, or proportional to the number of processors, while the speedup in diffraction calculations ranged from 50 to 60 percent, depending on the type and number of processors. The parallelized version of MACOS is portable, and, to the user, its interface is basically the same as that of the original serial version of MACOS.
Solving Integer Programs from Dependence and Synchronization Problems

DTIC Science & Technology

1993-03-01

DEFF.NSNE Solving Integer Programs from Dependence and Synchronization Problems Jaspal Subhlok March 1993 CMU-CS-93-130 School of Computer ScienceT IC...method Is an exact and efficient way of solving integer programming problems arising in dependence and synchronization analysis of parallel programs...7/;- p Keywords: Exact dependence tesing, integer programming. parallelilzng compilers, parallel program analysis, synchronization analysis Solving
Hamlet's Transformation.

NASA Astrophysics Data System (ADS)

Usher, P. D.

1997-12-01

William Shakespeare's Hamlet has much evidence to suggest that the Bard was aware of the cosmological models of his time, specifically the geocentric bounded Ptolemaic and Tychonic models, and the infinite Diggesian. Moreover, Shakespeare describes how the Ptolemaic model is to be transformed to the Diggesian. Hamlet's "transformation" is the reason that Claudius, who personifies the Ptolemaic model, summons Rosencrantz and Guildenstern, who personify the Tychonic. Pantometria, written by Leonard Digges and his son Thomas in 1571, contains the first technical use of the word "transformation." At age thirty, Thomas Digges went on to propose his Perfit Description, as alluded to in Act Five where Hamlet's age is given as thirty. In Act Five as well, the words "bore" and "arms" refer to Thomas' vocation as muster-master and his scientific interest in ballistics. England's leading astronomer was also the father of the poet whose encomium introduced the First Folio of 1623. His oldest child Dudley became a member of the Virginia Company and facilitated the writing of The Tempest. Taken as a whole, such manifold connections to Thomas Digges support Hotson's contention that Shakespeare knew the Digges family. Rosencrantz and Guildenstern in Hamlet bear Danish names because they personify the Danish model, while the king's name is latinized like that of Claudius Ptolemaeus. The reason Shakespeare anglicized "Amleth" to "Hamlet" was because he saw a parallel between Book Three of Saxo Grammaticus and the eventual triumph of the Diggesian model. But Shakespeare eschewed Book Four, creating this particular ending from an infinity of other possibilities because it "suited his purpose," viz. to celebrate the concept of a boundless universe of stars like the Sun.
The FORCE - A highly portable parallel programming language

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.
The FORCE: A highly portable parallel programming language

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.
Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

DOE PAGES

Olivier, Stephen L.; de Supinski, Bronis R.; Schulz, Martin; ...

2013-01-01

Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems.more » Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.« less
Distributed and parallel Ada and the Ada 9X recommendations

NASA Technical Reports Server (NTRS)

Volz, Richard A.; Goldsack, Stephen J.; Theriault, R.; Waldrop, Raymond S.; Holzbacher-Valero, A. A.

1992-01-01

Recently, the DoD has sponsored work towards a new version of Ada, intended to support the construction of distributed systems. The revised version, often called Ada 9X, will become the new standard sometimes in the 1990s. It is intended that Ada 9X should provide language features giving limited support for distributed system construction. The requirements for such features are given. Many of the most advanced computer applications involve embedded systems that are comprised of parallel processors or networks of distributed computers. If Ada is to become the widely adopted language envisioned by many, it is essential that suitable compilers and tools be available to facilitate the creation of distributed and parallel Ada programs for these applications. The major languages issues impacting distributed and parallel programming are reviewed, and some principles upon which distributed/parallel language systems should be built are suggested. Based upon these, alternative language concepts for distributed/parallel programming are analyzed.
Implementation and performance of parallel Prolog interpreter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wei, S.; Kale, L.V.; Balkrishna, R.

1988-01-01

In this paper, the authors discuss the implementation of a parallel Prolog interpreter on different parallel machines. The implementation is based on the REDUCE--OR process model which exploits both AND and OR parallelism in logic programs. It is machine independent as it runs on top of the chare-kernel--a machine-independent parallel programming system. The authors also give the performance of the interpreter running a diverse set of benchmark pargrams on parallel machines including shared memory systems: an Alliant FX/8, Sequent and a MultiMax, and a non-shared memory systems: Intel iPSC/32 hypercube, in addition to its performance on a multiprocessor simulation system.
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele

2001-01-01

This viewgraph presentation provides information on support sources available for the automatic parallelization of computer program. CAPTools, a support tool developed at the University of Greenwich, transforms, with user guidance, existing sequential Fortran code into parallel message passing code. Comparison routines are then run for debugging purposes, in essence, ensuring that the code transformation was accurate.
Parallelizing serial code for a distributed processing environment with an application to high frequency electromagnetic scattering

NASA Astrophysics Data System (ADS)

Work, Paul R.

1991-12-01

This thesis investigates the parallelization of existing serial programs in computational electromagnetics for use in a parallel environment. Existing algorithms for calculating the radar cross section of an object are covered, and a ray-tracing code is chosen for implementation on a parallel machine. Current parallel architectures are introduced and a suitable parallel machine is selected for the implementation of the chosen ray-tracing algorithm. The standard techniques for the parallelization of serial codes are discussed, including load balancing and decomposition considerations, and appropriate methods for the parallelization effort are selected. A load balancing algorithm is modified to increase the efficiency of the application, and a high level design of the structure of the serial program is presented. A detailed design of the modifications for the parallel implementation is also included, with both the high level and the detailed design specified in a high level design language called UNITY. The correctness of the design is proven using UNITY and standard logic operations. The theoretical and empirical results show that it is possible to achieve an efficient parallel application for a serial computational electromagnetic program where the characteristics of the algorithm and the target architecture critically influence the development of such an implementation.
The Automated Instrumentation and Monitoring System (AIMS): Design and Architecture. 3.2

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Schmidt, Melisa; Schulbach, Cathy; Bailey, David (Technical Monitor)

1997-01-01

Whether a researcher is designing the 'next parallel programming paradigm', another 'scalable multiprocessor' or investigating resource allocation algorithms for multiprocessors, a facility that enables parallel program execution to be captured and displayed is invaluable. Careful analysis of such information can help computer and software architects to capture, and therefore, exploit behavioral variations among/within various parallel programs to take advantage of specific hardware characteristics. A software tool-set that facilitates performance evaluation of parallel applications on multiprocessors has been put together at NASA Ames Research Center under the sponsorship of NASA's High Performance Computing and Communications Program over the past five years. The Automated Instrumentation and Monitoring Systematic has three major software components: a source code instrumentor which automatically inserts active event recorders into program source code before compilation; a run-time performance monitoring library which collects performance data; and a visualization tool-set which reconstructs program execution based on the data collected. Besides being used as a prototype for developing new techniques for instrumenting, monitoring and presenting parallel program execution, AIMS is also being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Currently, the execution of FORTRAN and C programs on the Intel Paragon and PALM workstations can be automatically instrumented and monitored. Performance data thus collected can be displayed graphically on various workstations. The process of performance tuning with AIMS will be illustrated using various NAB Parallel Benchmarks. This report includes a description of the internal architecture of AIMS and a listing of the source code.
What Multilevel Parallel Programs do when you are not Watching: A Performance Analysis Case Study Comparing MPI/OpenMP, MLP, and Nested OpenMP

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Labarta, Jesus; Gimenez, Judit

2004-01-01

With the current trend in parallel computer architectures towards clusters of shared memory symmetric multi-processors, parallel programming techniques have evolved that support parallelism beyond a single level. When comparing the performance of applications based on different programming paradigms, it is important to differentiate between the influence of the programming model itself and other factors, such as implementation specific behavior of the operating system (OS) or architectural issues. Rewriting-a large scientific application in order to employ a new programming paradigms is usually a time consuming and error prone task. Before embarking on such an endeavor it is important to determine that there is really a gain that would not be possible with the current implementation. A detailed performance analysis is crucial to clarify these issues. The multilevel programming paradigms considered in this study are hybrid MPI/OpenMP, MLP, and nested OpenMP. The hybrid MPI/OpenMP approach is based on using MPI [7] for the coarse grained parallelization and OpenMP [9] for fine grained loop level parallelism. The MPI programming paradigm assumes a private address space for each process. Data is transferred by explicitly exchanging messages via calls to the MPI library. This model was originally designed for distributed memory architectures but is also suitable for shared memory systems. The second paradigm under consideration is MLP which was developed by Taft. The approach is similar to MPi/OpenMP, using a mix of coarse grain process level parallelization and loop level OpenMP parallelization. As it is the case with MPI, a private address space is assumed for each process. The MLP approach was developed for ccNUMA architectures and explicitly takes advantage of the availability of shared memory. A shared memory arena which is accessible by all processes is required. Communication is done by reading from and writing to the shared memory.
Performance Evaluation in Network-Based Parallel Computing

NASA Technical Reports Server (NTRS)

Dezhgosha, Kamyar

1996-01-01

Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
WFIRST: Science from the Guest Investigator and Parallel Observation Programs

NASA Astrophysics Data System (ADS)

Postman, Marc; Nataf, David; Furlanetto, Steve; Milam, Stephanie; Robertson, Brant; Williams, Ben; Teplitz, Harry; Moustakas, Leonidas; Geha, Marla; Gilbert, Karoline; Dickinson, Mark; Scolnic, Daniel; Ravindranath, Swara; Strolger, Louis; Peek, Joshua; Marc Postman

2018-01-01

The Wide Field InfraRed Survey Telescope (WFIRST) mission will provide an extremely rich archival dataset that will enable a broad range of scientific investigations beyond the initial objectives of the proposed key survey programs. The scientific impact of WFIRST will thus be significantly expanded by a robust Guest Investigator (GI) archival research program. We will present examples of GI research opportunities ranging from studies of the properties of a variety of Solar System objects, surveys of the outer Milky Way halo, comprehensive studies of cluster galaxies, to unique and new constraints on the epoch of cosmic re-ionization and the assembly of galaxies in the early universe.WFIRST will also support the acquisition of deep wide-field imaging and slitless spectroscopic data obtained in parallel during campaigns with the coronagraphic instrument (CGI). These parallel wide-field imager (WFI) datasets can provide deep imaging data covering several square degrees at no impact to the scheduling of the CGI program. A competitively selected program of well-designed parallel WFI observation programs will, like the GI science above, maximize the overall scientific impact of WFIRST. We will give two examples of parallel observations that could be conducted during a proposed CGI program centered on a dozen nearby stars.
Optimal Elevation and Configuration of Hanford's Double-Shell Tank Waste Mixer Pumps

DOE Office of Scientific and Technical Information (OSTI.GOV)

Onishi, Yasuo; Yokuda, Satoru T.; Majumder, Catherine A.

The objective of this study was to compare the mixing performance of the Lawrence pump, which has injection nozzles at the top, with an alternative pump that has injection nozzles at the bottom, and to determine the optimal elevation for the alternative pump. Sixteen cases were evaluated: two sludge thicknesses at eight levels. A two-step evaluation approach was used: Step 1 to evaluate all 16 cases with the non-rotating mixer pump model and Step 2 to further evaluate four of those cases with the more realistic rotating mixer pump model. The TEMPEST code was used.
Analytical collisionless damping rate of geodesic acoustic mode

NASA Astrophysics Data System (ADS)

Ren, H.; Xu, X. Q.

2016-10-01

Collisionless damping of geodesic acoustic mode (GAM) is analytically investigated by considering the finite-orbit-width (FOW) resonance effect to the 3rd order in the gyro-kinetic equations. A concise and transparent expression for the damping rate is presented for the first time. Good agreement is found between the analytical damping rate and the previous TEMPEST simulation result (Xu 2008 et al Phys. Rev. Lett. 100 215001) for systematic q scans. Our result also shows that it is of sufficient accuracy and has to take into account the FOW effect to the 3rd order.
Parallelized direct execution simulation of message-passing parallel programs

NASA Technical Reports Server (NTRS)

Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

1994-01-01

As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study

DOE PAGES

Radhakrishnan, Hari; Rouson, Damian W. I.; Morris, Karla; ...

2015-01-01

This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were donemore » using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.« less
Programming Probabilistic Structural Analysis for Parallel Processing Computer

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

1991-01-01

The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.

Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

NASA Technical Reports Server (NTRS)

Weeks, Cindy Lou

1986-01-01

Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
Performance Implications of Synchronization Support for Parallel FORTRAN Programs

DTIC Science & Technology

1991-06-17

applications we used in this study are BDNA and FLO52. BDNA is a molecular dy- I namics simulator for biomolecules in water and it uses ordinary...parallelism structures and loop granularity. In the BDNA program, most of the parallel loops are not nested and the iterations are 200-1000 instructions long...are of concern. The BDNA curve in Figure 21 shows that for this program only 17% of all 32 I I 100 BDNA -4 FLO52 -I 80 3 CumuilatQe percentage of3
Parallelization of Program to Optimize Simulated Trajectories (POST3D)

NASA Technical Reports Server (NTRS)

Hammond, Dana P.; Korte, John J. (Technical Monitor)

2001-01-01

This paper describes the parallelization of the Program to Optimize Simulated Trajectories (POST3D). POST3D uses a gradient-based optimization algorithm that reaches an optimum design point by moving from one design point to the next. The gradient calculations required to complete the optimization process, dominate the computational time and have been parallelized using a Single Program Multiple Data (SPMD) on a distributed memory NUMA (non-uniform memory access) architecture. The Origin2000 was used for the tests presented.
Selective, Embedded, Just-In-Time Specialization (SEJITS): Portable Parallel Performance from Sequential, Productive, Embedded Domain-Specific Languages

DTIC Science & Technology

2012-12-01

identity operation SIMD Single instruction, multiple datastream parallel computing Scala A byte-compiled programming language featuring dynamic type...Specific Languages 5a. CONTRACT NUMBER FA8750-10-1-0191 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 61101E 6. AUTHOR(S) Armando Fox 5d...application performance, but usually must rely on efficiency programmers who are experts in explicit parallel programming to achieve it. Since such efficiency
Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

PubMed

Bellucci, Michael A; Coker, David F

2011-07-28

We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent. © 2011 American Institute of Physics
Concepts of Concurrent Programming

DTIC Science & Technology

1990-04-01

to the material presented. Carriero89 Carriero, N., and Gelernter, D. " How to Write Parallel Programs : A Guide to the Perplexed." ACM...between the architectures on which programs can be executed and the application domains from which problems are drawn. Our goal is to show how programs ...Sept. 1989), 251-510. Abstract: There are four papers: 1. Programming Languages for Distributed Computing Systems (52); 2. How to Write Parallel
NavP: Structured and Multithreaded Distributed Parallel Programming

NASA Technical Reports Server (NTRS)

Pan, Lei; Xu, Jingling

2006-01-01

This slide presentation reviews some of the issues around distributed parallel programming. It compares and contrast two methods of programming: Single Program Multiple Data (SPMD) with the Navigational Programming (NAVP). It then reviews the distributed sequential computing (DSC) method and the methodology of NavP. Case studies are presented. It also reviews the work that is being done to enable the NavP system.
High Performance Programming Using Explicit Shared Memory Model on Cray T3D1

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Saini, Subhash; Grassi, Charles

1994-01-01

The Cray T3D system is the first-phase system in Cray Research, Inc.'s (CRI) three-phase massively parallel processing (MPP) program. This system features a heterogeneous architecture that closely couples DEC's Alpha microprocessors and CRI's parallel-vector technology, i.e., the Cray Y-MP and Cray C90. An overview of the Cray T3D hardware and available programming models is presented. Under Cray Research adaptive Fortran (CRAFT) model four programming methods (data parallel, work sharing, message-passing using PVM, and explicit shared memory model) are available to the users. However, at this time data parallel and work sharing programming models are not available to the user community. The differences between standard PVM and CRI's PVM are highlighted with performance measurements such as latencies and communication bandwidths. We have found that the performance of neither standard PVM nor CRI s PVM exploits the hardware capabilities of the T3D. The reasons for the bad performance of PVM as a native message-passing library are presented. This is illustrated by the performance of NAS Parallel Benchmarks (NPB) programmed in explicit shared memory model on Cray T3D. In general, the performance of standard PVM is about 4 to 5 times less than obtained by using explicit shared memory model. This degradation in performance is also seen on CM-5 where the performance of applications using native message-passing library CMMD on CM-5 is also about 4 to 5 times less than using data parallel methods. The issues involved (such as barriers, synchronization, invalidating data cache, aligning data cache etc.) while programming in explicit shared memory model are discussed. Comparative performance of NPB using explicit shared memory programming model on the Cray T3D and other highly parallel systems such as the TMC CM-5, Intel Paragon, Cray C90, IBM-SP1, etc. is presented.
On program restructuring, scheduling, and communication for parallel processor systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Polychronopoulos, Constantine D.

1986-08-01

This dissertation discusses several software and hardware aspects of program execution on large-scale, high-performance parallel processor systems. The issues covered are program restructuring, partitioning, scheduling and interprocessor communication, synchronization, and hardware design issues of specialized units. All this work was performed focusing on a single goal: to maximize program speedup, or equivalently, to minimize parallel execution time. Parafrase, a Fortran restructuring compiler was used to transform programs in a parallel form and conduct experiments. Two new program restructuring techniques are presented, loop coalescing and subscript blocking. Compile-time and run-time scheduling schemes are covered extensively. Depending on the program construct, thesemore » algorithms generate optimal or near-optimal schedules. For the case of arbitrarily nested hybrid loops, two optimal scheduling algorithms for dynamic and static scheduling are presented. Simulation results are given for a new dynamic scheduling algorithm. The performance of this algorithm is compared to that of self-scheduling. Techniques for program partitioning and minimization of interprocessor communication for idealized program models and for real Fortran programs are also discussed. The close relationship between scheduling, interprocessor communication, and synchronization becomes apparent at several points in this work. Finally, the impact of various types of overhead on program speedup and experimental results are presented.« less
Disseminating near-real-time hazards information and flood maps in the Philippines through Web-GIS.

PubMed

A Lagmay, Alfredo Mahar Francisco; Racoma, Bernard Alan; Aracan, Ken Adrian; Alconis-Ayco, Jenalyn; Saddi, Ivan Lester

2017-09-01

The Philippines being a locus of tropical cyclones, tsunamis, earthquakes and volcanic eruptions, is a hotbed of disasters. These natural hazards inflict loss of lives and costly damage to property. Situated in a region where climate and geophysical tempest is common, the Philippines will inevitably suffer from calamities similar to those experienced recently. With continued development and population growth in hazard prone areas, it is expected that damage to infrastructure and human losses would persist and even rise unless appropriate measures are immediately implemented by government. In 2012, the Philippines launched a responsive program for disaster prevention and mitigation called the Nationwide Operational Assessment of Hazards (Project NOAH), specifically for government warning agencies to be able to provide a 6hr lead-time warning to vulnerable communities against impending floods and to use advanced technology to enhance current geo-hazard vulnerability maps. To disseminate such critical information to as wide an audience as possible, a Web-GIS using mashups of freely available source codes and application program interface (APIs) was developed and can be found in the URLs http://noah.dost.gov.ph and http://noah.up.edu.ph/. This Web-GIS tool is now heavily used by local government units in the Philippines in their disaster prevention and mitigation efforts and can be replicated in countries that have a proactive approach to address the impacts of natural hazards but lack sufficient funds. Copyright © 2017. Published by Elsevier B.V.
Modelling parallel programs and multiprocessor architectures with AXE

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.

1991-01-01

AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
Web Based Parallel Programming Workshop for Undergraduate Education.

ERIC Educational Resources Information Center

Marcus, Robert L.; Robertson, Douglass

Central State University (Ohio), under a contract with Nichols Research Corporation, has developed a World Wide web based workshop on high performance computing entitled "IBN SP2 Parallel Programming Workshop." The research is part of the DoD (Department of Defense) High Performance Computing Modernization Program. The research…
SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

NASA Technical Reports Server (NTRS)

Cooke, Daniel; Rushton, Nelson

2013-01-01

With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less costly than development of comparable parallel code. Moreover, SequenceL not only automatically parallelizes the code, but since it is based on CSP-NT, it is provably race free, thus eliminating the largest quality challenge the parallelized software developer faces.
Instrumentation, performance visualization, and debugging tools for multiprocessors

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.; Hontalas, Philip J.

1991-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs.
Testing New Programming Paradigms with NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

2000-01-01

Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage was applied to several benchmarks, noticeably BT and SP, resulting in better sequential performance. In order to overcome the lack of an HPF performance model and guide the development of the HPF codes, we employed an empirical performance model for several primitives found in the benchmarks. We encountered a few limitations of HPF, such as lack of supporting the "REDISTRIBUTION" directive and no easy way to handle irregular computation. The parallelization with OpenMP directives was done at the outer-most loop level to achieve the largest granularity. The performance of six HPF and OpenMP benchmarks is compared with their MPI counterparts for the Class-A problem size in the figure in next page. These results were obtained on an SGI Origin2000 (195MHz) with MIPSpro-f77 compiler 7.2.1 for OpenMP and MPI codes and PGI pghpf-2.4.3 compiler with MPI interface for HPF programs.
Temporal Trends and Temperature-Related Incidence of Electrical Storm: The TEMPEST Study (Temperature-Related Incidence of Electrical Storm).

PubMed

Guerra, Federico; Bonelli, Paolo; Flori, Marco; Cipolletta, Laura; Carbucicchio, Corrado; Izquierdo, Maite; Kozluk, Edward; Shivkumar, Kalyanam; Vaseghi, Marmar; Patani, Francesca; Cupido, Claudio; Pala, Salvatore; Ruiz-Granell, Ricardo; Ferrero, Angel; Tondo, Claudio; Capucci, Alessandro

2017-03-01

The occurrence of ventricular tachyarrhythmias seems to follow circadian, daily, and seasonal distributions. Our aim is to identify potential temporal patterns of electrical storm (ES), in which a cluster of ventricular tachycardias or ventricular fibrillation, negatively affects short- and long-term survival. The TEMPEST study (Circannual Pattern and Temperature-Related Incidence of Electrical Storm) is a patient-level, pooled analysis of previously published data sets. Study selection criteria included diagnosis of ES, absence of acute coronary syndrome as the arrhythmic trigger, and ≥10 patients included. At the end of the selection and collection processes, 5 centers had the data set from their article pooled into the present registry. Temperature data and sunrise and sunset hours were retrieved from Weather Underground, the largest weather database available online. Total sample included 246 patients presenting with ES (221 men; age: 65±9 years). Each ES episode included a median of 7 ventricular tachycardia/ventricular fibrillation episodes. Fifty-nine percent of patients experienced ES during daytime hours ( P <0.001). The prevalence of ES was significantly higher during workdays, with Saturdays and Sundays registering the lowest rates of ES (10.4% and 7.2%, respectively, versus 16.5% daily mean from Monday to Friday; P <0.001). ES occurrence was significantly associated with increased monthly temperature range when compared with the month before ( P =0.003). ES incidence is not homogenous over time but seems to have a clustered pattern, with a higher incidence during daytime hours and working days. ES is associated with an increase in monthly temperature variation. https://www.crd.york.ac.uk. Unique identifier: CRD42013003744. © 2017 American Heart Association, Inc.
Regional TEMPEST survey in north-east Namibia

NASA Astrophysics Data System (ADS)

Peters, Geoffrey; Street, Gregory; Kahimise, Ivor; Hutchins, David

2015-09-01

A regional scale TEMPEST208 airborne electromagnetic survey was flown in north-east Namibia in 2011. With broad line spacing (4 km) and a relatively low-powered, fixed-wing system, the approach was intended to provide a regional geo-electric map of the area, rather than direct detection of potential mineral deposits. A key component of the geo-electric profiling was to map the relative thickness of the Kalahari sediments, which is up to 200 m thick and obscures most of the bedrock in the area. Knowledge of the thickness would allow explorers to better predict the costs of exploration under the Kalahari. An additional aim was to determine if bedrock conductors were detectable beneath the Kalahari cover. The system succeeded in measuring the Kalahari thickness where this cover was relatively thin and moderately conductive. Limitations in depth penetration mean that it is not possible to map the thickness in the centre of the survey area, and much of the northern half of the survey area. Additional problems arise due to the variable conductivity of the Kalahari cover. Where the conductivity of the Kalahari sediment is close to that of the basement, there is no discernable contrast to delineate the base of the Kalahari. Basement conductors are visible beneath the more thinly covered areas such as in the north-west and south of the survey area. The remainder of the survey area generally comprises deeper, more conductive cover and for the most part basement conductors cannot be detected. A qualitative comparison with VTEM data shows comparable results in terms of regional mapping, and suggests that even more powerful systems such as the VTEM may not detect discrete conductors beneath the thick conductive parts of the Kalahari cover.
Parallel computation with the force

NASA Technical Reports Server (NTRS)

Jordan, H. F.

1985-01-01

A methodology, called the force, supports the construction of programs to be executed in parallel by a force of processes. The number of processes in the force is unspecified, but potentially very large. The force idea is embodied in a set of macros which produce multiproceossor FORTRAN code and has been studied on two shared memory multiprocessors of fairly different character. The method has simplified the writing of highly parallel programs within a limited class of parallel algorithms and is being extended to cover a broader class. The individual parallel constructs which comprise the force methodology are discussed. Of central concern are their semantics, implementation on different architectures and performance implications.
Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures

NASA Technical Reports Server (NTRS)

Biegel, Bryan A. (Technical Monitor); Jost, G.; Jin, H.; Labarta J.; Gimenez, J.; Caubet, J.

2003-01-01

Parallel programming paradigms include process level parallelism, thread level parallelization, and multilevel parallelism. This viewgraph presentation describes a detailed performance analysis of these paradigms for Shared Memory Architecture (SMA). This analysis uses the Paraver Performance Analysis System. The presentation includes diagrams of a flow of useful computations.
76 FR 62808 - Pilot Program for Parallel Review of Medical Products

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-11

... voluntary participation in the pilot program, as well as the guiding principles the Agencies intend to... 57045), parallel review is intended to reduce the time between FDA marketing approval and CMS national...

Algorithms and programming tools for image processing on the MPP

NASA Technical Reports Server (NTRS)

Reeves, A. P.

1985-01-01

Topics addressed include: data mapping and rotational algorithms for the Massively Parallel Processor (MPP); Parallel Pascal language; documentation for the Parallel Pascal Development system; and a description of the Parallel Pascal language used on the MPP.
Execution models for mapping programs onto distributed memory parallel computers

NASA Technical Reports Server (NTRS)

Sussman, Alan

1992-01-01

The problem of exploiting the parallelism available in a program to efficiently employ the resources of the target machine is addressed. The problem is discussed in the context of building a mapping compiler for a distributed memory parallel machine. The paper describes using execution models to drive the process of mapping a program in the most efficient way onto a particular machine. Through analysis of the execution models for several mapping techniques for one class of programs, we show that the selection of the best technique for a particular program instance can make a significant difference in performance. On the other hand, the results of benchmarks from an implementation of a mapping compiler show that our execution models are accurate enough to select the best mapping technique for a given program.
Program Correctness, Verification and Testing for Exascale (Corvette)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sen, Koushik; Iancu, Costin; Demmel, James W

The goal of this project is to provide tools to assess the correctness of parallel programs written using hybrid parallelism. There is a dire lack of both theoretical and engineering know-how in the area of finding bugs in hybrid or large scale parallel programs, which our research aims to change. In the project we have demonstrated novel approaches in several areas: 1. Low overhead automated and precise detection of concurrency bugs at scale. 2. Using low overhead bug detection tools to guide speculative program transformations for performance. 3. Techniques to reduce the concurrency required to reproduce a bug using partialmore » program restart/replay. 4. Techniques to provide reproducible execution of floating point programs. 5. Techniques for tuning the floating point precision used in codes.« less
Parallel Computing Strategies for Irregular Algorithms

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
Trace-Driven Debugging of Message Passing Programs

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Hood, Robert; Lopez, Louis; Bailey, David (Technical Monitor)

1998-01-01

In this paper we report on features added to a parallel debugger to simplify the debugging of parallel message passing programs. These features include replay, setting consistent breakpoints based on interprocess event causality, a parallel undo operation, and communication supervision. These features all use trace information collected during the execution of the program being debugged. We used a number of different instrumentation techniques to collect traces. We also implemented trace displays using two different trace visualization systems. The implementation was tested on an SGI Power Challenge cluster and a network of SGI workstations.
Exploiting Symmetry on Parallel Architectures.

NASA Astrophysics Data System (ADS)

Stiller, Lewis Benjamin

1995-01-01

This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

NASA Astrophysics Data System (ADS)

Simunovic, S.; Zacharia, T.; Baltas, N.; Spalding, D. B.

1995-03-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. The Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, S.; Zacharia, T.; Baltas, N.

1995-04-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. Themore » Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.« less
Parallel hyperbolic PDE simulation on clusters: Cell versus GPU

NASA Astrophysics Data System (ADS)

Rostrup, Scott; De Sterck, Hans

2010-12-01

Increasingly, high-performance computing is looking towards data-parallel computational devices to enhance computational performance. Two technologies that have received significant attention are IBM's Cell Processor and NVIDIA's CUDA programming model for graphics processing unit (GPU) computing. In this paper we investigate the acceleration of parallel hyperbolic partial differential equation simulation on structured grids with explicit time integration on clusters with Cell and GPU backends. The message passing interface (MPI) is used for communication between nodes at the coarsest level of parallelism. Optimizations of the simulation code at the several finer levels of parallelism that the data-parallel devices provide are described in terms of data layout, data flow and data-parallel instructions. Optimized Cell and GPU performance are compared with reference code performance on a single x86 central processing unit (CPU) core in single and double precision. We further compare the CPU, Cell and GPU platforms on a chip-to-chip basis, and compare performance on single cluster nodes with two CPUs, two Cell processors or two GPUs in a shared memory configuration (without MPI). We finally compare performance on clusters with 32 CPUs, 32 Cell processors, and 32 GPUs using MPI. Our GPU cluster results use NVIDIA Tesla GPUs with GT200 architecture, but some preliminary results on recently introduced NVIDIA GPUs with the next-generation Fermi architecture are also included. This paper provides computational scientists and engineers who are considering porting their codes to accelerator environments with insight into how structured grid based explicit algorithms can be optimized for clusters with Cell and GPU accelerators. It also provides insight into the speed-up that may be gained on current and future accelerator architectures for this class of applications. Program summaryProgram title: SWsolver Catalogue identifier: AEGY_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v3 No. of lines in distributed program, including test data, etc.: 59 168 No. of bytes in distributed program, including test data, etc.: 453 409 Distribution format: tar.gz Programming language: C, CUDA Computer: Parallel Computing Clusters. Individual compute nodes may consist of x86 CPU, Cell processor, or x86 CPU with attached NVIDIA GPU accelerator. Operating system: Linux Has the code been vectorised or parallelized?: Yes. Tested on 1-128 x86 CPU cores, 1-32 Cell Processors, and 1-32 NVIDIA GPUs. RAM: Tested on Problems requiring up to 4 GB per compute node. Classification: 12 External routines: MPI, CUDA, IBM Cell SDK Nature of problem: MPI-parallel simulation of Shallow Water equations using high-resolution 2D hyperbolic equation solver on regular Cartesian grids for x86 CPU, Cell Processor, and NVIDIA GPU using CUDA. Solution method: SWsolver provides 3 implementations of a high-resolution 2D Shallow Water equation solver on regular Cartesian grids, for CPU, Cell Processor, and NVIDIA GPU. Each implementation uses MPI to divide work across a parallel computing cluster. Additional comments: Sub-program numdiff is used for the test run.
Edge Simulation Laboratory Progress and Plans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cohen, R

The Edge Simulation Laboratory (ESL) is a project to develop a gyrokinetic code for MFE edge plasmas based on continuum (Eulerian) techniques. ESL is a base-program activity of OFES, with an allied algorithm research activity funded by the OASCR base math program. ESL OFES funds directly support about 0.8 FTE of career staff at LLNL, a postdoc and a small fraction of an FTE at GA, and a graduate student at UCSD. In addition the allied OASCR program funds about 1/2 FTE each in the computations directorates at LBNL and LLNL. OFES ESL funding for LLNL and UCSD began inmore » fall 2005, while funding for GA and the math team began about a year ago. ESL's continuum approach is a complement to the PIC-based methods of the CPES Project, and was selected (1) because of concerns about noise issues associated with PIC in the high-density-contrast environment of the edge pedestal, (2) to be able to exploit advanced numerical methods developed for fluid codes, and (3) to build upon the successes of core continuum gyrokinetic codes such as GYRO, GS2 and GENE. The ESL project presently has three components: TEMPEST, a full-f, full-geometry (single-null divertor, or arbitrary-shape closed flux surfaces) code in E, {mu} (energy, magnetic-moment) coordinates; EGK, a simple-geometry rapid-prototype code, presently of; and the math component, which is developing and implementing algorithms for a next-generation code. Progress would be accelerated if we could find funding for a fourth, computer science, component, which would develop software infrastructure, provide user support, and address needs for data handing and analysis. We summarize the status and plans for the three funded activities.« less
ORCA Project: Research on high-performance parallel computer programming environments. Final report, 1 Apr-31 Mar 90

DOE Office of Scientific and Technical Information (OSTI.GOV)

Snyder, L.; Notkin, D.; Adams, L.

1990-03-31

This task relates to research on programming massively parallel computers. Previous work on the Ensamble concept of programming was extended and investigation into nonshared memory models of parallel computation was undertaken. Previous work on the Ensamble concept defined a set of programming abstractions and was used to organize the programming task into three distinct levels; Composition of machine instruction, composition of processes, and composition of phases. It was applied to shared memory models of computations. During the present research period, these concepts were extended to nonshared memory models. During the present research period, one Ph D. thesis was completed, onemore » book chapter, and six conference proceedings were published.« less
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

NASA Technical Reports Server (NTRS)

Dorband, John E.; Aburdene, Maurice F.

2002-01-01

Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
The parallel programming of voluntary and reflexive saccades.

PubMed

Walker, Robin; McSorley, Eugene

2006-06-01

A novel two-step paradigm was used to investigate the parallel programming of consecutive, stimulus-elicited ('reflexive') and endogenous ('voluntary') saccades. The mean latency of voluntary saccades, made following the first reflexive saccades in two-step conditions, was significantly reduced compared to that of voluntary saccades made in the single-step control trials. The latency of the first reflexive saccades was modulated by the requirement to make a second saccade: first saccade latency increased when a second voluntary saccade was required in the opposite direction to the first saccade, and decreased when a second saccade was required in the same direction as the first reflexive saccade. A second experiment confirmed the basic effect and also showed that a second reflexive saccade may be programmed in parallel with a first voluntary saccade. The results support the view that voluntary and reflexive saccades can be programmed in parallel on a common motor map.
Incremental Parallelization of Non-Data-Parallel Programs Using the Charon Message-Passing Library

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob F.

2000-01-01

Message passing is among the most popular techniques for parallelizing scientific programs on distributed-memory architectures. The reasons for its success are wide availability (MPI), efficiency, and full tuning control provided to the programmer. A major drawback, however, is that incremental parallelization, as offered by compiler directives, is not generally possible, because all data structures have to be changed throughout the program simultaneously. Charon remedies this situation through mappings between distributed and non-distributed data. It allows breaking up the parallelization into small steps, guaranteeing correctness at every stage. Several tools are available to help convert legacy codes into high-performance message-passing programs. They usually target data-parallel applications, whose loops carrying most of the work can be distributed among all processors without much dependency analysis. Others do a full dependency analysis and then convert the code virtually automatically. Even more toolkits are available that aid construction from scratch of message passing programs. None, however, allows piecemeal translation of codes with complex data dependencies (i.e. non-data-parallel programs) into message passing codes. The Charon library (available in both C and Fortran) provides incremental parallelization capabilities by linking legacy code arrays with distributed arrays. During the conversion process, non-distributed and distributed arrays exist side by side, and simple mapping functions allow the programmer to switch between the two in any location in the program. Charon also provides wrapper functions that leave the structure of the legacy code intact, but that allow execution on truly distributed data. Finally, the library provides a rich set of communication functions that support virtually all patterns of remote data demands in realistic structured grid scientific programs, including transposition, nearest-neighbor communication, pipelining, gather/scatter, and redistribution. At the end of the conversion process most intermediate Charon function calls will have been removed, the non-distributed arrays will have been deleted, and virtually the only remaining Charon functions calls are the high-level, highly optimized communications. Distribution of the data is under complete control of the programmer, although a wide range of useful distributions is easily available through predefined functions. A crucial aspect of the library is that it does not allocate space for distributed arrays, but accepts programmer-specified memory. This has two major consequences. First, codes parallelized using Charon do not suffer from encapsulation; user data is always directly accessible. This provides high efficiency, and also retains the possibility of using message passing directly for highly irregular communications. Second, non-distributed arrays can be interpreted as (trivial) distributions in the Charon sense, which allows them to be mapped to truly distributed arrays, and vice versa. This is the mechanism that enables incremental parallelization. In this paper we provide a brief introduction of the library and then focus on the actual steps in the parallelization process, using some representative examples from, among others, the NAS Parallel Benchmarks. We show how a complicated two-dimensional pipeline-the prototypical non-data-parallel algorithm- can be constructed with ease. To demonstrate the flexibility of the library, we give examples of the stepwise, efficient parallel implementation of nonlocal boundary conditions common in aircraft simulations, as well as the construction of the sequence of grids required for multigrid.
78 FR 76628 - Pilot Program for Parallel Review of Medical Products; Extension of the Duration of the Program

Federal Register 2010, 2011, 2012, 2013, 2014

2013-12-18

...The Food and Drug Administration (FDA) and the Centers for Medicare and Medicaid Services (CMS) (the Agencies) are announcing the extension of the ``Pilot Program for Parallel Review of Medical Products.'' The Agencies have decided to continue the program as currently designed for an additional period of 2 years from the date of publication of this notice.
Integrated Task and Data Parallel Programming

NASA Technical Reports Server (NTRS)

Grimshaw, A. S.

1998-01-01

This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.
Integrated Task And Data Parallel Programming: Language Design

NASA Technical Reports Server (NTRS)

Grimshaw, Andrew S.; West, Emily A.

1998-01-01

his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.
Automatic Management of Parallel and Distributed System Resources

NASA Technical Reports Server (NTRS)

Yan, Jerry; Ngai, Tin Fook; Lundstrom, Stephen F.

1990-01-01

Viewgraphs on automatic management of parallel and distributed system resources are presented. Topics covered include: parallel applications; intelligent management of multiprocessing systems; performance evaluation of parallel architecture; dynamic concurrent programs; compiler-directed system approach; lattice gaseous cellular automata; and sparse matrix Cholesky factorization.
Describing, using 'recognition cones'. [parallel-series model with English-like computer program

NASA Technical Reports Server (NTRS)

Uhr, L.

1973-01-01

A parallel-serial 'recognition cone' model is examined, taking into account the model's ability to describe scenes of objects. An actual program is presented in an English-like language. The concept of a 'description' is discussed together with possible types of descriptive information. Questions regarding the level and the variety of detail are considered along with approaches for improving the serial representations of parallel systems.

PISCES: An environment for parallel scientific computation

NASA Technical Reports Server (NTRS)

Pratt, T. W.

1985-01-01

The parallel implementation of scientific computing environment (PISCES) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a FORTRAN 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces FORTRAN, an extension of FORTRAN 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified virtual machine that defines the run-time environment within which Pisces FORTRAN programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. An introduction to the Pisces 1 virtual computer and the FORTRAN 77 extensions is presented. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several granularities of parallelism in programs and the provision of a window mechanism for distributed access to large arrays of data.
Security of information in IT systems

NASA Astrophysics Data System (ADS)

Kaliczynska, Malgorzata

2005-02-01

The aim of the paper is to increase human awareness of the dangers connected with social engineering methods of obtaining information. The article demonstrates psychological and sociological methods of influencing people used in the attacks on IT systems. Little known techniques are presented about one of the greater threats that is electromagnetic emission or corona effect. Moreover, the work shows methods of protecting against this type of dangers. Also, in the paper one can find information on devices made according to the TEMPEST technology. The article not only discusses the methods of gathering information, but also instructs how to protect against its out-of-control loss.
Eigensolver for a Sparse, Large Hermitian Matrix

NASA Technical Reports Server (NTRS)

Tisdale, E. Robert; Oyafuso, Fabiano; Klimeck, Gerhard; Brown, R. Chris

2003-01-01

A parallel-processing computer program finds a few eigenvalues in a sparse Hermitian matrix that contains as many as 100 million diagonal elements. This program finds the eigenvalues faster, using less memory, than do other, comparable eigensolver programs. This program implements a Lanczos algorithm in the American National Standards Institute/ International Organization for Standardization (ANSI/ISO) C computing language, using the Message Passing Interface (MPI) standard to complement an eigensolver in PARPACK. [PARPACK (Parallel Arnoldi Package) is an extension, to parallel-processing computer architectures, of ARPACK (Arnoldi Package), which is a collection of Fortran 77 subroutines that solve large-scale eigenvalue problems.] The eigensolver runs on Beowulf clusters of computers at the Jet Propulsion Laboratory (JPL).
Parallelization of elliptic solver for solving 1D Boussinesq model

NASA Astrophysics Data System (ADS)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sofronov, I.D.; Voronin, B.L.; Butnev, O.I.

1997-12-31

The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle.more » The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.« less
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

2001-01-01

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify the program execution without changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.
Relative Debugging of Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

2002-01-01

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular, the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify, the program execution with out changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.
Paralex: An Environment for Parallel Programming in Distributed Systems

DTIC Science & Technology

1991-12-07

distributed systems is coni- parable to assembly language programming for traditional sequential systems - the user must resort to low-level primitives ...to accomplish data encoding/decoding, communication, remote exe- cution, synchronization , failure detection and recovery. It is our belief that... synchronization . Finally, composing parallel programs by interconnecting se- quential computations allows automatic support for heterogeneity and fault tolerance
Interfacing Computer Aided Parallelization and Performance Analysis

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Biegel, Bryan A. (Technical Monitor)

2003-01-01

When porting sequential applications to parallel computer architectures, the program developer will typically go through several cycles of source code optimization and performance analysis. We have started a project to develop an environment where the user can jointly navigate through program structure and performance data information in order to make efficient optimization decisions. In a prototype implementation we have interfaced the CAPO computer aided parallelization tool with the Paraver performance analysis tool. We describe both tools and their interface and give an example for how the interface helps within the program development cycle of a benchmark code.
LDRD final report on massively-parallel linear programming : the parPCx system.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

2005-02-01

This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less
A high-speed linear algebra library with automatic parallelism

NASA Technical Reports Server (NTRS)

Boucher, Michael L.

1994-01-01

Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.
Dual and parallel postdoctoral training programs: implications for the osteopathic medical profession.

PubMed

Burkhart, Diane N; Lischka, Terri A

2011-04-01

Students in colleges of osteopathic medicine have several options when considering postdoctoral training programs. In addition to training programs approved solely by the American Osteopathic Association or accredited solely by the Accreditation Council for Graduate Medical Education (ACGME), students can pursue programs accredited by both organizations (ie, dually accredited programs) or osteopathic programs that occur side-by-side with ACGME programs (ie, parallel programs). In the present article, we report on the availability and growth of these 2 training options and describe their benefits and drawbacks for trainees and the osteopathic medical profession as a whole.
The paradigm compiler: Mapping a functional language for the connection machine

NASA Technical Reports Server (NTRS)

Dennis, Jack B.

1989-01-01

The Paradigm Compiler implements a new approach to compiling programs written in high level languages for execution on highly parallel computers. The general approach is to identify the principal data structures constructed by the program and to map these structures onto the processing elements of the target machine. The mapping is chosen to maximize performance as determined through compile time global analysis of the source program. The source language is Sisal, a functional language designed for scientific computations, and the target language is Paris, the published low level interface to the Connection Machine. The data structures considered are multidimensional arrays whose dimensions are known at compile time. Computations that build such arrays usually offer opportunities for highly parallel execution; they are data parallel. The Connection Machine is an attractive target for these computations, and the parallel for construct of the Sisal language is a convenient high level notation for data parallel algorithms. The principles and organization of the Paradigm Compiler are discussed.
Charon Toolkit for Parallel, Implicit Structured-Grid Computations: Functional Design

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)

1997-01-01

In a previous report the design concepts of Charon were presented. Charon is a toolkit that aids engineers in developing scientific programs for structured-grid applications to be run on MIMD parallel computers. It constitutes an augmentation of the general-purpose MPI-based message-passing layer, and provides the user with a hierarchy of tools for rapid prototyping and validation of parallel programs, and subsequent piecemeal performance tuning. Here we describe the implementation of the domain decomposition tools used for creating data distributions across sets of processors. We also present the hierarchy of parallelization tools that allows smooth translation of legacy code (or a serial design) into a parallel program. Along with the actual tool descriptions, we will present the considerations that led to the particular design choices. Many of these are motivated by the requirement that Charon must be useful within the traditional computational environments of Fortran 77 and C. Only the Fortran 77 syntax will be presented in this report.
Integrated Network Decompositions and Dynamic Programming for Graph Optimization (INDDGO)

DOE Office of Scientific and Technical Information (OSTI.GOV)

The INDDGO software package offers a set of tools for finding exact solutions to graph optimization problems via tree decompositions and dynamic programming algorithms. Currently the framework offers serial and parallel (distributed memory) algorithms for finding tree decompositions and solving the maximum weighted independent set problem. The parallel dynamic programming algorithm is implemented on top of the MADNESS task-based runtime.
Exploiting loop level parallelism in nonprocedural dataflow programs

NASA Technical Reports Server (NTRS)

Gokhale, Maya B.

1987-01-01

Discussed are how loop level parallelism is detected in a nonprocedural dataflow program, and how a procedural program with concurrent loops is scheduled. Also discussed is a program restructuring technique which may be applied to recursive equations so that concurrent loops may be generated for a seemingly iterative computation. A compiler which generates C code for the language described below has been implemented. The scheduling component of the compiler and the restructuring transformation are described.
Tolerant (parallel) Programming

NASA Technical Reports Server (NTRS)

DiNucci, David C.; Bailey, David H. (Technical Monitor)

1997-01-01

In order to be truly portable, a program must be tolerant of a wide range of development and execution environments, and a parallel program is just one which must be tolerant of a very wide range. This paper first defines the term "tolerant programming", then describes many layers of tools to accomplish it. The primary focus is on F-Nets, a formal model for expressing computation as a folded partial-ordering of operations, thereby providing an architecture-independent expression of tolerant parallel algorithms. For implementing F-Nets, Cooperative Data Sharing (CDS) is a subroutine package for implementing communication efficiently in a large number of environments (e.g. shared memory and message passing). Software Cabling (SC), a very-high-level graphical programming language for building large F-Nets, possesses many of the features normally expected from today's computer languages (e.g. data abstraction, array operations). Finally, L2(sup 3) is a CASE tool which facilitates the construction, compilation, execution, and debugging of SC programs.
Resolutions of the Coulomb operator: VIII. Parallel implementation using the modern programming language X10.

PubMed

Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P

2014-10-30

Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine. Copyright © 2014 Wiley Periodicals, Inc.
New NAS Parallel Benchmarks Results

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)

1997-01-01

NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.
Exploring types of play in an adapted robotics program for children with disabilities.

PubMed

Lindsay, Sally; Lam, Ashley

2018-04-01

Play is an important occupation in a child's development. Children with disabilities often have fewer opportunities to engage in meaningful play than typically developing children. The purpose of this study was to explore the types of play (i.e., solitary, parallel and co-operative) within an adapted robotics program for children with disabilities aged 6-8 years. This study draws on detailed observations of each of the six robotics workshops and interviews with 53 participants (21 children, 21 parents and 11 programme staff). Our findings showed that four children engaged in solitary play, where all but one showed signs of moving towards parallel play. Six children demonstrated parallel play during all workshops. The remainder of the children had mixed play types play (solitary, parallel and/or co-operative) throughout the robotics workshops. We observed more parallel and co-operative, and less solitary play as the programme progressed. Ten different children displayed co-operative behaviours throughout the workshops. The interviews highlighted how staff supported children's engagement in the programme. Meanwhile, parents reported on their child's development of play skills. An adapted LEGO ® robotics program has potential to develop the play skills of children with disabilities in moving from solitary towards more parallel and co-operative play. Implications for rehabilitation Educators and clinicians working with children who have disabilities should consider the potential of LEGO ® robotics programs for developing their play skills. Clinicians should consider how the extent of their involvement in prompting and facilitating children's engagement and play within a robotics program may influence their ability to interact with their peers. Educators and clinicians should incorporate both structured and unstructured free-play elements within a robotics program to facilitate children's social development.

Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units

USDA-ARS?s Scientific Manuscript database

This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...
Multiprocessor smalltalk: Implementation, performance, and analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pallas, J.I.

1990-01-01

Multiprocessor Smalltalk demonstrates the value of object-oriented programming on a multiprocessor. Its implementation and analysis shed light on three areas: concurrent programming in an object oriented language without special extensions, implementation techniques for adapting to multiprocessors, and performance factors in the resulting system. Adding parallelism to Smalltalk code is easy, because programs already use control abstractions like iterators. Smalltalk's basic control and concurrency primitives (lambda expressions, processes and semaphores) can be used to build parallel control abstractions, including parallel iterators, parallel objects, atomic objects, and futures. Language extensions for concurrency are not required. This implementation demonstrates that it is possiblemore » to build an efficient parallel object-oriented programming system and illustrates techniques for doing so. Three modification tools-serialization, replication, and reorganization-adapted the Berkeley Smalltalk interpreter to the Firefly multiprocessor. Multiprocessor Smalltalk's performance shows that the combination of multiprocessing and object-oriented programming can be effective: speedups (relative to the original serial version) exceed 2.0 for five processors on all the benchmarks; the median efficiency is 48%. Analysis shows both where performance is lost and how to improve and generalize the experimental results. Changes in the interpreter to support concurrency add at most 12% overhead; better access to per-process variables could eliminate much of that. Changes in the user code to express concurrency add as much as 70% overhead; this overhead could be reduced to 54% if blocks (lambda expressions) were reentrant. Performance is also lost when the program cannot keep all five processors busy.« less
Implementing the PM Programming Language using MPI and OpenMP - a New Tool for Programming Geophysical Models on Parallel Systems

NASA Astrophysics Data System (ADS)

Bellerby, Tim

2015-04-01

PM (Parallel Models) is a new parallel programming language specifically designed for writing environmental and geophysical models. The language is intended to enable implementers to concentrate on the science behind the model rather than the details of running on parallel hardware. At the same time PM leaves the programmer in control - all parallelisation is explicit and the parallel structure of any given program may be deduced directly from the code. This paper describes a PM implementation based on the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) standards, looking at issues involved with translating the PM parallelisation model to MPI/OpenMP protocols and considering performance in terms of the competing factors of finer-grained parallelisation and increased communication overhead. In order to maximise portability, the implementation stays within the MPI 1.3 standard as much as possible, with MPI-2 MPI-IO file handling the only significant exception. Moreover, it does not assume a thread-safe implementation of MPI. PM adopts a two-tier abstract representation of parallel hardware. A PM processor is a conceptual unit capable of efficiently executing a set of language tasks, with a complete parallel system consisting of an abstract N-dimensional array of such processors. PM processors may map to single cores executing tasks using cooperative multi-tasking, to multiple cores or even to separate processing nodes, efficiently sharing tasks using algorithms such as work stealing. While tasks may move between hardware elements within a PM processor, they may not move between processors without specific programmer intervention. Tasks are assigned to processors using a nested parallelism approach, building on ideas from Reyes et al. (2009). The main program owns all available processors. When the program enters a parallel statement then either processors are divided out among the newly generated tasks (number of new tasks < number of processors) or tasks are divided out among the available processors (number of tasks > number of processors). Nested parallel statements may further subdivide the processor set owned by a given task. Tasks or processors are distributed evenly by default, but uneven distributions are possible under programmer control. It is also possible to explicitly enable child tasks to migrate within the processor set owned by their parent task, reducing load unbalancing at the potential cost of increased inter-processor message traffic. PM incorporates some programming structures from the earlier MIST language presented at a previous EGU General Assembly, while adopting a significantly different underlying parallelisation model and type system. PM code is available at www.pm-lang.org under an unrestrictive MIT license. Reference Ruymán Reyes, Antonio J. Dorta, Francisco Almeida, Francisco de Sande, 2009. Automatic Hybrid MPI+OpenMP Code Generation with llc, Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science Volume 5759, 185-195
Parallel transformation of K-SVD solar image denoising algorithm

NASA Astrophysics Data System (ADS)

Liang, Youwen; Tian, Yu; Li, Mei

2017-02-01

The images obtained by observing the sun through a large telescope always suffered with noise due to the low SNR. K-SVD denoising algorithm can effectively remove Gauss white noise. Training dictionaries for sparse representations is a time consuming task, due to the large size of the data involved and to the complexity of the training algorithms. In this paper, an OpenMP parallel programming language is proposed to transform the serial algorithm to the parallel version. Data parallelism model is used to transform the algorithm. Not one atom but multiple atoms updated simultaneously is the biggest change. The denoising effect and acceleration performance are tested after completion of the parallel algorithm. Speedup of the program is 13.563 in condition of using 16 cores. This parallel version can fully utilize the multi-core CPU hardware resources, greatly reduce running time and easily to transplant in multi-core platform.
Diderot: a Domain-Specific Language for Portable Parallel Scientific Visualization and Image Analysis.

PubMed

Kindlmann, Gordon; Chiw, Charisee; Seltzer, Nicholas; Samuels, Lamont; Reppy, John

2016-01-01

Many algorithms for scientific visualization and image analysis are rooted in the world of continuous scalar, vector, and tensor fields, but are programmed in low-level languages and libraries that obscure their mathematical foundations. Diderot is a parallel domain-specific language that is designed to bridge this semantic gap by providing the programmer with a high-level, mathematical programming notation that allows direct expression of mathematical concepts in code. Furthermore, Diderot provides parallel performance that takes advantage of modern multicore processors and GPUs. The high-level notation allows a concise and natural expression of the algorithms and the parallelism allows efficient execution on real-world datasets.
Array processor architecture

NASA Technical Reports Server (NTRS)

Barnes, George H. (Inventor); Lundstrom, Stephen F. (Inventor); Shafer, Philip E. (Inventor)

1983-01-01

A high speed parallel array data processing architecture fashioned under a computational envelope approach includes a data base memory for secondary storage of programs and data, and a plurality of memory modules interconnected to a plurality of processing modules by a connection network of the Omega gender. Programs and data are fed from the data base memory to the plurality of memory modules and from hence the programs are fed through the connection network to the array of processors (one copy of each program for each processor). Execution of the programs occur with the processors operating normally quite independently of each other in a multiprocessing fashion. For data dependent operations and other suitable operations, all processors are instructed to finish one given task or program branch before all are instructed to proceed in parallel processing fashion on the next instruction. Even when functioning in the parallel processing mode however, the processors are not locked-step but execute their own copy of the program individually unless or until another overall processor array synchronization instruction is issued.
A parallel solver for huge dense linear systems

NASA Astrophysics Data System (ADS)

Badia, J. M.; Movilla, J. L.; Climente, J. I.; Castillo, M.; Marqués, M.; Mayo, R.; Quintana-Ortí, E. S.; Planelles, J.

2011-11-01

HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O(100.000). The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations. The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform. Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors. New version program summaryProgram title: Huge Dense System Solver (HDSS) Catalogue identifier: AEHU_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 87 062 No. of bytes in distributed program, including test data, etc.: 1 069 110 Distribution format: tar.gz Programming language: Fortran90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system: Linux/Unix Has the code been vectorized or parallelized?: Yes, includes MPI primitives. RAM: Tested for up to 190 GB Classification: 6.5 External routines: MPI ( http://www.mpi-forum.org/), BLAS ( http://www.netlib.org/blas/), PLAPACK ( http://www.cs.utexas.edu/~plapack/), POOCLAPACK ( ftp://ftp.cs.utexas.edu/pub/rvdg/PLAPACK/pooclapack.ps) (code for PLAPACK and POOCLAPACK is included in the distribution). Catalogue identifier of previous version: AEHU_v1_0 Journal reference of previous version: Comput. Phys. Comm. 182 (2011) 533 Does the new version supersede the previous version?: Yes Nature of problem: Huge scale dense systems of linear equations, Ax=B, beyond standard LAPACK capabilities. Solution method: The linear systems are solved by means of parallelized routines based on the LU factorization, using efficient secondary storage algorithms when the available main memory is insufficient. Reasons for new version: In many applications we need to guarantee a high accuracy in the solution of very large linear systems and we can do it by using double-precision arithmetic. Summary of revisions: Version 1.1 Can be used to solve linear systems using double-precision arithmetic. New version of the initialization routine. The user can choose the kind of arithmetic and the values of several parameters of the environment. Running time: About 5 hours to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors using double-precision arithmetic on an eight-node commodity cluster with a total of 64 Intel cores.
Implementation and performance of FDPS: a framework for developing parallel particle simulation codes

NASA Astrophysics Data System (ADS)

Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro

2016-08-01

We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.
Concurrency-based approaches to parallel programming

NASA Technical Reports Server (NTRS)

Kale, L.V.; Chrisochoides, N.; Kohl, J.; Yelick, K.

1995-01-01

The inevitable transition to parallel programming can be facilitated by appropriate tools, including languages and libraries. After describing the needs of applications developers, this paper presents three specific approaches aimed at development of efficient and reusable parallel software for irregular and dynamic-structured problems. A salient feature of all three approaches in their exploitation of concurrency within a processor. Benefits of individual approaches such as these can be leveraged by an interoperability environment which permits modules written using different approaches to co-exist in single applications.
Reliability models for dataflow computer systems

NASA Technical Reports Server (NTRS)

Kavi, K. M.; Buckles, B. P.

1985-01-01

The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers.
Method for resource control in parallel environments using program organization and run-time support

NASA Technical Reports Server (NTRS)

Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

2001-01-01

A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.
Method for resource control in parallel environments using program organization and run-time support

NASA Technical Reports Server (NTRS)

Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

1999-01-01

A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.
Parallel community climate model: Description and user`s guide

DOE Office of Scientific and Technical Information (OSTI.GOV)

Drake, J.B.; Flanery, R.E.; Semeraro, B.D.

This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain intomore » geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.« less
The 2nd Symposium on the Frontiers of Massively Parallel Computations

NASA Technical Reports Server (NTRS)

Mills, Ronnie (Editor)

1988-01-01

Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.
The Goddard Space Flight Center Program to develop parallel image processing systems

NASA Technical Reports Server (NTRS)

Schaefer, D. H.

1972-01-01

Parallel image processing which is defined as image processing where all points of an image are operated upon simultaneously is discussed. Coherent optical, noncoherent optical, and electronic methods are considered parallel image processing techniques.
Parallel Volunteer Learning during Youth Programs

ERIC Educational Resources Information Center

Lesmeister, Marilyn K.; Green, Jeremy; Derby, Amy; Bothum, Candi

2012-01-01

Lack of time is a hindrance for volunteers to participate in educational opportunities, yet volunteer success in an organization is tied to the orientation and education they receive. Meeting diverse educational needs of volunteers can be a challenge for program managers. Scheduling a Volunteer Learning Track for chaperones that is parallel to a…
Mechanism to support generic collective communication across a variety of programming models

DOEpatents

Almasi, Gheorghe [Ardsley, NY; Dozsa, Gabor [Ardsley, NY; Kumar, Sameer [White Plains, NY

2011-07-19

A system and method for supporting collective communications on a plurality of processors that use different parallel programming paradigms, in one aspect, may comprise a schedule defining one or more tasks in a collective operation, an executor that executes the task, a multisend module to perform one or more data transfer functions associated with the tasks, and a connection manager that controls one or more connections and identifies an available connection. The multisend module uses the available connection in performing the one or more data transfer functions. A plurality of processors that use different parallel programming paradigms can use a common implementation of the schedule module, the executor module, the connection manager and the multisend module via a language adaptor specific to a parallel programming paradigm implemented on a processor.
Programming with Intervals

NASA Astrophysics Data System (ADS)

Matsakis, Nicholas D.; Gross, Thomas R.

Intervals are a new, higher-level primitive for parallel programming with which programmers directly construct the program schedule. Programs using intervals can be statically analyzed to ensure that they do not deadlock or contain data races. In this paper, we demonstrate the flexibility of intervals by showing how to use them to emulate common parallel control-flow constructs like barriers and signals, as well as higher-level patterns such as bounded-buffer producer-consumer. We have implemented intervals as a publicly available library for Java and Scala.
Classification of hyperspectral imagery using MapReduce on a NVIDIA graphics processing unit (Conference Presentation)

NASA Astrophysics Data System (ADS)

Ramirez, Andres; Rahnemoonfar, Maryam

2017-04-01

A hyperspectral image provides multidimensional figure rich in data consisting of hundreds of spectral dimensions. Analyzing the spectral and spatial information of such image with linear and non-linear algorithms will result in high computational time. In order to overcome this problem, this research presents a system using a MapReduce-Graphics Processing Unit (GPU) model that can help analyzing a hyperspectral image through the usage of parallel hardware and a parallel programming model, which will be simpler to handle compared to other low-level parallel programming models. Additionally, Hadoop was used as an open-source version of the MapReduce parallel programming model. This research compared classification accuracy results and timing results between the Hadoop and GPU system and tested it against the following test cases: the CPU and GPU test case, a CPU test case and a test case where no dimensional reduction was applied.
File concepts for parallel I/O

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1989-01-01

The subject of input/output (I/O) was often neglected in the design of parallel computer systems, although for many problems I/O rates will limit the speedup attainable. The I/O problem is addressed by considering the role of files in parallel systems. The notion of parallel files is introduced. Parallel files provide for concurrent access by multiple processes, and utilize parallelism in the I/O system to improve performance. Parallel files can also be used conventionally by sequential programs. A set of standard parallel file organizations is proposed, organizations are suggested, using multiple storage devices. Problem areas are also identified and discussed.

Program For Parallel Discrete-Event Simulation

NASA Technical Reports Server (NTRS)

Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.

1991-01-01

User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.
LLMapReduce: Multi-Lingual Map-Reduce for Supercomputing Environments

DTIC Science & Technology

2015-11-20

1990s. Popularized by Google [36] and Apache Hadoop [37], map-reduce has become a staple technology of the ever- growing big data community...Lexington, MA, U.S.A Abstract— The map-reduce parallel programming model has become extremely popular in the big data community. Many big data ...to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing simple parallel programming
Creating a Parallel Version of VisIt for Microsoft Windows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Whitlock, B J; Biagas, K S; Rawson, P L

2011-12-07

VisIt is a popular, free interactive parallel visualization and analysis tool for scientific data. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images or movies for presentations. VisIt was designed from the ground up to work on many scales of computers from modest desktops up to massively parallel clusters. VisIt is comprised of a set of cooperating programs. All programs can be run locally or in client/server mode in which some run locally and some run remotely on compute clusters. The VisIt program most able to harness today's computing powermore » is the VisIt compute engine. The compute engine is responsible for reading simulation data from disk, processing it, and sending results or images back to the VisIt viewer program. In a parallel environment, the compute engine runs several processes, coordinating using the Message Passing Interface (MPI) library. Each MPI process reads some subset of the scientific data and filters the data in various ways to create useful visualizations. By using MPI, VisIt has been able to scale well into the thousands of processors on large computers such as dawn and graph at LLNL. The advent of multicore CPU's has made parallelism the 'new' way to achieve increasing performance. With today's computers having at least 2 cores and in many cases up to 8 and beyond, it is more important than ever to deploy parallel software that can use that computing power not only on clusters but also on the desktop. We have created a parallel version of VisIt for Windows that uses Microsoft's MPI implementation (MSMPI) to process data in parallel on the Windows desktop as well as on a Windows HPC cluster running Microsoft Windows Server 2008. Initial desktop parallel support for Windows was deployed in VisIt 2.4.0. Windows HPC cluster support has been completed and will appear in the VisIt 2.5.0 release. We plan to continue supporting parallel VisIt on Windows so our users will be able to take full advantage of their multicore resources.« less
Electric tempest in a teacup: The tea leaf analogy to microfluidic blood plasma separation

NASA Astrophysics Data System (ADS)

Yeo, Leslie Y.; Friend, James R.; Arifin, Dian R.

2006-09-01

In a similar fashion to Einstein's tea leaf paradox, the rotational liquid flow induced by ionic wind above a liquid surface can trap suspended microparticles by a helical motion, spinning them down towards a bottom stagnation point. The motion is similar to Batchelor [Q. J. Mech. Appl. Math. 4, 29 (1951)] flows occurring between stationary and rotating disks and arises due to a combination of the primary azimuthal and secondary bulk meridional recirculation that produces a centrifugal and enhanced inward radial force near the chamber bottom. The technology is thus useful for microfluidic particle trapping/concentration; the authors demonstrate its potential for rapid erythrocyte/blood plasma separation for miniaturized medical diagnostic kits.
Debugging Fortran on a shared memory machine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Allen, T.R.; Padua, D.A.

1987-01-01

Debugging on a parallel processor is more difficult than debugging on a serial machine because errors in a parallel program may introduce nondeterminism. The approach to parallel debugging presented here attempts to reduce the problem of debugging on a parallel machine to that of debugging on a serial machine by automatically detecting nondeterminism. 20 refs., 6 figs.
A portable MPI-based parallel vector template library

NASA Technical Reports Server (NTRS)

Sheffler, Thomas J.

1995-01-01

This paper discusses the design and implementation of a polymorphic collection library for distributed address-space parallel computers. The library provides a data-parallel programming model for C++ by providing three main components: a single generic collection class, generic algorithms over collections, and generic algebraic combining functions. Collection elements are the fourth component of a program written using the library and may be either of the built-in types of C or of user-defined types. Many ideas are borrowed from the Standard Template Library (STL) of C++, although a restricted programming model is proposed because of the distributed address-space memory model assumed. Whereas the STL provides standard collections and implementations of algorithms for uniprocessors, this paper advocates standardizing interfaces that may be customized for different parallel computers. Just as the STL attempts to increase programmer productivity through code reuse, a similar standard for parallel computers could provide programmers with a standard set of algorithms portable across many different architectures. The efficacy of this approach is verified by examining performance data collected from an initial implementation of the library running on an IBM SP-2 and an Intel Paragon.
A Portable MPI-Based Parallel Vector Template Library

NASA Technical Reports Server (NTRS)

Sheffler, Thomas J.

1995-01-01

This paper discusses the design and implementation of a polymorphic collection library for distributed address-space parallel computers. The library provides a data-parallel programming model for C + + by providing three main components: a single generic collection class, generic algorithms over collections, and generic algebraic combining functions. Collection elements are the fourth component of a program written using the library and may be either of the built-in types of c or of user-defined types. Many ideas are borrowed from the Standard Template Library (STL) of C++, although a restricted programming model is proposed because of the distributed address-space memory model assumed. Whereas the STL provides standard collections and implementations of algorithms for uniprocessors, this paper advocates standardizing interfaces that may be customized for different parallel computers. Just as the STL attempts to increase programmer productivity through code reuse, a similar standard for parallel computers could provide programmers with a standard set of algorithms portable across many different architectures. The efficacy of this approach is verified by examining performance data collected from an initial implementation of the library running on an IBM SP-2 and an Intel Paragon.
Parallel computation and the basis system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, G.R.

1993-05-01

A software package has been written that can facilitate efforts to develop powerful, flexible, and easy-to use programs that can run in single-processor, massively parallel, and distributed computing environments. Particular attention has been given to the difficulties posed by a program consisting of many science packages that represent subsystems of a complicated, coupled system. Methods have been found to maintain independence of the packages by hiding data structures without increasing the communications costs in a parallel computing environment. Concepts developed in this work are demonstrated by a prototype program that uses library routines from two existing software systems, Basis andmore » Parallel Virtual Machine (PVM). Most of the details of these libraries have been encapsulated in routines and macros that could be rewritten for alternative libraries that possess certain minimum capabilities. The prototype software uses a flexible master-and-slaves paradigm for parallel computation and supports domain decomposition with message passing for partitioning work among slaves. Facilities are provided for accessing variables that are distributed among the memories of slaves assigned to subdomains. The software is named PROTOPAR.« less
Performance Evaluation of Remote Memory Access (RMA) Programming on Shared Memory Parallel Computers

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Jost, Gabriele; Biegel, Bryan A. (Technical Monitor)

2002-01-01

The purpose of this study is to evaluate the feasibility of remote memory access (RMA) programming on shared memory parallel computers. We discuss different RMA based implementations of selected CFD application benchmark kernels and compare them to corresponding message passing based codes. For the message-passing implementation we use MPI point-to-point and global communication routines. For the RMA based approach we consider two different libraries supporting this programming model. One is a shared memory parallelization library (SMPlib) developed at NASA Ames, the other is the MPI-2 extensions to the MPI Standard. We give timing comparisons for the different implementation strategies and discuss the performance.
The Automated Instrumentation and Monitoring System (AIMS) reference manual

NASA Technical Reports Server (NTRS)

Yan, Jerry; Hontalas, Philip; Listgarten, Sherry

1993-01-01

Whether a researcher is designing the 'next parallel programming paradigm,' another 'scalable multiprocessor' or investigating resource allocation algorithms for multiprocessors, a facility that enables parallel program execution to be captured and displayed is invaluable. Careful analysis of execution traces can help computer designers and software architects to uncover system behavior and to take advantage of specific application characteristics and hardware features. A software tool kit that facilitates performance evaluation of parallel applications on multiprocessors is described. The Automated Instrumentation and Monitoring System (AIMS) has four major software components: a source code instrumentor which automatically inserts active event recorders into the program's source code before compilation; a run time performance-monitoring library, which collects performance data; a trace file animation and analysis tool kit which reconstructs program execution from the trace file; and a trace post-processor which compensate for data collection overhead. Besides being used as prototype for developing new techniques for instrumenting, monitoring, and visualizing parallel program execution, AIMS is also being incorporated into the run-time environments of various hardware test beds to evaluate their impact on user productivity. Currently, AIMS instrumentors accept FORTRAN and C parallel programs written for Intel's NX operating system on the iPSC family of multi computers. A run-time performance-monitoring library for the iPSC/860 is included in this release. We plan to release monitors for other platforms (such as PVM and TMC's CM-5) in the near future. Performance data collected can be graphically displayed on workstations (e.g. Sun Sparc and SGI) supporting X-Windows (in particular, Xl IR5, Motif 1.1.3).
Tensor contraction engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster, and many-body perturbation theories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hirata, So

2003-11-20

We develop a symbolic manipulation program and program generator (Tensor Contraction Engine or TCE) that automatically derives the working equations of a well-defined model of second-quantized many-electron theories and synthesizes efficient parallel computer programs on the basis of these equations. Provided an ansatz of a many-electron theory model, TCE performs valid contractions of creation and annihilation operators according to Wick's theorem, consolidates identical terms, and reduces the expressions into the form of multiple tensor contractions acted by permutation operators. Subsequently, it determines the binary contraction order for each multiple tensor contraction with the minimal operation and memory cost, factorizes commonmore » binary contractions (defines intermediate tensors), and identifies reusable intermediates. The resulting ordered list of binary tensor contractions, additions, and index permutations is translated into an optimized program that is combined with the NWChem and UTChem computational chemistry software packages. The programs synthesized by TCE take advantage of spin symmetry, Abelian point-group symmetry, and index permutation symmetry at every stage of calculations to minimize the number of arithmetic operations and storage requirement, adjust the peak local memory usage by index range tiling, and support parallel I/O interfaces and dynamic load balancing for parallel executions. We demonstrate the utility of TCE through automatic derivation and implementation of parallel programs for various models of configuration-interaction theory (CISD, CISDT, CISDTQ), many-body perturbation theory [MBPT(2), MBPT(3), MBPT(4)], and coupled-cluster theory (LCCD, CCD, LCCSD, CCSD, QCISD, CCSDT, and CCSDTQ).« less
Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tsugane, Keisuke; Boku, Taisuke; Murai, Hitoshi

Recently, the Partitioned Global Address Space (PGAS) parallel programming model has emerged as a usable distributed memory programming model. XcalableMP (XMP) is a PGAS parallel programming language that extends base languages such as C and Fortran with directives in OpenMP-like style. XMP supports a global-view model that allows programmers to define global data and to map them to a set of processors, which execute the distributed global data as a single thread. In XMP, the concept of a coarray is also employed for local-view programming. In this study, we port Gyrokinetic Toroidal Code - Princeton (GTC-P), which is a three-dimensionalmore » gyrokinetic PIC code developed at Princeton University to study the microturbulence phenomenon in magnetically confined fusion plasmas, to XMP as an example of hybrid memory model coding with the global-view and local-view programming models. In local-view programming, the coarray notation is simple and intuitive compared with Message Passing Interface (MPI) programming while the performance is comparable to that of the MPI version. Thus, because the global-view programming model is suitable for expressing the data parallelism for a field of grid space data, we implement a hybrid-view version using a global-view programming model to compute the field and a local-view programming model to compute the movement of particles. Finally, the performance is degraded by 20% compared with the original MPI version, but the hybrid-view version facilitates more natural data expression for static grid space data (in the global-view model) and dynamic particle data (in the local-view model), and it also increases the readability of the code for higher productivity.« less
Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP

DOE PAGES

Tsugane, Keisuke; Boku, Taisuke; Murai, Hitoshi; ...

2016-06-01

Recently, the Partitioned Global Address Space (PGAS) parallel programming model has emerged as a usable distributed memory programming model. XcalableMP (XMP) is a PGAS parallel programming language that extends base languages such as C and Fortran with directives in OpenMP-like style. XMP supports a global-view model that allows programmers to define global data and to map them to a set of processors, which execute the distributed global data as a single thread. In XMP, the concept of a coarray is also employed for local-view programming. In this study, we port Gyrokinetic Toroidal Code - Princeton (GTC-P), which is a three-dimensionalmore » gyrokinetic PIC code developed at Princeton University to study the microturbulence phenomenon in magnetically confined fusion plasmas, to XMP as an example of hybrid memory model coding with the global-view and local-view programming models. In local-view programming, the coarray notation is simple and intuitive compared with Message Passing Interface (MPI) programming while the performance is comparable to that of the MPI version. Thus, because the global-view programming model is suitable for expressing the data parallelism for a field of grid space data, we implement a hybrid-view version using a global-view programming model to compute the field and a local-view programming model to compute the movement of particles. Finally, the performance is degraded by 20% compared with the original MPI version, but the hybrid-view version facilitates more natural data expression for static grid space data (in the global-view model) and dynamic particle data (in the local-view model), and it also increases the readability of the code for higher productivity.« less
Parent-Child Parallel-Group Intervention for Childhood Aggression in Hong Kong

ERIC Educational Resources Information Center

Fung, Annis L. C.; Tsang, Sandra H. K. M.

2006-01-01

This article reports the original evidence-based outcome study on parent-child parallel group-designed Anger Coping Training (ACT) program for children aged 8-10 with reactive aggression and their parents in Hong Kong. This research program involved experimental and control groups with pre- and post-comparison. Quantitative data collection…
Parallel computer vision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uhr, L.

1987-01-01

This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Parallel Performance of a Combustion Chemistry Simulation

DOE PAGES

Skinner, Gregg; Eigenmann, Rudolf

1995-01-01

We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.
Algorithms and programming tools for image processing on the MPP, part 2

NASA Technical Reports Server (NTRS)

Reeves, Anthony P.

1986-01-01

A number of algorithms were developed for image warping and pyramid image filtering. Techniques were investigated for the parallel processing of a large number of independent irregular shaped regions on the MPP. In addition some utilities for dealing with very long vectors and for sorting were developed. Documentation pages for the algorithms which are available for distribution are given. The performance of the MPP for a number of basic data manipulations was determined. From these results it is possible to predict the efficiency of the MPP for a number of algorithms and applications. The Parallel Pascal development system, which is a portable programming environment for the MPP, was improved and better documentation including a tutorial was written. This environment allows programs for the MPP to be developed on any conventional computer system; it consists of a set of system programs and a library of general purpose Parallel Pascal functions. The algorithms were tested on the MPP and a presentation on the development system was made to the MPP users group. The UNIX version of the Parallel Pascal System was distributed to a number of new sites.
Scalable Unix commands for parallel processors : a high-performance implementation.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ong, E.; Lusk, E.; Gropp, W.

2001-06-22

We describe a family of MPI applications we call the Parallel Unix Commands. These commands are natural parallel versions of common Unix user commands such as ls, ps, and find, together with a few similar commands particular to the parallel environment. We describe the design and implementation of these programs and present some performance results on a 256-node Linux cluster. The Parallel Unix Commands are open source and freely available.
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Van Rosendale, John

1989-01-01

A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. The authors focus on tensor product array computations, a simple but important class of numerical algorithms. They consider first the problem of programming one-dimensional kernel routines, such as parallel tridiagonal solvers, and then look at how such parallel kernels can be combined to form parallel tensor product algorithms.
A CS1 pedagogical approach to parallel thinking

NASA Astrophysics Data System (ADS)

Rague, Brian William

Almost all collegiate programs in Computer Science offer an introductory course in programming primarily devoted to communicating the foundational principles of software design and development. The ACM designates this introduction to computer programming course for first-year students as CS1, during which methodologies for solving problems within a discrete computational context are presented. Logical thinking is highlighted, guided primarily by a sequential approach to algorithm development and made manifest by typically using the latest, commercially successful programming language. In response to the most recent developments in accessible multicore computers, instructors of these introductory classes may wish to include training on how to design workable parallel code. Novel issues arise when programming concurrent applications which can make teaching these concepts to beginning programmers a seemingly formidable task. Student comprehension of design strategies related to parallel systems should be monitored to ensure an effective classroom experience. This research investigated the feasibility of integrating parallel computing concepts into the first-year CS classroom. To quantitatively assess student comprehension of parallel computing, an experimental educational study using a two-factor mixed group design was conducted to evaluate two instructional interventions in addition to a control group: (1) topic lecture only, and (2) topic lecture with laboratory work using a software visualization Parallel Analysis Tool (PAT) specifically designed for this project. A new evaluation instrument developed for this study, the Perceptions of Parallelism Survey (PoPS), was used to measure student learning regarding parallel systems. The results from this educational study show a statistically significant main effect among the repeated measures, implying that student comprehension levels of parallel concepts as measured by the PoPS improve immediately after the delivery of any initial three-week CS1 level module when compared with student comprehension levels just prior to starting the course. Survey results measured during the ninth week of the course reveal that performance levels remained high compared to pre-course performance scores. A second result produced by this study reveals no statistically significant interaction effect between the intervention method and student performance as measured by the evaluation instrument over three separate testing periods. However, visual inspection of survey score trends and the low p-value generated by the interaction analysis (0.062) indicate that further studies may verify improved concept retention levels for the lecture w/PAT group.

YAPPA: a Compiler-Based Parallelization Framework for Irregular Applications on MPSoCs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lovergine, Silvia; Tumeo, Antonino; Villa, Oreste

Modern embedded systems include hundreds of cores. Because of the difficulty in providing a fast, coherent memory architecture, these systems usually rely on non-coherent, non-uniform memory architectures with private memories for each core. However, programming these systems poses significant challenges. The developer must extract large amounts of parallelism, while orchestrating communication among cores to optimize application performance. These issues become even more significant with irregular applications, which present data sets difficult to partition, unpredictable memory accesses, unbalanced control flow and fine grained communication. Hand-optimizing every single aspect is hard and time-consuming, and it often does not lead to the expectedmore » performance. There is a growing gap between such complex and highly-parallel architectures and the high level languages used to describe the specification, which were designed for simpler systems and do not consider these new issues. In this paper we introduce YAPPA (Yet Another Parallel Programming Approach), a compilation framework for the automatic parallelization of irregular applications on modern MPSoCs based on LLVM. We start by considering an efficient parallel programming approach for irregular applications on distributed memory systems. We then propose a set of transformations that can reduce the development and optimization effort. The results of our initial prototype confirm the correctness of the proposed approach.« less
PISCES 2 users manual

NASA Technical Reports Server (NTRS)

Pratt, Terrence W.

1987-01-01

PISCES 2 is a programming environment and set of extensions to Fortran 77 for parallel programming. It is intended to provide a basis for writing programs for scientific and engineering applications on parallel computers in a way that is relatively independent of the particular details of the underlying computer architecture. This user's manual provides a complete description of the PISCES 2 system as it is currently implemented on the 20 processor Flexible FLEX/32 at NASA Langley Research Center.
A language comparison for scientific computing on MIMD architectures

NASA Technical Reports Server (NTRS)

Jones, Mark T.; Patrick, Merrell L.; Voigt, Robert G.

1989-01-01

Choleski's method for solving banded symmetric, positive definite systems is implemented on a multiprocessor computer using three FORTRAN based parallel programming languages, the Force, PISCES and Concurrent FORTRAN. The capabilities of the language for expressing parallelism and their user friendliness are discussed, including readability of the code, debugging assistance offered, and expressiveness of the languages. The performance of the different implementations is compared. It is argued that PISCES, using the Force for medium-grained parallelism, is the appropriate choice for programming Choleski's method on the multiprocessor computer, Flex/32.
Code Parallelization with CAPO: A User Manual

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry; Biegel, Bryan (Technical Monitor)

2001-01-01

A software tool has been developed to assist the parallelization of scientific codes. This tool, CAPO, extends an existing parallelization toolkit, CAPTools developed at the University of Greenwich, to generate OpenMP parallel codes for shared memory architectures. This is an interactive toolkit to transform a serial Fortran application code to an equivalent parallel version of the software - in a small fraction of the time normally required for a manual parallelization. We first discuss the way in which loop types are categorized and how efficient OpenMP directives can be defined and inserted into the existing code using the in-depth interprocedural analysis. The use of the toolkit on a number of application codes ranging from benchmark to real-world application codes is presented. This will demonstrate the great potential of using the toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of processors. The second part of the document gives references to the parameters and the graphic user interface implemented in the toolkit. Finally a set of tutorials is included for hands-on experiences with this toolkit.
Thread concept for automatic task parallelization in image analysis

NASA Astrophysics Data System (ADS)

Lueckenhaus, Maximilian; Eckstein, Wolfgang

1998-09-01

Parallel processing of image analysis tasks is an essential method to speed up image processing and helps to exploit the full capacity of distributed systems. However, writing parallel code is a difficult and time-consuming process and often leads to an architecture-dependent program that has to be re-implemented when changing the hardware. Therefore it is highly desirable to do the parallelization automatically. For this we have developed a special kind of thread concept for image analysis tasks. Threads derivated from one subtask may share objects and run in the same context but may process different threads of execution and work on different data in parallel. In this paper we describe the basics of our thread concept and show how it can be used as basis of an automatic task parallelization to speed up image processing. We further illustrate the design and implementation of an agent-based system that uses image analysis threads for generating and processing parallel programs by taking into account the available hardware. The tests made with our system prototype show that the thread concept combined with the agent paradigm is suitable to speed up image processing by an automatic parallelization of image analysis tasks.
Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.

1994-01-01

The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).
Turbulent Heating and Wave Pressure in Solar Wind Acceleration Modeling: New Insights to Empirical Forecasting of the Solar Wind

NASA Astrophysics Data System (ADS)

Woolsey, L. N.; Cranmer, S. R.

2013-12-01

The study of solar wind acceleration has made several important advances recently due to improvements in modeling techniques. Existing code and simulations test the competing theories for coronal heating, which include reconnection/loop-opening (RLO) models and wave/turbulence-driven (WTD) models. In order to compare and contrast the validity of these theories, we need flexible tools that predict the emergent solar wind properties from a wide range of coronal magnetic field structures such as coronal holes, pseudostreamers, and helmet streamers. ZEPHYR (Cranmer et al. 2007) is a one-dimensional magnetohydrodynamics code that includes Alfven wave generation and reflection and the resulting turbulent heating to accelerate solar wind in open flux tubes. We present the ZEPHYR output for a wide range of magnetic field geometries to show the effect of the magnetic field profiles on wind properties. We also investigate the competing acceleration mechanisms found in ZEPHYR to determine the relative importance of increased gas pressure from turbulent heating and the separate pressure source from the Alfven waves. To do so, we developed a code that will become publicly available for solar wind prediction. This code, TEMPEST, provides an outflow solution based on only one input: the magnetic field strength as a function of height above the photosphere. It uses correlations found in ZEPHYR between the magnetic field strength at the source surface and the temperature profile of the outflow solution to compute the wind speed profile based on the increased gas pressure from turbulent heating. With this initial solution, TEMPEST then adds in the Alfven wave pressure term to the modified Parker equation and iterates to find a stable solution for the wind speed. This code, therefore, can make predictions of the wind speeds that will be observed at 1 AU based on extrapolations from magnetogram data, providing a useful tool for empirical forecasting of the sol! ar wind.
PIPS-SBB: A Parallel Distributed-Memory Branch-and-Bound Algorithm for Stochastic Mixed-Integer Programs

DOE PAGES

Munguia, Lluis-Miquel; Oxberry, Geoffrey; Rajan, Deepak

2016-05-01

Stochastic mixed-integer programs (SMIPs) deal with optimization under uncertainty at many levels of the decision-making process. When solved as extensive formulation mixed- integer programs, problem instances can exceed available memory on a single workstation. In order to overcome this limitation, we present PIPS-SBB: a distributed-memory parallel stochastic MIP solver that takes advantage of parallelism at multiple levels of the optimization process. We also show promising results on the SIPLIB benchmark by combining methods known for accelerating Branch and Bound (B&B) methods with new ideas that leverage the structure of SMIPs. Finally, we expect the performance of PIPS-SBB to improve furthermore » as more functionality is added in the future.« less
On the utility of threads for data parallel programming

NASA Technical Reports Server (NTRS)

Fahringer, Thomas; Haines, Matthew; Mehrotra, Piyush

1995-01-01

Threads provide a useful programming model for asynchronous behavior because of their ability to encapsulate units of work that can then be scheduled for execution at runtime, based on the dynamic state of a system. Recently, the threaded model has been applied to the domain of data parallel scientific codes, and initial reports indicate that the threaded model can produce performance gains over non-threaded approaches, primarily through the use of overlapping useful computation with communication latency. However, overlapping computation with communication is possible without the benefit of threads if the communication system supports asynchronous primitives, and this comparison has not been made in previous papers. This paper provides a critical look at the utility of lightweight threads as applied to data parallel scientific programming.
Enabling Requirements-Based Programming for Highly-Dependable Complex Parallel and Distributed Systems

NASA Technical Reports Server (NTRS)

Hinchey, Michael G.; Rash, James L.; Rouff, Christopher A.

2005-01-01

The manual application of formal methods in system specification has produced successes, but in the end, despite any claims and assertions by practitioners, there is no provable relationship between a manually derived system specification or formal model and the customer's original requirements. Complex parallel and distributed system present the worst case implications for today s dearth of viable approaches for achieving system dependability. No avenue other than formal methods constitutes a serious contender for resolving the problem, and so recognition of requirements-based programming has come at a critical juncture. We describe a new, NASA-developed automated requirement-based programming method that can be applied to certain classes of systems, including complex parallel and distributed systems, to achieve a high degree of dependability.
A design methodology for portable software on parallel computers

NASA Technical Reports Server (NTRS)

Nicol, David M.; Miller, Keith W.; Chrisman, Dan A.

1993-01-01

This final report for research that was supported by grant number NAG-1-995 documents our progress in addressing two difficulties in parallel programming. The first difficulty is developing software that will execute quickly on a parallel computer. The second difficulty is transporting software between dissimilar parallel computers. In general, we expect that more hardware-specific information will be included in software designs for parallel computers than in designs for sequential computers. This inclusion is an instance of portability being sacrificed for high performance. New parallel computers are being introduced frequently. Trying to keep one's software on the current high performance hardware, a software developer almost continually faces yet another expensive software transportation. The problem of the proposed research is to create a design methodology that helps designers to more precisely control both portability and hardware-specific programming details. The proposed research emphasizes programming for scientific applications. We completed our study of the parallelizability of a subsystem of the NASA Earth Radiation Budget Experiment (ERBE) data processing system. This work is summarized in section two. A more detailed description is provided in Appendix A ('Programming Practices to Support Eventual Parallelism'). Mr. Chrisman, a graduate student, wrote and successfully defended a Ph.D. dissertation proposal which describes our research associated with the issues of software portability and high performance. The list of research tasks are specified in the proposal. The proposal 'A Design Methodology for Portable Software on Parallel Computers' is summarized in section three and is provided in its entirety in Appendix B. We are currently studying a proposed subsystem of the NASA Clouds and the Earth's Radiant Energy System (CERES) data processing system. This software is the proof-of-concept for the Ph.D. dissertation. We have implemented and measured the performance of a portion of this subsystem on the Intel iPSC/2 parallel computer. These results are provided in section four. Our future work is summarized in section five, our acknowledgements are stated in section six, and references for published papers associated with NAG-1-995 are provided in section seven.
Massively parallel sparse matrix function calculations with NTPoly

NASA Astrophysics Data System (ADS)

Dawson, William; Nakajima, Takahito

2018-04-01

We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
pWeb: A High-Performance, Parallel-Computing Framework for Web-Browser-Based Medical Simulation.

PubMed

Halic, Tansel; Ahn, Woojin; De, Suvranu

2014-01-01

This work presents a pWeb - a new language and compiler for parallelization of client-side compute intensive web applications such as surgical simulations. The recently introduced HTML5 standard has enabled creating unprecedented applications on the web. Low performance of the web browser, however, remains the bottleneck of computationally intensive applications including visualization of complex scenes, real time physical simulations and image processing compared to native ones. The new proposed language is built upon web workers for multithreaded programming in HTML5. The language provides fundamental functionalities of parallel programming languages as well as the fork/join parallel model which is not supported by web workers. The language compiler automatically generates an equivalent parallel script that complies with the HTML5 standard. A case study on realistic rendering for surgical simulations demonstrates enhanced performance with a compact set of instructions.
Automatic data partitioning on distributed memory multicomputers. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Gupta, Manish

1992-01-01

Distributed-memory parallel computers are increasingly being used to provide high levels of performance for scientific applications. Unfortunately, such machines are not very easy to program. A number of research efforts seek to alleviate this problem by developing compilers that take over the task of generating communication. The communication overheads and the extent of parallelism exploited in the resulting target program are determined largely by the manner in which data is partitioned across different processors of the machine. Most of the compilers provide no assistance to the programmer in the crucial task of determining a good data partitioning scheme. A novel approach is presented, the constraints-based approach, to the problem of automatic data partitioning for numeric programs. In this approach, the compiler identifies some desirable requirements on the distribution of various arrays being referenced in each statement, based on performance considerations. These desirable requirements are referred to as constraints. For each constraint, the compiler determines a quality measure that captures its importance with respect to the performance of the program. The quality measure is obtained through static performance estimation, without actually generating the target data-parallel program with explicit communication. Each data distribution decision is taken by combining all the relevant constraints. The compiler attempts to resolve any conflicts between constraints such that the overall execution time of the parallel program is minimized. This approach has been implemented as part of a compiler called Paradigm, that accepts Fortran 77 programs, and specifies the partitioning scheme to be used for each array in the program. We have obtained results on some programs taken from the Linpack and Eispack libraries, and the Perfect Benchmarks. These results are quite promising, and demonstrate the feasibility of automatic data partitioning for a significant class of scientific application programs with regular computations.
GRADSPMHD: A parallel MHD code based on the SPH formalism

NASA Astrophysics Data System (ADS)

Vanaverbeke, S.; Keppens, R.; Poedts, S.

2014-03-01

We present GRADSPMHD, a completely Lagrangian parallel magnetohydrodynamics code based on the SPH formalism. The implementation of the equations of SPMHD in the “GRAD-h” formalism assembles known results, including the derivation of the discretized MHD equations from a variational principle, the inclusion of time-dependent artificial viscosity, resistivity and conductivity terms, as well as the inclusion of a mixed hyperbolic/parabolic correction scheme for satisfying the ∇ṡB→ constraint on the magnetic field. The code uses a tree-based formalism for neighbor finding and can optionally use the tree code for computing the self-gravity of the plasma. The structure of the code closely follows the framework of our parallel GRADSPH FORTRAN 90 code which we added previously to the CPC program library. We demonstrate the capabilities of GRADSPMHD by running 1, 2, and 3 dimensional standard benchmark tests and we find good agreement with previous work done by other researchers. The code is also applied to the problem of simulating the magnetorotational instability in 2.5D shearing box tests as well as in global simulations of magnetized accretion disks. We find good agreement with available results on this subject in the literature. Finally, we discuss the performance of the code on a parallel supercomputer with distributed memory architecture. Catalogue identifier: AERP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AERP_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 620503 No. of bytes in distributed program, including test data, etc.: 19837671 Distribution format: tar.gz Programming language: FORTRAN 90/MPI. Computer: HPC cluster. Operating system: Unix. Has the code been vectorized or parallelized?: Yes, parallelized using MPI. RAM: ˜30 MB for a Sedov test including 15625 particles on a single CPU. Classification: 12. Nature of problem: Evolution of a plasma in the ideal MHD approximation. Solution method: The equations of magnetohydrodynamics are solved using the SPH method. Running time: The test provided takes approximately 20 min using 4 processors.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets.

PubMed

Shrimankar, D D; Sathe, S R

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets

PubMed Central

Shrimankar, D. D.; Sathe, S. R.

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
Parallel Logic Programming and Parallel Systems Software and Hardware

DTIC Science & Technology

1989-07-29

Conference, Dallas TX. January 1985. (55) [Rous75] Roussel, P., "PROLOG: Manuel de Reference et d’Uilisation", Group d’ Intelligence Artificielle , Universite d...completed. Tools were provided for software development using artificial intelligence techniques. Al software for massively parallel architectures was...using artificial intelligence tech- niques. Al software for massively parallel architectures was started. 1. Introduction We describe research conducted
The force on the flex: Global parallelism and portability

NASA Technical Reports Server (NTRS)

Jordan, H. F.

1986-01-01

A parallel programming methodology, called the force, supports the construction of programs to be executed in parallel by an unspecified, but potentially large, number of processes. The methodology was originally developed on a pipelined, shared memory multiprocessor, the Denelcor HEP, and embodies the primitive operations of the force in a set of macros which expand into multiprocessor Fortran code. A small set of primitives is sufficient to write large parallel programs, and the system has been used to produce 10,000 line programs in computational fluid dynamics. The level of complexity of the force primitives is intermediate. It is high enough to mask detailed architectural differences between multiprocessors but low enough to give the user control over performance. The system is being ported to a medium scale multiprocessor, the Flex/32, which is a 20 processor system with a mixture of shared and local memory. Memory organization and the type of processor synchronization supported by the hardware on the two machines lead to some differences in efficient implementations of the force primitives, but the user interface remains the same. An initial implementation was done by retargeting the macros to Flexible Computer Corporation's ConCurrent C language. Subsequently, the macros were caused to directly produce the system calls which form the basis for ConCurrent C. The implementation of the Fortran based system is in step with Flexible Computer Corporations's implementation of a Fortran system in the parallel environment.
Cellular automata with object-oriented features for parallel molecular network modeling.

PubMed

Zhu, Hao; Wu, Yinghui; Huang, Sui; Sun, Yan; Dhar, Pawan

2005-06-01

Cellular automata are an important modeling paradigm for studying the dynamics of large, parallel systems composed of multiple, interacting components. However, to model biological systems, cellular automata need to be extended beyond the large-scale parallelism and intensive communication in order to capture two fundamental properties characteristic of complex biological systems: hierarchy and heterogeneity. This paper proposes extensions to a cellular automata language, Cellang, to meet this purpose. The extended language, with object-oriented features, can be used to describe the structure and activity of parallel molecular networks within cells. Capabilities of this new programming language include object structure to define molecular programs within a cell, floating-point data type and mathematical functions to perform quantitative computation, message passing capability to describe molecular interactions, as well as new operators, statements, and built-in functions. We discuss relevant programming issues of these features, including the object-oriented description of molecular interactions with molecule encapsulation, message passing, and the description of heterogeneity and anisotropy at the cell and molecule levels. By enabling the integration of modeling at the molecular level with system behavior at cell, tissue, organ, or even organism levels, the program will help improve our understanding of how complex and dynamic biological activities are generated and controlled by parallel functioning of molecular networks. Index Terms-Cellular automata, modeling, molecular network, object-oriented.

Efficient Thread Labeling for Monitoring Programs with Nested Parallelism

NASA Astrophysics Data System (ADS)

Ha, Ok-Kyoon; Kim, Sun-Sook; Jun, Yong-Kee

It is difficult and cumbersome to detect data races occurred in an execution of parallel programs. Any on-the-fly race detection techniques using Lamport's happened-before relation needs a thread labeling scheme for generating unique identifiers which maintain logical concurrency information for the parallel threads. NR labeling is an efficient thread labeling scheme for the fork-join program model with nested parallelism, because its efficiency depends only on the nesting depth for every fork and join operation. This paper presents an improved NR labeling, called e-NR labeling, in which every thread generates its label by inheriting the pointer to its ancestor list from the parent threads or by updating the pointer in a constant amount of time and space. This labeling is more efficient than the NR labeling, because its efficiency does not depend on the nesting depth for every fork and join operation. Some experiments were performed with OpenMP programs having nesting depths of three or four and maximum parallelisms varying from 10,000 to 1,000,000. The results show that e-NR is 5 times faster than NR labeling and 4.3 times faster than OS labeling in the average time for creating and maintaining the thread labels. In average space required for labeling, it is 3.5 times smaller than NR labeling and 3 times smaller than OS labeling.
User-Defined Data Distributions in High-Level Programming Languages

NASA Technical Reports Server (NTRS)

Diaconescu, Roxana E.; Zima, Hans P.

2006-01-01

One of the characteristic features of today s high performance computing systems is a physically distributed memory. Efficient management of locality is essential for meeting key performance requirements for these architectures. The standard technique for dealing with this issue has involved the extension of traditional sequential programming languages with explicit message passing, in the context of a processor-centric view of parallel computation. This has resulted in complex and error-prone assembly-style codes in which algorithms and communication are inextricably interwoven. This paper presents a high-level approach to the design and implementation of data distributions. Our work is motivated by the need to improve the current parallel programming methodology by introducing a paradigm supporting the development of efficient and reusable parallel code. This approach is currently being implemented in the context of a new programming language called Chapel, which is designed in the HPCS project Cascade.
Block-Parallel Data Analysis with DIY2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morozov, Dmitriy; Peterka, Tom

DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial,more » parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.« less
Parallel Rendering of Large Time-Varying Volume Data

NASA Technical Reports Server (NTRS)

Garbutt, Alexander E.

2005-01-01

Interactive visualization of large time-varying 3D volume datasets has been and still is a great challenge to the modem computational world. It stretches the limits of the memory capacity, the disk space, the network bandwidth and the CPU speed of a conventional computer. In this SURF project, we propose to develop a parallel volume rendering program on SGI's Prism, a cluster computer equipped with state-of-the-art graphic hardware. The proposed program combines both parallel computing and hardware rendering in order to achieve an interactive rendering rate. We use 3D texture mapping and a hardware shader to implement 3D volume rendering on each workstation. We use SGI's VisServer to enable remote rendering using Prism's graphic hardware. And last, we will integrate this new program with ParVox, a parallel distributed visualization system developed at JPL. At the end of the project, we Will demonstrate remote interactive visualization using this new hardware volume renderer on JPL's Prism System using a time-varying dataset from selected JPL applications.
Solving Partial Differential Equations in a data-driven multiprocessor environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gaudiot, J.L.; Lin, C.M.; Hosseiniyar, M.

1988-12-31

Partial differential equations can be found in a host of engineering and scientific problems. The emergence of new parallel architectures has spurred research in the definition of parallel PDE solvers. Concurrently, highly programmable systems such as data-how architectures have been proposed for the exploitation of large scale parallelism. The implementation of some Partial Differential Equation solvers (such as the Jacobi method) on a tagged token data-flow graph is demonstrated here. Asynchronous methods (chaotic relaxation) are studied and new scheduling approaches (the Token No-Labeling scheme) are introduced in order to support the implementation of the asychronous methods in a data-driven environment.more » New high-level data-flow language program constructs are introduced in order to handle chaotic operations. Finally, the performance of the program graphs is demonstrated by a deterministic simulation of a message passing data-flow multiprocessor. An analysis of the overhead in the data-flow graphs is undertaken to demonstrate the limits of parallel operations in dataflow PDE program graphs.« less
cljam: a library for handling DNA sequence alignment/map (SAM) with parallel processing.

PubMed

Takeuchi, Toshiki; Yamada, Atsuo; Aoki, Takashi; Nishimura, Kunihiro

2016-01-01

Next-generation sequencing can determine DNA bases and the results of sequence alignments are generally stored in files in the Sequence Alignment/Map (SAM) format and the compressed binary version (BAM) of it. SAMtools is a typical tool for dealing with files in the SAM/BAM format. SAMtools has various functions, including detection of variants, visualization of alignments, indexing, extraction of parts of the data and loci, and conversion of file formats. It is written in C and can execute fast. However, SAMtools requires an additional implementation to be used in parallel with, for example, OpenMP (Open Multi-Processing) libraries. For the accumulation of next-generation sequencing data, a simple parallelization program, which can support cloud and PC cluster environments, is required. We have developed cljam using the Clojure programming language, which simplifies parallel programming, to handle SAM/BAM data. Cljam can run in a Java runtime environment (e.g., Windows, Linux, Mac OS X) with Clojure. Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The cljam code is written in Clojure and has fewer lines than other similar tools.
Visual analysis of inter-process communication for large-scale parallel computing.

PubMed

Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu

2009-01-01

In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.
Parallel machine architecture and compiler design facilities

NASA Technical Reports Server (NTRS)

Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex

1990-01-01

The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.
The OpenMP Implementation of NAS Parallel Benchmarks and its Performance

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry

1999-01-01

As the new ccNUMA architecture became popular in recent years, parallel programming with compiler directives on these machines has evolved to accommodate new needs. In this study, we examine the effectiveness of OpenMP directives for parallelizing the NAS Parallel Benchmarks. Implementation details will be discussed and performance will be compared with the MPI implementation. We have demonstrated that OpenMP can achieve very good results for parallelization on a shared memory system, but effective use of memory and cache is very important.
DOE SBIR Phase-1 Report on Hybrid CPU-GPU Parallel Development of the Eulerian-Lagrangian Barracuda Multiphase Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dr. Dale M. Snider

2011-02-28

This report gives the result from the Phase-1 work on demonstrating greater than 10x speedup of the Barracuda computer program using parallel methods and GPU processors (General-Purpose Graphics Processing Unit or Graphics Processing Unit). Phase-1 demonstrated a 12x speedup on a typical Barracuda function using the GPU processor. The problem test case used about 5 million particles and 250,000 Eulerian grid cells. The relative speedup, compared to a single CPU, increases with increased number of particles giving greater than 12x speedup. Phase-1 work provided a path for reformatting data structure modifications to give good parallel performance while keeping a friendlymore » environment for new physics development and code maintenance. The implementation of data structure changes will be in Phase-2. Phase-1 laid the ground work for the complete parallelization of Barracuda in Phase-2, with the caveat that implemented computer practices for parallel programming done in Phase-1 gives immediate speedup in the current Barracuda serial running code. The Phase-1 tasks were completed successfully laying the frame work for Phase-2. The detailed results of Phase-1 are within this document. In general, the speedup of one function would be expected to be higher than the speedup of the entire code because of I/O functions and communication between the algorithms. However, because one of the most difficult Barracuda algorithms was parallelized in Phase-1 and because advanced parallelization methods and proposed parallelization optimization techniques identified in Phase-1 will be used in Phase-2, an overall Barracuda code speedup (relative to a single CPU) is expected to be greater than 10x. This means that a job which takes 30 days to complete will be done in 3 days. Tasks completed in Phase-1 are: Task 1: Profile the entire Barracuda code and select which subroutines are to be parallelized (See Section Choosing a Function to Accelerate) Task 2: Select a GPU consultant company and jointly parallelize subroutines (CPFD chose the small business EMPhotonics for the Phase-1 the technical partner. See Section Technical Objective and Approach) Task 3: Integrate parallel subroutines into Barracuda (See Section Results from Phase-1 and its subsections) Task 4: Testing, refinement, and optimization of parallel methodology (See Section Results from Phase-1 and Section Result Comparison Program) Task 5: Integrate Phase-1 parallel subroutines into Barracuda and release (See Section Results from Phase-1 and its subsections) Task 6: Roadmap of Phase-2 (See Section Plan for Phase-2) With the completion of Phase 1 we have the base understanding to completely parallelize Barracuda. An overview of the work to move Barracuda to a parallelized code is given in Plan for Phase-2.« less
Monitoring Data-Structure Evolution in Distributed Message-Passing Programs

NASA Technical Reports Server (NTRS)

Sarukkai, Sekhar R.; Beers, Andrew; Woodrow, Thomas S. (Technical Monitor)

1996-01-01

Monitoring the evolution of data structures in parallel and distributed programs, is critical for debugging its semantics and performance. However, the current state-of-art in tracking and presenting data-structure information on parallel and distributed environments is cumbersome and does not scale. In this paper we present a methodology that automatically tracks memory bindings (not the actual contents) of static and dynamic data-structures of message-passing C programs, using PVM. With the help of a number of examples we show that in addition to determining the impact of memory allocation overheads on program performance, graphical views can help in debugging the semantics of program execution. Scalable animations of virtual address bindings of source-level data-structures are used for debugging the semantics of parallel programs across all processors. In conjunction with light-weight core-files, this technique can be used to complement traditional debuggers on single processors. Detailed information (such as data-structure contents), on specific nodes, can be determined using traditional debuggers after the data structure evolution leading to the semantic error is observed graphically.
Command/response protocols and concurrent software

NASA Technical Reports Server (NTRS)

Bynum, W. L.

1987-01-01

A version of the program to control the parallel jaw gripper is documented. The parallel jaw end-effector hardware and the Intel 8031 processor that is used to control the end-effector are briefly described. A general overview of the controller program is given and a complete description of the program's structure and design are contained. There are three appendices: a memory map of the on-chip RAM, a cross-reference listing of the self-scheduling routines, and a summary of the top-level and monitor commands.
Computer programs for adjusting the mechanical properties of 2-inch dimension lumber for changes in moisture content

Treesearch

James W. Evans; Jane K. Evans; David W. Green

1990-01-01

This paper presents computer programs for adjusting the mechanical properties of 2-in. dimension lumber for changes in moisture content. Mechanical properties adjusted are modulus of rupture, ultimate tensile stress parallel to the grain, ultimate compressive stress parallel to the gain, and flexural modulus of elasticity. The models are valid for moisture contents...
NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations

NASA Astrophysics Data System (ADS)

Valiev, M.; Bylaska, E. J.; Govind, N.; Kowalski, K.; Straatsma, T. P.; Van Dam, H. J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T. L.; de Jong, W. A.

2010-09-01

The latest release of NWChem delivers an open-source computational chemistry package with extensive capabilities for large scale simulations of chemical and biological systems. Utilizing a common computational framework, diverse theoretical descriptions can be used to provide the best solution for a given scientific problem. Scalable parallel implementations and modular software design enable efficient utilization of current computational architectures. This paper provides an overview of NWChem focusing primarily on the core theoretical modules provided by the code and their parallel performance. Program summaryProgram title: NWChem Catalogue identifier: AEGI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGI_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open Source Educational Community License No. of lines in distributed program, including test data, etc.: 11 709 543 No. of bytes in distributed program, including test data, etc.: 680 696 106 Distribution format: tar.gz Programming language: Fortran 77, C Computer: all Linux based workstations and parallel supercomputers, Windows and Apple machines Operating system: Linux, OS X, Windows Has the code been vectorised or parallelized?: Code is parallelized Classification: 2.1, 2.2, 3, 7.3, 7.7, 16.1, 16.2, 16.3, 16.10, 16.13 Nature of problem: Large-scale atomistic simulations of chemical and biological systems require efficient and reliable methods for ground and excited solutions of many-electron Hamiltonian, analysis of the potential energy surface, and dynamics. Solution method: Ground and excited solutions of many-electron Hamiltonian are obtained utilizing density-functional theory, many-body perturbation approach, and coupled cluster expansion. These solutions or a combination thereof with classical descriptions are then used to analyze potential energy surface and perform dynamical simulations. Additional comments: Full documentation is provided in the distribution file. This includes an INSTALL file giving details of how to build the package. A set of test runs is provided in the examples directory. The distribution file for this program is over 90 Mbytes and therefore is not delivered directly when download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. Running time: Running time depends on the size of the chemical system, complexity of the method, number of cpu's and the computational task. It ranges from several seconds for serial DFT energy calculations on a few atoms to several hours for parallel coupled cluster energy calculations on tens of atoms or ab-initio molecular dynamics simulation on hundreds of atoms.
The Tempest: Difficult to Control Asthma in Adolescence.

PubMed

Burg, Gregory T; Covar, Ronina; Oland, Alyssa A; Guilbert, Theresa W

Severe asthma is associated with significant morbidity and is a highly heterogeneous disorder. Severe asthma in adolescence has some unique elements compared with the features of severe asthma a medical provider would see in younger children or adults. A specific focus on psychological issues and adherence highlights some of the challenges in the management of asthma in adolescents. Treatment of adolescents with severe asthma now includes 3 approved biologic phenotype-directed therapies. Therapies available to adults may be beneficial to adolescents with severe asthma. Research into predictors of specific treatment response by phenotypes is ongoing. Optimal treatment strategies are not yet defined and warrant further investigation. Copyright © 2018 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Science Planning for the TROPIX Mission

NASA Technical Reports Server (NTRS)

Russell, C. T.

1998-01-01

The objective of the study grant was to undertake the planning needed to execute meaningful solar electric propulsion missions in the magnetosphere and beyond. The first mission examined was the Transfer Orbit Plasma Investigation Experiment (TROPIX) mission to spiral outward through the magnetosphere. The next mission examined was to the moon and an asteroid. Entitled Diana, it was proposed to NASA in October 1994. Two similar missions were conceived in 1996 entitled CNR for Comet Nucleus Rendezvous and MBAR for Main Belt Asteroid Rendezvous. The latter mission was again proposed in 1998. All four of these missions were unsuccessfully proposed to the NASA Discovery program. Nevertheless we were partially successful in that the Deep Space 1 (DS1) mission was eventually carried out nearly duplicating our CNR mission. Returning to the magnetosphere we studied and proposed to the Medium Class Explorer (MIDEX) program a MidEx mission called TEMPEST, in 1995. This mission included two solar electric spacecraft that spiraled outward in the magnetosphere: one at near 900 inclination and one in the equatorial plane. This mission was not selected for flight. Next we proposed a single SEP vehicle to carry Energetic Neutral Atom (ENA) imagers and inside observations to complement the IMAGE mission providing needed data to properly interpret the IMAGE data. This mission called SESAME was submitted unsuccessfully in 1997. One proposal was successful. A study grant was awarded to examine a four spacecraft solar electric mission, named Global Magnetospheric Dynamics. This study was completed and a report on this mission is attached but events overtook this design and a separate study team was selected to design a classical chemical mission as a Solar Terrestrial Probe. Competing proposals such as through the MIDEX opportunity were expressly forbidden. A bibliography is attached.
Petascale Simulation Initiative Tech Base: FY2007 Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

May, J; Chen, R; Jefferson, D

The Petascale Simulation Initiative began as an LDRD project in the middle of Fiscal Year 2004. The goal of the project was to develop techniques to allow large-scale scientific simulation applications to better exploit the massive parallelism that will come with computers running at petaflops per second. One of the major products of this work was the design and prototype implementation of a programming model and a runtime system that lets applications extend data-parallel applications to use task parallelism. By adopting task parallelism, applications can use processing resources more flexibly, exploit multiple forms of parallelism, and support more sophisticated multiscalemore » and multiphysics models. Our programming model was originally called the Symponents Architecture but is now known as Cooperative Parallelism, and the runtime software that supports it is called Coop. (However, we sometimes refer to the programming model as Coop for brevity.) We have documented the programming model and runtime system in a submitted conference paper [1]. This report focuses on the specific accomplishments of the Cooperative Parallelism project (as we now call it) under Tech Base funding in FY2007. Development and implementation of the model under LDRD funding alone proceeded to the point of demonstrating a large-scale materials modeling application using Coop on more than 1300 processors by the end of FY2006. Beginning in FY2007, the project received funding from both LDRD and the Computation Directorate Tech Base program. Later in the year, after the three-year term of the LDRD funding ended, the ASC program supported the project with additional funds. The goal of the Tech Base effort was to bring Coop from a prototype to a production-ready system that a variety of LLNL users could work with. Specifically, the major tasks that we planned for the project were: (1) Port SARS [former name of the Coop runtime system] to another LLNL platform, probably Thunder or Peloton (depending on when Peloton becomes available); (2) Improve SARS's robustness and ease-of-use, and develop user documentation; and (3) Work with LLNL code teams to help them determine how Symponents could benefit their applications. The original funding request was $296,000 for the year, and we eventually received $252,000. The remainder of this report describes our efforts and accomplishments for each of the goals listed above.« less
Automated Instrumentation, Monitoring and Visualization of PVM Programs Using AIMS

NASA Technical Reports Server (NTRS)

Mehra, Pankaj; VanVoorst, Brian; Yan, Jerry; Lum, Henry, Jr. (Technical Monitor)

1994-01-01

We present views and analysis of the execution of several PVM (Parallel Virtual Machine) codes for Computational Fluid Dynamics on a networks of Sparcstations, including: (1) NAS Parallel Benchmarks CG and MG; (2) a multi-partitioning algorithm for NAS Parallel Benchmark SP; and (3) an overset grid flowsolver. These views and analysis were obtained using our Automated Instrumentation and Monitoring System (AIMS) version 3.0, a toolkit for debugging the performance of PVM programs. We will describe the architecture, operation and application of AIMS. The AIMS toolkit contains: (1) Xinstrument, which can automatically instrument various computational and communication constructs in message-passing parallel programs; (2) Monitor, a library of runtime trace-collection routines; (3) VK (Visual Kernel), an execution-animation tool with source-code clickback; and (4) Tally, a tool for statistical analysis of execution profiles. Currently, Xinstrument can handle C and Fortran 77 programs using PVM 3.2.x; Monitor has been implemented and tested on Sun 4 systems running SunOS 4.1.2; and VK uses XIIR5 and Motif 1.2. Data and views obtained using AIMS clearly illustrate several characteristic features of executing parallel programs on networked workstations: (1) the impact of long message latencies; (2) the impact of multiprogramming overheads and associated load imbalance; (3) cache and virtual-memory effects; and (4) significant skews between workstation clocks. Interestingly, AIMS can compensate for constant skew (zero drift) by calibrating the skew between a parent and its spawned children. In addition, AIMS' skew-compensation algorithm can adjust timestamps in a way that eliminates physically impossible communications (e.g., messages going backwards in time). Our current efforts are directed toward creating new views to explain the observed performance of PVM programs. Some of the features planned for the near future include: (1) ConfigView, showing the physical topology of the virtual machine, inferred using specially formatted IP (Internet Protocol) packets: and (2) LoadView, synchronous animation of PVM-program execution and resource-utilization patterns.
A parallel row-based algorithm with error control for standard-cell replacement on a hypercube multiprocessor

NASA Technical Reports Server (NTRS)

Sargent, Jeff Scott

1988-01-01

A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel new approaches to controlling error in parallel cell-placement algorithms; Heuristic Cell-Coloring and Adaptive (Parallel Move) Sequence Control. Heuristic Cell-Coloring identifies sets of noninteracting cells that can be moved repeatedly, and in parallel, with no buildup of error in the placement cost. Adaptive Sequence Control allows multiple parallel cell moves to take place between global cell-position updates. This feedback mechanism is based on an error bound derived analytically from the traditional annealing move-acceptance profile. Placement results are presented for real industry circuits and the performance is summarized of an implementation on the Intel iPSC/2 Hypercube. The runtime of this algorithm is 5 to 16 times faster than a previous program developed for the Hypercube, while producing equivalent quality placement. An integrated place and route program for the Intel iPSC/2 Hypercube is currently being developed.
Work stealing for GPU-accelerated parallel programs in a global address space framework: WORK STEALING ON GPU-ACCELERATED SYSTEMS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram

Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain.« less

Work stealing for GPU-accelerated parallel programs in a global address space framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram

Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain« less
Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies

PubMed Central

Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang

2008-01-01

Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
The language parallel Pascal and other aspects of the massively parallel processor

NASA Technical Reports Server (NTRS)

Reeves, A. P.; Bruner, J. D.

1982-01-01

A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.
Automatic recognition of vector and parallel operations in a higher level language

NASA Technical Reports Server (NTRS)

Schneck, P. B.

1971-01-01

A compiler for recognizing statements of a FORTRAN program which are suited for fast execution on a parallel or pipeline machine such as Illiac-4, Star or ASC is described. The technique employs interval analysis to provide flow information to the vector/parallel recognizer. Where profitable the compiler changes scalar variables to subscripted variables. The output of the compiler is an extension to FORTRAN which shows parallel and vector operations explicitly.
Understanding and Improving High-Performance I/O Subsystems

NASA Technical Reports Server (NTRS)

El-Ghazawi, Tarek A.; Frieder, Gideon; Clark, A. James

1996-01-01

This research program has been conducted in the framework of the NASA Earth and Space Science (ESS) evaluations led by Dr. Thomas Sterling. In addition to the many important research findings for NASA and the prestigious publications, the program has helped orienting the doctoral research program of two students towards parallel input/output in high-performance computing. Further, the experimental results in the case of the MasPar were very useful and helpful to MasPar with which the P.I. has had many interactions with the technical management. The contributions of this program are drawn from three experimental studies conducted on different high-performance computing testbeds/platforms, and therefore presented in 3 different segments as follows: 1. Evaluating the parallel input/output subsystem of a NASA high-performance computing testbeds, namely the MasPar MP- 1 and MP-2; 2. Characterizing the physical input/output request patterns for NASA ESS applications, which used the Beowulf platform; and 3. Dynamic scheduling techniques for hiding I/O latency in parallel applications such as sparse matrix computations. This study also has been conducted on the Intel Paragon and has also provided an experimental evaluation for the Parallel File System (PFS) and parallel input/output on the Paragon. This report is organized as follows. The summary of findings discusses the results of each of the aforementioned 3 studies. Three appendices, each containing a key scholarly research paper that details the work in one of the studies are included.
Methodologies and Tools for Tuning Parallel Programs: 80% Art, 20% Science, and 10% Luck

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Bailey, David (Technical Monitor)

1996-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. In the past few years, the ubiquitous introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CRI's Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance instrumentation/monitor/tuning technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g. AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.
Parallel Computing for Probabilistic Response Analysis of High Temperature Composites

NASA Technical Reports Server (NTRS)

Sues, R. H.; Lua, Y. J.; Smith, M. D.

1994-01-01

The objective of this Phase I research was to establish the required software and hardware strategies to achieve large scale parallelism in solving PCM problems. To meet this objective, several investigations were conducted. First, we identified the multiple levels of parallelism in PCM and the computational strategies to exploit these parallelisms. Next, several software and hardware efficiency investigations were conducted. These involved the use of three different parallel programming paradigms and solution of two example problems on both a shared-memory multiprocessor and a distributed-memory network of workstations.
Real-Time MENTAT programming language and architecture

NASA Technical Reports Server (NTRS)

Grimshaw, Andrew S.; Silberman, Ami; Liu, Jane W. S.

1989-01-01

Real-time MENTAT, a programming environment designed to simplify the task of programming real-time applications in distributed and parallel environments, is described. It is based on the same data-driven computation model and object-oriented programming paradigm as MENTAT. It provides an easy-to-use mechanism to exploit parallelism, language constructs for the expression and enforcement of timing constraints, and run-time support for scheduling and exciting real-time programs. The real-time MENTAT programming language is an extended C++. The extensions are added to facilitate automatic detection of data flow and generation of data flow graphs, to express the timing constraints of individual granules of computation, and to provide scheduling directives for the runtime system. A high-level view of the real-time MENTAT system architecture and programming language constructs is provided.
Computational strategies for three-dimensional flow simulations on distributed computer systems. Ph.D. Thesis Semiannual Status Report, 15 Aug. 1993 - 15 Feb. 1994

NASA Technical Reports Server (NTRS)

Weed, Richard Allen; Sankar, L. N.

1994-01-01

An increasing amount of research activity in computational fluid dynamics has been devoted to the development of efficient algorithms for parallel computing systems. The increasing performance to price ratio of engineering workstations has led to research to development procedures for implementing a parallel computing system composed of distributed workstations. This thesis proposal outlines an ongoing research program to develop efficient strategies for performing three-dimensional flow analysis on distributed computing systems. The PVM parallel programming interface was used to modify an existing three-dimensional flow solver, the TEAM code developed by Lockheed for the Air Force, to function as a parallel flow solver on clusters of workstations. Steady flow solutions were generated for three different wing and body geometries to validate the code and evaluate code performance. The proposed research will extend the parallel code development to determine the most efficient strategies for unsteady flow simulations.
Communication library for run-time visualization of distributed, asynchronous data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rowlan, J.; Wightman, B.T.

1994-04-01

In this paper we present a method for collecting and visualizing data generated by a parallel computational simulation during run time. Data distributed across multiple processes is sent across parallel communication lines to a remote workstation, which sorts and queues the data for visualization. We have implemented our method in a set of tools called PORTAL (for Parallel aRchitecture data-TrAnsfer Library). The tools comprise generic routines for sending data from a parallel program (callable from either C or FORTRAN), a semi-parallel communication scheme currently built upon Unix Sockets, and a real-time connection to the scientific visualization program AVS. Our methodmore » is most valuable when used to examine large datasets that can be efficiently generated and do not need to be stored on disk. The PORTAL source libraries, detailed documentation, and a working example can be obtained by anonymous ftp from info.mcs.anl.gov from the file portal.tar.Z from the directory pub/portal.« less
Multiprogramming performance degradation - Case study on a shared memory multiprocessor

NASA Technical Reports Server (NTRS)

Dimpsey, R. T.; Iyer, R. K.

1989-01-01

The performance degradation due to multiprogramming overhead is quantified for a parallel-processing machine. Measurements of real workloads were taken, and it was found that there is a moderate correlation between the completion time of a program and the amount of system overhead measured during program execution. Experiments in controlled environments were then conducted to calculate a lower bound on the performance degradation of parallel jobs caused by multiprogramming overhead. The results show that the multiprogramming overhead of parallel jobs consumes at least 4 percent of the processor time. When two or more serial jobs are introduced into the system, this amount increases to 5.3 percent
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zuo, Wangda; McNeil, Andrew; Wetter, Michael

2011-09-06

We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
SPSS and SAS programs for determining the number of components using parallel analysis and velicer's MAP test.

PubMed

O'Connor, B P

2000-08-01

Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer's minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvalues-greater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.
Message Passing and Shared Address Space Parallelism on an SMP Cluster

NASA Technical Reports Server (NTRS)

Shan, Hongzhang; Singh, Jaswinder P.; Oliker, Leonid; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2002-01-01

Currently, message passing (MP) and shared address space (SAS) are the two leading parallel programming paradigms. MP has been standardized with MPI, and is the more common and mature approach; however, code development can be extremely difficult, especially for irregularly structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality and high protocol overhead. In this paper, we compare the performance of and the programming effort required for six applications under both programming models on a 32-processor PC-SMP cluster, a platform that is becoming increasingly attractive for high-end scientific computing. Our application suite consists of codes that typically do not exhibit scalable performance under shared-memory programming due to their high communication-to-computation ratios and/or complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications, while being competitive for the others. A hybrid MPI+SAS strategy shows only a small performance advantage over pure MPI in some cases. Finally, improved implementations of two MPI collective operations on PC-SMP clusters are presented.
Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

NASA Technical Reports Server (NTRS)

Harper, Richard

1989-01-01

In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.
The revised solar array synthesis computer program

NASA Technical Reports Server (NTRS)

1970-01-01

The Revised Solar Array Synthesis Computer Program is described. It is a general-purpose program which computes solar array output characteristics while accounting for the effects of temperature, incidence angle, charged-particle irradiation, and other degradation effects on various solar array configurations in either circular or elliptical orbits. Array configurations may consist of up to 75 solar cell panels arranged in any series-parallel combination not exceeding three series-connected panels in a parallel string and no more than 25 parallel strings in an array. Up to 100 separate solar array current-voltage characteristics, corresponding to 100 equal-time increments during the sunlight illuminated portion of an orbit or any 100 user-specified combinations of incidence angle and temperature, can be computed and printed out during one complete computer execution. Individual panel incidence angles may be computed and printed out at the user's option.
High Performance Programming Using Explicit Shared Memory Model on the Cray T3D

NASA Technical Reports Server (NTRS)

Saini, Subhash; Simon, Horst D.; Lasinski, T. A. (Technical Monitor)

1994-01-01

The Cray T3D is the first-phase system in Cray Research Inc.'s (CRI) three-phase massively parallel processing program. In this report we describe the architecture of the T3D, as well as the CRAFT (Cray Research Adaptive Fortran) programming model, and contrast it with PVM, which is also supported on the T3D We present some performance data based on the NAS Parallel Benchmarks to illustrate both architectural and software features of the T3D.
Parallel computation and the Basis system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, G.R.

1992-12-16

A software package has been written that can facilitate efforts to develop powerful, flexible, and easy-to-use programs that can run in single-processor, massively parallel, and distributed computing environments. Particular attention has been given to the difficulties posed by a program consisting of many science packages that represent subsystems of a complicated, coupled system. Methods have been found to maintain independence of the packages by hiding data structures without increasing the communication costs in a parallel computing environment. Concepts developed in this work are demonstrated by a prototype program that uses library routines from two existing software systems, Basis and Parallelmore » Virtual Machine (PVM). Most of the details of these libraries have been encapsulated in routines and macros that could be rewritten for alternative libraries that possess certain minimum capabilities. The prototype software uses a flexible master-and-slaves paradigm for parallel computation and supports domain decomposition with message passing for partitioning work among slaves. Facilities are provided for accessing variables that are distributed among the memories of slaves assigned to subdomains. The software is named PROTOPAR.« less
Orthorectification by Using Gpgpu Method

NASA Astrophysics Data System (ADS)

Sahin, H.; Kulur, S.

2012-07-01

Thanks to the nature of the graphics processing, the newly released products offer highly parallel processing units with high-memory bandwidth and computational power of more than teraflops per second. The modern GPUs are not only powerful graphic engines but also they are high level parallel programmable processors with very fast computing capabilities and high-memory bandwidth speed compared to central processing units (CPU). Data-parallel computations can be shortly described as mapping data elements to parallel processing threads. The rapid development of GPUs programmability and capabilities attracted the attentions of researchers dealing with complex problems which need high level calculations. This interest has revealed the concepts of "General Purpose Computation on Graphics Processing Units (GPGPU)" and "stream processing". The graphic processors are powerful hardware which is really cheap and affordable. So the graphic processors became an alternative to computer processors. The graphic chips which were standard application hardware have been transformed into modern, powerful and programmable processors to meet the overall needs. Especially in recent years, the phenomenon of the usage of graphics processing units in general purpose computation has led the researchers and developers to this point. The biggest problem is that the graphics processing units use different programming models unlike current programming methods. Therefore, an efficient GPU programming requires re-coding of the current program algorithm by considering the limitations and the structure of the graphics hardware. Currently, multi-core processors can not be programmed by using traditional programming methods. Event procedure programming method can not be used for programming the multi-core processors. GPUs are especially effective in finding solution for repetition of the computing steps for many data elements when high accuracy is needed. Thus, it provides the computing process more quickly and accurately. Compared to the GPUs, CPUs which perform just one computing in a time according to the flow control are slower in performance. This structure can be evaluated for various applications of computer technology. In this study covers how general purpose parallel programming and computational power of the GPUs can be used in photogrammetric applications especially direct georeferencing. The direct georeferencing algorithm is coded by using GPGPU method and CUDA (Compute Unified Device Architecture) programming language. Results provided by this method were compared with the traditional CPU programming. In the other application the projective rectification is coded by using GPGPU method and CUDA programming language. Sample images of various sizes, as compared to the results of the program were evaluated. GPGPU method can be used especially in repetition of same computations on highly dense data, thus finding the solution quickly.
Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

PubMed

Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

2016-08-05

The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

Automated Performance Prediction of Message-Passing Parallel Programs

NASA Technical Reports Server (NTRS)

Block, Robert J.; Sarukkai, Sekhar; Mehra, Pankaj; Woodrow, Thomas S. (Technical Monitor)

1995-01-01

The increasing use of massively parallel supercomputers to solve large-scale scientific problems has generated a need for tools that can predict scalability trends of applications written for these machines. Much work has been done to create simple models that represent important characteristics of parallel programs, such as latency, network contention, and communication volume. But many of these methods still require substantial manual effort to represent an application in the model's format. The NIK toolkit described in this paper is the result of an on-going effort to automate the formation of analytic expressions of program execution time, with a minimum of programmer assistance. In this paper we demonstrate the feasibility of our approach, by extending previous work to detect and model communication patterns automatically, with and without overlapped computations. The predictions derived from these models agree, within reasonable limits, with execution times of programs measured on the Intel iPSC/860 and Paragon. Further, we demonstrate the use of MK in selecting optimal computational grain size and studying various scalability metrics.
Connectionist Models and Parallelism in High Level Vision.

DTIC Science & Technology

1985-01-01

GRANT NUMBER(s) Jerome A. Feldman N00014-82-K-0193 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENt. PROJECT, TASK Computer Science...Connectionist Models 2.1 Background and Overviev % Computer science is just beginning to look seriously at parallel computation : it may turn out that...the chair. The program includes intermediate level networks that compute more complex joints and ones that compute parallelograms in the image. These
Transient Finite Element Computations on a Variable Transputer System

NASA Technical Reports Server (NTRS)

Smolinski, Patrick J.; Lapczyk, Ireneusz

1993-01-01

A parallel program to analyze transient finite element problems was written and implemented on a system of transputer processors. The program uses the explicit time integration algorithm which eliminates the need for equation solving, making it more suitable for parallel computations. An interprocessor communication scheme was developed for arbitrary two dimensional grid processor configurations. Several 3-D problems were analyzed on a system with a small number of processors.
Scheduling for Locality in Shared-Memory Multiprocessors

DTIC Science & Technology

1993-05-01

Submitted in Partial Fulfillment of the Requirements for the Degree ’)iIC Q(JALfryT INSPECTED 5 DOCTOR OF PHILOSOPHY I Accesion For Supervised by NTIS CRAM... architecture on parallel program performance, explain the implications of this trend on popular parallel programming models, and propose system software to 0...decomoosition and scheduling algorithms. I. SUIUECT TERMS IS. NUMBER OF PAGES shared-memory multiprocessors; architecture trends; loop 110 scheduling
An Empirical Development of Parallelization Guidelines for Time-Driven Simulation

DTIC Science & Technology

1989-12-01

wives, who though not Cub fans, put on a good show during our trip, to waich some games . I would also like to recognize the help of my professors at...program parallelization. in this research effort a Ballistic Missile Defense (BMD) time driven simulation program, developed by DESE Research and...continuously, or continuously with discrete changes superimposed. The distinguishing feature of these simulations is the interaction between discretely
Force user's manual, revised

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Arenstorf, Norbert S.; Ramanan, Aruna V.

1987-01-01

A methodology for writing parallel programs for shared memory multiprocessors has been formalized as an extension to the Fortran language and implemented as a macro preprocessor. The extended language is known as the Force, and this manual describes how to write Force programs and execute them on the Flexible Computer Corporation Flex/32, the Encore Multimax and the Sequent Balance computers. The parallel extension macros are described in detail, but knowledge of Fortran is assumed.
Parallel Programming Paradigms

DTIC Science & Technology

1987-07-01

Unclassified IS.. DECLASSIFICATIONIOOWNGRADIN G 16. DISTRIBUTION STATEMENT (of this Report) Distribution of this report is unlimited. 17...8416878 and by the Office of Naval Research Contracts No. N00014-86-K-0264 and No. N00014-85- K-0328. 8 ?~~ O . G 1 49 II Parallel Programming Paradigms...processors -. "to fetch from the same memory cell (list head) and thus seems to favor a shared memory - g implementation [37). In this dissertation, we
Dynamic programming in parallel boundary detection with application to ultrasound intima-media segmentation.

PubMed

Zhou, Yuan; Cheng, Xinyao; Xu, Xiangyang; Song, Enmin

2013-12-01

Segmentation of carotid artery intima-media in longitudinal ultrasound images for measuring its thickness to predict cardiovascular diseases can be simplified as detecting two nearly parallel boundaries within a certain distance range, when plaque with irregular shapes is not considered. In this paper, we improve the implementation of two dynamic programming (DP) based approaches to parallel boundary detection, dual dynamic programming (DDP) and piecewise linear dual dynamic programming (PL-DDP). Then, a novel DP based approach, dual line detection (DLD), which translates the original 2-D curve position to a 4-D parameter space representing two line segments in a local image segment, is proposed to solve the problem while maintaining efficiency and rotation invariance. To apply the DLD to ultrasound intima-media segmentation, it is imbedded in a framework that employs an edge map obtained from multiplication of the responses of two edge detectors with different scales and a coupled snake model that simultaneously deforms the two contours for maintaining parallelism. The experimental results on synthetic images and carotid arteries of clinical ultrasound images indicate improved performance of the proposed DLD compared to DDP and PL-DDP, with respect to accuracy and efficiency. Copyright © 2013 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Shipman, Galen M.

These are the slides for a presentation on programming models in HPC, at the Los Alamos National Laboratory's Parallel Computing Summer School. The following topics are covered: Flynn's Taxonomy of computer architectures; single instruction single data; single instruction multiple data; multiple instruction multiple data; address space organization; definition of Trinity (Intel Xeon-Phi is a MIMD architecture); single program multiple data; multiple program multiple data; ExMatEx workflow overview; definition of a programming model, programming languages, runtime systems; programming model and environments; MPI (Message Passing Interface); OpenMP; Kokkos (Performance Portable Thread-Parallel Programming Model); Kokkos abstractions, patterns, policies, and spaces; RAJA, a systematicmore » approach to node-level portability and tuning; overview of the Legion Programming Model; mapping tasks and data to hardware resources; interoperability: supporting task-level models; Legion S3D execution and performance details; workflow, integration of external resources into the programming model.« less
Massively parallel data processing for quantitative total flow imaging with optical coherence microscopy and tomography

NASA Astrophysics Data System (ADS)

Sylwestrzak, Marcin; Szlag, Daniel; Marchand, Paul J.; Kumar, Ashwin S.; Lasser, Theo

2017-08-01

We present an application of massively parallel processing of quantitative flow measurements data acquired using spectral optical coherence microscopy (SOCM). The need for massive signal processing of these particular datasets has been a major hurdle for many applications based on SOCM. In view of this difficulty, we implemented and adapted quantitative total flow estimation algorithms on graphics processing units (GPU) and achieved a 150 fold reduction in processing time when compared to a former CPU implementation. As SOCM constitutes the microscopy counterpart to spectral optical coherence tomography (SOCT), the developed processing procedure can be applied to both imaging modalities. We present the developed DLL library integrated in MATLAB (with an example) and have included the source code for adaptations and future improvements. Catalogue identifier: AFBT_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AFBT_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPLv3 No. of lines in distributed program, including test data, etc.: 913552 No. of bytes in distributed program, including test data, etc.: 270876249 Distribution format: tar.gz Programming language: CUDA/C, MATLAB. Computer: Intel x64 CPU, GPU supporting CUDA technology. Operating system: 64-bit Windows 7 Professional. Has the code been vectorized or parallelized?: Yes, CPU code has been vectorized in MATLAB, CUDA code has been parallelized. RAM: Dependent on users parameters, typically between several gigabytes and several tens of gigabytes Classification: 6.5, 18. Nature of problem: Speed up of data processing in optical coherence microscopy Solution method: Utilization of GPU for massively parallel data processing Additional comments: Compiled DLL library with source code and documentation, example of utilization (MATLAB script with raw data) Running time: 1,8 s for one B-scan (150 × faster in comparison to the CPU data processing time)
Use Computer-Aided Tools to Parallelize Large CFD Applications

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Yan, J.

2000-01-01

Porting applications to high performance parallel computers is always a challenging task. It is time consuming and costly. With rapid progressing in hardware architectures and increasing complexity of real applications in recent years, the problem becomes even more sever. Today, scalability and high performance are mostly involving handwritten parallel programs using message-passing libraries (e.g. MPI). However, this process is very difficult and often error-prone. The recent reemergence of shared memory parallel (SMP) architectures, such as the cache coherent Non-Uniform Memory Access (ccNUMA) architecture used in the SGI Origin 2000, show good prospects for scaling beyond hundreds of processors. Programming on an SMP is simplified by working in a globally accessible address space. The user can supply compiler directives, such as OpenMP, to parallelize the code. As an industry standard for portable implementation of parallel programs for SMPs, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran, C and C++ to express shared memory parallelism. It promises an incremental path for parallel conversion of existing software, as well as scalability and performance for a complete rewrite or an entirely new development. Perhaps the main disadvantage of programming with directives is that inserted directives may not necessarily enhance performance. In the worst cases, it can create erroneous results. While vendors have provided tools to perform error-checking and profiling, automation in directive insertion is very limited and often failed on large programs, primarily due to the lack of a thorough enough data dependence analysis. To overcome the deficiency, we have developed a toolkit, CAPO, to automatically insert OpenMP directives in Fortran programs and apply certain degrees of optimization. CAPO is aimed at taking advantage of detailed inter-procedural dependence analysis provided by CAPTools, developed by the University of Greenwich, to reduce potential errors made by users. Earlier tests on NAS Benchmarks and ARC3D have demonstrated good success of this tool. In this study, we have applied CAPO to parallelize three large applications in the area of computational fluid dynamics (CFD): OVERFLOW, TLNS3D and INS3D. These codes are widely used for solving Navier-Stokes equations with complicated boundary conditions and turbulence model in multiple zones. Each one comprises of from 50K to 1,00k lines of FORTRAN77. As an example, CAPO took 77 hours to complete the data dependence analysis of OVERFLOW on a workstation (SGI, 175MHz, R10K processor). A fair amount of effort was spent on correcting false dependencies due to lack of necessary knowledge during the analysis. Even so, CAPO provides an easy way for user to interact with the parallelization process. The OpenMP version was generated within a day after the analysis was completed. Due to sequential algorithms involved, code sections in TLNS3D and INS3D need to be restructured by hand to produce more efficient parallel codes. An included figure shows preliminary test results of the generated OVERFLOW with several test cases in single zone. The MPI data points for the small test case were taken from a handcoded MPI version. As we can see, CAPO's version has achieved 18 fold speed up on 32 nodes of the SGI O2K. For the small test case, it outperformed the MPI version. These results are very encouraging, but further work is needed. For example, although CAPO attempts to place directives on the outer- most parallel loops in an interprocedural framework, it does not insert directives based on the best manual strategy. In particular, it lacks the support of parallelization at the multi-zone level. Future work will emphasize on the development of methodology to work in a multi-zone level and with a hybrid approach. Development of tools to perform more complicated code transformation is also needed.
Northeast Artificial Intelligence Consortium Annual Report - 1988 Parallel Vision. Volume 9

DTIC Science & Technology

1989-10-01

supports the Northeast Aritificial Intelligence Consortium (NAIC). Volume 9 Parallel Vision Report submitted by Christopher M. Brown Randal C. Nelson...NORTHEAST ARTIFICIAL INTELLIGENCE CONSORTIUM ANNUAL REPORT - 1988 Parallel Vision Syracuse University Christopher M. Brown and Randal C. Nelson...Technical Director Directorate of Intelligence & Reconnaissance FOR THE COMMANDER: IGOR G. PLONISCH Directorate of Plans & Programs If your address has
High Performance Input/Output for Parallel Computer Systems

NASA Technical Reports Server (NTRS)

Ligon, W. B.

1996-01-01

The goal of our project is to study the I/O characteristics of parallel applications used in Earth Science data processing systems such as Regional Data Centers (RDCs) or EOSDIS. Our approach is to study the runtime behavior of typical programs and the effect of key parameters of the I/O subsystem both under simulation and with direct experimentation on parallel systems. Our three year activity has focused on two items: developing a test bed that facilitates experimentation with parallel I/O, and studying representative programs from the Earth science data processing application domain. The Parallel Virtual File System (PVFS) has been developed for use on a number of platforms including the Tiger Parallel Architecture Workbench (TPAW) simulator, The Intel Paragon, a cluster of DEC Alpha workstations, and the Beowulf system (at CESDIS). PVFS provides considerable flexibility in configuring I/O in a UNIX- like environment. Access to key performance parameters facilitates experimentation. We have studied several key applications fiom levels 1,2 and 3 of the typical RDC processing scenario including instrument calibration and navigation, image classification, and numerical modeling codes. We have also considered large-scale scientific database codes used to organize image data.
The Research of the Parallel Computing Development from the Angle of Cloud Computing

NASA Astrophysics Data System (ADS)

Peng, Zhensheng; Gong, Qingge; Duan, Yanyu; Wang, Yun

2017-10-01

Cloud computing is the development of parallel computing, distributed computing and grid computing. The development of cloud computing makes parallel computing come into people’s lives. Firstly, this paper expounds the concept of cloud computing and introduces two several traditional parallel programming model. Secondly, it analyzes and studies the principles, advantages and disadvantages of OpenMP, MPI and Map Reduce respectively. Finally, it takes MPI, OpenMP models compared to Map Reduce from the angle of cloud computing. The results of this paper are intended to provide a reference for the development of parallel computing.
A real-time MPEG software decoder using a portable message-passing library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kwong, Man Kam; Tang, P.T. Peter; Lin, Biquan

1995-12-31

We present a real-time MPEG software decoder that uses message-passing libraries such as MPL, p4 and MPI. The parallel MPEG decoder currently runs on the IBM SP system but can be easil ported to other parallel machines. This paper discusses our parallel MPEG decoding algorithm as well as the parallel programming environment under which it uses. Several technical issues are discussed, including balancing of decoding speed, memory limitation, 1/0 capacities, and optimization of MPEG decoding components. This project shows that a real-time portable software MPEG decoder is feasible in a general-purpose parallel machine.
Memory access in shared virtual memory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berrendorf, R.

1992-01-01

Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.
Memory access in shared virtual memory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berrendorf, R.

1992-09-01

Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.
Support of Multidimensional Parallelism in the OpenMP Programming Model

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Jost, Gabriele

2003-01-01

OpenMP is the current standard for shared-memory programming. While providing ease of parallel programming, the OpenMP programming model also has limitations which often effect the scalability of applications. Examples for these limitations are work distribution and point-to-point synchronization among threads. We propose extensions to the OpenMP programming model which allow the user to easily distribute the work in multiple dimensions and synchronize the workflow among the threads. The proposed extensions include four new constructs and the associated runtime library. They do not require changes to the source code and can be implemented based on the existing OpenMP standard. We illustrate the concept in a prototype translator and test with benchmark codes and a cloud modeling code.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Dritz, K.W.; Boyle, J.M.

This paper addresses the problem of measuring and analyzing the performance of fine-grained parallel programs running on shared-memory multiprocessors. Such processors use locking (either directly in the application program, or indirectly in a subroutine library or the operating system) to serialize accesses to global variables. Given sufficiently high rates of locking, the chief factor preventing linear speedup (besides lack of adequate inherent parallelism in the application) is lock contention - the blocking of processes that are trying to acquire a lock currently held by another process. We show how a high-resolution, low-overhead clock may be used to measure both lockmore » contention and lack of parallel work. Several ways of presenting the results are covered, culminating in a method for calculating, in a single multiprocessing run, both the speedup actually achieved and the speedup lost to contention for each lock and to lack of parallel work. The speedup losses are reported in the same units, ''processor-equivalents,'' as the speedup achieved. Both are obtained without having to perform the usual one-process comparison run. We chronicle also a variety of experiments motivated by actual results obtained with our measurement method. The insights into program performance that we gained from these experiments helped us to refine the parts of our programs concerned with communication and synchronization. Ultimately these improvements reduced lock contention to a negligible amount and yielded nearly linear speedup in applications not limited by lack of parallel work. We describe two generally applicable strategies (''code motion out of critical regions'' and ''critical-region fissioning'') for reducing lock contention and one (''lock/variable fusion'') applicable only on certain architectures.« less
ICASE Computer Science Program

NASA Technical Reports Server (NTRS)

1985-01-01

The Institute for Computer Applications in Science and Engineering computer science program is discussed in outline form. Information is given on such topics as problem decomposition, algorithm development, programming languages, and parallel architectures.

"A tempest in a cocktail glass": mothers, alcohol, and television, 1977-1996.

PubMed

Golden, J

2000-06-01

This article examines the portrayal of pregnancy and alcohol in thirty-six national network evening news broadcasts (ABC, CBS, NBC). Early coverage focused on white, middle-class women, as scientific authorities and government officials warned against drinking during pregnancy. After 1987, however, women who drank during pregnancy were depicted as members of minority groups and as a danger to society. The thematic transition began before warning labels appeared on alcoholic beverages and gained strength from official government efforts to prevent fetal alcohol syndrome. The greatest impetus for the revised discourse, however, was the eruption of a "moral panic" over crack cocaine use. By linking fetal harm to substance abuse, the panic suggested it was in the public's interest to control the behavior of pregnant women.
TEMPEST in a gallimaufry: applying multilevel systems theory to person-in-context research.

PubMed

Peck, Stephen C

2007-12-01

Terminological ambiguity and inattention to personal and contextual multilevel systems undermine personality, self, and identity theories. Hierarchical and heterarchical systems theories are used to describe contents and processes existing within and across three interrelated multilevel systems: levels of organization, representation, and integration. Materially nested levels of organization are used to distinguish persons from contexts and personal from social identity. Functionally nested levels of representation are used to distinguish personal identity from the sense of identity and symbolic (belief) from iconic (schema) systems. Levels of integration are hypothesized to unfold separately but interdependently across levels of representation. Multilevel system configurations clarify alternative conceptualizations of traits and contextualized identity. Methodological implications for measurement and analysis (e.g., integrating variable- and pattern-centered methods) are briefly described.
Comparisons of anomalous and collisional radial transport with a continuum kinetic edge code

NASA Astrophysics Data System (ADS)

Bodi, K.; Krasheninnikov, S.; Cohen, R.; Rognlien, T.

2009-05-01

Modeling of anomalous (turbulence-driven) radial transport in controlled-fusion plasmas is necessary for long-time transport simulations. Here the focus is continuum kinetic edge codes such as the (2-D, 2-V) transport version of TEMPEST, NEO, and the code being developed by the Edge Simulation Laboratory, but the model also has wider application. Our previously developed anomalous diagonal transport matrix model with velocity-dependent convection and diffusion coefficients allows contact with typical fluid transport models (e.g., UEDGE). Results are presented that combine the anomalous transport model and collisional transport owing to ion drift orbits utilizing a Krook collision operator that conserves density and energy. Comparison is made of the relative magnitudes and possible synergistic effects of the two processes for typical tokamak device parameters.
EUV phase-shifting masks and aberration monitors

NASA Astrophysics Data System (ADS)

Deng, Yunfei; Neureuther, Andrew R.

2002-07-01

Rigorous electromagnetic simulation with TEMPEST is used to examine the use of phase-shifting masks in EUV lithography. The effects of oblique incident illumination and mask patterning by ion-mixing of multilayers are analyzed. Oblique incident illumination causes streamers at absorber edges and causes position shifting in aerial images. The diffraction waves between ion-mixed and pristine multilayers are observed. The phase-shifting caused by stepped substrates is simulated and images show that it succeeds in creation of phase-shifting effects. The diffraction process at the phase boundary is also analyzed. As an example of EUV phase-shifting masks, a coma pattern and probe based aberration monitor is simulated and aerial images are formed under different levels of coma aberration. The probe signal rises quickly as coma increases as designed.
Overview of Edge Simulation Laboratory (ESL)

NASA Astrophysics Data System (ADS)

Cohen, R. H.; Dorr, M.; Hittinger, J.; Rognlien, T.; Umansky, M.; Xiong, A.; Xu, X.; Belli, E.; Candy, J.; Snyder, P.; Colella, P.; Martin, D.; Sternberg, T.; van Straalen, B.; Bodi, K.; Krasheninnikov, S.

2006-10-01

The ESL is a new collaboration to build a full-f electromagnetic gyrokinetic code for tokamak edge plasmas using continuum methods. Target applications are edge turbulence and transport (neoclassical and anomalous), and edge-localized modes. Initially the project has three major threads: (i) verification and validation of TEMPEST, the project's initial (electrostatic) edge code which can be run in 4D (neoclassical and transport-timescale applications) or 5D (turbulence); (ii) design of the next generation code, which will include more complete physics (electromagnetics, fluid equation option, improved collisions) and advanced numerics (fully conservative, high-order discretization, mapped multiblock grids, adaptivity), and (iii) rapid-prototype codes to explore the issues attached to solving fully nonlinear gyrokinetics with steep radial gradiens. We present a brief summary of the status of each of these activities.
Ethical issues in nanomedicine: Tempest in a teapot?

PubMed

Allon, Irit; Ben-Yehudah, Ahmi; Dekel, Raz; Solbakk, Jan-Helge; Weltring, Klaus-Michael; Siegal, Gil

2017-03-01

Nanomedicine offers remarkable options for new therapeutic avenues. As methods in nanomedicine advance, ethical questions conjunctly arise. Nanomedicine is an exceptional niche in several aspects as it reflects risks and uncertainties not encountered in other areas of medical research or practice. Nanomedicine partially overlaps, partially interlocks and partially exceeds other medical disciplines. Some interpreters agree that advances in nanotechnology may pose varied ethical challenges, whilst others argue that these challenges are not new and that nanotechnology basically echoes recurrent bioethical dilemmas. The purpose of this article is to discuss some of the ethical issues related to nanomedicine and to reflect on the question whether nanomedicine generates ethical challenges of new and unique nature. Such a determination should have implications on regulatory processes and professional conducts and protocols in the future.
Tempest in a sugar-coated lab vial.

PubMed

Dragun, Duska; Philippe, Aurélie

2018-06-23

Angiotensin II type 1 receptor (AT 1 R) is a classical G-protein-coupled-receptor (GPCR) displaying complex structure consisting of 7-transmembrane helices connected by intracellular and extracellular loops. Beside Angiotensin II binding within transmembrane sites and mechanically induced ligand free activation, AT 1 R can be also activated by agonistic autoantibodies (AT 1 R-Ab) recognizing conformational epitopes contained in the second extracellular loop. Direct pathophysiologic involvement of AT 1 R-Abs is well established in several autoimmune contexts and organ transplantation (1). A commercially available sandwich ELISA appreciating native receptor conformation relies on cell membrane AT 1 R extracts from human AT 1 R overexpressing Chinese hamster ovary (CHO) cells as a solid phase. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Simulation of exposure and alignment for nanoimprint lithography

NASA Astrophysics Data System (ADS)

Deng, Yunfei; Neureuther, Andrew R.

2002-07-01

Rigorous electromagnetic simulation with TEMPEST is used to examine the exposure and alignment processes for nano-imprint lithography with attenuating thin-film molds. Parameters in the design of topographical features of the nano-imprint system and material choices of the components are analyzed. The small feature size limits light transmission through the feature. While little can be done with auxiliary structures to attract light into small holes, the use of an absorbing material with a low real part of the refractive index such as silver helps mitigates the problem. Results on complementary alignment marks shows that the small transmission through the metal layer and the vertical separation of two alignment marks create the leakage equivalent to 1 nm misalignment but satisfactory alignment can be obtained by measuring alignment signals over a +/- 30 nm range.
The neural basis of parallel saccade programming: an fMRI study.

PubMed

Hu, Yanbo; Walker, Robin

2011-11-01

The neural basis of parallel saccade programming was examined in an event-related fMRI study using a variation of the double-step saccade paradigm. Two double-step conditions were used: one enabled the second saccade to be partially programmed in parallel with the first saccade while in a second condition both saccades had to be prepared serially. The intersaccadic interval, observed in the parallel programming (PP) condition, was significantly reduced compared with latency in the serial programming (SP) condition and also to the latency of single saccades in control conditions. The fMRI analysis revealed greater activity (BOLD response) in the frontal and parietal eye fields for the PP condition compared with the SP double-step condition and when compared with the single-saccade control conditions. By contrast, activity in the supplementary eye fields was greater for the double-step condition than the single-step condition but did not distinguish between the PP and SP requirements. The role of the frontal eye fields in PP may be related to the advanced temporal preparation and increased salience of the second saccade goal that may mediate activity in other downstream structures, such as the superior colliculus. The parietal lobes may be involved in the preparation for spatial remapping, which is required in double-step conditions. The supplementary eye fields appear to have a more general role in planning saccade sequences that may be related to error monitoring and the control over the execution of the correct sequence of responses.
Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

NASA Technical Reports Server (NTRS)

1991-01-01

Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
Automatic differentiation for design sensitivity analysis of structural systems using multiple processors

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Storaasli, Olaf O.; Qin, Jiangning; Qamar, Ramzi

1994-01-01

An automatic differentiation tool (ADIFOR) is incorporated into a finite element based structural analysis program for shape and non-shape design sensitivity analysis of structural systems. The entire analysis and sensitivity procedures are parallelized and vectorized for high performance computation. Small scale examples to verify the accuracy of the proposed program and a medium scale example to demonstrate the parallel vector performance on multiple CRAY C90 processors are included.
The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

NASA Technical Reports Server (NTRS)

Woo, Alex C.; Hill, Kueichien C.

1996-01-01

The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.
Parallel line analysis: multifunctional software for the biomedical sciences

NASA Technical Reports Server (NTRS)

Swank, P. R.; Lewis, M. L.; Damron, K. L.; Morrison, D. R.

1990-01-01

An easy to use, interactive FORTRAN program for analyzing the results of parallel line assays is described. The program is menu driven and consists of five major components: data entry, data editing, manual analysis, manual plotting, and automatic analysis and plotting. Data can be entered from the terminal or from previously created data files. The data editing portion of the program is used to inspect and modify data and to statistically identify outliers. The manual analysis component is used to test the assumptions necessary for parallel line assays using analysis of covariance techniques and to determine potency ratios with confidence limits. The manual plotting component provides a graphic display of the data on the terminal screen or on a standard line printer. The automatic portion runs through multiple analyses without operator input. Data may be saved in a special file to expedite input at a future time.
Optimized and parallelized implementation of the electronegativity equalization method and the atom-bond electronegativity equalization method.

PubMed

Vareková, R Svobodová; Koca, J

2006-02-01

The most common way to calculate charge distribution in a molecule is ab initio quantum mechanics (QM). Some faster alternatives to QM have also been developed, the so-called "equalization methods" EEM and ABEEM, which are based on DFT. We have implemented and optimized the EEM and ABEEM methods and created the EEM SOLVER and ABEEM SOLVER programs. It has been found that the most time-consuming part of equalization methods is the reduction of the matrix belonging to the equation system generated by the method. Therefore, for both methods this part was replaced by the parallel algorithm WIRS and implemented within the PVM environment. The parallelized versions of the programs EEM SOLVER and ABEEM SOLVER showed promising results, especially on a single computer with several processors (compact PVM). The implemented programs are available through the Web page http://ncbr.chemi.muni.cz/~n19n/eem_abeem.
Automation of Data Traffic Control on DSM Architecture

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Jin, Hao-Qiang; Yan, Jerry

2001-01-01

The design of distributed shared memory (DSM) computers liberates users from the duty to distribute data across processors and allows for the incremental development of parallel programs using, for example, OpenMP or Java threads. DSM architecture greatly simplifies the development of parallel programs having good performance on a few processors. However, to achieve a good program scalability on DSM computers requires that the user understand data flow in the application and use various techniques to avoid data traffic congestions. In this paper we discuss a number of such techniques, including data blocking, data placement, data transposition and page size control and evaluate their efficiency on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks. We also present a tool which automates the detection of constructs causing data congestions in Fortran array oriented codes and advises the user on code transformations for improving data traffic in the application.
Developing Information Power Grid Based Algorithms and Software

NASA Technical Reports Server (NTRS)

Dongarra, Jack

1998-01-01

This exploratory study initiated our effort to understand performance modeling on parallel systems. The basic goal of performance modeling is to understand and predict the performance of a computer program or set of programs on a computer system. Performance modeling has numerous applications, including evaluation of algorithms, optimization of code implementations, parallel library development, comparison of system architectures, parallel system design, and procurement of new systems. Our work lays the basis for the construction of parallel libraries that allow for the reconstruction of application codes on several distinct architectures so as to assure performance portability. Following our strategy, once the requirements of applications are well understood, one can then construct a library in a layered fashion. The top level of this library will consist of architecture-independent geometric, numerical, and symbolic algorithms that are needed by the sample of applications. These routines should be written in a language that is portable across the targeted architectures.
Discrete sensitivity derivatives of the Navier-Stokes equations with a parallel Krylov solver

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Taylor, Arthur C., III

1994-01-01

This paper solves an 'incremental' form of the sensitivity equations derived by differentiating the discretized thin-layer Navier Stokes equations with respect to certain design variables of interest. The equations are solved with a parallel, preconditioned Generalized Minimal RESidual (GMRES) solver on a distributed-memory architecture. The 'serial' sensitivity analysis code is parallelized by using the Single Program Multiple Data (SPMD) programming model, domain decomposition techniques, and message-passing tools. Sensitivity derivatives are computed for low and high Reynolds number flows over a NACA 1406 airfoil on a 32-processor Intel Hypercube, and found to be identical to those computed on a single-processor Cray Y-MP. It is estimated that the parallel sensitivity analysis code has to be run on 40-50 processors of the Intel Hypercube in order to match the single-processor processing time of a Cray Y-MP.
Memory-based frame synchronizer. [for digital communication systems

NASA Technical Reports Server (NTRS)

Stattel, R. J.; Niswander, J. K. (Inventor)

1981-01-01

A frame synchronizer for use in digital communications systems wherein data formats can be easily and dynamically changed is described. The use of memory array elements provide increased flexibility in format selection and sync word selection in addition to real time reconfiguration ability. The frame synchronizer comprises a serial-to-parallel converter which converts a serial input data stream to a constantly changing parallel data output. This parallel data output is supplied to programmable sync word recognizers each consisting of a multiplexer and a random access memory (RAM). The multiplexer is connected to both the parallel data output and an address bus which may be connected to a microprocessor or computer for purposes of programming the sync word recognizer. The RAM is used as an associative memory or decorder and is programmed to identify a specific sync word. Additional programmable RAMs are used as counter decoders to define word bit length, frame word length, and paragraph frame length.
Parallel algorithms for modeling flow in permeable media. Annual report, February 15, 1995 - February 14, 1996

DOE Office of Scientific and Technical Information (OSTI.GOV)

G.A. Pope; K. Sephernoori; D.C. McKinney

1996-03-15

This report describes the application of distributed-memory parallel programming techniques to a compositional simulator called UTCHEM. The University of Texas Chemical Flooding reservoir simulator (UTCHEM) is a general-purpose vectorized chemical flooding simulator that models the transport of chemical species in three-dimensional, multiphase flow through permeable media. The parallel version of UTCHEM addresses solving large-scale problems by reducing the amount of time that is required to obtain the solution as well as providing a flexible and portable programming environment. In this work, the original parallel version of UTCHEM was modified and ported to CRAY T3D and CRAY T3E, distributed-memory, multiprocessor computersmore » using CRAY-PVM as the interprocessor communication library. Also, the data communication routines were modified such that the portability of the original code across different computer architectures was mad possible.« less
Parallel/distributed direct method for solving linear systems

NASA Technical Reports Server (NTRS)

Lin, Avi

1990-01-01

A new family of parallel schemes for directly solving linear systems is presented and analyzed. It is shown that these schemes exhibit a near optimal performance and enjoy several important features: (1) For large enough linear systems, the design of the appropriate paralleled algorithm is insensitive to the number of processors as its performance grows monotonically with them; (2) It is especially good for large matrices, with dimensions large relative to the number of processors in the system; (3) It can be used in both distributed parallel computing environments and tightly coupled parallel computing systems; and (4) This set of algorithms can be mapped onto any parallel architecture without any major programming difficulties or algorithmical changes.

Method, systems, and computer program products for implementing function-parallel network firewall

DOEpatents

Fulp, Errin W [Winston-Salem, NC; Farley, Ryan J [Winston-Salem, NC

2011-10-11

Methods, systems, and computer program products for providing function-parallel firewalls are disclosed. According to one aspect, a function-parallel firewall includes a first firewall node for filtering received packets using a first portion of a rule set including a plurality of rules. The first portion includes less than all of the rules in the rule set. At least one second firewall node filters packets using a second portion of the rule set. The second portion includes at least one rule in the rule set that is not present in the first portion. The first and second portions together include all of the rules in the rule set.
A computer program for converting rectangular coordinates to latitude-longitude coordinates

USGS Publications Warehouse

Rutledge, A.T.

1989-01-01

A computer program was developed for converting the coordinates of any rectangular grid on a map to coordinates on a grid that is parallel to lines of equal latitude and longitude. Using this program in conjunction with groundwater flow models, the user can extract data and results from models with varying grid orientations and place these data into grid structure that is oriented parallel to lines of equal latitude and longitude. All cells in the rectangular grid must have equal dimensions, and all cells in the latitude-longitude grid measure one minute by one minute. This program is applicable if the map used shows lines of equal latitude as arcs and lines of equal longitude as straight lines and assumes that the Earth 's surface can be approximated as a sphere. The program user enters the row number , column number, and latitude and longitude of the midpoint of the cell for three test cells on the rectangular grid. The latitude and longitude of boundaries of the rectangular grid also are entered. By solving sets of simultaneous linear equations, the program calculates coefficients that are used for making the conversion. As an option in the program, the user may build a groundwater model file based on a grid that is parallel to lines of equal latitude and longitude. The program reads a data file based on the rectangular coordinates and automatically forms the new data file. (USGS)
A distributed version of the NASA Engine Performance Program

NASA Technical Reports Server (NTRS)

Cours, Jeffrey T.; Curlett, Brian P.

1993-01-01

Distributed NEPP, a version of the NASA Engine Performance Program, uses the original NEPP code but executes it in a distributed computer environment. Multiple workstations connected by a network increase the program's speed and, more importantly, the complexity of the cases it can handle in a reasonable time. Distributed NEPP uses the public domain software package, called Parallel Virtual Machine, allowing it to execute on clusters of machines containing many different architectures. It includes the capability to link with other computers, allowing them to process NEPP jobs in parallel. This paper discusses the design issues and granularity considerations that entered into programming Distributed NEPP and presents the results of timing runs.
Performance of the Heavy Flavor Tracker (HFT) detector in star experiment at RHIC

NASA Astrophysics Data System (ADS)

Alruwaili, Manal

With the growing technology, the number of the processors is becoming massive. Current supercomputer processing will be available on desktops in the next decade. For mass scale application software development on massive parallel computing available on desktops, existing popular languages with large libraries have to be augmented with new constructs and paradigms that exploit massive parallel computing and distributed memory models while retaining the user-friendliness. Currently, available object oriented languages for massive parallel computing such as Chapel, X10 and UPC++ exploit distributed computing, data parallel computing and thread-parallelism at the process level in the PGAS (Partitioned Global Address Space) memory model. However, they do not incorporate: 1) any extension at for object distribution to exploit PGAS model; 2) the programs lack the flexibility of migrating or cloning an object between places to exploit load balancing; and 3) lack the programming paradigms that will result from the integration of data and thread-level parallelism and object distribution. In the proposed thesis, I compare different languages in PGAS model; propose new constructs that extend C++ with object distribution and object migration; and integrate PGAS based process constructs with these extensions on distributed objects. Object cloning and object migration. Also a new paradigm MIDD (Multiple Invocation Distributed Data) is presented when different copies of the same class can be invoked, and work on different elements of a distributed data concurrently using remote method invocations. I present new constructs, their grammar and their behavior. The new constructs have been explained using simple programs utilizing these constructs.
Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R

Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (`PAMI`) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single onemore » of the endpoints in the geometry, an instruction for the collective operation.« less
Using the Parallel Computing Toolbox with MATLAB on the Peregrine System |

Science.gov Websites

parallel pool took %g seconds.\\n', toc) % "single program multiple data" spmd fprintf('Worker %d says Hello World!\\n', labindex) end delete(gcp); % close the parallel pool exit To run the script on a compute node, create the file helloWorld.sub: #!/bin/bash #PBS -l walltime=05:00 #PBS -l nodes=1 #PBS -N
Address tracing for parallel machines

NASA Technical Reports Server (NTRS)

Stunkel, Craig B.; Janssens, Bob; Fuchs, W. Kent

1991-01-01

Recently implemented parallel system address-tracing methods based on several metrics are surveyed. The issues specific to collection of traces for both shared and distributed memory parallel computers are highlighted. Five general categories of address-trace collection methods are examined: hardware-captured, interrupt-based, simulation-based, altered microcode-based, and instrumented program-based traces. The problems unique to shared memory and distributed memory multiprocessors are examined separately.
By Hand or Not By-Hand: A Case Study of Alternative Approaches to Parallelize CFD Applications

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Bailey, David (Technical Monitor)

1997-01-01

While parallel processing promises to speed up applications by several orders of magnitude, the performance achieved still depends upon several factors, including the multiprocessor architecture, system software, data distribution and alignment, as well as the methods used for partitioning the application and mapping its components onto the architecture. The existence of the Gorden Bell Prize given out at Supercomputing every year suggests that while good performance can be attained for real applications on general purpose multiprocessors, the large investment in man-power and time still has to be repeated for each application-machine combination. As applications and machine architectures become more complex, the cost and time-delays for obtaining performance by hand will become prohibitive. Computer users today can turn to three possible avenues for help: parallel libraries, parallel languages and compilers, interactive parallelization tools. The success of these methodologies, in turn, depends on proper application of data dependency analysis, program structure recognition and transformation, performance prediction as well as exploitation of user supplied knowledge. NASA has been developing multidisciplinary applications on highly parallel architectures under the High Performance Computing and Communications Program. Over the past six years, the transition of underlying hardware and system software have forced the scientists to spend a large effort to migrate and recede their applications. Various attempts to exploit software tools to automate the parallelization process have not produced favorable results. In this paper, we report our most recent experience with CAPTOOL, a package developed at Greenwich University. We have chosen CAPTOOL for three reasons: 1. CAPTOOL accepts a FORTRAN 77 program as input. This suggests its potential applicability to a large collection of legacy codes currently in use. 2. CAPTOOL employs domain decomposition to obtain parallelism. Although the fact that not all kinds of parallelism are handled may seem unappealing, many NASA applications in computational aerosciences as well as earth and space sciences are amenable to domain decomposition. 3. CAPTOOL generates code for a large variety of environments employed across NASA centers: MPI/PVM on network of workstations to the IBS/SP2 and CRAY/T3D.
MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

NASA Technical Reports Server (NTRS)

Taft, James R.

1999-01-01

Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.
Verification of Electromagnetic Physics Models for Parallel Computing Architectures in the GeantV Project

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amadio, G.; et al.

An intensive R&D and programming effort is required to accomplish new challenges posed by future experimental high-energy particle physics (HEP) programs. The GeantV project aims to narrow the gap between the performance of the existing HEP detector simulation software and the ideal performance achievable, exploiting latest advances in computing technology. The project has developed a particle detector simulation prototype capable of transporting in parallel particles in complex geometries exploiting instruction level microparallelism (SIMD and SIMT), task-level parallelism (multithreading) and high-level parallelism (MPI), leveraging both the multi-core and the many-core opportunities. We present preliminary verification results concerning the electromagnetic (EM) physicsmore » models developed for parallel computing architectures within the GeantV project. In order to exploit the potential of vectorization and accelerators and to make the physics model effectively parallelizable, advanced sampling techniques have been implemented and tested. In this paper we introduce a set of automated statistical tests in order to verify the vectorized models by checking their consistency with the corresponding Geant4 models and to validate them against experimental data.« less
Cache Locality Optimization for Recursive Programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lifflander, Jonathan; Krishnamoorthy, Sriram

We present an approach to optimize the cache locality for recursive programs by dynamically splicing--recursively interleaving--the execution of distinct function invocations. By utilizing data effect annotations, we identify concurrency and data reuse opportunities across function invocations and interleave them to reduce reuse distance. We present algorithms that efficiently track effects in recursive programs, detect interference and dependencies, and interleave execution of function invocations using user-level (non-kernel) lightweight threads. To enable multi-core execution, a program is parallelized using a nested fork/join programming model. Our cache optimization strategy is designed to work in the context of a random work stealing scheduler. Wemore » present an implementation using the MIT Cilk framework that demonstrates significant improvements in sequential and parallel performance, competitive with a state-of-the-art compile-time optimizer for loop programs and a domain- specific optimizer for stencil programs.« less
A Comparison of Three Programming Models for Adaptive Applications

NASA Technical Reports Server (NTRS)

Shan, Hong-Zhang; Singh, Jaswinder Pal; Oliker, Leonid; Biswa, Rupak; Kwak, Dochan (Technical Monitor)

2000-01-01

We study the performance and programming effort for two major classes of adaptive applications under three leading parallel programming models. We find that all three models can achieve scalable performance on the state-of-the-art multiprocessor machines. The basic parallel algorithms needed for different programming models to deliver their best performance are similar, but the implementations differ greatly, far beyond the fact of using explicit messages versus implicit loads/stores. Compared with MPI and SHMEM, CC-SAS (cache-coherent shared address space) provides substantial ease of programming at the conceptual and program orchestration level, which often leads to the performance gain. However it may also suffer from the poor spatial locality of physically distributed shared data on large number of processors. Our CC-SAS implementation of the PARMETIS partitioner itself runs faster than in the other two programming models, and generates more balanced result for our application.
Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform.

PubMed

Cao, Jianfang; Chen, Lichao; Wang, Min; Tian, Yun

2018-01-01

The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance.
Parallel-Processing Software for Correlating Stereo Images

NASA Technical Reports Server (NTRS)

Klimeck, Gerhard; Deen, Robert; Mcauley, Michael; DeJong, Eric

2007-01-01

A computer program implements parallel- processing algorithms for cor relating images of terrain acquired by stereoscopic pairs of digital stereo cameras on an exploratory robotic vehicle (e.g., a Mars rove r). Such correlations are used to create three-dimensional computatio nal models of the terrain for navigation. In this program, the scene viewed by the cameras is segmented into subimages. Each subimage is assigned to one of a number of central processing units (CPUs) opera ting simultaneously.
Optimization by nonhierarchical asynchronous decomposition

NASA Technical Reports Server (NTRS)

Shankar, Jayashree; Ribbens, Calvin J.; Haftka, Raphael T.; Watson, Layne T.

1992-01-01

Large scale optimization problems are tractable only if they are somehow decomposed. Hierarchical decompositions are inappropriate for some types of problems and do not parallelize well. Sobieszczanski-Sobieski has proposed a nonhierarchical decomposition strategy for nonlinear constrained optimization that is naturally parallel. Despite some successes on engineering problems, the algorithm as originally proposed fails on simple two dimensional quadratic programs. The algorithm is carefully analyzed for quadratic programs, and a number of modifications are suggested to improve its robustness.
Final Report: Correctness Tools for Petascale Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mellor-Crummey, John

2014-10-27

In the course of developing parallel programs for leadership computing systems, subtle programming errors often arise that are extremely difficult to diagnose without tools. To meet this challenge, University of Maryland, the University of Wisconsin—Madison, and Rice University worked to develop lightweight tools to help code developers pinpoint a variety of program correctness errors that plague parallel scientific codes. The aim of this project was to develop software tools that help diagnose program errors including memory leaks, memory access errors, round-off errors, and data races. Research at Rice University focused on developing algorithms and data structures to support efficient monitoringmore » of multithreaded programs for memory access errors and data races. This is a final report about research and development work at Rice University as part of this project.« less
ng: What next-generation languages can teach us about HENP frameworks in the manycore era

NASA Astrophysics Data System (ADS)

Binet, Sébastien

2011-12-01

Current High Energy and Nuclear Physics (HENP) frameworks were written before multicore systems became widely deployed. A 'single-thread' execution model naturally emerged from that environment, however, this no longer fits into the processing model on the dawn of the manycore era. Although previous work focused on minimizing the changes to be applied to the LHC frameworks (because of the data taking phase) while still trying to reap the benefits of the parallel-enhanced CPU architectures, this paper explores what new languages could bring to the design of the next-generation frameworks. Parallel programming is still in an intensive phase of R&D and no silver bullet exists despite the 30+ years of literature on the subject. Yet, several parallel programming styles have emerged: actors, message passing, communicating sequential processes, task-based programming, data flow programming, ... to name a few. We present the work of the prototyping of a next-generation framework in new and expressive languages (python and Go) to investigate how code clarity and robustness are affected and what are the downsides of using languages younger than FORTRAN/C/C++.
DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.

PubMed

Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard

2004-09-09

Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
A Programming Framework for Scientific Applications on CPU-GPU Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Owens, John

2013-03-24

At a high level, my research interests center around designing, programming, and evaluating computer systems that use new approaches to solve interesting problems. The rapid change of technology allows a variety of different architectural approaches to computationally difficult problems, and a constantly shifting set of constraints and trends makes the solutions to these problems both challenging and interesting. One of the most important recent trends in computing has been a move to commodity parallel architectures. This sea change is motivated by the industry’s inability to continue to profitably increase performance on a single processor and instead to move to multiplemore » parallel processors. In the period of review, my most significant work has been leading a research group looking at the use of the graphics processing unit (GPU) as a general-purpose processor. GPUs can potentially deliver superior performance on a broad range of problems than their CPU counterparts, but effectively mapping complex applications to a parallel programming model with an emerging programming environment is a significant and important research problem.« less
Transputer parallel processing at NASA Lewis Research Center

NASA Technical Reports Server (NTRS)

Ellis, Graham K.

1989-01-01

The transputer parallel processing lab at NASA Lewis Research Center (LeRC) consists of 69 processors (transputers) that can be connected into various networks for use in general purpose concurrent processing applications. The main goal of the lab is to develop concurrent scientific and engineering application programs that will take advantage of the computational speed increases available on a parallel processor over the traditional sequential processor. Current research involves the development of basic programming tools. These tools will help standardize program interfaces to specific hardware by providing a set of common libraries for applications programmers. The thrust of the current effort is in developing a set of tools for graphics rendering/animation. The applications programmer currently has two options for on-screen plotting. One option can be used for static graphics displays and the other can be used for animated motion. The option for static display involves the use of 2-D graphics primitives that can be called from within an application program. These routines perform the standard 2-D geometric graphics operations in real-coordinate space as well as allowing multiple windows on a single screen.

A mathematical model to predict the effect of heat recovery on the wastewater temperature in sewers.

PubMed

Dürrenmatt, David J; Wanner, Oskar

2014-01-01

Raw wastewater contains considerable amounts of energy that can be recovered by means of a heat pump and a heat exchanger installed in the sewer. The technique is well established, and there are approximately 50 facilities in Switzerland, many of which have been successfully using this technique for years. The planning of new facilities requires predictions of the effect of heat recovery on the wastewater temperature in the sewer because altered wastewater temperatures may cause problems for the biological processes used in wastewater treatment plants and receiving waters. A mathematical model is presented that calculates the discharge in a sewer conduit and the spatial profiles and dynamics of the temperature in the wastewater, sewer headspace, pipe, and surrounding soil. The model was implemented in the simulation program TEMPEST and was used to evaluate measured time series of discharge and temperatures. It was found that the model adequately reproduces the measured data and that the temperature and thermal conductivity of the soil and the distance between the sewer pipe and undisturbed soil are the most sensitive model parameters. The temporary storage of heat in the pipe wall and the exchange of heat between wastewater and the pipe wall are the most important processes for heat transfer. The model can be used as a tool to determine the optimal site for heat recovery and the maximal amount of extractable heat. Copyright © 2013 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chrisochoides, N.; Sukup, F.

In this paper we present a parallel implementation of the Bowyer-Watson (BW) algorithm using the task-parallel programming model. The BW algorithm constitutes an ideal mesh refinement strategy for implementing a large class of unstructured mesh generation techniques on both sequential and parallel computers, by preventing the need for global mesh refinement. Its implementation on distributed memory multicomputes using the traditional data-parallel model has been proven very inefficient due to excessive synchronization needed among processors. In this paper we demonstrate that with the task-parallel model we can tolerate synchronization costs inherent to data-parallel methods by exploring concurrency in the processor level.more » Our preliminary performance data indicate that the task- parallel approach: (i) is almost four times faster than the existing data-parallel methods, (ii) scales linearly, and (iii) introduces minimum overheads compared to the {open_quotes}best{close_quotes} sequential implementation of the BW algorithm.« less
Simulation of Hanford Tank 241-C-106 Waste Release into Tank 241-Y-102

DOE Office of Scientific and Technical Information (OSTI.GOV)

KP Recknagle; Y Onishi

Waste stored in Hdord single-shell Tank 241-C-106 will be sluiced with a supernatant liquid from doubIe-shell Tank 241 -AY- 102 (AY-1 02) at the U.S. Department of Energy's Har@ord Site in Eastern Washington. The resulting slurry, containing up to 30 wtYo solids, will then be transferred to Tank AY-102. During the sluicing process, it is important to know the mass of the solids being transferred into AY- 102. One of the primary instruments used to measure solids transfer is an E+ densitometer located near the periphery of the tank at riser 15S. This study was undert.dcen to assess how wellmore » a densitometer measurement could represent the total mass of soiids transferred if a uniform lateral distribution was assumed. The study evaluated the C-1 06 slurry mixing and accumulation in Tank AY- 102 for the following five cases: Case 1: 3 wt'%0 slurry in 6.4-m AY-102 waste Case 2: 3 w-t% slurry in 4.3-m AY-102 waste Case 3: 30 wtYo slurry in 6.4-m AY-102 waste Case 4: 30 wt% slurry in 4.3-m AY-102 waste Case 5: 30 wt% slurry in 5. O-m AY-102 waste. The tirne-dependent, three-dimensional, TEMPEST computer code was used to simulate solid deposition and accumulation during the injection of the C-106 slurry into AY-102 through four injection nozzles. The TEMPEST computer code was applied previously to other Hanford tanks, AP-102, SY-102, AZ-101, SY-101, AY-102, and C-106, to model tank waste mixing with rotating pump jets, gas rollover events, waste transfer from one tank to another, and pump-out retrieval of the sluiced waste. The model results indicate that the solid depth accumulated at the densitometer is within 5% of the average depth accumulation. Thus the reading of the densitometer is expected to represent the total mass of the transferred solids reasonably well.« less
Object-Oriented Implementation of the NAS Parallel Benchmarks using Charm++

NASA Technical Reports Server (NTRS)

Krishnan, Sanjeev; Bhandarkar, Milind; Kale, Laxmikant V.

1996-01-01

This report describes experiences with implementing the NAS Computational Fluid Dynamics benchmarks using a parallel object-oriented language, Charm++. Our main objective in implementing the NAS CFD kernel benchmarks was to develop a code that could be used to easily experiment with different domain decomposition strategies and dynamic load balancing. We also wished to leverage the object-orientation provided by the Charm++ parallel object-oriented language, to develop reusable abstractions that would simplify the process of developing parallel applications. We first describe the Charm++ parallel programming model and the parallel object array abstraction, then go into detail about each of the Scalar Pentadiagonal (SP) and Lower/Upper Triangular (LU) benchmarks, along with performance results. Finally we conclude with an evaluation of the methodology used.
Analysis of the study skills of undergraduate pharmacy students of the University of Zambia School of Medicine.

PubMed

Ezeala, Christian Chinyere; Siyanga, Nalucha

2015-01-01

It aimed to compare the study skills of two groups of undergraduate pharmacy students in the School of Medicine, University of Zambia using the Study Skills Assessment Questionnaire (SSAQ), with the goal of analysing students' study skills and identifying factors that affect study skills. A questionnaire was distributed to 67 participants from both programs using stratified random sampling. Completed questionnaires were rated according to participants study skill. The total scores and scores within subscales were analysed and compared quantitatively. Questionnaires were distributed to 37 students in the regular program, and to 30 students in the parallel program. The response rate was 100%. Students had moderate to good study skills: 22 respondents (32.8%) showed good study skills, while 45 respondents (67.2%) were found to have moderate study skills. Students in the parallel program demonstrated significantly better study skills (mean SSAQ score, 185.4±14.5), particularly in time management and writing, than the students in the regular program (mean SSAQ score 175±25.4; P<0.05). No significant differences were found according to age, gender, residential or marital status, or level of study. The students in the parallel program had better time management and writing skills, probably due to their prior work experience. The more intensive training to students in regular program is needed in improving time management and writing skills.
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.

1991-01-01

A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
High-energy physics software parallelization using database techniques

NASA Astrophysics Data System (ADS)

Argante, E.; van der Stok, P. D. V.; Willers, I.

1997-02-01

A programming model for software parallelization, called CoCa, is introduced that copes with problems caused by typical features of high-energy physics software. By basing CoCa on the database transaction paradimg, the complexity induced by the parallelization is for a large part transparent to the programmer, resulting in a higher level of abstraction than the native message passing software. CoCa is implemented on a Meiko CS-2 and on a SUN SPARCcenter 2000 parallel computer. On the CS-2, the performance is comparable with the performance of native PVM and MPI.
Development, Verification and Validation of Parallel, Scalable Volume of Fluid CFD Program for Propulsion Applications

NASA Technical Reports Server (NTRS)

West, Jeff; Yang, H. Q.

2014-01-01

There are many instances involving liquid/gas interfaces and their dynamics in the design of liquid engine powered rockets such as the Space Launch System (SLS). Some examples of these applications are: Propellant tank draining and slosh, subcritical condition injector analysis for gas generators, preburners and thrust chambers, water deluge mitigation for launch induced environments and even solid rocket motor liquid slag dynamics. Commercially available CFD programs simulating gas/liquid interfaces using the Volume of Fluid approach are currently limited in their parallel scalability. In 2010 for instance, an internal NASA/MSFC review of three commercial tools revealed that parallel scalability was seriously compromised at 8 cpus and no additional speedup was possible after 32 cpus. Other non-interface CFD applications at the time were demonstrating useful parallel scalability up to 4,096 processors or more. Based on this review, NASA/MSFC initiated an effort to implement a Volume of Fluid implementation within the unstructured mesh, pressure-based algorithm CFD program, Loci-STREAM. After verification was achieved by comparing results to the commercial CFD program CFD-Ace+, and validation by direct comparison with data, Loci-STREAM-VoF is now the production CFD tool for propellant slosh force and slosh damping rate simulations at NASA/MSFC. On these applications, good parallel scalability has been demonstrated for problems sizes of tens of millions of cells and thousands of cpu cores. Ongoing efforts are focused on the application of Loci-STREAM-VoF to predict the transient flow patterns of water on the SLS Mobile Launch Platform in order to support the phasing of water for launch environment mitigation so that vehicle determinantal effects are not realized.
mm_par2.0: An object-oriented molecular dynamics simulation program parallelized using a hierarchical scheme with MPI and OPENMP

NASA Astrophysics Data System (ADS)

Oh, Kwang Jin; Kang, Ji Hoon; Myung, Hun Joo

2012-02-01

We have revised a general purpose parallel molecular dynamics simulation program mm_par using the object-oriented programming. We parallelized the revised version using a hierarchical scheme in order to utilize more processors for a given system size. The benchmark result will be presented here. New version program summaryProgram title: mm_par2.0 Catalogue identifier: ADXP_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXP_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 2 390 858 No. of bytes in distributed program, including test data, etc.: 25 068 310 Distribution format: tar.gz Programming language: C++ Computer: Any system operated by Linux or Unix Operating system: Linux Classification: 7.7 External routines: We provide wrappers for FFTW [1], Intel MKL library [2] FFT routine, and Numerical recipes [3] FFT, random number generator, and eigenvalue solver routines, SPRNG [4] random number generator, Mersenne Twister [5] random number generator, space filling curve routine. Catalogue identifier of previous version: ADXP_v1_0 Journal reference of previous version: Comput. Phys. Comm. 174 (2006) 560 Does the new version supersede the previous version?: Yes Nature of problem: Structural, thermodynamic, and dynamical properties of fluids and solids from microscopic scales to mesoscopic scales. Solution method: Molecular dynamics simulation in NVE, NVT, and NPT ensemble, Langevin dynamics simulation, dissipative particle dynamics simulation. Reasons for new version: First, object-oriented programming has been used, which is known to be open for extension and closed for modification. It is also known to be better for maintenance. Second, version 1.0 was based on atom decomposition and domain decomposition scheme [6] for parallelization. However, atom decomposition is not popular due to its poor scalability. On the other hand, domain decomposition scheme is better for scalability. It still has a limitation in utilizing a large number of cores on recent petascale computers due to the requirement that the domain size is larger than the potential cutoff distance. To go beyond such a limitation, a hierarchical parallelization scheme has been adopted in this new version and implemented using MPI [7] and OPENMP [8]. Summary of revisions: (1) Object-oriented programming has been used. (2) A hierarchical parallelization scheme has been adopted. (3) SPME routine has been fully parallelized with parallel 3D FFT using volumetric decomposition scheme [9]. K.J.O. thanks Mr. Seung Min Lee for useful discussion on programming and debugging. Running time: Running time depends on system size and methods used. For test system containing a protein (PDB id: 5DHFR) with CHARMM22 force field [10] and 7023 TIP3P [11] waters in simulation box having dimension 62.23 Å×62.23 Å×62.23 Å, the benchmark results are given in Fig. 1. Here the potential cutoff distance was set to 12 Å and the switching function was applied from 10 Å for the force calculation in real space. For the SPME [12] calculation, K, K, and K were set to 64 and the interpolation order was set to 4. To do the fast Fourier transform, we used Intel MKL library. All bonds including hydrogen atoms were constrained using SHAKE/RATTLE algorithms [13,14]. The code was compiled using Intel compiler version 11.1 and mvapich2 version 1.5. Fig. 2 shows performance gains from using CUDA-enabled version [15] of mm_par for 5DHFR simulation in water on Intel Core2Quad 2.83 GHz and GeForce GTX 580. Even though mm_par2.0 is not ported yet for GPU, its performance data would be useful to expect mm_par2.0 performance on GPU. Timing results for 1000 MD steps. 1, 2, 4, and 8 in the figure mean the number of OPENMP threads. Timing results for 1000 MD steps from double precision simulation on CPU, single precision simulation on GPU, and double precision simulation on GPU.
Analysis and selection of optimal function implementations in massively parallel computer

DOEpatents

Archer, Charles Jens [Rochester, MN; Peters, Amanda [Rochester, MN; Ratterman, Joseph D [Rochester, MN

2011-05-31

An apparatus, program product and method optimize the operation of a parallel computer system by, in part, collecting performance data for a set of implementations of a function capable of being executed on the parallel computer system based upon the execution of the set of implementations under varying input parameters in a plurality of input dimensions. The collected performance data may be used to generate selection program code that is configured to call selected implementations of the function in response to a call to the function under varying input parameters. The collected performance data may be used to perform more detailed analysis to ascertain the comparative performance of the set of implementations of the function under the varying input parameters.
The Portals 4.0 network programming interface.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barrett, Brian W.; Brightwell, Ronald Brian; Pedretti, Kevin

2012-11-01

This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandias Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generationmore » of machines employing advanced network interface architectures that support enhanced offload capabilities.« less
Automating FEA programming

NASA Technical Reports Server (NTRS)

Sharma, Naveen

1992-01-01

In this paper we briefly describe a combined symbolic and numeric approach for solving mathematical models on parallel computers. An experimental software system, PIER, is being developed in Common Lisp to synthesize computationally intensive and domain formulation dependent phases of finite element analysis (FEA) solution methods. Quantities for domain formulation like shape functions, element stiffness matrices, etc., are automatically derived using symbolic mathematical computations. The problem specific information and derived formulae are then used to generate (parallel) numerical code for FEA solution steps. A constructive approach to specify a numerical program design is taken. The code generator compiles application oriented input specifications into (parallel) FORTRAN77 routines with the help of built-in knowledge of the particular problem, numerical solution methods and the target computer.
Development for SSV on a parallel processing system (PARAGON)

NASA Astrophysics Data System (ADS)

Gothard, Benny M.; Allmen, Mark; Carroll, Michael J.; Rich, Dan

1995-12-01

A goal of the surrogate semi-autonomous vehicle (SSV) program is to have multiple vehicles navigate autonomously and cooperatively with other vehicles. This paper describes the process and tools used in porting UGV/SSV (unmanned ground vehicle) autonomous mobility and target recognition algorithms from a SISD (single instruction single data) processor architecture (i.e., a Sun SPARC workstation running C/UNIX) to a MIMD (multiple instruction multiple data) parallel processor architecture (i.e., PARAGON-a parallel set of i860 processors running C/UNIX). It discusses the gains in performance and the pitfalls of such a venture. It also examines the merits of this processor architecture (based on this conceptual prototyping effort) and programming paradigm to meet the final SSV demonstration requirements.
GASPRNG: GPU accelerated scalable parallel random number generator library

NASA Astrophysics Data System (ADS)

Gao, Shuang; Peterson, Gregory D.

2013-04-01

Graphics processors represent a promising technology for accelerating computational science applications. Many computational science applications require fast and scalable random number generation with good statistical properties, so they use the Scalable Parallel Random Number Generators library (SPRNG). We present the GPU Accelerated SPRNG library (GASPRNG) to accelerate SPRNG in GPU-based high performance computing systems. GASPRNG includes code for a host CPU and CUDA code for execution on NVIDIA graphics processing units (GPUs) along with a programming interface to support various usage models for pseudorandom numbers and computational science applications executing on the CPU, GPU, or both. This paper describes the implementation approach used to produce high performance and also describes how to use the programming interface. The programming interface allows a user to be able to use GASPRNG the same way as SPRNG on traditional serial or parallel computers as well as to develop tightly coupled programs executing primarily on the GPU. We also describe how to install GASPRNG and use it. To help illustrate linking with GASPRNG, various demonstration codes are included for the different usage models. GASPRNG on a single GPU shows up to 280x speedup over SPRNG on a single CPU core and is able to scale for larger systems in the same manner as SPRNG. Because GASPRNG generates identical streams of pseudorandom numbers as SPRNG, users can be confident about the quality of GASPRNG for scalable computational science applications. Catalogue identifier: AEOI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOI_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: UTK license. No. of lines in distributed program, including test data, etc.: 167900 No. of bytes in distributed program, including test data, etc.: 1422058 Distribution format: tar.gz Programming language: C and CUDA. Computer: Any PC or workstation with NVIDIA GPU (Tested on Fermi GTX480, Tesla C1060, Tesla M2070). Operating system: Linux with CUDA version 4.0 or later. Should also run on MacOS, Windows, or UNIX. Has the code been vectorized or parallelized?: Yes. Parallelized using MPI directives. RAM: 512 MB˜ 732 MB (main memory on host CPU, depending on the data type of random numbers.) / 512 MB (GPU global memory) Classification: 4.13, 6.5. Nature of problem: Many computational science applications are able to consume large numbers of random numbers. For example, Monte Carlo simulations are able to consume limitless random numbers for the computation as long as resources for the computing are supported. Moreover, parallel computational science applications require independent streams of random numbers to attain statistically significant results. The SPRNG library provides this capability, but at a significant computational cost. The GASPRNG library presented here accelerates the generators of independent streams of random numbers using graphical processing units (GPUs). Solution method: Multiple copies of random number generators in GPUs allow a computational science application to consume large numbers of random numbers from independent, parallel streams. GASPRNG is a random number generators library to allow a computational science application to employ multiple copies of random number generators to boost performance. Users can interface GASPRNG with software code executing on microprocessors and/or GPUs. Running time: The tests provided take a few minutes to run.
Fenix, A Fault Tolerant Programming Framework for MPI Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gamel, Marc; Teranihi, Keita; Valenzuela, Eric

2016-10-05

Fenix provides APIs to allow the users to add fault tolerance capability to MPI-based parallel programs in a transparent manner. Fenix-enabled programs can run through process failures during program execution using a pool of spare processes accommodated by Fenix.
Fully Parallel MHD Stability Analysis Tool

NASA Astrophysics Data System (ADS)

Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

2014-10-01

Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Initial results of the code parallelization will be reported. Work is supported by the U.S. DOE SBIR program.
GPU COMPUTING FOR PARTICLE TRACKING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nishimura, Hiroshi; Song, Kai; Muriki, Krishna

2011-03-25

This is a feasibility study of using a modern Graphics Processing Unit (GPU) to parallelize the accelerator particle tracking code. To demonstrate the massive parallelization features provided by GPU computing, a simplified TracyGPU program is developed for dynamic aperture calculation. Performances, issues, and challenges from introducing GPU are also discussed. General purpose Computation on Graphics Processing Units (GPGPU) bring massive parallel computing capabilities to numerical calculation. However, the unique architecture of GPU requires a comprehensive understanding of the hardware and programming model to be able to well optimize existing applications. In the field of accelerator physics, the dynamic aperture calculationmore » of a storage ring, which is often the most time consuming part of the accelerator modeling and simulation, can benefit from GPU due to its embarrassingly parallel feature, which fits well with the GPU programming model. In this paper, we use the Tesla C2050 GPU which consists of 14 multi-processois (MP) with 32 cores on each MP, therefore a total of 448 cores, to host thousands ot threads dynamically. Thread is a logical execution unit of the program on GPU. In the GPU programming model, threads are grouped into a collection of blocks Within each block, multiple threads share the same code, and up to 48 KB of shared memory. Multiple thread blocks form a grid, which is executed as a GPU kernel. A simplified code that is a subset of Tracy++ [2] is developed to demonstrate the possibility of using GPU to speed up the dynamic aperture calculation by having each thread track a particle.« less
Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing

NASA Astrophysics Data System (ADS)

Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.

2017-07-01

Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
Large-scale parallel lattice Boltzmann-cellular automaton model of two-dimensional dendritic growth

NASA Astrophysics Data System (ADS)

Jelinek, Bohumir; Eshraghi, Mohsen; Felicelli, Sergio; Peters, John F.

2014-03-01

An extremely scalable lattice Boltzmann (LB)-cellular automaton (CA) model for simulations of two-dimensional (2D) dendritic solidification under forced convection is presented. The model incorporates effects of phase change, solute diffusion, melt convection, and heat transport. The LB model represents the diffusion, convection, and heat transfer phenomena. The dendrite growth is driven by a difference between actual and equilibrium liquid composition at the solid-liquid interface. The CA technique is deployed to track the new interface cells. The computer program was parallelized using the Message Passing Interface (MPI) technique. Parallel scaling of the algorithm was studied and major scalability bottlenecks were identified. Efficiency loss attributable to the high memory bandwidth requirement of the algorithm was observed when using multiple cores per processor. Parallel writing of the output variables of interest was implemented in the binary Hierarchical Data Format 5 (HDF5) to improve the output performance, and to simplify visualization. Calculations were carried out in single precision arithmetic without significant loss in accuracy, resulting in 50% reduction of memory and computational time requirements. The presented solidification model shows a very good scalability up to centimeter size domains, including more than ten million of dendrites. Catalogue identifier: AEQZ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEQZ_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, UK Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 29,767 No. of bytes in distributed program, including test data, etc.: 3131,367 Distribution format: tar.gz Programming language: Fortran 90. Computer: Linux PC and clusters. Operating system: Linux. Has the code been vectorized or parallelized?: Yes. Program is parallelized using MPI. Number of processors used: 1-50,000 RAM: Memory requirements depend on the grid size Classification: 6.5, 7.7. External routines: MPI (http://www.mcs.anl.gov/research/projects/mpi/), HDF5 (http://www.hdfgroup.org/HDF5/) Nature of problem: Dendritic growth in undercooled Al-3 wt% Cu alloy melt under forced convection. Solution method: The lattice Boltzmann model solves the diffusion, convection, and heat transfer phenomena. The cellular automaton technique is deployed to track the solid/liquid interface. Restrictions: Heat transfer is calculated uncoupled from the fluid flow. Thermal diffusivity is constant. Unusual features: Novel technique, utilizing periodic duplication of a pre-grown “incubation” domain, is applied for the scaleup test. Running time: Running time varies from minutes to days depending on the domain size and number of computational cores.
A numerical differentiation library exploiting parallel architectures

NASA Astrophysics Data System (ADS)

Voglis, C.; Hadjidoukas, P. E.; Lagaris, I. E.; Papageorgiou, D. G.

2009-08-01

We present a software library for numerically estimating first and second order partial derivatives of a function by finite differencing. Various truncation schemes are offered resulting in corresponding formulas that are accurate to order O(h), O(h), and O(h), h being the differencing step. The derivatives are calculated via forward, backward and central differences. Care has been taken that only feasible points are used in the case where bound constraints are imposed on the variables. The Hessian may be approximated either from function or from gradient values. There are three versions of the software: a sequential version, an OpenMP version for shared memory architectures and an MPI version for distributed systems (clusters). The parallel versions exploit the multiprocessing capability offered by computer clusters, as well as modern multi-core systems and due to the independent character of the derivative computation, the speedup scales almost linearly with the number of available processors/cores. Program summaryProgram title: NDL (Numerical Differentiation Library) Catalogue identifier: AEDG_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDG_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 73 030 No. of bytes in distributed program, including test data, etc.: 630 876 Distribution format: tar.gz Programming language: ANSI FORTRAN-77, ANSI C, MPI, OPENMP Computer: Distributed systems (clusters), shared memory systems Operating system: Linux, Solaris Has the code been vectorised or parallelized?: Yes RAM: The library uses O(N) internal storage, N being the dimension of the problem Classification: 4.9, 4.14, 6.5 Nature of problem: The numerical estimation of derivatives at several accuracy levels is a common requirement in many computational tasks, such as optimization, solution of nonlinear systems, etc. The parallel implementation that exploits systems with multiple CPUs is very important for large scale and computationally expensive problems. Solution method: Finite differencing is used with carefully chosen step that minimizes the sum of the truncation and round-off errors. The parallel versions employ both OpenMP and MPI libraries. Restrictions: The library uses only double precision arithmetic. Unusual features: The software takes into account bound constraints, in the sense that only feasible points are used to evaluate the derivatives, and given the level of the desired accuracy, the proper formula is automatically employed. Running time: Running time depends on the function's complexity. The test run took 15 ms for the serial distribution, 0.6 s for the OpenMP and 4.2 s for the MPI parallel distribution on 2 processors.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.