unam-cray supercomputing conference: Topics by Science.gov

Sample records for unam-cray supercomputing conference

CRAY mini manual. Revision D

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Howser, Lona M.

1993-01-01

This document briefly describes the use of the CRAY supercomputers that are an integral part of the Supercomputing Network Subsystem of the Central Scientific Computing Complex at LaRC. Features of the CRAY supercomputers are covered, including: FORTRAN, C, PASCAL, architectures of the CRAY-2 and CRAY Y-MP, the CRAY UNICOS environment, batch job submittal, debugging, performance analysis, parallel processing, utilities unique to CRAY, and documentation. The document is intended for all CRAY users as a ready reference to frequently asked questions and to more detailed information contained in the vendor manuals. It is appropriate for both the novice and the experienced user.
Edison - A New Cray Supercomputer Advances Discovery at NERSC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy

2014-02-06

When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.
Edison - A New Cray Supercomputer Advances Discovery at NERSC

ScienceCinema

Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy; Trebotich, David; Broughton, Jeff; Antypas, Katie; Lukic, Zarija, Borrill, Julian; Draney, Brent; Chen, Jackie

2018-01-16

When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.
NAS technical summaries: Numerical aerodynamic simulation program, March 1991 - February 1992

NASA Technical Reports Server (NTRS)

1992-01-01

NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefiting other supercomputer centers in Government and industry. This report contains selected scientific results from the 1991-92 NAS Operational Year, March 4, 1991 to March 3, 1992, which is the fifth year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP. The Cray-2, the first generation supercomputer, has four processors, 256 megawords of central memory, and a total sustained speed of 250 million floating point operations per second. The Cray Y-MP, the second generation supercomputer, has eight processors and a total sustained speed of one billion floating point operations per second. Additional memory was installed this year, doubling capacity from 128 to 256 megawords of solid-state storage-device memory. Because of its higher performance, the Cray Y-MP delivered approximately 77 percent of the total number of supercomputer hours used during this year.
Full speed ahead for software

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolfe, A.

1986-03-10

Supercomputing software is moving into high gear, spurred by the rapid spread of supercomputers into new applications. The critical challenge is how to develop tools that will make it easier for programmers to write applications that take advantage of vectorizing in the classical supercomputer and the parallelism that is emerging in supercomputers and minisupercomputers. Writing parallel software is a challenge that every programmer must face because parallel architectures are springing up across the range of computing. Cray is developing a host of tools for programmers. Tools to support multitasking (in supercomputer parlance, multitasking means dividing up a single program tomore » run on multiple processors) are high on Cray's agenda. On tap for multitasking is Premult, dubbed a microtasking tool. As a preprocessor for Cray's CFT77 FORTRAN compiler, Premult will provide fine-grain multitasking.« less
1993 Gordon Bell Prize Winners

NASA Technical Reports Server (NTRS)

Karp, Alan H.; Simon, Horst; Heller, Don; Cooper, D. M. (Technical Monitor)

1994-01-01

The Gordon Bell Prize recognizes significant achievements in the application of supercomputers to scientific and engineering problems. In 1993, finalists were named for work in three categories: (1) Performance, which recognizes those who solved a real problem in the quickest elapsed time. (2) Price/performance, which encourages the development of cost-effective supercomputing. (3) Compiler-generated speedup, which measures how well compiler writers are facilitating the programming of parallel processors. The winners were announced November 17 at the Supercomputing 93 conference in Portland, Oregon. Gordon Bell, an independent consultant in Los Altos, California, is sponsoring $2,000 in prizes each year for 10 years to promote practical parallel processing research. This is the sixth year of the prize, which Computer administers. Something unprecedented in Gordon Bell Prize competition occurred this year: A computer manufacturer was singled out for recognition. Nine entries reporting results obtained on the Cray C90 were received, seven of the submissions orchestrated by Cray Research. Although none of these entries showed sufficiently high performance to win outright, the judges were impressed by the breadth of applications that ran well on this machine, all nine running at more than a third of the peak performance of the machine.
Parallel computation in a three-dimensional elastic-plastic finite-element analysis

NASA Technical Reports Server (NTRS)

Shivakumar, K. N.; Bigelow, C. A.; Newman, J. C., Jr.

1992-01-01

A CRAY parallel processing technique called autotasking was implemented in a three-dimensional elasto-plastic finite-element code. The technique was evaluated on two CRAY supercomputers, a CRAY 2 and a CRAY Y-MP. Autotasking was implemented in all major portions of the code, except the matrix equations solver. Compiler directives alone were not able to properly multitask the code; user-inserted directives were required to achieve better performance. It was noted that the connect time, rather than wall-clock time, was more appropriate to determine speedup in multiuser environments. For a typical example problem, a speedup of 2.1 (1.8 when the solution time was included) was achieved in a dedicated environment and 1.7 (1.6 with solution time) in a multiuser environment on a four-processor CRAY 2 supercomputer. The speedup on a three-processor CRAY Y-MP was about 2.4 (2.0 with solution time) in a multiuser environment.
Some Problems and Solutions in Transferring Ecosystem Simulation Codes to Supercomputers

NASA Technical Reports Server (NTRS)

Skiles, J. W.; Schulbach, C. H.

1994-01-01

Many computer codes for the simulation of ecological systems have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Recent recognition of ecosystem science as a High Performance Computing and Communications Program Grand Challenge area emphasizes supercomputers (both parallel and distributed systems) as the next set of tools for ecological simulation. Transferring ecosystem simulation codes to such systems is not a matter of simply compiling and executing existing code on the supercomputer since there are significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers. To more appropriately match the application to the architecture (necessary to achieve reasonable performance), the parallelism (if it exists) of the original application must be exploited. We discuss our work in transferring a general grassland simulation model (developed on a VAX in the FORTRAN computer programming language) to a Cray Y-MP. We show the Cray shared-memory vector-architecture, and discuss our rationale for selecting the Cray. We describe porting the model to the Cray and executing and verifying a baseline version, and we discuss the changes we made to exploit the parallelism in the application and to improve code execution. As a result, the Cray executed the model 30 times faster than the VAX 11/785 and 10 times faster than a Sun 4 workstation. We achieved an additional speed-up of approximately 30 percent over the original Cray run by using the compiler's vectorizing capabilities and the machine's ability to put subroutines and functions "in-line" in the code. With the modifications, the code still runs at only about 5% of the Cray's peak speed because it makes ineffective use of the vector processing capabilities of the Cray. We conclude with a discussion and future plans.
Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sarje, Abhinav; Jacobsen, Douglas W.; Williams, Samuel W.

The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.
Transferring ecosystem simulation codes to supercomputers

NASA Technical Reports Server (NTRS)

Skiles, J. W.; Schulbach, C. H.

1995-01-01

Many ecosystem simulation computer codes have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Supercomputing platforms (both parallel and distributed systems) have been largely unused, however, because of the perceived difficulty in accessing and using the machines. Also, significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers must be considered. We have transferred a grassland simulation model (developed on a VAX) to a Cray Y-MP/C90. We describe porting the model to the Cray and the changes we made to exploit the parallelism in the application and improve code execution. The Cray executed the model 30 times faster than the VAX and 10 times faster than a Unix workstation. We achieved an additional speedup of 30 percent by using the compiler's vectoring and 'in-line' capabilities. The code runs at only about 5 percent of the Cray's peak speed because it ineffectively uses the vector and parallel processing capabilities of the Cray. We expect that by restructuring the code, it could execute an additional six to ten times faster.
Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias;

2006-01-01

The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.

Improved Access to Supercomputers Boosts Chemical Applications.

ERIC Educational Resources Information Center

Borman, Stu

1989-01-01

Supercomputing is described in terms of computing power and abilities. The increase in availability of supercomputers for use in chemical calculations and modeling are reported. Efforts of the National Science Foundation and Cray Research are highlighted. (CW)
PREFACE: HITES 2012: 'Horizons of Innovative Theories, Experiments, and Supercomputing in Nuclear Physics'

NASA Astrophysics Data System (ADS)

Hecht, K. T.

2012-12-01

This volume contains the contributions of the speakers of an international conference in honor of Jerry Draayer's 70th birthday, entitled 'Horizons of Innovative Theories, Experiments and Supercomputing in Nuclear Physics'. The list of contributors includes not only international experts in these fields, but also many former collaborators, former graduate students, and former postdoctoral fellows of Jerry Draayer, stressing innovative theories such as special symmetries and supercomputing, both of particular interest to Jerry. The organizers of the conference intended to honor Jerry Draayer not only for his seminal contributions in these fields, but also for his administrative skills at departmental, university, national and international level. Signed: Ted Hecht University of Michigan Conference photograph Scientific Advisory Committee Ani AprahamianUniversity of Notre Dame Baha BalantekinUniversity of Wisconsin Bruce BarrettUniversity of Arizona Umit CatalyurekOhio State Unversity David DeanOak Ridge National Laboratory Jutta Escher (Chair)Lawrence Livermore National Laboratory Jorge HirschUNAM, Mexico David RoweUniversity of Toronto Brad Sherill & Michigan State University Joel TohlineLouisiana State University Edward ZganjarLousiana State University Organizing Committee Jeff BlackmonLouisiana State University Mark CaprioUniversity of Notre Dame Tomas DytrychLouisiana State University Ana GeorgievaINRNE, Bulgaria Kristina Launey (Co-chair)Louisiana State University Gabriella PopaOhio University Zanesville James Vary (Co-chair)Iowa State University Local Organizing Committee Laura LinhardtLouisiana State University Charlie RascoLouisiana State University Karen Richard (Coordinator)Louisiana State University
Demonstration of Cost-Effective, High-Performance Computing at Performance and Reliability Levels Equivalent to a 1994 Vector Supercomputer

NASA Technical Reports Server (NTRS)

Babrauckas, Theresa

2000-01-01

The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.
Introducing Argonne’s Theta Supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

Theta, the Argonne Leadership Computing Facility’s (ALCF) new Intel-Cray supercomputer, is officially open to the research community. Theta’s massively parallel, many-core architecture puts the ALCF on the path to Aurora, the facility’s future Intel-Cray system. Capable of nearly 10 quadrillion calculations per second, Theta enables researchers to break new ground in scientific investigations that range from modeling the inner workings of the brain to developing new materials for renewable energy applications.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer

NASA Technical Reports Server (NTRS)

Hornfeck, William A.

1988-01-01

A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
Data communication requirements for the advanced NAS network

NASA Technical Reports Server (NTRS)

Levin, Eugene; Eaton, C. K.; Young, Bruce

1986-01-01

The goal of the Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations, and by remote communications to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. In the 1987/1988 time period it is anticipated that a computer with 4 times the processing speed of a Cray 2 will be obtained and by 1990 an additional supercomputer with 16 times the speed of the Cray 2. The implications of this 20-fold increase in processing power on the data communications requirements are described. The analysis was based on models of the projected workload and system architecture. The results are presented together with the estimates of their sensitivity to assumptions inherent in the models.
Optimization of Supercomputer Use on EADS II System

NASA Technical Reports Server (NTRS)

Ahmed, Ardsher

1998-01-01

The main objective of this research was to optimize supercomputer use to achieve better throughput and utilization of supercomputers and to help facilitate the movement of non-supercomputing (inappropriate for supercomputer) codes to mid-range systems for better use of Government resources at Marshall Space Flight Center (MSFC). This work involved the survey of architectures available on EADS II and monitoring customer (user) applications running on a CRAY T90 system.
LASL benchmark performance 1978. [CDC STAR-100, 6600, 7600, Cyber 73, and CRAY-1

DOE Office of Scientific and Technical Information (OSTI.GOV)

McKnight, A.L.

1979-08-01

This report presents the results of running several benchmark programs on a CDC STAR-100, a Cray Research CRAY-1, a CDC 6600, a CDC 7600, and a CDC Cyber 73. The benchmark effort included CRAY-1's at several installations running different operating systems and compilers. This benchmark is part of an ongoing program at Los Alamos Scientific Laboratory to collect performance data and monitor the development trend of supercomputers. 3 tables.
Survey of new vector computers: The CRAY 1S from CRAY research; the CYBER 205 from CDC and the parallel computer from ICL - architecture and programming

NASA Technical Reports Server (NTRS)

Gentzsch, W.

1982-01-01

Problems which can arise with vector and parallel computers are discussed in a user oriented context. Emphasis is placed on the algorithms used and the programming techniques adopted. Three recently developed supercomputers are examined and typical application examples are given in CRAY FORTRAN, CYBER 205 FORTRAN and DAP (distributed array processor) FORTRAN. The systems performance is compared. The addition of parts of two N x N arrays is considered. The influence of the architecture on the algorithms and programming language is demonstrated. Numerical analysis of magnetohydrodynamic differential equations by an explicit difference method is illustrated, showing very good results for all three systems. The prognosis for supercomputer development is assessed.

A performance comparison of the Cray-2 and the Cray X-MP

NASA Technical Reports Server (NTRS)

Schmickley, Ronald; Bailey, David H.

1986-01-01

A suite of thirteen large Fortran benchmark codes were run on Cray-2 and Cray X-MP supercomputers. These codes were a mix of compute-intensive scientific application programs (mostly Computational Fluid Dynamics) and some special vectorized computation exercise programs. For the general class of programs tested on the Cray-2, most of which were not specially tuned for speed, the floating point operation rates varied under a variety of system load configurations from 40 percent up to 125 percent of X-MP performance rates. It is concluded that the Cray-2, in the original system configuration studied (without memory pseudo-banking) will run untuned Fortran code, on average, about 70 percent of X-MP speeds.
Chemical calculations on Cray computers

NASA Technical Reports Server (NTRS)

Taylor, Peter R.; Bauschlicher, Charles W., Jr.; Schwenke, David W.

1989-01-01

The influence of recent developments in supercomputing on computational chemistry is discussed with particular reference to Cray computers and their pipelined vector/limited parallel architectures. After reviewing Cray hardware and software the performance of different elementary program structures are examined, and effective methods for improving program performance are outlined. The computational strategies appropriate for obtaining optimum performance in applications to quantum chemistry and dynamics are discussed. Finally, some discussion is given of new developments and future hardware and software improvements.
Optimal Full Information Synthesis for Flexible Structures Implemented on Cray Supercomputers

NASA Technical Reports Server (NTRS)

Lind, Rick; Balas, Gary J.

1995-01-01

This paper considers an algorithm for synthesis of optimal controllers for full information feedback. The synthesis procedure reduces to a single linear matrix inequality which may be solved via established convex optimization algorithms. The computational cost of the optimization is investigated. It is demonstrated the problem dimension and corresponding matrices can become large for practical engineering problems. This algorithm represents a process that is impractical for standard workstations for large order systems. A flexible structure is presented as a design example. Control synthesis requires several days on a workstation but may be solved in a reasonable amount of time using a Cray supercomputer.
FFTs in external or hierarchical memory

NASA Technical Reports Server (NTRS)

Bailey, David H.

1989-01-01

A description is given of advanced techniques for computing an ordered FFT on a computer with external or hierarchical memory. These algorithms (1) require as few as two passes through the external data set, (2) use strictly unit stride, long vector transfers between main memory and external storage, (3) require only a modest amount of scratch space in main memory, and (4) are well suited for vector and parallel computation. Performance figures are included for implementations of some of these algorithms on Cray supercomputers. Of interest is the fact that a main memory version outperforms the current Cray library FFT routines on the Cray-2, the Cray X-MP, and the Cray Y-MP systems. Using all eight processors on the Cray Y-MP, this main memory routine runs at nearly 2 Gflops.
Internal computational fluid mechanics on supercomputers for aerospace propulsion systems

NASA Technical Reports Server (NTRS)

Andersen, Bernhard H.; Benson, Thomas J.

1987-01-01

The accurate calculation of three-dimensional internal flowfields for application towards aerospace propulsion systems requires computational resources available only on supercomputers. A survey is presented of three-dimensional calculations of hypersonic, transonic, and subsonic internal flowfields conducted at the Lewis Research Center. A steady state Parabolized Navier-Stokes (PNS) solution of flow in a Mach 5.0, mixed compression inlet, a Navier-Stokes solution of flow in the vicinity of a terminal shock, and a PNS solution of flow in a diffusing S-bend with vortex generators are presented and discussed. All of these calculations were performed on either the NAS Cray-2 or the Lewis Research Center Cray XMP.
Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP

NASA Technical Reports Server (NTRS)

Chan, Tony F.; Fatoohi, Rod A.

1990-01-01

The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.
A parallel finite-difference method for computational aerodynamics

NASA Technical Reports Server (NTRS)

Swisshelm, Julie M.

1989-01-01

A finite-difference scheme for solving complex three-dimensional aerodynamic flow on parallel-processing supercomputers is presented. The method consists of a basic flow solver with multigrid convergence acceleration, embedded grid refinements, and a zonal equation scheme. Multitasking and vectorization have been incorporated into the algorithm. Results obtained include multiprocessed flow simulations from the Cray X-MP and Cray-2. Speedups as high as 3.3 for the two-dimensional case and 3.5 for segments of the three-dimensional case have been achieved on the Cray-2. The entire solver attained a factor of 2.7 improvement over its unitasked version on the Cray-2. The performance of the parallel algorithm on each machine is analyzed.
Antenna pattern control using impedance surfaces

NASA Technical Reports Server (NTRS)

Balanis, Constantine A.; Liu, Kefeng

1992-01-01

During this research period, we have effectively transferred existing computer codes from CRAY supercomputer to work station based systems. The work station based version of our code preserved the accuracy of the numerical computations while giving a much better turn-around time than the CRAY supercomputer. Such a task relieved us of the heavy dependence of the supercomputer account budget and made codes developed in this research project more feasible for applications. The analysis of pyramidal horns with impedance surfaces was our major focus during this research period. Three different modeling algorithms in analyzing lossy impedance surfaces were investigated and compared with measured data. Through this investigation, we discovered that a hybrid Fourier transform technique, which uses the eigen mode in the stepped waveguide section and the Fourier transformed field distributions across the stepped discontinuities for lossy impedances coating, gives a better accuracy in analyzing lossy coatings. After a further refinement of the present technique, we will perform an accurate radiation pattern synthesis in the coming reporting period.
Using Strassen's algorithm to accelerate the solution of linear systems

NASA Technical Reports Server (NTRS)

Bailey, David H.; Lee, King; Simon, Horst D.

1990-01-01

Strassen's algorithm for fast matrix-matrix multiplication has been implemented for matrices of arbitrary shapes on the CRAY-2 and CRAY Y-MP supercomputers. Several techniques have been used to reduce the scratch space requirement for this algorithm while simultaneously preserving a high level of performance. When the resulting Strassen-based matrix multiply routine is combined with some routines from the new LAPACK library, LU decomposition can be performed with rates significantly higher than those achieved by conventional means. We succeeded in factoring a 2048 x 2048 matrix on the CRAY Y-MP at a rate equivalent to 325 MFLOPS.
FAST: A multi-processed environment for visualization of computational fluid dynamics

NASA Technical Reports Server (NTRS)

Bancroft, Gordon V.; Merritt, Fergus J.; Plessel, Todd C.; Kelaita, Paul G.; Mccabe, R. Kevin

1991-01-01

Three-dimensional, unsteady, multi-zoned fluid dynamics simulations over full scale aircraft are typical of the problems being investigated at NASA Ames' Numerical Aerodynamic Simulation (NAS) facility on CRAY2 and CRAY-YMP supercomputers. With multiple processor workstations available in the 10-30 Mflop range, we feel that these new developments in scientific computing warrant a new approach to the design and implementation of analysis tools. These larger, more complex problems create a need for new visualization techniques not possible with the existing software or systems available as of this writing. The visualization techniques will change as the supercomputing environment, and hence the scientific methods employed, evolves even further. The Flow Analysis Software Toolkit (FAST), an implementation of a software system for fluid mechanics analysis, is discussed.
A parallel algorithm for generation and assembly of finite element stiffness and mass matrices

NASA Technical Reports Server (NTRS)

Storaasli, O. O.; Carmona, E. A.; Nguyen, D. T.; Baddourah, M. A.

1991-01-01

A new algorithm is proposed for parallel generation and assembly of the finite element stiffness and mass matrices. The proposed assembly algorithm is based on a node-by-node approach rather than the more conventional element-by-element approach. The new algorithm's generality and computation speed-up when using multiple processors are demonstrated for several practical applications on multi-processor Cray Y-MP and Cray 2 supercomputers.
Input/output behavior of supercomputing applications

NASA Technical Reports Server (NTRS)

Miller, Ethan L.

1991-01-01

The collection and analysis of supercomputer I/O traces and their use in a collection of buffering and caching simulations are described. This serves two purposes. First, it gives a model of how individual applications running on supercomputers request file system I/O, allowing system designer to optimize I/O hardware and file system algorithms to that model. Second, the buffering simulations show what resources are needed to maximize the CPU utilization of a supercomputer given a very bursty I/O request rate. By using read-ahead and write-behind in a large solid stated disk, one or two applications were sufficient to fully utilize a Cray Y-MP CPU.
Supercomputer algorithms for efficient linear octree encoding of three-dimensional brain images.

PubMed

Berger, S B; Reis, D J

1995-02-01

We designed and implemented algorithms for three-dimensional (3-D) reconstruction of brain images from serial sections using two important supercomputer architectures, vector and parallel. These architectures were represented by the Cray YMP and Connection Machine CM-2, respectively. The programs operated on linear octree representations of the brain data sets, and achieved 500-800 times acceleration when compared with a conventional laboratory workstation. As the need for higher resolution data sets increases, supercomputer algorithms may offer a means of performing 3-D reconstruction well above current experimental limits.
Solving large sparse eigenvalue problems on supercomputers

NASA Technical Reports Server (NTRS)

Philippe, Bernard; Saad, Youcef

1988-01-01

An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed.
SNS programming environment user's guide

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Howser, Lona M.; Humes, D. Creig; Cronin, Catherine K.; Bowen, John T.; Drozdowski, Joseph M.; Utley, Judith A.; Flynn, Theresa M.; Austin, Brenda A.

1992-01-01

The computing environment is briefly described for the Supercomputing Network Subsystem (SNS) of the Central Scientific Computing Complex of NASA Langley. The major SNS computers are a CRAY-2, a CRAY Y-MP, a CONVEX C-210, and a CONVEX C-220. The software is described that is common to all of these computers, including: the UNIX operating system, computer graphics, networking utilities, mass storage, and mathematical libraries. Also described is file management, validation, SNS configuration, documentation, and customer services.
Scaling of data communications for an advanced supercomputer network

NASA Technical Reports Server (NTRS)

Levin, E.; Eaton, C. K.; Young, Bruce

1986-01-01

The goal of NASA's Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations and by remote communication to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. The implications of a projected 20-fold increase in processing power on the data communications requirements are described.
Engineering PFLOTRAN for Scalable Performance on Cray XT and IBM BlueGene Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mills, Richard T; Sripathi, Vamsi K; Mahinthakumar, Gnanamanika

We describe PFLOTRAN - a code for simulation of coupled hydro-thermal-chemical processes in variably saturated, non-isothermal, porous media - and the approaches we have employed to obtain scalable performance on some of the largest scale supercomputers in the world. We present detailed analyses of I/O and solver performance on Jaguar, the Cray XT5 at Oak Ridge National Laboratory, and Intrepid, the IBM BlueGene/P at Argonne National Laboratory, that have guided our choice of algorithms.
A performance comparison of scalar, vector, and concurrent vector computers including supercomputers for modeling transport of reactive contaminants in groundwater

NASA Astrophysics Data System (ADS)

Tripathi, Vijay S.; Yeh, G. T.

1993-06-01

Sophisticated and highly computation-intensive models of transport of reactive contaminants in groundwater have been developed in recent years. Application of such models to real-world contaminant transport problems, e.g., simulation of groundwater transport of 10-15 chemically reactive elements (e.g., toxic metals) and relevant complexes and minerals in two and three dimensions over a distance of several hundred meters, requires high-performance computers including supercomputers. Although not widely recognized as such, the computational complexity and demand of these models compare with well-known computation-intensive applications including weather forecasting and quantum chemical calculations. A survey of the performance of a variety of available hardware, as measured by the run times for a reactive transport model HYDROGEOCHEM, showed that while supercomputers provide the fastest execution times for such problems, relatively low-cost reduced instruction set computer (RISC) based scalar computers provide the best performance-to-price ratio. Because supercomputers like the Cray X-MP are inherently multiuser resources, often the RISC computers also provide much better turnaround times. Furthermore, RISC-based workstations provide the best platforms for "visualization" of groundwater flow and contaminant plumes. The most notable result, however, is that current workstations costing less than $10,000 provide performance within a factor of 5 of a Cray X-MP.
Understanding the Cray X1 System

NASA Technical Reports Server (NTRS)

Cheung, Samson

2004-01-01

This paper helps the reader understand the characteristics of the Cray X1 vector supercomputer system, and provides hints and information to enable the reader to port codes to the system. It provides a comparison between the basic performance of the X1 platform and other platforms that are available at NASA Ames Research Center. A set of codes, solving the Laplacian equation with different parallel paradigms, is used to understand some features of the X1 compiler. An example code from the NAS Parallel Benchmarks is used to demonstrate performance optimization on the X1 platform.
NAS (Numerical Aerodynamic Simulation Program) technical summaries, March 1989 - February 1990

NASA Technical Reports Server (NTRS)

1990-01-01

Given here are selected scientific results from the Numerical Aerodynamic Simulation (NAS) Program's third year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP supercomputer. Topics covered include flow field analysis of fighter wing configurations, large-scale ocean modeling, the Space Shuttle flow field, advanced computational fluid dynamics (CFD) codes for rotary-wing airloads and performance prediction, turbulence modeling of separated flows, airloads and acoustics of rotorcraft, vortex-induced nonlinearities on submarines, and standing oblique detonation waves.

Heart Fibrillation and Parallel Supercomputers

NASA Technical Reports Server (NTRS)

Kogan, B. Y.; Karplus, W. J.; Chudin, E. E.

1997-01-01

The Luo and Rudy 3 cardiac cell mathematical model is implemented on the parallel supercomputer CRAY - T3D. The splitting algorithm combined with variable time step and an explicit method of integration provide reasonable solution times and almost perfect scaling for rectilinear wave propagation. The computer simulation makes it possible to observe new phenomena: the break-up of spiral waves caused by intracellular calcium and dynamics and the non-uniformity of the calcium distribution in space during the onset of the spiral wave.
EFFECTS OF TUMORS ON INHALED PHARMACOLOGIC DRUGS: II. PARTICLE MOTION

EPA Science Inventory

ABSTRACT

Computer simulations were conducted to describe drug particle motion in human lung bifurcations with tumors. The computations used FIDAP with a Cray T90 supercomputer. The objective was to better understand particle behavior as affected by particle characteristics...
Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Doerfler, Douglas; Austin, Brian; Cook, Brandon

There are many potential issues associated with deploying the Intel Xeon Phi™ (code named Knights Landing [KNL]) manycore processor in a large-scale supercomputer. One in particular is the ability to fully utilize the high-speed communications network, given that the serial performance of a Xeon Phi TM core is a fraction of a Xeon®core. In this paper, we take a look at the trade-offs associated with allocating enough cores to fully utilize the Aries high-speed network versus cores dedicated to computation, e.g., the trade-off between MPI and OpenMP. In addition, we evaluate new features of Cray MPI in support of KNL,more » such as internode optimizations. We also evaluate one-sided programming models such as Unified Parallel C. We quantify the impact of the above trade-offs and features using a suite of National Energy Research Scientific Computing Center applications.« less
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan

While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less
The CSM testbed software system: A development environment for structural analysis methods on the NAS CRAY-2

NASA Technical Reports Server (NTRS)

Gillian, Ronnie E.; Lotts, Christine G.

1988-01-01

The Computational Structural Mechanics (CSM) Activity at Langley Research Center is developing methods for structural analysis on modern computers. To facilitate that research effort, an applications development environment has been constructed to insulate the researcher from the many computer operating systems of a widely distributed computer network. The CSM Testbed development system was ported to the Numerical Aerodynamic Simulator (NAS) Cray-2, at the Ames Research Center, to provide a high end computational capability. This paper describes the implementation experiences, the resulting capability, and the future directions for the Testbed on supercomputers.
RISC Processors and High Performance Computing

NASA Technical Reports Server (NTRS)

Saini, Subhash; Bailey, David H.; Lasinski, T. A. (Technical Monitor)

1995-01-01

In this tutorial, we will discuss top five current RISC microprocessors: The IBM Power2, which is used in the IBM RS6000/590 workstation and in the IBM SP2 parallel supercomputer, the DEC Alpha, which is in the DEC Alpha workstation and in the Cray T3D; the MIPS R8000, which is used in the SGI Power Challenge; the HP PA-RISC 7100, which is used in the HP 700 series workstations and in the Convex Exemplar; and the Cray proprietary processor, which is used in the new Cray J916. The architecture of these microprocessors will first be presented. The effective performance of these processors will then be compared, both by citing standard benchmarks and also in the context of implementing a real applications. In the process, different programming models such as data parallel (CM Fortran and HPF) and message passing (PVM and MPI) will be introduced and compared. The latest NAS Parallel Benchmark (NPB) absolute performance and performance per dollar figures will be presented. The next generation of the NP13 will also be described. The tutorial will conclude with a discussion of general trends in the field of high performance computing, including likely future developments in hardware and software technology, and the relative roles of vector supercomputers tightly coupled parallel computers, and clusters of workstations. This tutorial will provide a unique cross-machine comparison not available elsewhere.
Efficient multitasking of Choleski matrix factorization on CRAY supercomputers

NASA Technical Reports Server (NTRS)

Overman, Andrea L.; Poole, Eugene L.

1991-01-01

A Choleski method is described and used to solve linear systems of equations that arise in large scale structural analysis. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is used for comparison with the microtasked and autotasked implementations. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both computers. CPU and wall clock timings are given for the parallel implementations and are compared to single processor timings of the same algorithm.
Supercomputer analysis of purine and pyrimidine metabolism leading to DNA synthesis.

PubMed

Heinmets, F

1989-06-01

A model-system is established to analyze purine and pyrimidine metabolism leading to DNA synthesis. The principal aim is to explore the flow and regulation of terminal deoxynucleoside triophosphates (dNTPs) in various input and parametric conditions. A series of flow equations are established, which are subsequently converted to differential equations. These are programmed (Fortran) and analyzed on a Cray chi-MP/48 supercomputer. The pool concentrations are presented as a function of time in conditions in which various pertinent parameters of the system are modified. The system is formulated by 100 differential equations.
TOP500 Sublist for November 2001

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack J.

2001-11-09

18th Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, GERMANY; KNOXVILLE, TENN.; BERKELEY, CALIF. In what has become a much-anticipated event in the world of high-performance computing, the 18th edition of the TOP500 list of the world's fastest supercomputers was released today (November 9, 2001). The latest edition of the twice-yearly ranking finds IBM as the leader in the field, with 32 percent in terms of installed systems and 37 percent in terms of total performance of all the installed systems. In a surprise move Hewlett-Packard captured the second place with 30 percent of the systems. Most ofmore » these systems are smaller in size and as a consequence HP's share of installed performance is smaller with 15 percent. This is still enough for second place in this category. SGI, Cray and Sun follow in the number of TOP500 systems with 41 (8 percent), 39 (8 percent), and 31 (6 percent) respectively. In the category of installed performance Cray Inc. keeps the third position with 11 percent ahead of SGI (8 percent) and Compaq (8 percent).« less
Barrier-breaking performance for industrial problems on the CRAY C916

DOE Office of Scientific and Technical Information (OSTI.GOV)

Graffunder, S.K.

1993-12-31

Nine applications, including third-party codes, were submitted to the Gordon Bell Prize committee showing the CRAY C916 supercomputer providing record-breaking time to solution for industrial problems in several disciplines. Performance was obtained by balancing raw hardware speed; effective use of large, real, shared memory; compiler vectorization and autotasking; hand optimization; asynchronous I/O techniques; and new algorithms. The highest GFLOPS performance for the submissions was 11.1 GFLOPS out of a peak advertised performance of 16 GFLOPS for the CRAY C916 system. One program achieved a 15.45 speedup from the compiler with just two hand-inserted directives to scope variables properly for themore » mathematical library. New I/O techniques hide tens of gigabytes of I/O behind parallel computations. Finally, new iterative solver algorithms have demonstrated times to solution on 1 CPU as high as 70 times faster than the best direct solvers.« less
Performance of the fusion code GYRO on four generations of Cray computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fahey, Mark R

2014-01-01

GYRO is a code used for the direct numerical simulation of plasma microturbulence. It has been ported to a variety of modern MPP platforms including several modern commodity clusters, IBM SPs, and Cray XC, XT, and XE series machines. We briefly describe the mathematical structure of the equations, the data layout, and the redistribution scheme. Also, while the performance and scaling of GYRO on many of these systems has been shown before, here we show the comparative performance and scaling on four generations of Cray supercomputers including the newest addition - the Cray XC30. The more recently added hybrid OpenMP/MPImore » imple- mentation also shows a great deal of promise on custom HPC systems that utilize fast CPUs and proprietary interconnects. Four machines of varying sizes were used in the experiment, all of which are located at the National Institute for Computational Sciences at the University of Tennessee at Knoxville and Oak Ridge National Laboratory. The advantages, limitations, and performance of using each system are discussed.« less
Gigaflop performance on a CRAY-2: Multitasking a computational fluid dynamics application

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Overman, Andrea L.; Lambiotte, Jules J.; Streett, Craig L.

1991-01-01

The methodology is described for converting a large, long-running applications code that executed on a single processor of a CRAY-2 supercomputer to a version that executed efficiently on multiple processors. Although the conversion of every application is different, a discussion of the types of modification used to achieve gigaflop performance is included to assist others in the parallelization of applications for CRAY computers, especially those that were developed for other computers. An existing application, from the discipline of computational fluid dynamics, that had utilized over 2000 hrs of CPU time on CRAY-2 during the previous year was chosen as a test case to study the effectiveness of multitasking on a CRAY-2. The nature of dominant calculations within the application indicated that a sustained computational rate of 1 billion floating-point operations per second, or 1 gigaflop, might be achieved. The code was first analyzed and modified for optimal performance on a single processor in a batch environment. After optimal performance on a single CPU was achieved, the code was modified to use multiple processors in a dedicated environment. The results of these two efforts were merged into a single code that had a sustained computational rate of over 1 gigaflop on a CRAY-2. Timings and analysis of performance are given for both single- and multiple-processor runs.
Supercomputers for engineering analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goudreau, G.L.; Benson, D.J.; Hallquist, J.O.

1986-07-01

The Cray-1 and Cray X-MP/48 experience in engineering computations at the Lawrence Livermore National Laboratory is surveyed. The fully vectorized explicit DYNA and implicit NIKE finite element codes are discussed with respect to solid and structural mechanics. The main efficiencies for production analyses are currently obtained by simple CFT compiler exploitation of pipeline architecture for inner do-loop optimization. Current developmet of outer-loop multitasking is also discussed. Applications emphasis will be on 3D examples spanning earth penetrator loads analysis, target lethality assessment, and crashworthiness. The use of a vectorized large deformation shell element in both DYNA and NIKE has substantially expandedmore » 3D nonlinear capability. 25 refs., 7 figs.« less
A vectorized Lanczos eigensolver for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1990-01-01

The computational strategies used to implement a Lanczos-based-method eigensolver on the latest generation of supercomputers are described. Several examples of structural vibration and buckling problems are presented that show the effects of using optimization techniques to increase the vectorization of the computational steps. The data storage and access schemes and the tools and strategies that best exploit the computer resources are presented. The method is implemented on the Convex C220, the Cray 2, and the Cray Y-MP computers. Results show that very good computation rates are achieved for the most computationally intensive steps of the Lanczos algorithm and that the Lanczos algorithm is many times faster than other methods extensively used in the past.
Particle simulation on heterogeneous distributed supercomputers

NASA Technical Reports Server (NTRS)

Becker, Jeffrey C.; Dagum, Leonardo

1993-01-01

We describe the implementation and performance of a three dimensional particle simulation distributed between a Thinking Machines CM-2 and a Cray Y-MP. These are connected by a combination of two high-speed networks: a high-performance parallel interface (HIPPI) and an optical network (UltraNet). This is the first application to use this configuration at NASA Ames Research Center. We describe our experience implementing and using the application and report the results of several timing measurements. We show that the distribution of applications across disparate supercomputing platforms is feasible and has reasonable performance. In addition, several practical aspects of the computing environment are discussed.
Accelerating Virtual High-Throughput Ligand Docking: current technology and case study on a petascale supercomputer.

PubMed

Ellingson, Sally R; Dakshanamurthy, Sivanesan; Brown, Milton; Smith, Jeremy C; Baudry, Jerome

2014-04-25

In this paper we give the current state of high-throughput virtual screening. We describe a case study of using a task-parallel MPI (Message Passing Interface) version of Autodock4 [1], [2] to run a virtual high-throughput screen of one-million compounds on the Jaguar Cray XK6 Supercomputer at Oak Ridge National Laboratory. We include a description of scripts developed to increase the efficiency of the predocking file preparation and postdocking analysis. A detailed tutorial, scripts, and source code for this MPI version of Autodock4 are available online at http://www.bio.utk.edu/baudrylab/autodockmpi.htm.
Performance of a Bounce-Averaged Global Model of Super-Thermal Electron Transport in the Earth's Magnetic Field

NASA Technical Reports Server (NTRS)

McGuire, Tim

1998-01-01

In this paper, we report the results of our recent research on the application of a multiprocessor Cray T916 supercomputer in modeling super-thermal electron transport in the earth's magnetic field. In general, this mathematical model requires numerical solution of a system of partial differential equations. The code we use for this model is moderately vectorized. By using Amdahl's Law for vector processors, it can be verified that the code is about 60% vectorized on a Cray computer. Speedup factors on the order of 2.5 were obtained compared to the unvectorized code. In the following sections, we discuss the methodology of improving the code. In addition to our goal of optimizing the code for solution on the Cray computer, we had the goal of scalability in mind. Scalability combines the concepts of portabilty with near-linear speedup. Specifically, a scalable program is one whose performance is portable across many different architectures with differing numbers of processors for many different problem sizes. Though we have access to a Cray at this time, the goal was to also have code which would run well on a variety of architectures.
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villa, Oreste; Tumeo, Antonino; Secchi, Simone

Irregular applications, such as data mining and analysis or graph-based computations, show unpredictable memory/network access patterns and control structures. Highly multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2 and XMT, appear to address their requirements better than commodity clusters. However, the research on highly multithreaded systems is currently limited by the lack of adequate architectural simulation infrastructures due to issues such as size of the machines, memory footprint, simulation speed, accuracy and customization. At the same time, Shared-memory MultiProcessors (SMPs) with multi-core processors have become an attractive platform to simulate large scale machines. In this paper, wemore » introduce a cycle-level simulator of the highly multithreaded Cray XMT supercomputer. The simulator runs unmodified XMT applications. We discuss how we tackled the challenges posed by its development, detailing the techniques introduced to make the simulation as fast as possible while maintaining a high accuracy. By mapping XMT processors (ThreadStorm with 128 hardware threads) to host computing cores, the simulation speed remains constant as the number of simulated processors increases, up to the number of available host cores. The simulator supports zero-overhead switching among different accuracy levels at run-time and includes a network model that takes into account contention. On a modern 48-core SMP host, our infrastructure simulates a large set of irregular applications 500 to 2000 times slower than real time when compared to a 128-processor XMT, while remaining within 10\\% of accuracy. Emulation is only from 25 to 200 times slower than real time.« less
Research on Spectroscopy, Opacity, and Atmospheres

NASA Technical Reports Server (NTRS)

Kurucz, Robert L.

1999-01-01

A web site has been set up to make the calculations accessible; (i.e., cfakus.harvard.edu) This data can also be accessed by FTP. It has all of the atomic and diatomic molecular data, tables of distribution function opacities, grids of model atmospheres, colors, fluxes, etc, programs that are ready for distribution, and most of recent papers developed during this grant. Atlases and computed spectra will be added as they are completed. New atomic and molecular calculations will be added as they are completed. The atomic programs that had been running on a Cray at the San Diego Supercomputer Center can now run on the Vaxes and Alpha. The work started with Ni and Co because there were new laboratory analyses that included isotopic and hyperfine splitting. Those calculations are described in the appended abstract for the 6th Atomic Spectroscopy and oscillator Strengths meeting in Victoria last summer. A surprising finding is that quadrupole transitions have been grossly in error because mixing with higher levels has not been included. All levels up through n=9 for Fe I and II, the spectra for which the most information is available, are now included. After Fe I and Fe II, all other spectra are "easy". ATLAS12, the opacity sampling program for computing models with arbitrary abundances, has been put on the web server. A new distribution function opacity program for workstations that replaces the one used on the Cray at the San Diego Supercomputer Center has been written. Each set of abundances would take 100 Cray hours costing $100,000.
Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

NASA Technical Reports Server (NTRS)

Logan, Terry G.

1994-01-01

The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.

ATLAS and LHC computing on CRAY

NASA Astrophysics Data System (ADS)

Sciacca, F. G.; Haug, S.; ATLAS Collaboration

2017-10-01

Access and exploitation of large scale computing resources, such as those offered by general purpose HPC centres, is one important measure for ATLAS and the other Large Hadron Collider experiments in order to meet the challenge posed by the full exploitation of the future data within the constraints of flat budgets. We report on the effort of moving the Swiss WLCG T2 computing, serving ATLAS, CMS and LHCb, from a dedicated cluster to the large Cray systems at the Swiss National Supercomputing Centre CSCS. These systems do not only offer very efficient hardware, cooling and highly competent operators, but also have large backfill potentials due to size and multidisciplinary usage and potential gains due to economy at scale. Technical solutions, performance, expected return and future plans are discussed.
A Performance Evaluation of the Cray X1 for Scientific Applications

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak; Borrill, Julian; Canning, Andrew; Carter, Jonathan; Djomehri, M. Jahed; Shan, Hongzhang; Skinner, David

2004-01-01

The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to build high-end capability and cost effectiveness. However, the recent development of massively parallel vector systems is having a significant effect on the supercomputing landscape. In this paper, we compare the performance of the recently released Cray X1 vector system with that of the cacheless NEC SX-6 vector machine, and the superscalar cache-based IBM Power3 and Power4 architectures for scientific applications. Overall results demonstrate that the X1 is quite promising, but performance improvements are expected as the hardware, systems software, and numerical libraries mature. Code reengineering to effectively utilize the complex architecture may also lead to significant efficiency enhancements.
Climate Data Assimilation on a Massively Parallel Supercomputer

NASA Technical Reports Server (NTRS)

Ding, Hong Q.; Ferraro, Robert D.

1996-01-01

We have designed and implemented a set of highly efficient and highly scalable algorithms for an unstructured computational package, the PSAS data assimilation package, as demonstrated by detailed performance analysis of systematic runs on up to 512-nodes of an Intel Paragon. The preconditioned Conjugate Gradient solver achieves a sustained 18 Gflops performance. Consequently, we achieve an unprecedented 100-fold reduction in time to solution on the Intel Paragon over a single head of a Cray C90. This not only exceeds the daily performance requirement of the Data Assimilation Office at NASA's Goddard Space Flight Center, but also makes it possible to explore much larger and challenging data assimilation problems which are unthinkable on a traditional computer platform such as the Cray C90.
High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

NASA Technical Reports Server (NTRS)

Kramer, Williams T. C.; Simon, Horst D.

1994-01-01

This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.
High performance computing applications in neurobiological research

NASA Technical Reports Server (NTRS)

Ross, Muriel D.; Cheng, Rei; Doshay, David G.; Linton, Samuel W.; Montgomery, Kevin; Parnas, Bruce R.

1994-01-01

The human nervous system is a massively parallel processor of information. The vast numbers of neurons, synapses and circuits is daunting to those seeking to understand the neural basis of consciousness and intellect. Pervading obstacles are lack of knowledge of the detailed, three-dimensional (3-D) organization of even a simple neural system and the paucity of large scale, biologically relevant computer simulations. We use high performance graphics workstations and supercomputers to study the 3-D organization of gravity sensors as a prototype architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scale-up, three-dimensional versions run on the Cray Y-MP and CM5 supercomputers.
Close to real life. [solving for transonic flow about lifting airfoils using supercomputers

NASA Technical Reports Server (NTRS)

Peterson, Victor L.; Bailey, F. Ron

1988-01-01

NASA's Numerical Aerodynamic Simulation (NAS) facility for CFD modeling of highly complex aerodynamic flows employs as its basic hardware two Cray-2s, an ETA-10 Model Q, an Amdahl 5880 mainframe computer that furnishes both support processing and access to 300 Gbytes of disk storage, several minicomputers and superminicomputers, and a Thinking Machines 16,000-device 'connection machine' processor. NAS, which was the first supercomputer facility to standardize operating-system and communication software on all processors, has done important Space Shuttle aerodynamics simulations and will be critical to the configurational refinement of the National Aerospace Plane and its intergrated powerplant, which will involve complex, high temperature reactive gasdynamic computations.
CONVEX mini manual

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Howser, Lona M.

1993-01-01

The use of the CONVEX computers that are an integral part of the Supercomputing Network Subsystems (SNS) of the Central Scientific Computing Complex of LaRC is briefly described. Features of the CONVEX computers that are significantly different than the CRAY supercomputers are covered, including: FORTRAN, C, architecture of the CONVEX computers, the CONVEX environment, batch job submittal, debugging, performance analysis, utilities unique to CONVEX, and documentation. This revision reflects the addition of the Applications Compiler and X-based debugger, CXdb. The document id intended for all CONVEX users as a ready reference to frequently asked questions and to more detailed information contained with the vendor manuals. It is appropriate for both the novice and the experienced user.
A Performance Evaluation of the Cray X1 for Scientific Applications

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak; Borrill, Julian; Canning, Andrew; Carter, Jonathan; Djomehri, M. Jahed; Shan, Hongzhang; Skinner, David

2003-01-01

The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to build high-end capability and capacity computers because of their generality, scalability, and cost effectiveness. However, the recent development of massively parallel vector systems is having a significant effect on the supercomputing landscape. In this paper, we compare the performance of the recently-released Cray X1 vector system with that of the cacheless NEC SX-6 vector machine, and the superscalar cache-based IBM Power3 and Power4 architectures for scientific applications. Overall results demonstrate that the X1 is quite promising, but performance improvements are expected as the hardware, systems software, and numerical libraries mature. Code reengineering to effectively utilize the complex architecture may also lead to significant efficiency enhancements.
Application of high-performance computing to numerical simulation of human movement

NASA Technical Reports Server (NTRS)

Anderson, F. C.; Ziegler, J. M.; Pandy, M. G.; Whalen, R. T.

1995-01-01

We have examined the feasibility of using massively-parallel and vector-processing supercomputers to solve large-scale optimization problems for human movement. Specifically, we compared the computational expense of determining the optimal controls for the single support phase of gait using a conventional serial machine (SGI Iris 4D25), a MIMD parallel machine (Intel iPSC/860), and a parallel-vector-processing machine (Cray Y-MP 8/864). With the human body modeled as a 14 degree-of-freedom linkage actuated by 46 musculotendinous units, computation of the optimal controls for gait could take up to 3 months of CPU time on the Iris. Both the Cray and the Intel are able to reduce this time to practical levels. The optimal solution for gait can be found with about 77 hours of CPU on the Cray and with about 88 hours of CPU on the Intel. Although the overall speeds of the Cray and the Intel were found to be similar, the unique capabilities of each machine are better suited to different portions of the computational algorithm used. The Intel was best suited to computing the derivatives of the performance criterion and the constraints whereas the Cray was best suited to parameter optimization of the controls. These results suggest that the ideal computer architecture for solving very large-scale optimal control problems is a hybrid system in which a vector-processing machine is integrated into the communication network of a MIMD parallel machine.
An analysis of file migration in a UNIX supercomputing environment

NASA Technical Reports Server (NTRS)

Miller, Ethan L.; Katz, Randy H.

1992-01-01

The super computer center at the National Center for Atmospheric Research (NCAR) migrates large numbers of files to and from its mass storage system (MSS) because there is insufficient space to store them on the Cray supercomputer's local disks. This paper presents an analysis of file migration data collected over two years. The analysis shows that requests to the MSS are periodic, with one day and one week periods. Read requests to the MSS account for the majority of the periodicity; as write requests are relatively constant over the course of a week. Additionally, reads show a far greater fluctuation than writes over a day and week since reads are driven by human users while writes are machine-driven.
Adaptation of MSC/NASTRAN to a supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gloudeman, J.F.; Hodge, J.C.

1982-01-01

MSC/NASTRAN is a large-scale general purpose digital computer program which solves a wider variety of engineering analysis problems by the finite element method. The program capabilities include static and dynamic structural analysis (linear and nonlinear), heat transfer, acoustics, electromagnetism and other types of field problems. It is used worldwide by large and small companies in such diverse fields as automotive, aerospace, civil engineering, shipbuilding, offshore oil, industrial equipment, chemical engineering, biomedical research, optics and government research. The paper presents the significant aspects of the adaptation of MSC/NASTRAN to the Cray-1. First, the general architecture and predominant functional use of MSC/NASTRANmore » are discussed to help explain the imperatives and the challenges of this undertaking. The key characteristics of the Cray-1 which influenced the decision to undertake this effort are then reviewed to help identify performance targets. An overview of the MSC/NASTRAN adaptation effort is then given to help define the scope of the project. Finally, some measures of MSC/NASTRAN's operational performance on the Cray-1 are given, along with a few guidelines to help avoid improper interpretation. 17 references.« less
Optimization strategies for molecular dynamics programs on Cray computers and scalar work stations

NASA Astrophysics Data System (ADS)

Unekis, Michael J.; Rice, Betsy M.

1994-12-01

We present results of timing runs and different optimization strategies for a prototype molecular dynamics program that simulates shock waves in a two-dimensional (2-D) model of a reactive energetic solid. The performance of the program may be improved substantially by simple changes to the Fortran or by employing various vendor-supplied compiler optimizations. The optimum strategy varies among the machines used and will vary depending upon the details of the program. The effect of various compiler options and vendor-supplied subroutine calls is demonstrated. Comparison is made between two scalar workstations (IBM RS/6000 Model 370 and Model 530) and several Cray supercomputers (X-MP/48, Y-MP8/128, and C-90/16256). We find that for a scientific application program dominated by sequential, scalar statements, a relatively inexpensive high-end work station such as the IBM RS/60006 RISC series will outperform single processor performance of the Cray X-MP/48 and perform competitively with single processor performance of the Y-MP8/128 and C-9O/16256.
Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization

NASA Technical Reports Server (NTRS)

Jones, James Patton; Nitzberg, Bill

1999-01-01

The NAS facility has operated parallel supercomputers for the past 11 years, including the Intel iPSC/860, Intel Paragon, Thinking Machines CM-5, IBM SP-2, and Cray Origin 2000. Across this wide variety of machine architectures, across a span of 10 years, across a large number of different users, and through thousands of minor configuration and policy changes, the utilization of these machines shows three general trends: (1) scheduling using a naive FIFO first-fit policy results in 40-60% utilization, (2) switching to the more sophisticated dynamic backfilling scheduling algorithm improves utilization by about 15 percentage points (yielding about 70% utilization), and (3) reducing the maximum allowable job size further increases utilization. Most surprising is the consistency of these trends. Over the lifetime of the NAS parallel systems, we made hundreds, perhaps thousands, of small changes to hardware, software, and policy, yet, utilization was affected little. In particular these results show that the goal of achieving near 100% utilization while supporting a real parallel supercomputing workload is unrealistic.
Deploying Darter A Cray XC30 System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fahey, Mark R; Budiardja, Reuben D; Crosby, Lonnie D

TheUniversityofTennessee,KnoxvilleacquiredaCrayXC30 supercomputer, called Darter, with a peak performance of 248.9 Ter- aflops. Darter was deployed in late March of 2013 with a very aggressive production timeline - the system was deployed, accepted, and placed into production in only 2 weeks. The Spring Experiment for the Center for Analysis and Prediction of Storms (CAPS) largely drove the accelerated timeline, as the experiment was scheduled to start in mid-April. The Consortium for Advanced Simulation of Light Water Reactors (CASL) project also needed access and was able to meet their tight deadlines on the newly acquired XC30. Darter s accelerated deployment and op-more » erations schedule resulted in substantial scientific impacts within the re- search community as well as immediate real-world impacts such as early severe tornado warnings« less
Synthesis, hydrolysis rates, supercomputer modeling, and antibacterial activity of bicyclic tetrahydropyridazinones.

PubMed

Jungheim, L N; Boyd, D B; Indelicato, J M; Pasini, C E; Preston, D A; Alborn, W E

1991-05-01

Bicyclic tetrahydropyridazinones, such as 13, where X are strongly electron-withdrawing groups, were synthesized to investigate their antibacterial activity. These delta-lactams are homologues of bicyclic pyrazolidinones 15, which were the first non-beta-lactam containing compounds reported to bind to penicillin-binding proteins (PBPs). The delta-lactam compounds exhibit poor antibacterial activity despite having reactivity comparable to the gamma-lactams. Molecular modeling based on semiempirical molecular orbital calculations on a Cray X-MP supercomputer, predicted that the reason for the inactivity is steric bulk hindering high affinity of the compounds to PBPs, as well as high conformational flexibility of the tetrahydropyridazinone ring hampering effective alignment of the molecule in the active site. Subsequent PBP binding experiments confirmed that this class of compound does not bind to PBPs.
Vectorized program architectures for supercomputer-aided circuit design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rizzoli, V.; Ferlito, M.; Neri, A.

1986-01-01

Vector processors (supercomputers) can be effectively employed in MIC or MMIC applications to solve problems of large numerical size such as broad-band nonlinear design or statistical design (yield optimization). In order to fully exploit the capabilities of a vector hardware, any program architecture must be structured accordingly. This paper presents a possible approach to the ''semantic'' vectorization of microwave circuit design software. Speed-up factors of the order of 50 can be obtained on a typical vector processor (Cray X-MP), with respect to the most powerful scaler computers (CDC 7600), with cost reductions of more than one order of magnitude. Thismore » could broaden the horizon of microwave CAD techniques to include problems that are practically out of the reach of conventional systems.« less
Comparison of the MPP with other supercomputers for LANDSAT data processing

NASA Technical Reports Server (NTRS)

Ozga, Martin

1987-01-01

The massively parallel processor is compared to the CRAY X-MP and the CYBER-205 for LANDSAT data processing. The maximum likelihood classification algorithm is the basis for comparison since this algorithm is simple to implement and vectorizes very well. The algorithm was implemented on all three machines and tested by classifying the same full scene of LANDSAT multispectral scan data. Timings are compared as well as features of the machines and available software.
Experiences From NASA/Langley's DMSS Project

NASA Technical Reports Server (NTRS)

1996-01-01

There is a trend in institutions with high performance computing and data management requirements to explore mass storage systems with peripherals directly attached to a high speed network. The Distributed Mass Storage System (DMSS) Project at the NASA Langley Research Center (LaRC) has placed such a system into production use. This paper will present the experiences, both good and bad, we have had with this system since putting it into production usage. The system is comprised of: 1) National Storage Laboratory (NSL)/UniTree 2.1, 2) IBM 9570 HIPPI attached disk arrays (both RAID 3 and RAID 5), 3) IBM RS6000 server, 4) HIPPI/IPI3 third party transfers between the disk array systems and the supercomputer clients, a CRAY Y-MP and a CRAY 2, 5) a "warm spare" file server, 6) transition software to convert from CRAY's Data Migration Facility (DMF) based system to DMSS, 7) an NSC PS32 HIPPI switch, and 8) a STK 4490 robotic library accessed from the IBM RS6000 block mux interface. This paper will cover: the performance of the DMSS in the following areas: file transfer rates, migration and recall, and file manipulation (listing, deleting, etc.); the appropriateness of a workstation class of file server for NSL/UniTree with LaRC's present storage requirements in mind the role of the third party transfers between the supercomputers and the DMSS disk array systems in DMSS; a detailed comparison (both in performance and functionality) between the DMF and DMSS systems LaRC's enhancements to the NSL/UniTree system administration environment the mechanism for DMSS to provide file server redundancy the statistics on the availability of DMSS the design and experiences with the locally developed transparent transition software which allowed us to make over 1.5 million DMF files available to NSL/UniTree with minimal system outage
Parallel-vector out-of-core equation solver for computational mechanics

NASA Technical Reports Server (NTRS)

Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.

1993-01-01

A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.
Nuclear shell model code CRUNCHER

DOE Office of Scientific and Technical Information (OSTI.GOV)

Resler, D.A.; Grimes, S.M.

1988-05-01

A new nuclear shell model code CRUNCHER, patterned after the code VLADIMIR, has been developed. While CRUNCHER and VLADIMIR employ the techniques of an uncoupled basis and the Lanczos process, improvements in the new code allow it to handle much larger problems than the previous code and to perform them more efficiently. Tests involving a moderately sized calculation indicate that CRUNCHER running on a SUN 3/260 workstation requires approximately one-half the central processing unit (CPU) time required by VLADIMIR running on a CRAY-1 supercomputer.

Researchers Mine Information from Next-Generation Subsurface Flow Simulations

DOE PAGES

Gedenk, Eric D.

2015-12-01

A research team based at Virginia Tech University leveraged computing resources at the US Department of Energy's (DOE's) Oak Ridge National Laboratory to explore subsurface multiphase flow phenomena that can't be experimentally observed. Using the Cray XK7 Titan supercomputer at the Oak Ridge Leadership Computing Facility, the team took Micro-CT images of subsurface geologic systems and created two-phase flow simulations. The team's model development has implications for computational research pertaining to carbon sequestration, oil recovery, and contaminant transport.
Space Shuttle Main Engine structural analysis and data reduction/evaluation. Volume 3A: High pressure oxidizer turbo-pump preburner pump housing stress analysis report

NASA Technical Reports Server (NTRS)

Shannon, Robert V., Jr.

1989-01-01

The model generation and structural analysis performed for the High Pressure Oxidizer Turbopump (HPOTP) preburner pump volute housing located on the main pump end of the HPOTP in the space shuttle main engine are summarized. An ANSYS finite element model of the volute housing was built and executed. A static structural analysis was performed on the Engineering Analysis and Data System (EADS) Cray-XMP supercomputer
Parallel FEM Simulation of Electromechanics in the Heart

NASA Astrophysics Data System (ADS)

Xia, Henian; Wong, Kwai; Zhao, Xiaopeng

2011-11-01

Cardiovascular disease is the leading cause of death in America. Computer simulation of complicated dynamics of the heart could provide valuable quantitative guidance for diagnosis and treatment of heart problems. In this paper, we present an integrated numerical model which encompasses the interaction of cardiac electrophysiology, electromechanics, and mechanoelectrical feedback. The model is solved by finite element method on a Linux cluster and the Cray XT5 supercomputer, kraken. Dynamical influences between the effects of electromechanics coupling and mechanic-electric feedback are shown.
Using a multifrontal sparse solver in a high performance, finite element code

NASA Technical Reports Server (NTRS)

King, Scott D.; Lucas, Robert; Raefsky, Arthur

1990-01-01

We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
Optical clock distribution in supercomputers using polyimide-based waveguides

NASA Astrophysics Data System (ADS)

Bihari, Bipin; Gan, Jianhua; Wu, Linghui; Liu, Yujie; Tang, Suning; Chen, Ray T.

1999-04-01

Guided-wave optics is a promising way to deliver high-speed clock-signal in supercomputer with minimized clock-skew. Si- CMOS compatible polymer-based waveguides for optoelectronic interconnects and packaging have been fabricated and characterized. A 1-to-48 fanout optoelectronic interconnection layer (OIL) structure based on Ultradel 9120/9020 for the high-speed massive clock signal distribution for a Cray T-90 supercomputer board has been constructed. The OIL employs multimode polymeric channel waveguides in conjunction with surface-normal waveguide output coupler and 1-to-2 splitters. Surface-normal couplers can couple the optical clock signals into and out from the H-tree polyimide waveguides surface-normally, which facilitates the integration of photodetectors to convert optical-signal to electrical-signal. A 45-degree surface- normal couplers has been integrated at each output end. The measured output coupling efficiency is nearly 100 percent. The output profile from 45-degree surface-normal coupler were calculated using Fresnel approximation. the theoretical result is in good agreement with experimental result. A total insertion loss of 7.98 dB at 850 nm was measured experimentally.
A highly scalable particle tracking algorithm using partitioned global address space (PGAS) programming for extreme-scale turbulence simulations

NASA Astrophysics Data System (ADS)

Buaria, D.; Yeung, P. K.

2017-12-01

A new parallel algorithm utilizing a partitioned global address space (PGAS) programming model to achieve high scalability is reported for particle tracking in direct numerical simulations of turbulent fluid flow. The work is motivated by the desire to obtain Lagrangian information necessary for the study of turbulent dispersion at the largest problem sizes feasible on current and next-generation multi-petaflop supercomputers. A large population of fluid particles is distributed among parallel processes dynamically, based on instantaneous particle positions such that all of the interpolation information needed for each particle is available either locally on its host process or neighboring processes holding adjacent sub-domains of the velocity field. With cubic splines as the preferred interpolation method, the new algorithm is designed to minimize the need for communication, by transferring between adjacent processes only those spline coefficients determined to be necessary for specific particles. This transfer is implemented very efficiently as a one-sided communication, using Co-Array Fortran (CAF) features which facilitate small data movements between different local partitions of a large global array. The cost of monitoring transfer of particle properties between adjacent processes for particles migrating across sub-domain boundaries is found to be small. Detailed benchmarks are obtained on the Cray petascale supercomputer Blue Waters at the University of Illinois, Urbana-Champaign. For operations on the particles in a 81923 simulation (0.55 trillion grid points) on 262,144 Cray XE6 cores, the new algorithm is found to be orders of magnitude faster relative to a prior algorithm in which each particle is tracked by the same parallel process at all times. This large speedup reduces the additional cost of tracking of order 300 million particles to just over 50% of the cost of computing the Eulerian velocity field at this scale. Improving support of PGAS models on major compilers suggests that this algorithm will be of wider applicability on most upcoming supercomputers.
The International Conference on Vector and Parallel Computing (2nd)

DTIC Science & Technology

1989-01-17

Computation of the SVD of Bidiagonal Matrices" ...................................... 11 " Lattice QCD -As a Large Scale Scientific Computation...vectorizcd for the IBM 3090 Vector Facility. In addition, elapsed times " Lattice QCD -As a Large Scale Scientific have been reduced by using 3090...benchmarked Lattice QCD on a large number ofcompu- come from the wavefront solver routine. This was exten- ters: CrayX-MP and Cray 2 (vector
A secure file manager for UNIX

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeVries, R.G.

1990-12-31

The development of a secure file management system for a UNIX-based computer facility with supercomputers and workstations is described. Specifically, UNIX in its usual form does not address: (1) Operation which would satisfy rigorous security requirements. (2) Online space management in an environment where total data demands would be many times the actual online capacity. (3) Making the file management system part of a computer network in which users of any computer in the local network could retrieve data generated on any other computer in the network. The characteristics of UNIX can be exploited to develop a portable, secure filemore » manager which would operate on computer systems ranging from workstations to supercomputers. Implementation considerations making unusual use of UNIX features, rather than requiring extensive internal system changes, are described, and implementation using the Cray Research Inc. UNICOS operating system is outlined.« less
Using a Cray Y-MP as an array processor for a RISC Workstation

NASA Technical Reports Server (NTRS)

Lamaster, Hugh; Rogallo, Sarah J.

1992-01-01

As microprocessors increase in power, the economics of centralized computing has changed dramatically. At the beginning of the 1980's, mainframes and super computers were often considered to be cost-effective machines for scalar computing. Today, microprocessor-based RISC (reduced-instruction-set computer) systems have displaced many uses of mainframes and supercomputers. Supercomputers are still cost competitive when processing jobs that require both large memory size and high memory bandwidth. One such application is array processing. Certain numerical operations are appropriate to use in a Remote Procedure Call (RPC)-based environment. Matrix multiplication is an example of an operation that can have a sufficient number of arithmetic operations to amortize the cost of an RPC call. An experiment which demonstrates that matrix multiplication can be executed remotely on a large system to speed the execution over that experienced on a workstation is described.
Distributed Finite Element Analysis Using a Transputer Network

NASA Technical Reports Server (NTRS)

Watson, James; Favenesi, James; Danial, Albert; Tombrello, Joseph; Yang, Dabby; Reynolds, Brian; Turrentine, Ronald; Shephard, Mark; Baehmann, Peggy

1989-01-01

The principal objective of this research effort was to demonstrate the extraordinarily cost effective acceleration of finite element structural analysis problems using a transputer-based parallel processing network. This objective was accomplished in the form of a commercially viable parallel processing workstation. The workstation is a desktop size, low-maintenance computing unit capable of supercomputer performance yet costs two orders of magnitude less. To achieve the principal research objective, a transputer based structural analysis workstation termed XPFEM was implemented with linear static structural analysis capabilities resembling commercially available NASTRAN. Finite element model files, generated using the on-line preprocessing module or external preprocessing packages, are downloaded to a network of 32 transputers for accelerated solution. The system currently executes at about one third Cray X-MP24 speed but additional acceleration appears likely. For the NASA selected demonstration problem of a Space Shuttle main engine turbine blade model with about 1500 nodes and 4500 independent degrees of freedom, the Cray X-MP24 required 23.9 seconds to obtain a solution while the transputer network, operated from an IBM PC-AT compatible host computer, required 71.7 seconds. Consequently, the $80,000 transputer network demonstrated a cost-performance ratio about 60 times better than the $15,000,000 Cray X-MP24 system.
Board-level optical clock signal distribution using Si CMOS-compatible polyimide-based 1- to 48-fanout H-tree

NASA Astrophysics Data System (ADS)

Wu, Linghui; Bihari, Bipin; Gan, Jianhua; Chen, Ray T.; Tang, Suning

1998-08-01

Si-CMOS compatible polymer-based waveguides for optoelectronic interconnects and packaging have been fabricated and characterized. A 1-to-48 fanout optoelectronic interconnection layer (OIL) structure based on Ultradel 9120/9020 for the high-speed massive clock signal distribution for a Cray T-90 supercomputer board has been constructed. The OIL employs multimode polymeric channel waveguides in conjunction with surface-normal waveguide output coupler and 1-to-2 splitter. A total insertion loss of 7.98 dB at 850 nm was measured experimentally.
Development of a Navier-Stokes algorithm for parallel-processing supercomputers. Ph.D. Thesis - Colorado State Univ., Dec. 1988

NASA Technical Reports Server (NTRS)

Swisshelm, Julie M.

1989-01-01

An explicit flow solver, applicable to the hierarchy of model equations ranging from Euler to full Navier-Stokes, is combined with several techniques designed to reduce computational expense. The computational domain consists of local grid refinements embedded in a global coarse mesh, where the locations of these refinements are defined by the physics of the flow. Flow characteristics are also used to determine which set of model equations is appropriate for solution in each region, thereby reducing not only the number of grid points at which the solution must be obtained, but also the computational effort required to get that solution. Acceleration to steady-state is achieved by applying multigrid on each of the subgrids, regardless of the particular model equations being solved. Since each of these components is explicit, advantage can readily be taken of the vector- and parallel-processing capabilities of machines such as the Cray X-MP and Cray-2.
Vectorization of a particle code used in the simulation of rarefied hypersonic flow

NASA Technical Reports Server (NTRS)

Baganoff, D.

1990-01-01

A limitation of the direct simulation Monte Carlo (DSMC) method is that it does not allow efficient use of vector architectures that predominate in current supercomputers. Consequently, the problems that can be handled are limited to those of one- and two-dimensional flows. This work focuses on a reformulation of the DSMC method with the objective of designing a procedure that is optimized to the vector architectures found on machines such as the Cray-2. In addition, it focuses on finding a better balance between algorithmic complexity and the total number of particles employed in a simulation so that the overall performance of a particle simulation scheme can be greatly improved. Simulations of the flow about a 3D blunt body are performed with 10 to the 7th particles and 4 x 10 to the 5th mesh cells. Good statistics are obtained with time averaging over 800 time steps using 4.5 h of Cray-2 single-processor CPU time.
Parallel Navier-Stokes computations on shared and distributed memory architectures

NASA Technical Reports Server (NTRS)

Hayder, M. Ehtesham; Jayasimha, D. N.; Pillay, Sasi Kumar

1995-01-01

We study a high order finite difference scheme to solve the time accurate flow field of a jet using the compressible Navier-Stokes equations. As part of our ongoing efforts, we have implemented our numerical model on three parallel computing platforms to study the computational, communication, and scalability characteristics. The platforms chosen for this study are a cluster of workstations connected through fast networks (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and a distributed memory multiprocessor (the IBM SPI). Our focus in this study is on the LACE testbed. We present some results for the Cray YMP and the IBM SP1 mainly for comparison purposes. On the LACE testbed, we study: (1) the communication characteristics of Ethernet, FDDI, and the ALLNODE networks and (2) the overheads induced by the PVM message passing library used for parallelizing the application. We demonstrate that clustering of workstations is effective and has the potential to be computationally competitive with supercomputers at a fraction of the cost.
Compute Server Performance Results

NASA Technical Reports Server (NTRS)

Stockdale, I. E.; Barton, John; Woodrow, Thomas (Technical Monitor)

1994-01-01

Parallel-vector supercomputers have been the workhorses of high performance computing. As expectations of future computing needs have risen faster than projected vector supercomputer performance, much work has been done investigating the feasibility of using Massively Parallel Processor systems as supercomputers. An even more recent development is the availability of high performance workstations which have the potential, when clustered together, to replace parallel-vector systems. We present a systematic comparison of floating point performance and price-performance for various compute server systems. A suite of highly vectorized programs was run on systems including traditional vector systems such as the Cray C90, and RISC workstations such as the IBM RS/6000 590 and the SGI R8000. The C90 system delivers 460 million floating point operations per second (FLOPS), the highest single processor rate of any vendor. However, if the price-performance ration (PPR) is considered to be most important, then the IBM and SGI processors are superior to the C90 processors. Even without code tuning, the IBM and SGI PPR's of 260 and 220 FLOPS per dollar exceed the C90 PPR of 160 FLOPS per dollar when running our highly vectorized suite,
TOP500 Supercomputers for June 2004

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

2004-06-23

23rd Edition of TOP500 List of World's Fastest Supercomputers Released: Japan's Earth Simulator Enters Third Year in Top Position MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 23rd edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2004) at the International Supercomputer Conference in Heidelberg, Germany.
ORNL Cray X1 evaluation status report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, P.K.; Alexander, R.A.; Apra, E.

2004-05-01

On August 15, 2002 the Department of Energy (DOE) selected the Center for Computational Sciences (CCS) at Oak Ridge National Laboratory (ORNL) to deploy a new scalable vector supercomputer architecture for solving important scientific problems in climate, fusion, biology, nanoscale materials and astrophysics. ''This program is one of the first steps in an initiative designed to provide U.S. scientists with the computational power that is essential to 21st century scientific leadership,'' said Dr. Raymond L. Orbach, director of the department's Office of Science. In FY03, CCS procured a 256-processor Cray X1 to evaluate the processors, memory subsystem, scalability of themore » architecture, software environment and to predict the expected sustained performance on key DOE applications codes. The results of the micro-benchmarks and kernel bench marks show the architecture of the Cray X1 to be exceptionally fast for most operations. The best results are shown on large problems, where it is not possible to fit the entire problem into the cache of the processors. These large problems are exactly the types of problems that are important for the DOE and ultra-scale simulation. Application performance is found to be markedly improved by this architecture: - Large-scale simulations of high-temperature superconductors run 25 times faster than on an IBM Power4 cluster using the same number of processors. - Best performance of the parallel ocean program (POP v1.4.3) is 50 percent higher than on Japan s Earth Simulator and 5 times higher than on an IBM Power4 cluster. - A fusion application, global GYRO transport, was found to be 16 times faster on the X1 than on an IBM Power3. The increased performance allowed simulations to fully resolve questions raised by a prior study. - The transport kernel in the AGILE-BOLTZTRAN astrophysics code runs 15 times faster than on an IBM Power4 cluster using the same number of processors. - Molecular dynamics simulations related to the phenomenon of photon echo run 8 times faster than previously achieved. Even at 256 processors, the Cray X1 system is already outperforming other supercomputers with thousands of processors for a certain class of applications such as climate modeling and some fusion applications. This evaluation is the outcome of a number of meetings with both high-performance computing (HPC) system vendors and application experts over the past 9 months and has received broad-based support from the scientific community and other agencies.« less
Research on Spectroscopy, Opacity, and Atmospheres

NASA Technical Reports Server (NTRS)

Kurucz, Robert L.

1999-01-01

To make my calculations more readily accessible I have set up a web site cfaku5.harvard.edu that can also be accessed by FTP. it has 5 9GB disks that hold all of my atomic and diatomic molecular data, my tables of distribution function opacities, my grids of model atmospheres, colors, fluxes, etc, my program that are ready for distribution, most of my recent papers. Atlases and computed spectra will be added as they are completed. New atomic and molecular calculations will be added as they are completed. I got my atomic programs that had been running on a Cray at the San Diego Supercomputer Center to run on my Vaxes and Alpha. I started with Ni and Co because there were new laboratory analyses that included isotopic and hyperfine splitting. Those calculations are described in the appended abstract for the 6th Atomic Spectroscopy and oscillator Strengths meeting in Victoria last summer. A surprising finding is that quadrupole transitions have been grossly in error because mixing with higher levels has not been included. I now have enough memory in my Alpha to treat 3000 x 3000 matrices. I now include all levels up through n=9 for Fe I and 11, the spectra for which the most information is available. I am finishing those calculations right now. After Fe I and Fe 11, all other spectra are "easy", and I will be in mass production. ATL;LS12, my opacity sampling program for computing models with arbitrary abundances, has been put on the web server. I wrote a new distribution function opacity program for workstations that replaces the one I used on the Cray at the San Diego Supercomputer Center. Each set of abundances would take 100 Cray hours costing $100,000. 1 ran 25 cases. Each of my opacity CDs contains three abundances. I have a new program -iinning on the Alpha that takes about a week. I am going to have to get a faster processor or I will have to dedicate a whole workstation just to opacities.
PREFACE: XXXV Symposium on Nuclear Physics

NASA Astrophysics Data System (ADS)

Padilla-Rodal, E.; Bijker, R.

2012-09-01

Conference logo The XXXV Symposium on Nuclear Physics was held at Hotel Hacienda Cocoyoc, Morelos, Mexico from January 3-6 2012. Conceived in 1978 as a small meeting, over the years and thanks to the efforts of various organizing committees, the symposium has become a well known international conference on nuclear physics. To the best of our knowledge, the Mexican Symposium on Nuclear Physics represents the conference series with longest tradition in Latin America and one of the longest-running annual nuclear physics conferences in the world. The Symposium brings together leading scientists from all around the world, working in the fields of nuclear structure, nuclear reactions, physics with radioactive ion beams, hadronic physics, nuclear astrophysics, neutron physics and relativistic heavy-ion physics. Its main goal is to provide a relaxed environment where the exchange of ideas, discussion of new results and consolidation of scientific collaboration are encouraged. To celebrate the 35th edition of the symposium 53 colleagues attended from diverse countries including: Argentina, Australia, Canada, Japan, Saudi Arabia and USA. We were happy to have the active participation of Eli F Aguilera, Eduardo Andrade, Octavio Castaños, Alfonso Mondragón, Stuart Pittel and Andrés Sandoval who also participated in the first edition of the Symposium back in 1978. We were joined by old friends of Cocoyoc (Stuart Pittel, Osvaldo Civitarese, Piet Van Isacker, Jerry Draayer and Alfredo Galindo-Uribarri) as well as several first time visitors that we hope will come back to this scientific meeting in the forthcoming years. The scientific program consisted of 33 invited talks, proposed by the international advisory committee, which nicely covered the topics of the Symposium giving a balanced perspective between the experimental and the theoretical work that is currently underway in each line of research. Fifteen posters complemented the scientific sessions giving the opportunity for Mexican students to present their current research and interact with the visiting scientists. The present volume contains 21 research articles based on invited talks presented at the symposium. We cannot thank enough to all the authors for their enthusiastic contribution, to the anonymous referees for the time they devoted to the review process, which helped us to maintain the high standard of the Conference Proceedings. Finally we would like to thank the International Advisory Committee and the Sponsoring Organizations that made this event possible. E Padilla-Rodal and R Bijker Editors Conference photograph International Advisory Committee Osvaldo Civitarese, Universidad Nacional de La Plata, Argentina Jerry P Draayer, Louisiana State University, USA Alfredo Galindo-Uribarri, Oak Ridge National Laboratory, USA Paulo Gomes, Universidade Federal Fluminense, Brazil Piet Van Isacker, GANIL, France James J Kolata, University of Notre Dame, USA Reiner Krücken, TRIUMF, Canada Jorge López, The University of Texas at El Paso, USA Stuart Pittel, University of Delaware, USA W Michael Snow, Indiana University, USA Adam Szczepaniak, Indiana University, USA Michael Wiescher, University of Notre Dame, USA Organizing Committee Elizabeth Padilla-Rodal (Chair), Instituto de Ciencias Nucleares, UNAM, Mexico Roelof Bijker, Instituto de Ciencias Nucleares, UNAM, Mexico Sponsoring Organizations División de Física Nuclear, SMF Dirección General de Asuntos de Personal Académico, UNAM Centro Latino-Americano de Física Instituto de Ciencias Nucleares, UNAM Instituto de Física, UNAM Instituto Nacional de Investigaciones Nucleares
MODA A Framework for Memory Centric Performance Characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shrestha, Sunil; Su, Chun-Yi; White, Amanda M.

2012-06-29

In the age of massive parallelism, the focus of performance analysis has switched from the processor and related structures to the memory and I/O resources. Adapting to this new reality, a performance analysis tool has to provide a way to analyze resource usage to pinpoint existing and potential problems in a given application. This paper provides an overview of the Memory Observant Data Analysis (MODA) tool, a memory-centric tool first implemented on the Cray XMT supercomputer. Throughout the paper, MODA's capabilities have been showcased with experiments done on matrix multiply and Graph-500 application codes.

User's and test case manual for FEMATS

NASA Technical Reports Server (NTRS)

Chatterjee, Arindam; Volakis, John; Nurnberger, Mike; Natzke, John

1995-01-01

The FEMATS program incorporates first-order edge-based finite elements and vector absorbing boundary conditions into the scattered field formulation for computation of the scattering from three-dimensional geometries. The code has been validated extensively for a large class of geometries containing inhomogeneities and satisfying transition conditions. For geometries that are too large for the workstation environment, the FEMATS code has been optimized to run on various supercomputers. Currently, FEMATS has been configured to run on the HP 9000 workstation, vectorized for the Cray Y-MP, and parallelized to run on the Kendall Square Research (KSR) architecture and the Intel Paragon.
ATLAS computing on CSCS HPC

NASA Astrophysics Data System (ADS)

Filipcic, A.; Haug, S.; Hostettler, M.; Walker, R.; Weber, M.

2015-12-01

The Piz Daint Cray XC30 HPC system at CSCS, the Swiss National Supercomputing centre, was the highest ranked European system on TOP500 in 2014, also featuring GPU accelerators. Event generation and detector simulation for the ATLAS experiment have been enabled for this machine. We report on the technical solutions, performance, HPC policy challenges and possible future opportunities for HEP on extreme HPC systems. In particular a custom made integration to the ATLAS job submission system has been developed via the Advanced Resource Connector (ARC) middleware. Furthermore, a partial GPU acceleration of the Geant4 detector simulations has been implemented.
Modeling Subsurface Reactive Flows Using Leadership-Class Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mills, Richard T; Hammond, Glenn; Lichtner, Peter

2009-01-01

We describe our experiences running PFLOTRAN - a code for simulation of coupled hydro-thermal-chemical processes in variably saturated, non-isothermal, porous media - on leadership-class supercomputers, including initial experiences running on the petaflop incarnation of Jaguar, the Cray XT5 at the National Center for Computational Sciences at Oak Ridge National Laboratory. PFLOTRAN utilizes fully implicit time-stepping and is built on top of the Portable, Extensible Toolkit for Scientific Computation (PETSc). We discuss some of the hurdles to 'at scale' performance with PFLOTRAN and the progress we have made in overcoming them on leadership-class computer architectures.
Network issues for large mass storage requirements

NASA Technical Reports Server (NTRS)

Perdue, James

1992-01-01

File Servers and Supercomputing environments need high performance networks to balance the I/O requirements seen in today's demanding computing scenarios. UltraNet is one solution which permits both high aggregate transfer rates and high task-to-task transfer rates as demonstrated in actual tests. UltraNet provides this capability as both a Server-to-Server and Server-to-Client access network giving the supercomputing center the following advantages highest performance Transport Level connections (to 40 MBytes/sec effective rates); matches the throughput of the emerging high performance disk technologies, such as RAID, parallel head transfer devices and software striping; supports standard network and file system applications using SOCKET's based application program interface such as FTP, rcp, rdump, etc.; supports access to the Network File System (NFS) and LARGE aggregate bandwidth for large NFS usage; provides access to a distributed, hierarchical data server capability using DISCOS UniTree product; supports file server solutions available from multiple vendors, including Cray, Convex, Alliant, FPS, IBM, and others.
Supercomputer description of human lung morphology for imaging analysis.

PubMed

Martonen, T B; Hwang, D; Guan, X; Fleming, J S

1998-04-01

A supercomputer code that describes the three-dimensional branching structure of the human lung has been developed. The algorithm was written for the Cray C94. In our simulations, the human lung was divided into a matrix containing discrete volumes (voxels) so as to be compatible with analyses of SPECT images. The matrix has 3840 voxels. The matrix can be segmented into transverse, sagittal and coronal layers analogous to human subject examinations. The compositions of individual voxels were identified by the type and respective number of airways present. The code provides a mapping of the spatial positions of the almost 17 million airways in human lungs and unambiguously assigns each airway to a voxel. Thus, the clinician and research scientist in the medical arena have a powerful new tool to be used in imaging analyses. The code was designed to be integrated into diverse applications, including the interpretation of SPECT images, the design of inhalation exposure experiments and the targeted delivery of inhaled pharmacologic drugs.
Integrated risk/cost planning models for the US Air Traffic system

NASA Technical Reports Server (NTRS)

Mulvey, J. M.; Zenios, S. A.

1985-01-01

A prototype network planning model for the U.S. Air Traffic control system is described. The model encompasses the dual objectives of managing collision risks and transportation costs where traffic flows can be related to these objectives. The underlying structure is a network graph with nonseparable convex costs; the model is solved efficiently by capitalizing on its intrinsic characteristics. Two specialized algorithms for solving the resulting problems are described: (1) truncated Newton, and (2) simplicial decomposition. The feasibility of the approach is demonstrated using data collected from a control center in the Midwest. Computational results with different computer systems are presented, including a vector supercomputer (CRAY-XMP). The risk/cost model has two primary uses: (1) as a strategic planning tool using aggregate flight information, and (2) as an integrated operational system for forecasting congestion and monitoring (controlling) flow throughout the U.S. In the latter case, access to a supercomputer is required due to the model's enormous size.
Modelling and scale-up of chemical flooding: First annual report for the period October 1985-September 1986

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pope, G.A.; Lake, L.W.; Sepehrnoori, K.

1987-07-01

This report consists of three parts. Part A describes the development of our chemical flood simulator UTCHEM during the past year, simulation studies, and physical property modelling and experiments. Part B is a report on the optimization and vectorization of UTCHEM on our Cray supercomputer to speed it up. Part C describes our use of UTCHEM to investigate the use of tracers for interwell reservoir tests. Part A of this Annual Report consists of five sections. In the first section, we give a general description of the simulator and recent changes in it along with a test case for amore » slightly compressible fluid. In the second section, we describe the major changes which were needed to add gel and alkaline reactions and give preliminary simulation results for these processes. In the third section, comparisons with a surfactant pilot field test are given. In the fourth section, process scaleup and design simulations are given and also our recent mesh refinement results. In the fifth section, experimental results and associated physical property modelling studies are reported. Part B gives our results on the speedup of UTCHEM on a Cray supercomputer. Depending on the size of the problem, this speedup factor was at least tenfold and resulted from a combination of a faster solver, vectorization, and code optimization. Part C describes our use of UTCHEM for field tracer studies and gives the results of a comparison with field tracer data on the same field (Big Muddy) as was simulated and compared with the surfactant pilot reported in section 3 of Part A. 120 figs., 37 tabs.« less
The SGI/Cray T3E: Experiences and Insights

NASA Technical Reports Server (NTRS)

Bernard, Lisa Hamet

1998-01-01

The NASA Goddard Space Flight Center is home to the fifth most powerful supercomputer in the world, a 1024 processor SGI/Cray T3E-600. The original 512 processor system was placed at Goddard in March, 1997 as part of a cooperative agreement between the High Performance Computing and Communications Program's Earth and Space Sciences Project (ESS) and SGI/Cray Research. The goal of this system is to facilitate achievement of the Project milestones of 10, 50 and 100 GFLOPS sustained performance on selected Earth and space science application codes. The additional 512 processors were purchased in March, 1998 by the NASA Earth Science Enterprise for the NASA Seasonal to Interannual Prediction Project (NSIPP). These two "halves" still operate as a single system, and must satisfy the unique requirements of both aforementioned groups, as well as guest researchers from the Earth, space, microgravity, manned space flight and aeronautics communities. Few large scalable parallel systems are configured for capability computing, so models are hard to find. This unique environment has created a challenging system administration task, and has yielded some insights into the supercomputing needs of the various NASA Enterprises, as well as insights into the strengths and weaknesses of the T3E architecture and software. The T3E is a distributed memory system in which the processing elements (PE's) are connected by a low latency, high bandwidth bidirectional 3-D torus. Due to the focus on high speed communication between PE's, the T3E requires PE's to be allocated contiguously per job. Further, jobs will only execute on the user specified number of PE's and PE timesharing is possible but impractical. With a highly varied job mix in both size and runtime of jobs, the resulting scenario is PE fragmentation and an inability to achieve near 100% utilization. SGI/Cray has provided several scheduling and configuration tools to minimize the impact of fragmentation. These tools include PScheD (the political scheduler), GRM (the global resource manager) and NQE (the Network Queuing Environment). Features and impact of these tools will be discussed, as will resulting performance and utilization data. As a distributed memory system, the T3E is designed to be programmed through explicit message passing. Consequently, certain assumptions related to code design are made by the operating system (UNICOS/mk) and its scheduling tools. With the exception of HPF, which does run on the T3E, however poorly, alternative programming styles have the potential to impact the T3E in unexpected and undesirable ways. Several examples will be presented (preceeded with the disclaimer, "Don't try this at home! Violators will be prosecuted!")
Scalable nuclear density functional theory with Sky3D

NASA Astrophysics Data System (ADS)

Afibuzzaman, Md; Schuetrumpf, Bastian; Aktulga, Hasan Metin

2018-02-01

In nuclear astrophysics, quantum simulations of large inhomogeneous dense systems as they appear in the crusts of neutron stars present big challenges. The number of particles in a simulation with periodic boundary conditions is strongly limited due to the immense computational cost of the quantum methods. In this paper, we describe techniques for an efficient and scalable parallel implementation of Sky3D, a nuclear density functional theory solver that operates on an equidistant grid. Presented techniques allow Sky3D to achieve good scaling and high performance on a large number of cores, as demonstrated through detailed performance analysis on a Cray XC40 supercomputer.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Reynolds, William; Weber, Marta S.; Farber, Robert M.

Social Media provide an exciting and novel view into social phenomena. The vast amounts of data that can be gathered from the Internet coupled with massively parallel supercomputers such as the Cray XMT open new vistas for research. Conclusions drawn from such analysis must recognize that social media are distinct from the underlying social reality. Rigorous validation is essential. This paper briefly presents results obtained from computational analysis of social media - utilizing both blog and twitter data. Validation of these results is discussed in the context of a framework of established methodologies from the social sciences. Finally, an outlinemore » for a set of supporting studies is proposed.« less
The computation of pi to 29,360,000 decimal digits using Borweins' quartically convergent algorithm

NASA Technical Reports Server (NTRS)

Bailey, David H.

1988-01-01

The quartically convergent numerical algorithm developed by Borwein and Borwein (1987) for 1/pi is implemented via a prime-modulus-transform multiprecision technique on the NASA Ames Cray-2 supercomputer to compute the first 2.936 x 10 to the 7th digits of the decimal expansion of pi. The history of pi computations is briefly recalled; the most recent algorithms are characterized; the implementation procedures are described; and samples of the output listing are presented. Statistical analyses show that the present decimal expansion is completely random, with only acceptable numbers of long repeating strings and single-digit runs.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Meneses, Esteban; Ni, Xiang; Jones, Terry R

The unprecedented computational power of cur- rent supercomputers now makes possible the exploration of complex problems in many scientific fields, from genomic analysis to computational fluid dynamics. Modern machines are powerful because they are massive: they assemble millions of cores and a huge quantity of disks, cards, routers, and other components. But it is precisely the size of these machines that glooms the future of supercomputing. A system that comprises many components has a high chance to fail, and fail often. In order to make the next generation of supercomputers usable, it is imperative to use some type of faultmore » tolerance platform to run applications on large machines. Most fault tolerance strategies can be optimized for the peculiarities of each system and boost efficacy by keeping the system productive. In this paper, we aim to understand how failure characterization can improve resilience in several layers of the software stack: applications, runtime systems, and job schedulers. We examine the Titan supercomputer, one of the fastest systems in the world. We analyze a full year of Titan in production and distill the failure patterns of the machine. By looking into Titan s log files and using the criteria of experts, we provide a detailed description of the types of failures. In addition, we inspect the job submission files and describe how the system is used. Using those two sources, we cross correlate failures in the machine to executing jobs and provide a picture of how failures affect the user experience. We believe such characterization is fundamental in developing appropriate fault tolerance solutions for Cray systems similar to Titan.« less
Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

NASA Technical Reports Server (NTRS)

1991-01-01

Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
The use of supercomputer modelling of high-temperature failure in pipe weldments to optimize weld and heat affected zone materials property selection

NASA Astrophysics Data System (ADS)

Wang, Z. P.; Hayhurst, D. R.

1994-07-01

The creep deformation and damage evolution in a pipe weldment has been modeled by using the finite-element continuum damage mechanics (CDM) method. The finite-element CDM computer program DAMAGE XX has been adapted to run with increased speed on a Cray XMP/416 supercomputer. Run times are sufficiently short (20 min) to permit many parametric studies to be carried out on vessel lifetimes for different weld and heat affected zone (HAZ) materials. Finite-element mesh sensitivity was studied first in order to select a mesh capable of correctly predicting experimentally observed results using at least possible computer time. A study was then made of the effect on the lifetime of a butt welded vessel of each of the commomly measured material parameters for the weld and HAZ materials. Forty different ferritic steel welded vessels were analyzed for a constant internal pressure of 45.5 MPa at a temperature of 565 C; each vessel having the same parent pipe material but different weld and HAZ materials. A lifetime improvement has been demonstrated of 30% over that obtained for the initial materials property data. A methodology for weldment design has been established which uses supercomputer-based CDM analysis techniques; it is quick to use, provides accurate results, and is a viable design tool.
COOP 3D ARPA Experiment 109 National Center for Atmospheric Research

NASA Technical Reports Server (NTRS)

1998-01-01

Coupled atmospheric and hydrodynamic forecast models were executed on the supercomputing resources of the National Center for Atmospheric Research (NCAR) in Boulder, Colorado and the Ohio Supercomputing Center (OSC)in Columbus, Ohio. respectively. The interoperation of the forecast models on these geographically diverse, high performance Cray platforms required the transfer of large three dimensional data sets at very high information rates. High capacity, terrestrial fiber optic transmission system technologies were integrated with those of an experimental high speed communications satellite in Geosynchronous Earth Orbit (GEO) to test the integration of the two systems. Operation over a spacecraft in GEO orbit required modification of the standard configuration of legacy data communications protocols to facilitate their ability to perform efficiently in the changing environment characteristic of a hybrid network. The success of this performance tuning enabled the use of such an architecture to facilitate high data rate, fiber optic quality data communications between high performance systems not accessible to standard terrestrial fiber transmission systems. Thus obviating the performance degradation often found in contemporary earth/satellite hybrids.
Magnetosphere simulations with a high-performance 3D AMR MHD Code

NASA Astrophysics Data System (ADS)

Gombosi, Tamas; Dezeeuw, Darren; Groth, Clinton; Powell, Kenneth; Song, Paul

1998-11-01

BATS-R-US is a high-performance 3D AMR MHD code for space physics applications running on massively parallel supercomputers. In BATS-R-US the electromagnetic and fluid equations are solved with a high-resolution upwind numerical scheme in a tightly coupled manner. The code is very robust and it is capable of spanning a wide range of plasma parameters (such as β, acoustic and Alfvénic Mach numbers). Our code is highly scalable: it achieved a sustained performance of 233 GFLOPS on a Cray T3E-1200 supercomputer with 1024 PEs. This talk reports results from the BATS-R-US code for the GGCM (Geospace General Circularculation Model) Phase 1 Standard Model Suite. This model suite contains 10 different steady-state configurations: 5 IMF clock angles (north, south, and three equally spaced angles in- between) with 2 IMF field strengths for each angle (5 nT and 10 nT). The other parameters are: solar wind speed =400 km/sec; solar wind number density = 5 protons/cc; Hall conductance = 0; Pedersen conductance = 5 S; parallel conductivity = ∞.
Massively Multithreaded Maxflow for Image Segmentation on the Cray XMT-2

PubMed Central

Bokhari, Shahid H.; Çatalyürek, Ümit V.; Gurcan, Metin N.

2014-01-01

SUMMARY Image segmentation is a very important step in the computerized analysis of digital images. The maxflow mincut approach has been successfully used to obtain minimum energy segmentations of images in many fields. Classical algorithms for maxflow in networks do not directly lend themselves to efficient parallel implementations on contemporary parallel processors. We present the results of an implementation of Goldberg-Tarjan preflow-push algorithm on the Cray XMT-2 massively multithreaded supercomputer. This machine has hardware support for 128 threads in each physical processor, a uniformly accessible shared memory of up to 4 TB and hardware synchronization for each 64 bit word. It is thus well-suited to the parallelization of graph theoretic algorithms, such as preflow-push. We describe the implementation of the preflow-push code on the XMT-2 and present the results of timing experiments on a series of synthetically generated as well as real images. Our results indicate very good performance on large images and pave the way for practical applications of this machine architecture for image analysis in a production setting. The largest images we have run are 320002 pixels in size, which are well beyond the largest previously reported in the literature. PMID:25598745
A Reasoning And Hypothesis-Generation Framework Based On Scalable Graph Analytics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sukumar, Sreenivas Rangan

Finding actionable insights from data has always been difficult. As the scale and forms of data increase tremendously, the task of finding value becomes even more challenging. Data scientists at Oak Ridge National Laboratory are leveraging unique leadership infrastructure (e.g. Urika-XA and Urika-GD appliances) to develop scalable algorithms for semantic, logical and statistical reasoning with unstructured Big Data. We present the deployment of such a framework called ORiGAMI (Oak Ridge Graph Analytics for Medical Innovations) on the National Library of Medicine s SEMANTIC Medline (archive of medical knowledge since 1994). Medline contains over 70 million knowledge nuggets published in 23.5more » million papers in medical literature with thousands more added daily. ORiGAMI is available as an open-science medical hypothesis generation tool - both as a web-service and an application programming interface (API) at http://hypothesis.ornl.gov . Since becoming an online service, ORIGAMI has enabled clinical subject-matter experts to: (i) discover the relationship between beta-blocker treatment and diabetic retinopathy; (ii) hypothesize that xylene is an environmental cancer-causing carcinogen and (iii) aid doctors with diagnosis of challenging cases when rare diseases manifest with common symptoms. In 2015, ORiGAMI was featured in the Historical Clinical Pathological Conference in Baltimore as a demonstration of artificial intelligence to medicine, IEEE/ACM Supercomputing and recognized as a Centennial Showcase Exhibit at the Radiological Society of North America (RSNA) Conference in Chicago. The final paper will describe the workflow built for the Cray Urika-XA and Urika-GD appliances that is able to reason with the knowledge of every published medical paper every time a clinical researcher uses the tool.« less
IBM PC enhances the world's future

NASA Technical Reports Server (NTRS)

Cox, Jozelle

1988-01-01

Although the purpose of this research is to illustrate the importance of computers to the public, particularly the IBM PC, present examinations will include computers developed before the IBM PC was brought into use. IBM, as well as other computing facilities, began serving the public years ago, and is continuing to find ways to enhance the existence of man. With new developments in supercomputers like the Cray-2, and the recent advances in artificial intelligence programming, the human race is gaining knowledge at a rapid pace. All have benefited from the development of computers in the world; not only have they brought new assets to life, but have made life more and more of a challenge everyday.
Climate Ocean Modeling on a Beowulf Class System

NASA Technical Reports Server (NTRS)

Cheng, B. N.; Chao, Y.; Wang, P.; Bondarenko, M.

2000-01-01

With the growing power and shrinking cost of personal computers. the availability of fast ethernet interconnections, and public domain software packages, it is now possible to combine them to build desktop parallel computers (named Beowulf or PC clusters) at a fraction of what it would cost to buy systems of comparable power front supercomputer companies. This led as to build and assemble our own sys tem. specifically for climate ocean modeling. In this article, we present our experience with such a system, discuss its network performance, and provide some performance comparison data with both HP SPP2000 and Cray T3E for an ocean Model used in present-day oceanographic research.

Full potential methods for analysis/design of complex aerospace configurations

NASA Technical Reports Server (NTRS)

Shankar, Vijaya; Szema, Kuo-Yen; Bonner, Ellwood

1986-01-01

The steady form of the full potential equation, in conservative form, is employed to analyze and design a wide variety of complex aerodynamic shapes. The nonlinear method is based on the theory of characteristic signal propagation coupled with novel flux biasing concepts and body-fitted mapping procedures. The resulting codes are vectorized for the CRAY XMP and the VPS-32 supercomputers. Use of the full potential nonlinear theory is demonstrated for a single-point supersonic wing design and a multipoint design for transonic maneuver/supersonic cruise/maneuver conditions. Achievement of high aerodynamic efficiency through numerical design is verified by wind tunnel tests. Other studies reported include analyses of a canard/wing/nacelle fighter geometry.
The International Cytokine Conference (11th) Held in Dublin (Ireland) on September 20-24 2003 (European Cytokine Network, Volume 14, Number 3, September 2003)

DTIC Science & Technology

2003-09-01

Fisiologia Celular UNAM, Mexico City/ Mexico specific marker of bacterial sepsis. Despite the increasing clinical importance of PCT the knowledge about its...about their function or scription factor NF-kappaB. Likewise, Varicella -zoster virus strongly mechanism of action. activates the AP- 1 components Jun and
Study of the TRAC Airfoil Table Computational System

NASA Technical Reports Server (NTRS)

Hu, Hong

1999-01-01

The report documents the study of the application of the TRAC airfoil table computational package (TRACFOIL) to the prediction of 2D airfoil force and moment data over a wide range of angle of attack and Mach number. The TRACFOIL generates the standard C-81 airfoil table for input into rotorcraft comprehensive codes such as CAM- RAD. The existing TRACFOIL computer package is successfully modified to run on Digital alpha workstations and on Cray-C90 supercomputers. A step-by-step instruction for using the package on both computer platforms is provided. Application of the newer version of TRACFOIL is made for two airfoil sections. The C-81 data obtained using the TRACFOIL method are compared with those of wind-tunnel data and results are presented.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

NASA Astrophysics Data System (ADS)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O'Neill, B. J.; Nolting, C.; Edmon, P.; Donnert, J. M. F.; Jones, T. W.

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
Numerical results on the transcendence of constants involving pi, e, and Euler's constant

NASA Technical Reports Server (NTRS)

Bailey, David H.

1988-01-01

The existence of simple polynomial equations (integer relations) for the constants e/pi, e + pi, log pi, gamma (Euler's constant), e exp gamma, gamma/e, gamma/pi, and log gamma is investigated by means of numerical computations. The recursive form of the Ferguson-Fourcade algorithm (Ferguson and Fourcade, 1979; Ferguson, 1986 and 1987) is implemented on the Cray-2 supercomputer at NASA Ames, applying multiprecision techniques similar to those described by Bailey (1988) except that FFTs are used instead of dual-prime-modulus transforms for multiplication. It is shown that none of the constants has an integer relation of degree eight or less with coefficients of Euclidean norm 10 to the 9th or less.
Developing software to use parallel processing effectively. Final report, June-December 1987

DOE Office of Scientific and Technical Information (OSTI.GOV)

Center, J.

1988-10-01

This report describes the difficulties involved in writing efficient parallel programs and describes the hardware and software support currently available for generating software that utilizes processing effectively. Historically, the processing rate of single-processor computers has increased by one order of magnitude every five years. However, this pace is slowing since electronic circuitry is coming up against physical barriers. Unfortunately, the complexity of engineering and research problems continues to require ever more processing power (far in excess of the maximum estimated 3 Gflops achievable by single-processor computers). For this reason, parallel-processing architectures are receiving considerable interest, since they offer high performancemore » more cheaply than a single-processor supercomputer, such as the Cray.« less
Tools for 3D scientific visualization in computational aerodynamics

NASA Technical Reports Server (NTRS)

Bancroft, Gordon; Plessel, Todd; Merritt, Fergus; Watson, Val

1989-01-01

The purpose is to describe the tools and techniques in use at the NASA Ames Research Center for performing visualization of computational aerodynamics, for example visualization of flow fields from computer simulations of fluid dynamics about vehicles such as the Space Shuttle. The hardware used for visualization is a high-performance graphics workstation connected to a super computer with a high speed channel. At present, the workstation is a Silicon Graphics IRIS 3130, the supercomputer is a CRAY2, and the high speed channel is a hyperchannel. The three techniques used for visualization are post-processing, tracking, and steering. Post-processing analysis is done after the simulation. Tracking analysis is done during a simulation but is not interactive, whereas steering analysis involves modifying the simulation interactively during the simulation. Using post-processing methods, a flow simulation is executed on a supercomputer and, after the simulation is complete, the results of the simulation are processed for viewing. The software in use and under development at NASA Ames Research Center for performing these types of tasks in computational aerodynamics is described. Workstation performance issues, benchmarking, and high-performance networks for this purpose are also discussed as well as descriptions of other hardware for digital video and film recording.
Scalability study of parallel spatial direct numerical simulation code on IBM SP1 parallel supercomputer

NASA Technical Reports Server (NTRS)

Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad

1994-01-01

The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
NAS Parallel Benchmark Results 11-96. 1.0

NASA Technical Reports Server (NTRS)

Bailey, David H.; Bailey, David; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

The NAS Parallel Benchmarks have been developed at NASA Ames Research Center to study the performance of parallel supercomputers. The eight benchmark problems are specified in a "pencil and paper" fashion. In other words, the complete details of the problem to be solved are given in a technical document, and except for a few restrictions, benchmarkers are free to select the language constructs and implementation techniques best suited for a particular system. These results represent the best results that have been reported to us by the vendors for the specific 3 systems listed. In this report, we present new NPB (Version 1.0) performance results for the following systems: DEC Alpha Server 8400 5/440, Fujitsu VPP Series (VX, VPP300, and VPP700), HP/Convex Exemplar SPP2000, IBM RS/6000 SP P2SC node (120 MHz), NEC SX-4/32, SGI/CRAY T3E, SGI Origin200, and SGI Origin2000. We also report High Performance Fortran (HPF) based NPB results for IBM SP2 Wide Nodes, HP/Convex Exemplar SPP2000, and SGI/CRAY T3D. These results have been submitted by Applied Parallel Research (APR) and Portland Group Inc. (PGI). We also present sustained performance per dollar for Class B LU, SP and BT benchmarks.
RIP-REMOTE INTERACTIVE PARTICLE-TRACER

NASA Technical Reports Server (NTRS)

Rogers, S. E.

1994-01-01

Remote Interactive Particle-tracing (RIP) is a distributed-graphics program which computes particle traces for computational fluid dynamics (CFD) solution data sets. A particle trace is a line which shows the path a massless particle in a fluid will take; it is a visual image of where the fluid is going. The program is able to compute and display particle traces at a speed of about one trace per second because it runs on two machines concurrently. The data used by the program is contained in two files. The solution file contains data on density, momentum and energy quantities of a flow field at discrete points in three-dimensional space, while the grid file contains the physical coordinates of each of the discrete points. RIP requires two computers. A local graphics workstation interfaces with the user for program control and graphics manipulation, and a remote machine interfaces with the solution data set and performs time-intensive computations. The program utilizes two machines in a distributed mode for two reasons. First, the data to be used by the program is usually generated on the supercomputer. RIP avoids having to convert and transfer the data, eliminating any memory limitations of the local machine. Second, as computing the particle traces can be computationally expensive, RIP utilizes the power of the supercomputer for this task. Although the remote site code was developed on a CRAY, it is possible to port this to any supercomputer class machine with a UNIX-like operating system. Integration of a velocity field from a starting physical location produces the particle trace. The remote machine computes the particle traces using the particle-tracing subroutines from PLOT3D/AMES, a CFD post-processing graphics program available from COSMIC (ARC-12779). These routines use a second-order predictor-corrector method to integrate the velocity field. Then the remote program sends graphics tokens to the local machine via a remote-graphics library. The local machine interprets the graphics tokens and draws the particle traces. The program is menu driven. RIP is implemented on the silicon graphics IRIS 3000 (local workstation) with an IRIX operating system and on the CRAY2 (remote station) with a UNICOS 1.0 or 2.0 operating system. The IRIS 4D can be used in place of the IRIS 3000. The program is written in C (67%) and FORTRAN 77 (43%) and has an IRIS memory requirement of 4 MB. The remote and local stations must use the same user ID. PLOT3D/AMES unformatted data sets are required for the remote machine. The program was developed in 1988.
ERDC MSRC (Major Shared Resource Center) Resource. Spring 2008

DTIC Science & Technology

2008-01-01

obtained from ADCIRC results. The alpha test was performed on the Cray XT3 machine (Sapphire) at ERDC and the IBM P575+ system ( Babbage ) at the...2008 20 Scotty Swillie (center) and Charles Ray (far right) were part of the team that constructed the DoD HPCMP booth for the Conference (From
Scaling up ATLAS Event Service to production levels on opportunistic computing platforms

NASA Astrophysics Data System (ADS)

Benjamin, D.; Caballero, J.; Ernst, M.; Guan, W.; Hover, J.; Lesny, D.; Maeno, T.; Nilsson, P.; Tsulaia, V.; van Gemmeren, P.; Vaniachine, A.; Wang, F.; Wenaus, T.; ATLAS Collaboration

2016-10-01

Continued growth in public cloud and HPC resources is on track to exceed the dedicated resources available for ATLAS on the WLCG. Examples of such platforms are Amazon AWS EC2 Spot Instances, Edison Cray XC30 supercomputer, backfill at Tier 2 and Tier 3 sites, opportunistic resources at the Open Science Grid (OSG), and ATLAS High Level Trigger farm between the data taking periods. Because of specific aspects of opportunistic resources such as preemptive job scheduling and data I/O, their efficient usage requires workflow innovations provided by the ATLAS Event Service. Thanks to the finer granularity of the Event Service data processing workflow, the opportunistic resources are used more efficiently. We report on our progress in scaling opportunistic resource usage to double-digit levels in ATLAS production.
High Performance Computing Software Applications for Space Situational Awareness

NASA Astrophysics Data System (ADS)

Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.
Relativistic Collisions of Highly-Charged Ions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ionescu, Dorin; Belkacem, Ali

1998-11-19

The physics of elementary atomic processes in relativistic collisions between highly-charged ions and atoms or other ions is briefly discussed, and some recent theoretical and experimental results in this field are summarized. They include excitation, capture, ionization, and electron-positron pair creation. The numerical solution of the two-center Dirac equation in momentum space is shown to be a powerful nonperturbative method for describing atomic processes in relativistic collisions involving heavy and highly-charged ions. By propagating negative-energy wave packets in time the evolution of the QED vacuum around heavy ions in relativistic motion is investigated. Recent results obtained from numerical calculations usingmore » massively parallel processing on the Cray-T3E supercomputer of the National Energy Research Scientific Computer Center (NERSC) at Berkeley National Laboratory are presented.« less
Swarm Verification

NASA Technical Reports Server (NTRS)

Holzmann, Gerard J.; Joshi, Rajeev; Groce, Alex

2008-01-01

Reportedly, supercomputer designer Seymour Cray once said that he would sooner use two strong oxen to plow a field than a thousand chickens. Although this is undoubtedly wise when it comes to plowing a field, it is not so clear for other types of tasks. Model checking problems are of the proverbial "search the needle in a haystack" type. Such problems can often be parallelized easily. Alas, none of the usual divide and conquer methods can be used to parallelize the working of a model checker. Given that it has become easier than ever to gain access to large numbers of computers to perform even routine tasks it is becoming more and more attractive to find alternate ways to use these resources to speed up model checking tasks. This paper describes one such method, called swarm verification.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it maymore » be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.« less
An interactive adaptive remeshing algorithm for the two-dimensional Euler equations

NASA Technical Reports Server (NTRS)

Slack, David C.; Walters, Robert W.; Lohner, R.

1990-01-01

An interactive adaptive remeshing algorithm utilizing a frontal grid generator and a variety of time integration schemes for the two-dimensional Euler equations on unstructured meshes is presented. Several device dependent interactive graphics interfaces have been developed along with a device independent DI-3000 interface which can be employed on any computer that has the supporting software including the Cray-2 supercomputers Voyager and Navier. The time integration methods available include: an explicit four stage Runge-Kutta and a fully implicit LU decomposition. A cell-centered finite volume upwind scheme utilizing Roe's approximate Riemann solver is developed. To obtain higher order accurate results a monotone linear reconstruction procedure proposed by Barth is utilized. Results for flow over a transonic circular arc and flow through a supersonic nozzle are examined.
High Productivity Computing Systems and Competitiveness Initiative

DTIC Science & Technology

2007-07-01

planning committee for the annual, international Supercomputing Conference in 2004 and 2005. This is the leading HPC industry conference in the world. It...sector partnerships. Partnerships will form a key part of discussions at the 2nd High Performance Computing Users Conference, planned for July 13, 2005...other things an interagency roadmap for high-end computing core technologies and an accessibility improvement plan . Improving HPC Education and
ALCF Data Science Program: Productive Data-centric Supercomputing

NASA Astrophysics Data System (ADS)

Romero, Nichols; Vishwanath, Venkatram

The ALCF Data Science Program (ADSP) is targeted at big data science problems that require leadership computing resources. The goal of the program is to explore and improve a variety of computational methods that will enable data-driven discoveries across all scientific disciplines. The projects will focus on data science techniques covering a wide area of discovery including but not limited to uncertainty quantification, statistics, machine learning, deep learning, databases, pattern recognition, image processing, graph analytics, data mining, real-time data analysis, and complex and interactive workflows. Project teams will be among the first to access Theta, ALCFs forthcoming 8.5 petaflops Intel/Cray system. The program will transition to the 200 petaflop/s Aurora supercomputing system when it becomes available. In 2016, four projects have been selected to kick off the ADSP. The selected projects span experimental and computational sciences and range from modeling the brain to discovering new materials for solar-powered windows to simulating collision events at the Large Hadron Collider (LHC). The program will have a regular call for proposals with the next call expected in Spring 2017.http://www.alcf.anl.gov/alcf-data-science-program This research used resources of the ALCF, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.
: A Scalable and Transparent System for Simulating MPI Programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Perumalla, Kalyan S

2010-01-01

is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, andmore » MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.« less

Implementation of molecular dynamics and its extensions with the coarse-grained UNRES force field on massively parallel systems; towards millisecond-scale simulations of protein structure, dynamics, and thermodynamics

PubMed Central

Liwo, Adam; Ołdziej, Stanisław; Czaplewski, Cezary; Kleinerman, Dana S.; Blood, Philip; Scheraga, Harold A.

2010-01-01

We report the implementation of our united-residue UNRES force field for simulations of protein structure and dynamics with massively parallel architectures. In addition to coarse-grained parallelism already implemented in our previous work, in which each conformation was treated by a different task, we introduce a fine-grained level in which energy and gradient evaluation are split between several tasks. The Message Passing Interface (MPI) libraries have been utilized to construct the parallel code. The parallel performance of the code has been tested on a professional Beowulf cluster (Xeon Quad Core), a Cray XT3 supercomputer, and two IBM BlueGene/P supercomputers with canonical and replica-exchange molecular dynamics. With IBM BlueGene/P, about 50 % efficiency and 120-fold speed-up of the fine-grained part was achieved for a single trajectory of a 767-residue protein with use of 256 processors/trajectory. Because of averaging over the fast degrees of freedom, UNRES provides an effective 1000-fold speed-up compared to the experimental time scale and, therefore, enables us to effectively carry out millisecond-scale simulations of proteins with 500 and more amino-acid residues in days of wall-clock time. PMID:20305729
Super and parallel computers and their impact on civil engineering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kamat, M.P.

1986-01-01

This book presents the papers given at a conference on the use of supercomputers in civil engineering. Topics considered at the conference included solving nonlinear equations on a hypercube, a custom architectured parallel processing system, distributed data processing, algorithms, computer architecture, parallel processing, vector processing, computerized simulation, and cost benefit analysis.
Implementation of Helioseismic Data Reduction and Diagnostic Techniques on Massively Parallel Architectures

NASA Technical Reports Server (NTRS)

Korzennik, Sylvain

1997-01-01

Under the direction of Dr. Rhodes, and the technical supervision of Dr. Korzennik, the data assimilation of high spatial resolution solar dopplergrams has been carried out throughout the program on the Intel Delta Touchstone supercomputer. With the help of a research assistant, partially supported by this grant, and under the supervision of Dr. Korzennik, code development was carried out at SAO, using various available resources. To ensure cross-platform portability, PVM was selected as the message passing library. A parallel implementation of power spectra computation for helioseismology data reduction, using PVM was successfully completed. It was successfully ported to SMP architectures (i.e. SUN), and to some MPP architectures (i.e. the CM5). Due to limitation of the implementation of PVM on the Cray T3D, the port to that architecture was not completed at the time.
Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad

1995-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.
NASA Astrophysics Data System (ADS)

Morisset, C.; Delgado-Inglada, G.; Torres-Peimbert, S.

2014-04-01

Most - if not all - planetary nebulae exhibit a complex structure, far from the spherical shape. The reasons for this dramatic change in symmetry, that occurs in early stage of the development of the nebula, remain controversial. The same physics operates in a variety of stars, from young (winds from young stars and/or high mass stars) to old (novae, symbiotic stars). The aim of the APN series of conferences has been to offer the opportunity to anyone involved in the study of asymmetric planetary nebulae (and related objects) to discuss the latest results obtained in this field. The APN VI conference was organized by the Instituto de Astronomia (UNAM) and took place on Riviera Maya, Quintana Roo, México, 4-8 Nov. 2013
Gigabit ATM: another technical mistake?

NASA Astrophysics Data System (ADS)

Christ, Paul

1998-09-01

Once upon a time, or more precisely during February 1988 at the CCITT Seoul plenary, and definitely arriving as a revolution, ATM hit the hard-core B-ISDN circuit-switching gang. Initiated by the Telecoms' camp, but, surprisingly, soon to be pushed by computer minded people, ATM's generic technological history is somewhat richer than single-sided stories. Here are two classical elements of that history: Firstly, together with X.25, ATM suffers from the connection versus datagram dichotomy, well known for more than twenty years. Secondly, and lesser known, ATM's use of cells in support of the 'I' of B-ISDN was questioned from the very beginning by the packet switching camp. Furthermore, in this context, there are two other essential elements to be considered: Firstly, the exponential growth of the Internet and later intranets, using Internet technology, sparked by the success of the Web and the WINTEL alliance, resulted in a corresponding demand for both aggregate and end-system network bandwidth. Secondly, servers, historically restricted to the exclusive club of HIPPI-equipped supercomputers, suddenly become ordinary high-end PCs with 64-bit wide PCI busses -- definitely aiming at the Gigabit. Here, if your aim is for Gigabit ATM with 5000-transactions per second classical supercomputers, a 65K ATM MTU -- as implemented by Cray -- might be okay. Following Clark and others, another part of the story is the adoption and redefinition, by the IETF, of the Telecoms' notion of 'Integrated Services' and QoS mechanisms. The quest for low-delay IP packet forwarding, perhaps possible over ATM cut-throughs, has resulted in the switching versus/or integrated-with-routing movement. However, a blow for ATM may be the recent results concerning fast routing table lookup algorithms. This, by making Gigabit routing possible using ordinary Pentium processors may eventually render the much prophesized ATM switching performance unnecessary. Recently, with the rise of Gigabit Ethernet, many of the elements mentioned above are now being presented by standard 'Gigabit Ethernet and Gigabit ATM -- friends or foes' conferences. In- depth analyses are given concerning the canonical elements of such a setting: legacy, new use requirements, manageability, security LAN-WAN, architectures, standards, technologies and products, complexity, evolution-transition strategies, manufacturers, player organizations etc. Often in such conferences, Fiber Channel, being one of Gigabit Ethernet's physical media, is presented as the only other Gigabit LAN technology. In an attempt to sum up: Given the present state of ATM deployment measured in terms of functionalities and sophistication, after ten years of CCCITT/ITU and almost as many years of ATM Forum effort, does the question still being asked now represent the answer -- ATM is or was a mistake there some elements still missing? Here's a technical and a political example:
GPU acceleration of a petascale application for turbulent mixing at high Schmidt number using OpenMP 4.5

NASA Astrophysics Data System (ADS)

Clay, M. P.; Buaria, D.; Yeung, P. K.; Gotoh, T.

2018-07-01

This paper reports on the successful implementation of a massively parallel GPU-accelerated algorithm for the direct numerical simulation of turbulent mixing at high Schmidt number. The work stems from a recent development (Comput. Phys. Commun., vol. 219, 2017, 313-328), in which a low-communication algorithm was shown to attain high degrees of scalability on the Cray XE6 architecture when overlapping communication and computation via dedicated communication threads. An even higher level of performance has now been achieved using OpenMP 4.5 on the Cray XK7 architecture, where on each node the 16 integer cores of an AMD Interlagos processor share a single Nvidia K20X GPU accelerator. In the new algorithm, data movements are minimized by performing virtually all of the intensive scalar field computations in the form of combined compact finite difference (CCD) operations on the GPUs. A memory layout in departure from usual practices is found to provide much better performance for a specific kernel required to apply the CCD scheme. Asynchronous execution enabled by adding the OpenMP 4.5 NOWAIT clause to TARGET constructs improves scalability when used to overlap computation on the GPUs with computation and communication on the CPUs. On the 27-petaflops supercomputer Titan at Oak Ridge National Laboratory, USA, a GPU-to-CPU speedup factor of approximately 5 is consistently observed at the largest problem size of 81923 grid points for the scalar field computed with 8192 XK7 nodes.
Reorientation of rotating fluid in microgravity environment with and without gravity jitters

NASA Technical Reports Server (NTRS)

Hung, R. J.; Lee, C. C.; Shyu, K. L.

1990-01-01

In a spacecraft design, the requirements of settled propellant are different for tank pressurization, engine restart, venting, or propellant transfer. The requirement to settle or to position liquid fuel over the outlet end of the spacecraft propellant tank prior main engine restart poses a microgravity fluid behavior problem. In this paper, the dynamical behavior of liquid propellant, fluid reorientation, and propellant resettling have been carried out through the execution of supercomputer CRAY X-MP to simulate the fluid management in a microgravity environment. Results show that the resettlement of fluid can be accomplished more efficiently for fluid in rotating tank than in nonrotating tank, and also better performance for gravity jitters imposed on fluid settlement than without gravity jitters based on the amount of time needed to carry out resettlement period of time between the initiation and termination of geysering.
ARC-2012-ACD12-0020-005

NASA Image and Video Library

2012-02-10

Then and Now: These images illustrate the dramatic improvement in NASA computing power over the last 23 years, and its effect on the number of grid points used for flow simulations. At left, an image from the first full-body Navier-Stokes simulation (1988) of an F-16 fighter jet showing pressure on the aircraft body, and fore-body streamlines at Mach 0.90. This steady-state solution took 25 hours using a single Cray X-MP processor to solve the 500,000 grid-point problem. Investigator: Neal Chaderjian, NASA Ames Research Center At right, a 2011 snapshot from a Navier-Stokes simulation of a V-22 Osprey rotorcraft in hover. The blade vortices interact with the smaller turbulent structures. This very detailed simulation used 660 million grid points, and ran on 1536 processors on the Pleiades supercomputer for 180 hours. Investigator: Neal Chaderjian, NASA Ames Research Center; Image: Tim Sandstrom, NASA Ames Research Center
Parallelization of the FLAPW method

NASA Astrophysics Data System (ADS)

Canning, A.; Mannstadt, W.; Freeman, A. J.

2000-08-01

The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
Parallel performance optimizations on unstructured mesh-based simulations

DOE PAGES

Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas; ...

2015-06-01

This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches.more » We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less
Non-linear wave phenomena in Josephson elements for superconducting electronics

NASA Astrophysics Data System (ADS)

Christiansen, P. L.; Parmentier, R. D.; Skovgaard, O.

1985-07-01

The long and intermediate length Josephson tunnel junction oscillator with overlap geometry of linear and circular configuration, is investigated by computational solution of the perturbed sine-Gordon equation model and by experimental measurements. The model predicts the experimental results very well. Line oscillators as well as ring oscillators are treated. For long junctions soliton perturbation methods are developed and turn out to be efficient prediction tools, also providing physical understanding of the dynamics of the oscillator. For intermediate length junctions expansions in terms of linear cavity modes reduce computational costs. The narrow linewidth of the electromagnetic radiation (typically 1 kHz of a line at 10 GHz) is demonstrated experimentally. Corresponding computer simulations requiring a relative accuracy of less than 10 to the -7th power are performed on supercomputer CRAY-1-S. The broadening of linewidth due to external microradiation and internal thermal noise is determined.
Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation

PubMed Central

Phillips, James C.; Sun, Yanhua; Jain, Nikhil; Bohm, Eric J.; Kalé, Laxmikant V.

2014-01-01

Currently deployed petascale supercomputers typically use toroidal network topologies in three or more dimensions. While these networks perform well for topology-agnostic codes on a few thousand nodes, leadership machines with 20,000 nodes require topology awareness to avoid network contention for communication-intensive codes. Topology adaptation is complicated by irregular node allocation shapes and holes due to dedicated input/output nodes or hardware failure. In the context of the popular molecular dynamics program NAMD, we present methods for mapping a periodic 3-D grid of fixed-size spatial decomposition domains to 3-D Cray Gemini and 5-D IBM Blue Gene/Q toroidal networks to enable hundred-million atom full machine simulations, and to similarly partition node allocations into compact domains for smaller simulations using multiple-copy algorithms. Additional enabling techniques are discussed and performance is reported for NCSA Blue Waters, ORNL Titan, ANL Mira, TACC Stampede, and NERSC Edison. PMID:25594075
Thermal Properties of Mineralized and Non Mineralized Type I Collagen in Bone

DTIC Science & Technology

2002-04-01

Heredial, E. Villarreal", J. Ocotldn-Flores2, A. L. Gomez-Cortes’, F. J. Aranda-Manteca 5, E. Orozco’ and L. Bucio’. lInstituto de Fisica , UNAM, Ciudad... Investigaciones en Materiales, UNAM, Ciudad Universitaria, Coyoacan. C.P. 04510. Mexico. D.F.41nstituto de Investigaciones Antropologicas, UNAM...Universitaria, Coyoacan. C.P. 01000, Mexico. D.F.2Centro de -instrumentos, UNAM, Ciudad Universitaria, Coyoacan. C.P. 04510. Mexico. D.F.3Instituto de
The ASCI Network for SC '99: A Step on the Path to a 100 Gigabit Per Second Supercomputing Network

DOE Office of Scientific and Technical Information (OSTI.GOV)

PRATT,THOMAS J.; TARMAN,THOMAS D.; MARTINEZ,LUIS M.

2000-07-24

This document highlights the Discom{sup 2}'s Distance computing and communication team activities at the 1999 Supercomputing conference in Portland, Oregon. This conference is sponsored by the IEEE and ACM. Sandia, Lawrence Livermore and Los Alamos National laboratories have participated in this conference for eleven years. For the last four years the three laboratories have come together at the conference under the DOE's ASCI, Accelerated Strategic Computing Initiatives rubric. Communication support for the ASCI exhibit is provided by the ASCI DISCOM{sup 2} project. The DISCOM{sup 2} communication team uses this forum to demonstrate and focus communication and networking developments within themore » community. At SC 99, DISCOM built a prototype of the next generation ASCI network demonstrated remote clustering techniques, demonstrated the capabilities of the emerging Terabit Routers products, demonstrated the latest technologies for delivering visualization data to the scientific users, and demonstrated the latest in encryption methods including IP VPN technologies and ATM encryption research. The authors also coordinated the other production networking activities within the booth and between their demonstration partners on the exhibit floor. This paper documents those accomplishments, discusses the details of their implementation, and describes how these demonstrations support Sandia's overall strategies in ASCI networking.« less
Microstructure and Thermal Expansion Properties of Ostrich Eggshell

DTIC Science & Technology

2002-04-01

A.Rodriguez-Hernindez, E. Villarreal4, A. Martinez, M.V. Garcia-GardufiolŖ, V.A. Basiuk 3ř, L. Bucio and E. Orozco Instituto de Fisica UNAM, Apdo. Postal 20...Ku, Yokohama 240-8501.41nstituto de Investigaciones en Materiales, UNAM, 04510 Mexico D.F. Mexico ABSTRACT Textures of calcite crystals from ostrich...364, 01000 Mexico D.F., Mexico ’Facultad de Ciencias UNAM, 04510 Mexico D.F. Mexico 2 Div. de Posgr. e Invest., Facultad de Odontologia UNAM, 04510
Committees and sponsors

NASA Astrophysics Data System (ADS)

2011-10-01

International Advisory Committee Richard F CastenYale, USA Luiz Carlos ChamonSão Paulo, Brazil Osvaldo CivitareseLa Plata, Argentina Jozsef CsehATOMKI, Hungary Jerry P DraayerLSU, USA Alfredo Galindo-UribarriORNL & UT, USA James J KolataNotre Dame, USA Jorge López UTEP, USA Joseph B NatowitzTexas A & M, USA Ma Esther Ortiz IF-UNAM Stuart PittelDelaware, USA Andrés SandovalIF-UNAM Adam SzczepaniakIndiana, USA Piet Van IsackerGANIL, France Michael WiescherNotre Dame, USA Organizing Committee Libertad Barrón-Palos (Chair)IF-UNAM Roelof BijkerICN-UNAM Ruben FossionICN-UNAM David LizcanoININ Sponsors Instituto de Ciencias Nucleares, UNAMInstituto de Física, UNAMInstituto Nacional de Investigaciones NuclearesDivisión de Física Nuclear de la SMFCentro Latinoamericano de Física
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Secchi, Simone; Tumeo, Antonino; Villa, Oreste

Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy inmore » reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.« less
New Computer Simulations of Macular Neural Functioning

NASA Technical Reports Server (NTRS)

Ross, Muriel D.; Doshay, D.; Linton, S.; Parnas, B.; Montgomery, K.; Chimento, T.

1994-01-01

We use high performance graphics workstations and supercomputers to study the functional significance of the three-dimensional (3-D) organization of gravity sensors. These sensors have a prototypic architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scaled-up, 3-D versions run on a Cray Y-MP supercomputer. A semi-automated method of reconstruction of neural tissue from serial sections studied in a transmission electron microscope has been developed to eliminate tedious conventional photography. The reconstructions use a mesh as a step in generating a neural surface for visualization. Two meshes are required to model calyx surfaces. The meshes are connected and the resulting prisms represent the cytoplasm and the bounding membranes. A finite volume analysis method is employed to simulate voltage changes along the calyx in response to synapse activation on the calyx or on calyceal processes. The finite volume method insures that charge is conserved at the calyx-process junction. These and other models indicate that efferent processes act as voltage followers, and that the morphology of some afferent processes affects their functioning. In a final application, morphological information is symbolically represented in three dimensions in a computer. The possible functioning of the connectivities is tested using mathematical interpretations of physiological parameters taken from the literature. Symbolic, 3-D simulations are in progress to probe the functional significance of the connectivities. This research is expected to advance computer-based studies of macular functioning and of synaptic plasticity.
Multiprocessing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

Very little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPs or more) in computational aerodynamics to significantly improve turnaround time. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, the improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) through multi-tasking is applied via a strategy which requires relatively minor modifications to an existing code for a single processor. Essentially, this approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. The existing single processor code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor. As a demonstration of this approach, a Multiple Processor Multiple Grid (MPMG) code is developed. It is capable of using nine processors, and can be easily extended to a larger number of processors. This code solves the three-dimensional, Reynolds averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. The solver is applied to generic oblique-wing aircraft problem on a four processor Cray-2 computer. A tricubic interpolation scheme is developed to increase the accuracy of coupling of overlapped grids. For the oblique-wing aircraft problem, a speedup of two in elapsed (turnaround) time is observed in a saturated time-sharing environment.

Public Outreach and Educational Experiences in Mexico and Latin American communities in California

NASA Astrophysics Data System (ADS)

Andres De Leo-Winkler, Mario; Canalizo, Gabriela; Pichardo, Barbara; Arias, Brenda

2015-08-01

I have created and applied diverse methods in public outreach at National Autonomous Univerisity of Mexico (UNAM) since 2001.A student-led volunteer astronomical club has been created, the biggest in Mexico. We serve over 10,000 people per year. We have created public outreach activities for the general audience: archeo-astronomical outings, scientific movie debates, conferences, courses, public telescope viewings. We have also worked with juvenile delinquents to offer them scientific opportunities when released from jail.I've also created and worked the social media for the Institute of Astronomy UNAM, which is currently the biggest social media site on astronomy in Spanish in the world. I've created and organized a mass photo exhibition (over 1 million people served) for the Institute of Astronomy, UNAM which was citizen-funded through an online platform, the first of its kind in the country. Together with my colleages, we created workshops on astronomy for children with the Mexican's government funding.I've participated in several radio and television programs/capsules designed to bring astronomy to the general audience, one in particular ("Astrophysics for Dummies") was very successful in nation-wide Mexican radio.I am currently applying all experiences to develop a new public outreach project on astronomy for the University of California - Riverside and its on-campus and surrounding Latin American communities. We are offering new workshops for blind and deaf children. We want to integrate the Latino community to our outreach activities and offer science in their language in a simple and entertaining fashion. We have also successfully applied astrophotography as a course which brings social-science and arts undergraduate students into natural sciences.Sharing experiences, success and failure stories will help new and experienced educators and public outreach professionals learn and better from past experiences.
New Developments Regarding the KT Event and Other Catastrophes in Earth History

NASA Astrophysics Data System (ADS)

This volume contains papers that have been accepted for presentation at the conference on New Developments Regarding the KT Event and Other Catastrophes in Earth History, February 9-12, 1994, in Houston, Texas. The Program Committee consisted of W. Alvarez (University of California, Berkeley), D. Black (Lunar and Planetary Institute), J. Bourgeois (National Science Foundation), K. Burke (University of Houston), R. Ginsburg (University of Miami), G. Keller (Princeton University), C. Koeberl (University of Vienna), J. Longoria (Florida International University), G. Ryder (Lunar and Planetary Institute), V. Sharpton, convener (Lunar and Planetary Institute), H. Sigurdsson (University of Rhode Island), R. Turco (University of California, Los Angeles), and P. Ward (University of Washington). The Scientific Organizing Committee consisted of W. Alvarez (University of California, Berkeley), D. Black (Lunar and Planetary Institute), K. Burke (University of Houston), R. Ginsburg (University of Miami), L. Hunt (National Academy of Sciences), G. Keller (Princeton University), L. Marin (UNAM, cd. Universitaria), D. Raup (University of Chicago), V. Sharpton (Lunar and Planetary Institute), E. Shoemaker (U.S. Geological Survey, Flagstaff), and G. Suarez (UNAM, cd. Universitaria). Logistics and administrative and publications support were provided by the Publications and Program Services Department staff at the Lunar and Planetary Institute.
Multitasking and microtasking experience on the NA S Cray-2 and ACF Cray X-MP

NASA Technical Reports Server (NTRS)

Raiszadeh, Farhad

1987-01-01

The fast Fourier transform (FFT) kernel of the NAS benchmark program has been utilized to experiment with the multitasking library on the Cray-2 and Cray X-MP/48, and microtasking directives on the Cray X-MP. Some performance figures are shown, and the state of multitasking software is described.
UNAM Scientific Drilling Program of Chicxulub Impact Structure-Evidence for a 300 kilometer crater diameter

NASA Astrophysics Data System (ADS)

Urrutia-Fucugauchi, J.; Marin, L.; Trejo-Garcia, A.

As part of the UNAM drilling program at the Chicxulub structure, two 700 m deep continuously cored boreholes were completed between April and July, 1995. The Peto UNAM-6 and Tekax UNAM-7 drilling sites are ˜150 km and 125 km, respectively, SSE of Chicxulub Puerto, near the crater's center. Core samples from both sites show a sequence of post-crater carbonates on top of a thick impact breccia pile covering the disturbed Mesozoic platform rocks. At UNAM-7, two impact breccia units were encountered: (1) an upper breccia, mean magnetic susceptibility is high (˜55 × 10-6 SI units), indicating a large component of silicate basement has been incorporated into this breccia, and (2) an evaporite-rich, low susceptibility impact breccia similar in character to the evaporite-rich breccias observed at the PEMEX drill sites further out. The upper breccia was encountered at ˜226 m below the surface and is ˜125 m thick; the lower breccia is immediately subjacent and is >240 m thick. This two-breccia sequence is typical of the suevite-Bunte breccia sequence found within other well preserved impact craters. The suevitic upper unit is not present at UNAM-6. Instead, a >240 m thick evaporite-rich breccia unit, similar to the lower breccia at UNAM-7, was encountered at a depth of ˜280 m. The absence of an upper breccia equivalent at UNAM-6 suggests some portion of the breccia sequence has been removed by erosion. This is consistent with interpretations that place the high-standing crater rim at 130-150 km from the center. Consequently, the stratigraphic observations and magnetic susceptibiity records on the upper and lower breccias (depth and thickness) support a ˜300 km diameter crater model.
Parallel Performance Optimizations on Unstructured Mesh-based Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas

2015-01-01

© The Authors. Published by Elsevier B.V. This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cachemore » efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less
Molecular orbital studies of the bonding in heavy element organometallics: Progress report

NASA Astrophysics Data System (ADS)

Bursten, B. E.

1988-03-01

Over the past two years we have made considerable progress in the understanding of the bonding in heavy element mononuclear and binuclear complexes. For mononuclear complexes, our strategy has been to study the orbital interactions between the actinide metal center and the surrounding ligands. One particular system which has been studied extensively is X sub 3 AnL (where X = Cp, Cl, NH sub 2 ; An = actinide; and L = neutral or anionic ligand). We are interested not only in the mechanics of the An-X orbital interactions, but also how the relative donor characteristics of X may influence coordination of the fourth ligand L to the actinide. For binuclear systems, we are interested not only in homobimetallic complexes, but also in heterobimetallic complexes containing actinides and transition metals. In order to make the calculations of such large systems tractable, we have transferred the X-alpha-SW codes to the newly acquired Cray XMP24 at the Ohio Supercomputer Center. This has resulted in significant savings of money and time.
Thought Leaders during Crises in Massive Social Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Corley, Courtney D.; Farber, Robert M.; Reynolds, William

The vast amount of social media data that can be gathered from the internet coupled with workflows that utilize both commodity systems and massively parallel supercomputers, such as the Cray XMT, open new vistas for research to support health, defense, and national security. Computer technology now enables the analysis of graph structures containing more than 4 billion vertices joined by 34 billion edges along with metrics and massively parallel algorithms that exhibit near-linear scalability according to number of processors. The challenge lies in making this massive data and analysis comprehensible to an analyst and end-users that require actionable knowledge tomore » carry out their duties. Simply stated, we have developed language and content agnostic techniques to reduce large graphs built from vast media corpora into forms people can understand. Specifically, our tools and metrics act as a survey tool to identify thought leaders' -- those members that lead or reflect the thoughts and opinions of an online community, independent of the source language.« less
Exciting Quantized Vortex Rings in a Superfluid Unitary Fermi Gas

NASA Astrophysics Data System (ADS)

Bulgac, Aurel

2014-03-01

In a recent article, Yefsah et al., Nature 499, 426 (2013) report the observation of an unusual quantum excitation mode in an elongated harmonically trapped unitary Fermi gas. After phase imprinting a domain wall, they observe collective oscillations of the superfluid atomic cloud with a period almost an order of magnitude larger than that predicted by any theory of domain walls, which they interpret as a possible new quantum phenomenon dubbed ``a heavy soliton'' with an inertial mass some 50 times larger than one expected for a domain wall. We present compelling evidence that this ``heavy soliton'' is instead a quantized vortex ring by showing that the main aspects of the experiment can be naturally explained within an extension of the time-dependent density functional theory (TDDFT) to superfluid systems. The numerical simulations required the solution of some 260,000 nonlinear coupled time-dependent 3-dimensional partial differential equations and was implemented on 2048 GPUs on the Cray XK7 supercomputer Titan of the Oak Ridge Leadership Computing Facility.
Efficient Parallelization of a Dynamic Unstructured Application on the Tera MTA

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak

1999-01-01

The success of parallel computing in solving real-life computationally-intensive problems relies on their efficient mapping and execution on large-scale multiprocessor architectures. Many important applications are both unstructured and dynamic in nature, making their efficient parallel implementation a daunting task. This paper presents the parallelization of a dynamic unstructured mesh adaptation algorithm using three popular programming paradigms on three leading supercomputers. We examine an MPI message-passing implementation on the Cray T3E and the SGI Origin2OOO, a shared-memory implementation using cache coherent nonuniform memory access (CC-NUMA) of the Origin2OOO, and a multi-threaded version on the newly-released Tera Multi-threaded Architecture (MTA). We compare several critical factors of this parallel code development, including runtime, scalability, programmability, and memory overhead. Our overall results demonstrate that multi-threaded systems offer tremendous potential for quickly and efficiently solving some of the most challenging real-life problems on parallel computers.
A portable platform for accelerated PIC codes and its application to GPUs using OpenACC

NASA Astrophysics Data System (ADS)

Hariri, F.; Tran, T. M.; Jocksch, A.; Lanti, E.; Progsch, J.; Messmer, P.; Brunner, S.; Gheller, C.; Villard, L.

2016-10-01

We present a portable platform, called PIC_ENGINE, for accelerating Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as Graphic Processing Units (GPUs). The aim of this development is efficient simulations on future exascale systems by allowing different parallelization strategies depending on the application problem and the specific architecture. To this end, this platform contains the basic steps of the PIC algorithm and has been designed as a test bed for different algorithmic options and data structures. Among the architectures that this engine can explore, particular attention is given here to systems equipped with GPUs. The study demonstrates that our portable PIC implementation based on the OpenACC programming model can achieve performance closely matching theoretical predictions. Using the Cray XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the one on an Intel Sandy bridge 8-core CPU by a factor of 3.4.
Parallelization of the FLAPW method and comparison with the PPW method

NASA Astrophysics Data System (ADS)

Canning, Andrew; Mannstadt, Wolfgang; Freeman, Arthur

2000-03-01

The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. In the past the FLAPW method has been limited to systems of about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell running on up to 512 processors on a Cray T3E parallel supercomputer. Some results will also be presented on a comparison of the plane-wave pseudopotential method and the FLAPW method on large systems.
Query optimization for graph analytics on linked data using SPARQL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, Seokyong; Lee, Sangkeun; Lim, Seung -Hwan

2015-07-01

Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performancemore » of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.« less
Parallelized reliability estimation of reconfigurable computer networks

NASA Technical Reports Server (NTRS)

Nicol, David M.; Das, Subhendu; Palumbo, Dan

1990-01-01

A parallelized system, ASSURE, for computing the reliability of embedded avionics flight control systems which are able to reconfigure themselves in the event of failure is described. ASSURE accepts a grammar that describes a reliability semi-Markov state-space. From this it creates a parallel program that simultaneously generates and analyzes the state-space, placing upper and lower bounds on the probability of system failure. ASSURE is implemented on a 32-node Intel iPSC/860, and has achieved high processor efficiencies on real problems. Through a combination of improved algorithms, exploitation of parallelism, and use of an advanced microprocessor architecture, ASSURE has reduced the execution time on substantial problems by a factor of one thousand over previous workstation implementations. Furthermore, ASSURE's parallel execution rate on the iPSC/860 is an order of magnitude faster than its serial execution rate on a Cray-2 supercomputer. While dynamic load balancing is necessary for ASSURE's good performance, it is needed only infrequently; the particular method of load balancing used does not substantially affect performance.
The design and implementation of a parallel unstructured Euler solver using software primitives

NASA Technical Reports Server (NTRS)

Das, R.; Mavriplis, D. J.; Saltz, J.; Gupta, S.; Ponnusamy, R.

1992-01-01

This paper is concerned with the implementation of a three-dimensional unstructured grid Euler-solver on massively parallel distributed-memory computer architectures. The goal is to minimize solution time by achieving high computational rates with a numerically efficient algorithm. An unstructured multigrid algorithm with an edge-based data structure has been adopted, and a number of optimizations have been devised and implemented in order to accelerate the parallel communication rates. The implementation is carried out by creating a set of software tools, which provide an interface between the parallelization issues and the sequential code, while providing a basis for future automatic run-time compilation support. Large practical unstructured grid problems are solved on the Intel iPSC/860 hypercube and Intel Touchstone Delta machine. The quantitative effect of the various optimizations are demonstrated, and we show that the combined effect of these optimizations leads to roughly a factor of three performance improvement. The overall solution efficiency is compared with that obtained on the CRAY-YMP vector supercomputer.
Seismic signal processing on heterogeneous supercomputers

NASA Astrophysics Data System (ADS)

Gokhberg, Alexey; Ermert, Laura; Fichtner, Andreas

2015-04-01

The processing of seismic signals - including the correlation of massive ambient noise data sets - represents an important part of a wide range of seismological applications. It is characterized by large data volumes as well as high computational input/output intensity. Development of efficient approaches towards seismic signal processing on emerging high performance computing systems is therefore essential. Heterogeneous supercomputing systems introduced in the recent years provide numerous computing nodes interconnected via high throughput networks, every node containing a mix of processing elements of different architectures, like several sequential processor cores and one or a few graphical processing units (GPU) serving as accelerators. A typical representative of such computing systems is "Piz Daint", a supercomputer of the Cray XC 30 family operated by the Swiss National Supercomputing Center (CSCS), which we used in this research. Heterogeneous supercomputers provide an opportunity for manifold application performance increase and are more energy-efficient, however they have much higher hardware complexity and are therefore much more difficult to program. The programming effort may be substantially reduced by the introduction of modular libraries of software components that can be reused for a wide class of seismology applications. The ultimate goal of this research is design of a prototype for such library suitable for implementing various seismic signal processing applications on heterogeneous systems. As a representative use case we have chosen an ambient noise correlation application. Ambient noise interferometry has developed into one of the most powerful tools to image and monitor the Earth's interior. Future applications will require the extraction of increasingly small details from noise recordings. To meet this demand, more advanced correlation techniques combined with very large data volumes are needed. This poses new computational problems that require dedicated HPC solutions. The chosen application is using a wide range of common signal processing methods, which include various IIR filter designs, amplitude and phase correlation, computing the analytic signal, and discrete Fourier transforms. Furthermore, various processing methods specific for seismology, like rotation of seismic traces, are used. Efficient implementation of all these methods on the GPU-accelerated systems represents several challenges. In particular, it requires a careful distribution of work between the sequential processors and accelerators. Furthermore, since the application is designed to process very large volumes of data, special attention had to be paid to the efficient use of the available memory and networking hardware resources in order to reduce intensity of data input and output. In our contribution we will explain the software architecture as well as principal engineering decisions used to address these challenges. We will also describe the programming model based on C++ and CUDA that we used to develop the software. Finally, we will demonstrate performance improvements achieved by using the heterogeneous computing architecture. This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID d26.
Proceedings of the Scientific Conference on Obscuration and Aerosol Research Held in Aberdeen Maryland on 27-30 June 1989

DTIC Science & Technology

1990-08-01

corneal structure for both normal and swollen corneas. Other problems of future interest are the understanding of the structure of scarred and dystrophied ...METHOD AND RESULTS The system of equations is solved numerically on a Cray X-MP by a finite element method with 9-node Lagrange quadrilaterals ( Becker ...Appl. Math., 42, 430. Becker , E. B., G. F. Carey, and J. T. Oden, 1981. Finite Elements: An Introduction (Vol. 1), Prentice- Hall, Englewood Cliffs, New
Cray Research, Inc. Cray 1-s, Cray FORTRAN translator CFT) version 1. 11 Bugfix 1. Validation summary report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1983-09-09

This Validation Summary Report (VSR) for the Cray Research, Inc., CRAY FORTRAN Translator (CFT) Version 1.11 Bugfix 1 running under the CRAY Operating System (COS) Version 1.12 provides a consolidated summary of the results obtained from the validation of the subject compiler against the 1978 FORTRAN Standard (X3.9-1978/FIPS PUB 69). The compiler was validated against the Full Level FORTRAN level of FIPS PUB 69. The VSR is made up of several sections showing all the discrepancies found -if any. These include an overview of the validation which lists all categories of discrepancies together with the tests which failed.
A review on advances in seismology in Mexico after 30 years from the 1985 earthquake

NASA Astrophysics Data System (ADS)

Castro, Raúl R.; Pérez-Campos, Xyoli; Zúñiga, Ramón; Ramírez-Guzmán, Leonardo; Aguirre, Jorge; Husker, Allen; Cuéllar, Armando; Sánchez, Tomás

2016-10-01

The 19 September 1985 (Mw8.1) earthquake, located on the Michoacán coast, Mexico, generated great damage in Mexico City, more than 300 km away from the epicentral area. Other important cities near the coast and in central Mexico also suffered severe damage. Thirty years after this important event, the National Autonomous University of Mexico (UNAM), the Centro de Investigación Científica y de Educación Superior de Ensenada, Baja California (CICESE) and other institutions organized a conference to discuss the scientific advances, particularly in seismology, that had taken place in Mexico since then.
PREFACE: XXXVI Symposium on Nuclear Physics (Cocoyoc 2013)

NASA Astrophysics Data System (ADS)

Barrón-Palos, Libertad; Morales-Agiss, Irving; Martínez-Quiroz, Enrique

2014-03-01

logo The XXXVI Symposium on Nuclear Physics, organized by the Division of Nuclear Physics of the Mexican Physical Society, took place from 7-10 January, 2013. As it is customary, the Symposium was held at the Hotel Hacienda Cocoyoc, in the state of Morelos, Mexico. Conference photograph This international venue with many years of tradition was attended by outstanding physicists, some of them already regulars to this meeting and others who joined us for the first time; a total of 45 attendees from different countries (Argentina, Brazil, Canada, China, Germany, Italy, Japan, Mexico and the United States). A variety of topics related to nuclear physics (nuclear reactions, radioactive beams, nuclear structure, fundamental neutron physics, sub-nuclear physics and nuclear astrophysics, among others) were presented in 26 invited talks and 10 contributed posters. Local Organizing Committee Libertad Barrón-Palos (IF-UNAM)) Enrique Martínez-Quíroz (ININ)) Irving Morales-Agiss (ICN-UNAM)) International Advisory Committee Osvaldo Civitarese (UNLP, Argentina) Jerry P Draayer (LSU, USA)) Alfredo Galindo-Uribarri (ORNL, USA)) Paulo Gomes (UFF, Brazil)) Piet Van Isacker (GANIL, France)) James J Kolata (UND, USA)) Reiner Krücken (TRIUMF, Canada)) Jorge López (UTEP, USA)) Stuart Pittel (UD, USA)) W Michael Snow (IU, USA)) Adam Szczepaniak (IU, USA)) Michael Wiescher (UND, USA)) A list of participants is available in the PDF
PASCOS 2012 - 18th International Symposium on Particles Strings and Cosmology

NASA Astrophysics Data System (ADS)

2014-03-01

The XVII International Conference on Strings, Particles and Cosmology, PASCOS 2012, was held in the City of Mérida, Mexico, from June 3-8, 2012. The conference series is aimed at exploring the interface and interplay between particle physics, string theory and cosmology. With the advent of new data, the emphasis of the XVIII edition of PASCOS was on phenomenology and the interpretation of recent observational and experimental results. The conference followed the format of previous conferences in this series, with plenary reviews and contributed presentations in parallel sessions. The lectures covered a wide range of subjects which included: Dark matter and dark energy, flavor physics and CP violation, neutrino physics, supersymmetry, Higgs physics, baryogenesis and EDMs, supergravity, high energy cosmic rays, string and F-theory GUTs, and string phenomenology. This is the first time that PASCOS was held in Latin America. The aim to do it in Mexico was to engage the Latin American community and thus to bring the conference to a wider and different audience, a goal which was thoroughly achieved. The venue was held at the Hotel Fiesta Americana in the beautiful city of Mérida. The social events included a reception with typical local food at the Katun restaurant, conference dinner at the historical Quinta Montes Molina, and an excursion to the archeological site of Dzibilchaltún including a swim at the famous cenote. PASCOS 2012 was possible thanks to the generous support of the following sponsors: CONACyT (Consejo Nacional de Ciencia y Tecnología), UNAM (Universidad Nacional Autónoma de México: Consejo Técnico de la Investigación Científica, Instituto de Ciencias Nucleares, Instituto de Física), Cinvestav, (Centro de Estudios Avanzados del IPN: U. Zacatanco, U. Mérida and Secretaría General), ICyTDF (Instituto Científico y Tecnológico del D.F.), PIFI (Programa Integral de Fortalecimiento Institucional, Universidad de Guanajuato, Campus León), SMF (Sociedad Mexicana de Física), ICTP (International Centre for Theoretical Physics), BUAP (Benemérita Universidad Autónoma de Puebla), the Government of the State of Yucatán, the University of Hamburg, and Telmex. We also want to acknowledge the invaluable help of the staff of the Mexican Physical Society, in particular Lic. Santos Zúñniga Sánchez and Ms. Claudia Velasco Marín, and of the conference secretaries, Ms. Lizette Ramírez Bermúdez (UNAM) and Ms. Mariana del Castillo Sánchez (Cinvestav), for their support before, during and after the organization of PASCOS 2012. Last but not least, we would like to thank all the PASCOS 2012 participants for their attendance and for contributing to make the conference an engaging and stimulating event. The organizers, Myriam Mondragón, Adnan Bashir, David Delepine, Francisco Larios, Oscar Loaiza, Axel de la Macorra, Lukas Nellen, Sarira Sahu, Humberto Salazar and Liliana Velasco-Sevilla.

The Virtual Earth-Solar Observatory of the SCiESMEX

NASA Astrophysics Data System (ADS)

De la Luz, V.; Gonzalez-Esparza, A.; Cifuentes-Nava, G.

2015-12-01

The Mexican Space Weather Service (SCiESMEX, http://www.sciesmex.unam.mx) started operations in October 2014. The project includes the Virtual Earth-Solar Observatory (VESO, http://www.veso.unam.mx). The VESO is a improved project wich objetive is integrate the space weather instrumentation network from the National Autonomous University of Mexico (UNAM). The network includes the Mexican Array Radiotelescope (MEXART), the Callisto receptor (MEXART), a Neutron Telescope, a Cosmic Ray Telescope. the Schumann Antenna, the National Magnetic Service, and the mexican GPS network (TlalocNet). The VESO facility is located at the Geophysics Institute campus Michoacan (UNAM). We offer the service of data store, real-time data, and quasi real-time data. The hardware of VESO includes a High Performance Computer (HPC) dedicated specially to big data storage.
A Secure Alert System

DTIC Science & Technology

2006-12-01

script?size=0’+union+select+𔃻’,𔃻’,𔃻’,concat(uname||’- ’|| passwd )+as+i_name+𔃻’+𔃻’+from+usertable+where+uname+like+󈧝 f. Mitigation Techniques
Development of a CRAY 1 version of the SINDA program. [thermo-structural analyzer program

NASA Technical Reports Server (NTRS)

Juba, S. M.; Fogerson, P. E.

1982-01-01

The SINDA thermal analyzer program was transferred from the UNIVAC 1110 computer to a CYBER And then to a CRAY 1. Significant changes to the code of the program were required in order to execute efficiently on the CYBER and CRAY. The program was tested on the CRAY using a thermal math model of the shuttle which was too large to run on either the UNIVAC or CYBER. An effort was then begun to further modify the code of SINDA in order to make effective use of the vector capabilities of the CRAY.
U.S. Department of Energy awards $200 million for next-generation

Science.gov Websites

build a powerful new next-gen supercomputer, Aurora. The press conference took place at Chicago tech Congress and industry at the announcement of DOE's award of $200 million to Argonne in order to build a accelerate to next-generation exascale computing. DOE earlier announced a $325 million investment to build
Highly parallel implementation of non-adiabatic Ehrenfest molecular dynamics

NASA Astrophysics Data System (ADS)

Kanai, Yosuke; Schleife, Andre; Draeger, Erik; Anisimov, Victor; Correa, Alfredo

2014-03-01

While the adiabatic Born-Oppenheimer approximation tremendously lowers computational effort, many questions in modern physics, chemistry, and materials science require an explicit description of coupled non-adiabatic electron-ion dynamics. Electronic stopping, i.e. the energy transfer of a fast projectile atom to the electronic system of the target material, is a notorious example. We recently implemented real-time time-dependent density functional theory based on the plane-wave pseudopotential formalism in the Qbox/qb@ll codes. We demonstrate that explicit integration using a fourth-order Runge-Kutta scheme is very suitable for modern highly parallelized supercomputers. Applying the new implementation to systems with hundreds of atoms and thousands of electrons, we achieved excellent performance and scalability on a large number of nodes both on the BlueGene based ``Sequoia'' system at LLNL as well as the Cray architecture of ``Blue Waters'' at NCSA. As an example, we discuss our work on computing the electronic stopping power of aluminum and gold for hydrogen projectiles, showing an excellent agreement with experiment. These first-principles calculations allow us to gain important insight into the the fundamental physics of electronic stopping.
Venus Gravity Handbook

NASA Technical Reports Server (NTRS)

Konopliv, Alexander S.; Sjogren, William L.

1996-01-01

This report documents the Venus gravity methods and results to date (model MGNP90LSAAP). It is called a handbook in that it contains many useful plots (such as geometry and orbit behavior) that are useful in evaluating the tracking data. We discuss the models that are used in processing the Doppler data and the estimation method for determining the gravity field. With Pioneer Venus Orbiter and Magellan tracking data, the Venus gravity field was determined complete to degree and order 90 with the use of the JPL Cray T3D Supercomputer. The gravity field shows unprecedented high correlation with topography and resolution of features to the 2OOkm resolution. In the procedure for solving the gravity field, other information is gained as well, and, for example, we discuss results for the Venus ephemeris, Love number, pole orientation of Venus, and atmospheric densities. Of significance is the Love number solution which indicates a liquid core for Venus. The ephemeris of Venus is determined to an accuracy of 0.02 mm/s (tens of meters in position), and the rotation period to 243.0194 +/- 0.0002 days.
Data Transfer Study HPSS Archiving

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wynne, James; Parete-Koon, Suzanne T; Mitchell, Quinn

2015-01-01

The movement of the large amounts of data produced by codes run in a High Performance Computing (HPC) environment can be a bottleneck for project workflows. To balance filesystem capacity and performance requirements, HPC centers enforce data management policies to purge old files to make room for new computation and analysis results. Users at Oak Ridge Leadership Computing Facility (OLCF) and many other HPC user facilities must archive data to avoid data loss during purges, therefore the time associated with data movement for archiving is something that all users must consider. This study observed the difference in transfer speed frommore » the originating location on the Lustre filesystem to the more permanent High Performance Storage System (HPSS). The tests were done with a number of different transfer methods for files that spanned a variety of sizes and compositions that reflect OLCF user data. This data will be used to help users of Titan and other Cray supercomputers plan their workflow and data transfers so that they are most efficient for their project. We will also discuss best practice for maintaining data at shared user facilities.« less
Accelerating Science with the NERSC Burst Buffer Early User Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhimji, Wahid; Bard, Debbie; Romanus, Melissa

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burstmore » Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

NASA Astrophysics Data System (ADS)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; Ng, Esmond G.; Maris, Pieter; Vary, James P.

2018-01-01

We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
A leap forward with UTK s Cray XC30

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fahey, Mark R

2014-01-01

This paper shows a significant productivity leap for several science groups and the accomplishments they have made to date on Darter - a Cray XC30 at the University of Tennessee Knoxville. The increased productivity is due to faster processors and interconnect combined in a new generation from Cray, and yet it still has a very similar programming environment as compared to previous generations of Cray machines that makes porting easy.
The " Swarm of Ants vs. Herd of Elephants" Debated Revisited: Performance Measurements of PVM-Overflow Across a Wide Spectrum of Architectures

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Jespersen, Dennis; Buning, Peter; Bailey, David (Technical Monitor)

1996-01-01

The Gorden Bell Prizes given out at Supercomputing every year includes at least two catergories: performance (highest GFLOP count) and price-performance (GFLOP/million $$) for real applications. In the past five years, the winners of the price-performance categories all came from networks of work-stations. This reflects three important facts: 1. supercomputers are still too expensive for the masses; 2. achieving high performance for real applications takes real work; and, most importantly; 3. it is possible to obtain acceptable performance for certain real applications on network of work stations. With the continued advance of network technology as well as increased performance of "desktop" workstation, the "Swarm of Ants vs. Herd of Elephants" debate, which began with vector multiprocessors (VPPs) against SIMD type multiprocessors (e.g. CM2), is now recast as VPPs against Symetric Multiprocessors (SMPs, e.g. SGI PowerChallenge). This paper reports on performance studies we performed solving a large scale (2-million grid pt.s) CFD problem involving a Boeing 747 based on a parallel version of OVERFLOW that utilizes message passing on PVM. A performance monitoring tool developed under NASA HPCC, called AIMS, was used to instrument and analyze the the performance data thus obtained. We plan to compare its performance data obtained across a wide spectrum of architectures including: the Cray C90, IBM/SP2, SGI/Power Challenge Cluster, to a group of workstations connected over a simple network. The metrics of comparison includes speed-up, price-performance, throughput, and turn-around time. We also plan to present a plan of attack for various issues that will make the execution of Grand Challenge Applications across the Global Information Infrastructure a reality.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bland, Arthur S Buddy; Hack, James J; Baker, Ann E

Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energymore » assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools and resources for next-generation systems.« less
Space Radar Image of Mammoth Mountain, California

NASA Image and Video Library

1999-05-01

This false-color composite radar image of the Mammoth Mountain area in the Sierra Nevada Mountains, California, was acquired by the Spaceborne Imaging Radar-C and X-band Synthetic Aperture Radar aboard the space shuttle Endeavour on its 67th orbit on October 3, 1994. The image is centered at 37.6 degrees north latitude and 119.0 degrees west longitude. The area is about 39 kilometers by 51 kilometers (24 miles by 31 miles). North is toward the bottom, about 45 degrees to the right. In this image, red was created using L-band (horizontally transmitted/vertically received) polarization data; green was created using C-band (horizontally transmitted/vertically received) polarization data; and blue was created using C-band (horizontally transmitted and received) polarization data. Crawley Lake appears dark at the center left of the image, just above or south of Long Valley. The Mammoth Mountain ski area is visible at the top right of the scene. The red areas correspond to forests, the dark blue areas are bare surfaces and the green areas are short vegetation, mainly brush. The purple areas at the higher elevations in the upper part of the scene are discontinuous patches of snow cover from a September 28 storm. New, very thin snow was falling before and during the second space shuttle pass. In parallel with the operational SIR-C data processing, an experimental effort is being conducted to test SAR data processing using the Jet Propulsion Laboratory's massively parallel supercomputing facility, centered around the Cray Research T3D. These experiments will assess the abilities of large supercomputers to produce high throughput Synthetic Aperture Radar processing in preparation for upcoming data-intensive SAR missions. The image released here was produced as part of this experimental effort. http://photojournal.jpl.nasa.gov/catalog/PIA01746
PLOT3D/AMES, UNIX SUPERCOMPUTER AND SGI IRIS VERSION (WITHOUT TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. In addition to providing the advantages of performing complex calculations on a supercomputer, the Supercomputer/IRIS implementation of PLOT3D offers advanced 3-D, view manipulation, and animation capabilities. Shading and hidden line/surface removal can be used to enhance depth perception and other aspects of the graphical displays. A mouse can be used to translate, rotate, or zoom in on views. Files for several types of output can be produced. Two animation options are available. Simple animation sequences can be created on the IRIS, or,if an appropriately modified version of ARCGRAPH (ARC-12350) is accesible on the supercomputer, files can be created for use in GAS (Graphics Animation System, ARC-12379), an IRIS program which offers more complex rendering and animation capabilities and options for recording images to digital disk, video tape, or 16-mm film. The version 3.6b+ Supercomputer/IRIS implementations of PLOT3D (ARC-12779) and PLOT3D/TURB3D (ARC-12784) are suitable for use on CRAY 2/UNICOS, CONVEX, and ALLIANT computers with a remote Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstation. These programs are distributed on .25 inch magnetic tape cartridges in IRIS TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstations (ARC-12783, ARC-12782); (2) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC12777, ARC-12781); (3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 - which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo, DN10000, and GMR3D are trademarks of Hewlett-Packard, Incorporated. System V is a trademark of Bell Labs, Incorporated. BSD4.3 is a trademark of the University of California at Berkeley. UNIX is a registered trademark of AT&T.
PLOT3D/AMES, UNIX SUPERCOMPUTER AND SGI IRIS VERSION (WITH TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. In addition to providing the advantages of performing complex calculations on a supercomputer, the Supercomputer/IRIS implementation of PLOT3D offers advanced 3-D, view manipulation, and animation capabilities. Shading and hidden line/surface removal can be used to enhance depth perception and other aspects of the graphical displays. A mouse can be used to translate, rotate, or zoom in on views. Files for several types of output can be produced. Two animation options are available. Simple animation sequences can be created on the IRIS, or,if an appropriately modified version of ARCGRAPH (ARC-12350) is accesible on the supercomputer, files can be created for use in GAS (Graphics Animation System, ARC-12379), an IRIS program which offers more complex rendering and animation capabilities and options for recording images to digital disk, video tape, or 16-mm film. The version 3.6b+ Supercomputer/IRIS implementations of PLOT3D (ARC-12779) and PLOT3D/TURB3D (ARC-12784) are suitable for use on CRAY 2/UNICOS, CONVEX, and ALLIANT computers with a remote Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstation. These programs are distributed on .25 inch magnetic tape cartridges in IRIS TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstations (ARC-12783, ARC-12782); (2) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC12777, ARC-12781); (3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 - which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo, DN10000, and GMR3D are trademarks of Hewlett-Packard, Incorporated. System V is a trademark of Bell Labs, Incorporated. BSD4.3 is a trademark of the University of California at Berkeley. UNIX is a registered trademark of AT&T.
Strategies for vectorizing the sparse matrix vector product on the CRAY XMP, CRAY 2, and CYBER 205

NASA Technical Reports Server (NTRS)

Bauschlicher, Charles W., Jr.; Partridge, Harry

1987-01-01

Large, randomly sparse matrix vector products are important in a number of applications in computational chemistry, such as matrix diagonalization and the solution of simultaneous equations. Vectorization of this process is considered for the CRAY XMP, CRAY 2, and CYBER 205, using a matrix of dimension of 20,000 with from 1 percent to 6 percent nonzeros. Efficient scatter/gather capabilities add coding flexibility and yield significant improvements in performance. For the CYBER 205, it is shown that minor changes in the IO can reduce the CPU time by a factor of 50. Similar changes in the CRAY codes make a far smaller improvement.
Computers are stepping stones to improved imaging.

PubMed

Freiherr, G

1991-02-01

Never before has the radiology industry embraced the computer with such enthusiasm. Graphics supercomputers as well as UNIX- and RISC-based computing platforms are turning up in every digital imaging modality and especially in systems designed to enhance and transmit images, says author Greg Freiherr on assignment for Computers in Healthcare at the Radiological Society of North America conference in Chicago.
List of Participants

NASA Astrophysics Data System (ADS)

2011-08-01

Abigail Alvarez OlarteCINVESTAV Alba Leticia Carrillo MonteverdeDCI-UG Alberto CarramiñanaINAOE Aldo MorselliFERMI Alejandro CastillaDCI-UG Alejandro IbarraTechnical University of Munich Alma D Rojas PachecoFCFM-BUAP Alma Xochitl Gonzalez MoralesInstituto de Ciencias Nucleares, UNAM Andrew Walcott BeckwithAmerican Institute of Beam Energy Physics Ariadna Montiel ArenasDepartamento de Física, CINVESTAV Arnulfo ZepedaCinvestav Arturo Alvarez CruzInstituto de Fisica, UNAM Axel de la MacorraUNAM, IAC Azar MustafayevUniversity of Minnesota Benjamin JaramilloDCI-UG Vincent BertinCPPM-Marseille Carlos Alberto Vaquera-AraujoDCI-UG Carlos MuñozMadrid Autonoma U. & Madrid, IFT Carmine PagliaroneINFN, FNAL Carolina Lujan PeschardDCI-UG Christiane Frigerio MartinsUniversidade Federal do ABC-São Paulo Csaba BalazsMonash University David DelepineDCI-UG David G CerdenoUniversidad Autonoma de Madrid & Instituto de Fisica Teorica Debasish MajumdarSaha Institute of Nuclear Physics, Kolkata, India Dibyendu PanigrahiKandi Raj College, Kandi, Murshidabad, INDIA-742137 Dupret Alberto Santana BejaranoUniversidad de Sonora Departamento de Investigacion en Fisica Ernest MaRiverside U.C. Esteban Alejandro Reyes Pírez MontañezInstituto de Física, UNAM Federico Ortiz TrejoINSTITUTO DE ASTRONOMÍA - UNAM Francisco José de Anda NavarroUniversidad de Guadalajara González Alvarez Francisco JavierCINVESTAV-Depto. Física Gustavo Medina TancoICN-UNAM Hernando Efrain Caicedo OrtizInstituto Politecnico Nacional - IPN J D VergadosCERN & Ioannina U. James R BoyceJefferson Lab Jason SteffenFERMILAB Javier Montaño DomínguezDCI-UG Jeevan SolankiMandsaur Institue of Technology MP India Joe SatoSaitama University Jorge Luis Navarro EstradaUNAM-ICN and Universidad del Atlantico (B/quilla-Col.) Jose A R CembranosUniversity of Minnesota José DíazIFIC Jose Didino Garcia AguilarDepto. de Fisica. Cinvestav Keith OliveUniversity of Minnesota Konstantia BalasiUniversity of Ioannina, Greece Lilian Prado GonzalezBUAP, FCFM Lorenzo Díaz CruzBUAP Facultad de Ciencias Físico Matemáticas Luis Rey Díaz BarrónDivisión de Ciencias e Ingenierías Luis UrenaUniversidad de Guanajuato Magda LolaDept. of Physics, University of Patras, Greece Mahmoud WahbaEgyptian Center for Theoretical Physics, MTI Marcus S CohenNew Mexico State University Mario A Acero OrtegaICN - UNAM Mario E GomezUniversidad de Huelva Mark PipeUniversity of Sheffield Mauro NapsucialeDCI-UG Mirco CannoniUniversidad de Huelva Mónica Felipa Ramírez PalaciosUniversidad de Guadalajara Murli Manohar VermaLucknow university, India Nassim BozorgniaUCLA Octavio Obregón Octavio ValenzuelaIA-UNAM Oleg KamaevUniversity of Minnesota Osamu SetoHokkai-Gakuen University Pedro F González DíazIFF, CSIC, Serrano 121, 28006 Madrid, Spain Qaisar ShafiBartol Research Inst. and Delaware U. Raul Hennings-YeomansLos Alamos National Laboratory René Ángeles MartínezDepartamento de Fisica, del DCI de la Universidad de Guanajuato Reyna XoxocotziBUAP, FCFM Rishi Kumar TiwariGovt. Model Science College, Rewa (MP) INDIA Roberto A SussmanICN-UNAM Selim Gómez ÁvilaDCI-UG Sugai KenichiSaitama University Susana Valdez AlvaradoDCI-UG TVladimir - 2K CollaborationColorado State University Tonatiuh MatosCINVESTAV Valeriy DvoeglazovUniversidad de Zacatecas Vannia Gonzalez MaciasDCI-UG Vladimir Avila-ReeseInstituto de Astronomia, UNAM Wolfgang BietenholzINC, UNAM (Mexico) Yamanaka MasatoKyoto Sangyo University Yann MambriniLPT Orsay Yu-Feng ZhouInstitute of Theotretical Physics, Chinese Academy of Sciences, PR China Aaron HigueraDCI-UG Azarael Yebra PérezDCI-UG César Hernández AguayoDCI-UG Jaime Chagoya AlvarezDCI-UG Jonathan Rashid Rosique CampuzanoDCI-UG José Alfredo Soto ÁlvarezDCI-UG Juan Carlos De Haro SantosDCI-UG Luis Eduardo Medina MedranoDCI-UG Maria Fatima Rubio EspinozaDCI-UG Paulo Alberto Rodriguez HerreraDCI-UG Roberto Oziel Gutierrez CotaDCI-UG Rocha Moran Maria PaulinaDCI-UG Xareni Sanchez MonroyDCI-UG
High Performance Programming Using Explicit Shared Memory Model on Cray T3D1

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Saini, Subhash; Grassi, Charles

1994-01-01

The Cray T3D system is the first-phase system in Cray Research, Inc.'s (CRI) three-phase massively parallel processing (MPP) program. This system features a heterogeneous architecture that closely couples DEC's Alpha microprocessors and CRI's parallel-vector technology, i.e., the Cray Y-MP and Cray C90. An overview of the Cray T3D hardware and available programming models is presented. Under Cray Research adaptive Fortran (CRAFT) model four programming methods (data parallel, work sharing, message-passing using PVM, and explicit shared memory model) are available to the users. However, at this time data parallel and work sharing programming models are not available to the user community. The differences between standard PVM and CRI's PVM are highlighted with performance measurements such as latencies and communication bandwidths. We have found that the performance of neither standard PVM nor CRI s PVM exploits the hardware capabilities of the T3D. The reasons for the bad performance of PVM as a native message-passing library are presented. This is illustrated by the performance of NAS Parallel Benchmarks (NPB) programmed in explicit shared memory model on Cray T3D. In general, the performance of standard PVM is about 4 to 5 times less than obtained by using explicit shared memory model. This degradation in performance is also seen on CM-5 where the performance of applications using native message-passing library CMMD on CM-5 is also about 4 to 5 times less than using data parallel methods. The issues involved (such as barriers, synchronization, invalidating data cache, aligning data cache etc.) while programming in explicit shared memory model are discussed. Comparative performance of NPB using explicit shared memory programming model on the Cray T3D and other highly parallel systems such as the TMC CM-5, Intel Paragon, Cray C90, IBM-SP1, etc. is presented.
Biochemical characterization of a phosphinate inhibitor of Escherichia coli MurC.

PubMed

Marmor, S; Petersen, C P; Reck, F; Yang, W; Gao, N; Fisher, S L

2001-10-09

The bacterial UDP-N-acetylmuramyl-L-alanine ligase (MurC) from Escherichia coli, an essential, cytoplasmic peptidoglycan biosynthetic enzyme, catalyzes the ATP-dependent ligation of L-alanine (Ala) and UDP-N-acetylmuramic acid (UNAM) to form UDP-N-acetylmuramyl-L-alanine (UNAM-Ala). The phosphinate inhibitor 1 was designed and prepared as a multisubstrate/transition state analogue. The compound exhibits mixed-type inhibition with respect to all three enzyme substrates (ATP, UNAM, Ala), suggesting that this compound forms dead-end complexes with multiple enzyme states. Results from isothermal titration calorimetry (ITC) studies supported these findings as exothermic binding was observed under conditions with free enzyme (K(d) = 1.80-2.79 microM, 95% CI), enzyme saturated with ATP (K(d) = 0.097-0.108 microM, 95% CI), and enzyme saturated with the reaction product ADP (K(d) = 0.371-0.751 microM, 95% CI). Titrations run under conditions of saturating UNAM or the product UNAM-Ala did not show heat effects consistent with competitive compound binding to the active site. The potent binding affinity observed in the presence of ATP is consistent with the inhibitor design and the proposed Ordered Ter-Ter mechanism for this enzyme; however, the additional binding pathways suggest that the inhibitor can also serve as a product analogue.

High Performance Programming Using Explicit Shared Memory Model on the Cray T3D

NASA Technical Reports Server (NTRS)

Saini, Subhash; Simon, Horst D.; Lasinski, T. A. (Technical Monitor)

1994-01-01

The Cray T3D is the first-phase system in Cray Research Inc.'s (CRI) three-phase massively parallel processing program. In this report we describe the architecture of the T3D, as well as the CRAFT (Cray Research Adaptive Fortran) programming model, and contrast it with PVM, which is also supported on the T3D We present some performance data based on the NAS Parallel Benchmarks to illustrate both architectural and software features of the T3D.
Benchmark tests on the digital equipment corporation Alpha AXP 21164-based AlphaServer 8400, including a comparison of optimized vector and superscalar processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wasserman, H.J.

1996-02-01

The second generation of the Digital Equipment Corp. (DEC) DECchip Alpha AXP microprocessor is referred to as the 21164. From the viewpoint of numerically-intensive computing, the primary difference between it and its predecessor, the 21064, is that the 21164 has twice the multiply/add throughput per clock period (CP), a maximum of two floating point operations (FLOPS) per CP vs. one for 21064. The AlphaServer 8400 is a shared-memory multiprocessor server system that can accommodate up to 12 CPUs and up to 14 GB of memory. In this report we will compare single processor performance of the 8400 system with thatmore » of the International Business Machines Corp. (IBM) RISC System/6000 POWER-2 microprocessor running at 66 MHz, the Silicon Graphics, Inc. (SGI) MIPS R8000 microprocessor running at 75 MHz, and the Cray Research, Inc. CRAY J90. The performance comparison is based on a set of Fortran benchmark codes that represent a portion of the Los Alamos National Laboratory supercomputer workload. The advantage of using these codes, is that the codes also span a wide range of computational characteristics, such as vectorizability, problem size, and memory access pattern. The primary disadvantage of using them is that detailed, quantitative analysis of performance behavior of all codes on all machines is difficult. One important addition to the benchmark set appears for the first time in this report. Whereas the older version was written for a vector processor, the newer version is more optimized for microprocessor architectures. Therefore, we have for the first time, an opportunity to measure performance on a single application using implementations that expose the respective strengths of vector and superscalar architecture. All results in this report are from single processors. A subsequent article will explore shared-memory multiprocessing performance of the 8400 system.« less
NASA Exhibits

NASA Technical Reports Server (NTRS)

Deardorff, Glenn; Djomehri, M. Jahed; Freeman, Ken; Gambrel, Dave; Green, Bryan; Henze, Chris; Hinke, Thomas; Hood, Robert; Kiris, Cetin; Moran, Patrick;

2001-01-01

A series of NASA presentations for the Supercomputing 2001 conference are summarized. The topics include: (1) Mars Surveyor Landing Sites "Collaboratory"; (2) Parallel and Distributed CFD for Unsteady Flows with Moving Overset Grids; (3) IP Multicast for Seamless Support of Remote Science; (4) Consolidated Supercomputing Management Office; (5) Growler: A Component-Based Framework for Distributed/Collaborative Scientific Visualization and Computational Steering; (6) Data Mining on the Information Power Grid (IPG); (7) Debugging on the IPG; (8) Debakey Heart Assist Device: (9) Unsteady Turbopump for Reusable Launch Vehicle; (10) Exploratory Computing Environments Component Framework; (11) OVERSET Computational Fluid Dynamics Tools; (12) Control and Observation in Distributed Environments; (13) Multi-Level Parallelism Scaling on NASA's Origin 1024 CPU System; (14) Computing, Information, & Communications Technology; (15) NAS Grid Benchmarks; (16) IPG: A Large-Scale Distributed Computing and Data Management System; and (17) ILab: Parameter Study Creation and Submission on the IPG.

Computing with Beowulf

NASA Technical Reports Server (NTRS)

Cohen, Jarrett

1999-01-01

Parallel computers built out of mass-market parts are cost-effectively performing data processing and simulation tasks. The Supercomputing (now known as "SC") series of conferences celebrated its 10th anniversary last November. While vendors have come and gone, the dominant paradigm for tackling big problems still is a shared-resource, commercial supercomputer. Growing numbers of users needing a cheaper or dedicated-access alternative are building their own supercomputers out of mass-market parts. Such machines are generally called Beowulf-class systems after the 11th century epic. This modern-day Beowulf story began in 1994 at NASA's Goddard Space Flight Center. A laboratory for the Earth and space sciences, computing managers there threw down a gauntlet to develop a $50,000 gigaFLOPS workstation for processing satellite data sets. Soon, Thomas Sterling and Don Becker were working on the Beowulf concept at the University Space Research Association (USRA)-run Center of Excellence in Space Data and Information Sciences (CESDIS). Beowulf clusters mix three primary ingredients: commodity personal computers or workstations, low-cost Ethernet networks, and the open-source Linux operating system. One of the larger Beowulfs is Goddard's Highly-parallel Integrated Virtual Environment, or HIVE for short.
Sistemas Correctores de Campo Para EL Telescopio Ritchey-Chretien UNAM212

NASA Astrophysics Data System (ADS)

Cobos, F. J.; Galan, M. J.

1987-05-01

El telescopio UNAM2l2 fue inaugurado hace siete años y concebido para trabajar en las razones focales: f/7.5, F/13.5, F/27 y F/98. El diseño Ritchey-Chretién corresponde a la razón focal F/7.5 y el foco primario (F/2.286) no se consideró como utilizable para fotografía directa. En el Instituto de Astronomía de la UNAM, se diseñó y construyó un sistema corrector de campo para la razón focal F/7.5, que actualmente está en funcionamiento. Dentro de un programa de colaboración en diseflo y evaluación de sistemas ópticos, entre el Instituto de Astrofísica de Canarias y el Instituto de Astronomía de la UNAM, decidimos intentar el diseño de una correctora de campo para el foco primario del tȩlescopio UNAM212 bajo la consideración de que no son insalvables los problemas que implicaría su instalación y de que es muy posible que, en un futuro relativamente cercano, podamom tener un detector bidimenmional tipo Mepsicrón cuya área sensible haga tentadora la idea de construir la cámara directa para foco primario
Study to Determine Seismic Response of Sonic Boom-Coupled Rayleigh Waves

DTIC Science & Technology

1990-04-26

are compiled from the microtremor measurements carried out by Instituto de Ingenieria , UNAM and scientists from Japan (for a total of 181 sites...the accelerographs operated by Instituto de Ingenieria , UNAM. Using this new data and results from the analysis of previous accelerograms we present
Towards Cloud-Resolving European-Scale Climate Simulations using a fully GPU-enabled Prototype of the COSMO Regional Model

NASA Astrophysics Data System (ADS)

Leutwyler, David; Fuhrer, Oliver; Cumming, Benjamin; Lapillonne, Xavier; Gysi, Tobias; Lüthi, Daniel; Osuna, Carlos; Schär, Christoph

2014-05-01

The representation of moist convection is a major shortcoming of current global and regional climate models. State-of-the-art global models usually operate at grid spacings of 10-300 km, and therefore cannot fully resolve the relevant upscale and downscale energy cascades. Therefore parametrization of the relevant sub-grid scale processes is required. Several studies have shown that this approach entails major uncertainties for precipitation processes, which raises concerns about the model's ability to represent precipitation statistics and associated feedback processes, as well as their sensitivities to large-scale conditions. Further refining the model resolution to the kilometer scale allows representing these processes much closer to first principles and thus should yield an improved representation of the water cycle including the drivers of extreme events. Although cloud-resolving simulations are very useful tools for climate simulations and numerical weather prediction, their high horizontal resolution and consequently the small time steps needed, challenge current supercomputers to model large domains and long time scales. The recent innovations in the domain of hybrid supercomputers have led to mixed node designs with a conventional CPU and an accelerator such as a graphics processing unit (GPU). GPUs relax the necessity for cache coherency and complex memory hierarchies, but have a larger system memory-bandwidth. This is highly beneficial for low compute intensity codes such as atmospheric stencil-based models. However, to efficiently exploit these hybrid architectures, climate models need to be ported and/or redesigned. Within the framework of the Swiss High Performance High Productivity Computing initiative (HP2C) a project to port the COSMO model to hybrid architectures has recently come to and end. The product of these efforts is a version of COSMO with an improved performance on traditional x86-based clusters as well as hybrid architectures with GPUs. We present our redesign and porting approach as well as our experience and lessons learned. Furthermore, we discuss relevant performance benchmarks obtained on the new hybrid Cray XC30 system "Piz Daint" installed at the Swiss National Supercomputing Centre (CSCS), both in terms of time-to-solution as well as energy consumption. We will demonstrate a first set of short cloud-resolving climate simulations at the European-scale using the GPU-enabled COSMO prototype and elaborate our future plans on how to exploit this new model capability.
Preface: SciDAC 2008

NASA Astrophysics Data System (ADS)

Stevens, Rick

2008-07-01

The fourth annual Scientific Discovery through Advanced Computing (SciDAC) Conference was held June 13-18, 2008, in Seattle, Washington. The SciDAC conference series is the premier communitywide venue for presentation of results from the DOE Office of Science's interdisciplinary computational science program. Started in 2001 and renewed in 2006, the DOE SciDAC program is the country's - and arguably the world's - most significant interdisciplinary research program supporting the development of advanced scientific computing methods and their application to fundamental and applied areas of science. SciDAC supports computational science across many disciplines, including astrophysics, biology, chemistry, fusion sciences, and nuclear physics. Moreover, the program actively encourages the creation of long-term partnerships among scientists focused on challenging problems and computer scientists and applied mathematicians developing the technology and tools needed to address those problems. The SciDAC program has played an increasingly important role in scientific research by allowing scientists to create more accurate models of complex processes, simulate problems once thought to be impossible, and analyze the growing amount of data generated by experiments. To help further the research community's ability to tap into the capabilities of current and future supercomputers, Under Secretary for Science, Raymond Orbach, launched the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program in 2003. The INCITE program was conceived specifically to seek out computationally intensive, large-scale research projects with the potential to significantly advance key areas in science and engineering. The program encourages proposals from universities, other research institutions, and industry. During the first two years of the INCITE program, 10 percent of the resources at NERSC were allocated to INCITE awardees. However, demand for supercomputing resources far exceeded available systems; and in 2003, the Office of Science identified increasing computing capability by a factor of 100 as the second priority on its Facilities of the Future list. The goal was to establish leadership-class computing resources to support open science. As a result of a peer reviewed competition, the first leadership computing facility was established at Oak Ridge National Laboratory in 2004. A second leadership computing facility was established at Argonne National Laboratory in 2006. This expansion of computational resources led to a corresponding expansion of the INCITE program. In 2008, Argonne, Lawrence Berkeley, Oak Ridge, and Pacific Northwest national laboratories all provided resources for INCITE. By awarding large blocks of computer time on the DOE leadership computing facilities, the INCITE program enables the largest-scale computations to be pursued. In 2009, INCITE will award over half a billion node-hours of time. The SciDAC conference celebrates progress in advancing science through large-scale modeling and simulation. Over 350 participants attended this year's talks, poster sessions, and tutorials, spanning the disciplines supported by DOE. While the principal focus was on SciDAC accomplishments, this year's conference also included invited presentations and posters from DOE INCITE awardees. Another new feature in the SciDAC conference series was an electronic theater and video poster session, which provided an opportunity for the community to see over 50 scientific visualizations in a venue equipped with many high-resolution large-format displays. To highlight the growing international interest in petascale computing, this year's SciDAC conference included a keynote presentation by Herman Lederer from the Max Planck Institut, one of the leaders of DEISA (Distributed European Infrastructure for Supercomputing Applications) project and a member of the PRACE consortium, Europe's main petascale project. We also heard excellent talks from several European groups, including Laurent Gicquel of CERFACS, who spoke on `Large-Eddy Simulations of Turbulent Reacting Flows of Real Burners: Status and Challenges', and Jean-Francois Hamelin from EDF, who presented a talk on `Getting Ready for Petaflop Capacities and Beyond: A Utility Perspective'. Two other compelling addresses gave attendees a glimpse into the future. Tomas Diaz de la Rubia of Lawrence Livermore National Laboratory spoke on a vision for a fusion/fission hybrid reactor known as the `LIFE Engine' and discussed some of the materials and modeling challenges that need to be overcome to realize the vision for a 1000-year greenhouse-gas-free power source. Dan Reed from Microsoft gave a capstone talk on the convergence of technology, architecture, and infrastructure for cloud computing, data-intensive computing, and exascale computing (1018 flops/sec). High-performance computing is making rapid strides. The SciDAC community's computational resources are expanding dramatically. In the summer of 2008 the first general purpose petascale system (IBM Cell-based RoadRunner at Los Alamos National Laboratory) was recognized in the top 500 list of fastest machines heralding in the dawning of the petascale era. The DOE's leadership computing facility at Argonne reached number three on the Top 500 and is at the moment the most capable open science machine based on an IBM BG/P system with a peak performance of over 550 teraflops/sec. Later this year Oak Ridge is expected to deploy a 1 petaflops/sec Cray XT system. And even before the scientific community has had an opportunity to make significant use of petascale systems, the computer science research community is forging ahead with ideas and strategies for development of systems that may by the end of the next decade sustain exascale performance. Several talks addressed barriers to, and strategies for, achieving exascale capabilities. The last day of the conference was devoted to tutorials hosted by Microsoft Research at a new conference facility in Redmond, Washington. Over 90 people attended the tutorials, which covered topics ranging from an introduction to BG/P programming to advanced numerical libraries. The SciDAC and INCITE programs and the DOE Office of Advanced Scientific Computing Research core program investments in applied mathematics, computer science, and computational and networking facilities provide a nearly optimum framework for advancing computational science for DOE's Office of Science. At a broader level this framework also is benefiting the entire American scientific enterprise. As we look forward, it is clear that computational approaches will play an increasingly significant role in addressing challenging problems in basic science, energy, and environmental research. It takes many people to organize and support the SciDAC conference, and I would like to thank as many of them as possible. The backbone of the conference is the technical program; and the task of selecting, vetting, and recruiting speakers is the job of the organizing committee. I thank the members of this committee for all the hard work and the many tens of conference calls that enabled a wonderful program to be assembled. This year the following people served on the organizing committee: Jim Ahrens, LANL; David Bader, LLNL; Bryan Barnett, Microsoft; Peter Beckman, ANL; Vincent Chan, GA; Jackie Chen, SNL; Lori Diachin, LLNL; Dan Fay, Microsoft; Ian Foster, ANL; Mark Gordon, Ames; Mohammad Khaleel, PNNL; David Keyes, Columbia University; Bob Lucas, University of Southern California; Tony Mezzacappa, ORNL; Jeff Nichols, ORNL; David Nowak, ANL; Michael Papka, ANL; Thomas Schultess, ORNL; Horst Simon, LBNL; David Skinner, LBNL; Panagiotis Spentzouris, Fermilab; Bob Sugar, UCSB; and Kathy Yelick, LBNL. I owe a special thanks to Mike Papka and Jim Ahrens for handling the electronic theater. I also thank all those who submitted videos. It was a highly successful experiment. Behind the scenes an enormous amount of work is required to make a large conference go smoothly. First I thank Cheryl Zidel for her tireless efforts as organizing committee liaison and posters chair and, in general, handling all of my end of the program and keeping me calm. I also thank Gail Pieper for her work in editing the proceedings, Beth Cerny Patino for her work on the Organizing Committee website and electronic theater, and Ken Raffenetti for his work in keeping that website working. Jon Bashor and John Hules did an excellent job in handling conference communications. I thank Caitlin Youngquist for the striking graphic design; Dan Fay for tutorials arrangements; and Lynn Dory, Suzanne Stevenson, Sarah Pebelske and Sarah Zidel for on-site registration and conference support. We all owe Yeen Mankin an extra-special thanks for choosing the hotel, handling contracts, arranging menus, securing venues, and reassuring the chair that everything was under control. We are pleased to have obtained corporate sponsorship from Cray, IBM, Intel, HP, and SiCortex. I thank all the speakers and panel presenters. I also thank the former conference chairs Tony Metzzacappa, Bill Tang, and David Keyes, who were never far away for advice and encouragement. Finally, I offer my thanks to Michael Strayer, without whose leadership, vision, and persistence the SciDAC program would not have come into being and flourished. I am honored to be part of his program and his friend. Rick Stevens Seattle, Washington July 18, 2008
JESPP: Joint Experimentation on Scalable Parallel Processors Supercomputers

DTIC Science & Technology

2010-03-01

were for the relatively small market of scientific and engineering applications. Contrast this with GPUs that are designed to improve the end- user...experience in mass- market arenas such as gaming. In order to get meaningful speed-up using the GPU, it was determined that the data transfer and...Included) Conference Year Effectively using a Large GPGPU-Enhanced Linux Cluster HPCMP UGC 2009 FLOPS per Watt: Heterogeneous-Computing’s Approach
Quality use of the computer: Computational mechanics, artificial intelligence, robotics, and acoustic sensing; Proceedings of the ASME/JSME Pressure Vessels and Piping Conference, Honolulu, HI, July 23-27, 1989

NASA Astrophysics Data System (ADS)

Cory, J. F., Jr.; Gordon, J. L.; Miyoshi, T.; Suzuki, K.

1989-06-01

Papers are presented on the use of microcomputers, supercomputers, and workstations in solid and structural mechanics. Artificial intelligence technology, the development and use of expert systems, and research in the area of robotics are discussed. Attention is also given to probabilistic finite element and boundary element methods and acoustic sensing.
Shooting and bouncing rays - Calculating the RCS of an arbitrarily shaped cavity

NASA Technical Reports Server (NTRS)

Ling, Hao; Chou, Ri-Chee; Lee, Shung-Wu

1989-01-01

A ray-shooting approach is presented for calculating the interior radar cross section (RCS) from a partially open cavity. In the problem considered, a dense grid of rays is launched into the cavity through the opening. The rays bounce from the cavity walls based on the laws of geometrical optics and eventually exit the cavity via the aperture. The ray-bouncing method is based on tracking a large number of rays launched into the cavity through the opening and determining the geometrical optics field associated with each ray by taking into consideration (1) the geometrical divergence factor, (2) polarization, and (3) material loading of the cavity walls. A physical optics scheme is then applied to compute the backscattered field from the exit rays. This method is so simple in concept that there is virtually no restriction on the shape or material loading of the cavity. Numerical results obtained by this method are compared with those for the modal analysis for a circular cylinder terminated by a PEC plate. RCS results for an S-bend circular cylinder generated on the Cray X-MP supercomputer show significant RCS reduction. Some of the limitations and possible extensions of this technique are discussed.
PARVMEC: An Efficient, Scalable Implementation of the Variational Moments Equilibrium Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seal, Sudip K; Hirshman, Steven Paul; Wingen, Andreas

The ability to sustain magnetically confined plasma in a state of stable equilibrium is crucial for optimal and cost-effective operations of fusion devices like tokamaks and stellarators. The Variational Moments Equilibrium Code (VMEC) is the de-facto serial application used by fusion scientists to compute magnetohydrodynamics (MHD) equilibria and study the physics of three dimensional plasmas in confined configurations. Modern fusion energy experiments have larger system scales with more interactive experimental workflows, both demanding faster analysis turnaround times on computational workloads that are stressing the capabilities of sequential VMEC. In this paper, we present PARVMEC, an efficient, parallel version of itsmore » sequential counterpart, capable of scaling to thousands of processors on distributed memory machines. PARVMEC is a non-linear code, with multiple numerical physics modules, each with its own computational complexity. A detailed speedup analysis supported by scaling results on 1,024 cores of a Cray XC30 supercomputer is presented. Depending on the mode of PARVMEC execution, speedup improvements of one to two orders of magnitude are reported. PARVMEC equips fusion scientists for the first time with a state-of-theart capability for rapid, high fidelity analyses of magnetically confined plasmas at unprecedented scales.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE PAGES

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; ...

2017-09-14

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
Dynamic load balancing for petascale quantum Monte Carlo applications: The Alias method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sudheer, C. D.; Krishnan, S.; Srinivasan, A.

Diffusion Monte Carlo is the most accurate widely used Quantum Monte Carlo method for the electronic structure of materials, but it requires frequent load balancing or population redistribution steps to maintain efficiency and avoid accumulation of systematic errors on parallel machines. The load balancing step can be a significant factor affecting performance, and will become more important as the number of processing elements increases. We propose a new dynamic load balancing algorithm, the Alias Method, and evaluate it theoretically and empirically. An important feature of the new algorithm is that the load can be perfectly balanced with each process receivingmore » at most one message. It is also optimal in the maximum size of messages received by any process. We also optimize its implementation to reduce network contention, a process facilitated by the low messaging requirement of the algorithm. Empirical results on the petaflop Cray XT Jaguar supercomputer at ORNL showing up to 30% improvement in performance on 120,000 cores. The load balancing algorithm may be straightforwardly implemented in existing codes. The algorithm may also be employed by any method with many near identical computational tasks that requires load balancing.« less
OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms

DOE PAGES

Meng, Zhaoyi; Koniges, Alice; He, Yun Helen; ...

2016-09-21

In this paper, we investigate the OpenMP parallelization and optimization of two novel data classification algorithms. The new algorithms are based on graph and PDE solution techniques and provide significant accuracy and performance advantages over traditional data classification algorithms in serial mode. The methods leverage the Nystrom extension to calculate eigenvalue/eigenvectors of the graph Laplacian and this is a self-contained module that can be used in conjunction with other graph-Laplacian based methods such as spectral clustering. We use performance tools to collect the hotspots and memory access of the serial codes and use OpenMP as the parallelization language to parallelizemore » the most time-consuming parts. Where possible, we also use library routines. We then optimize the OpenMP implementations and detail the performance on traditional supercomputer nodes (in our case a Cray XC30), and test the optimization steps on emerging testbed systems based on Intel’s Knights Corner and Landing processors. We show both performance improvement and strong scaling behavior. Finally, a large number of optimization techniques and analyses are necessary before the algorithm reaches almost ideal scaling.« less
Computer aided design of monolithic microwave and millimeter wave integrated circuits and subsystems

NASA Astrophysics Data System (ADS)

Ku, Walter H.

1987-08-01

This interim technical report presents results of research on the computer aided design of monolithic microwave and millimeter wave integrated circuits and subsystems. A specific objective is to extend the state-of-the-art of the Computer Aided Design (CAD) of the monolithic microwave and millimeter wave integrated circuits (MIMIC). In this reporting period, we have derived a new model for the high electron mobility transistor (HEMT) based on a nonlinear charge control formulation which takes into consideration the variation of the 2DEG distance offset from the heterointerface as a function of bias. Pseudomorphic InGaAs/GaAs HEMT devices have been successfully fabricated at UCSD. For a 1 micron gate length, a maximum transconductance of 320 mS/mm was obtained. In cooperation with TRW, devices with 0.15 micron and 0.25 micron gate lengths have been successfully fabricated and tested. New results on the design of ultra-wideband distributed amplifiers using 0.15 micron pseudomorphic InGaAs/GaAs HEMT's have also been obtained. In addition, two-dimensional models of the submicron MESFET's, HEMT's and HBT's are currently being developed for the CRAY X-MP/48 supercomputer. Preliminary results obtained are also presented in this report.
High-performance finite-difference time-domain simulations of C-Mod and ITER RF antennas

NASA Astrophysics Data System (ADS)

Jenkins, Thomas G.; Smithe, David N.

2015-12-01

Finite-difference time-domain methods have, in recent years, developed powerful capabilities for modeling realistic ICRF behavior in fusion plasmas [1, 2, 3, 4]. When coupled with the power of modern high-performance computing platforms, such techniques allow the behavior of antenna near and far fields, and the flow of RF power, to be studied in realistic experimental scenarios at previously inaccessible levels of resolution. In this talk, we present results and 3D animations from high-performance FDTD simulations on the Titan Cray XK7 supercomputer, modeling both Alcator C-Mod's field-aligned ICRF antenna and the ITER antenna module. Much of this work focuses on scans over edge density, and tailored edge density profiles, to study dispersion and the physics of slow wave excitation in the immediate vicinity of the antenna hardware and SOL. An understanding of the role of the lower-hybrid resonance in low-density scenarios is emerging, and possible implications of this for the NSTX launcher and power balance are also discussed. In addition, we discuss ongoing work centered on using these simulations to estimate sputtering and impurity production, as driven by the self-consistent sheath potentials at antenna surfaces.
High-performance finite-difference time-domain simulations of C-Mod and ITER RF antennas

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jenkins, Thomas G., E-mail: tgjenkins@txcorp.com; Smithe, David N., E-mail: smithe@txcorp.com

Finite-difference time-domain methods have, in recent years, developed powerful capabilities for modeling realistic ICRF behavior in fusion plasmas [1, 2, 3, 4]. When coupled with the power of modern high-performance computing platforms, such techniques allow the behavior of antenna near and far fields, and the flow of RF power, to be studied in realistic experimental scenarios at previously inaccessible levels of resolution. In this talk, we present results and 3D animations from high-performance FDTD simulations on the Titan Cray XK7 supercomputer, modeling both Alcator C-Mod’s field-aligned ICRF antenna and the ITER antenna module. Much of this work focuses on scansmore » over edge density, and tailored edge density profiles, to study dispersion and the physics of slow wave excitation in the immediate vicinity of the antenna hardware and SOL. An understanding of the role of the lower-hybrid resonance in low-density scenarios is emerging, and possible implications of this for the NSTX launcher and power balance are also discussed. In addition, we discuss ongoing work centered on using these simulations to estimate sputtering and impurity production, as driven by the self-consistent sheath potentials at antenna surfaces.« less
A matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs

DOE PAGES

Azad, Ariful; Buluç, Aydın

2016-05-16

We describe parallel algorithms for computing maximal cardinality matching in a bipartite graph on distributed-memory systems. Unlike traditional algorithms that match one vertex at a time, our algorithms process many unmatched vertices simultaneously using a matrix-algebraic formulation of maximal matching. This generic matrix-algebraic framework is used to develop three efficient maximal matching algorithms with minimal changes. The newly developed algorithms have two benefits over existing graph-based algorithms. First, unlike existing parallel algorithms, cardinality of matching obtained by the new algorithms stays constant with increasing processor counts, which is important for predictable and reproducible performance. Second, relying on bulk-synchronous matrix operations,more » these algorithms expose a higher degree of parallelism on distributed-memory platforms than existing graph-based algorithms. We report high-performance implementations of three maximal matching algorithms using hybrid OpenMP-MPI and evaluate the performance of these algorithm using more than 35 real and randomly generated graphs. On real instances, our algorithms achieve up to 200 × speedup on 2048 cores of a Cray XC30 supercomputer. Even higher speedups are obtained on larger synthetically generated graphs where our algorithms show good scaling on up to 16,384 cores.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less

A distributed-memory approximation algorithm for maximum weight perfect bipartite matching

DOE Office of Scientific and Technical Information (OSTI.GOV)

Azad, Ariful; Buluc, Aydin; Li, Xiaoye S.

We design and implement an efficient parallel approximation algorithm for the problem of maximum weight perfect matching in bipartite graphs, i.e. the problem of finding a set of non-adjacent edges that covers all vertices and has maximum weight. This problem differs from the maximum weight matching problem, for which scalable approximation algorithms are known. It is primarily motivated by finding good pivots in scalable sparse direct solvers before factorization where sequential implementations of maximum weight perfect matching algorithms, such as those available in MC64, are widely used due to the lack of scalable alternatives. To overcome this limitation, we proposemore » a fully parallel distributed memory algorithm that first generates a perfect matching and then searches for weightaugmenting cycles of length four in parallel and iteratively augments the matching with a vertex disjoint set of such cycles. For most practical problems the weights of the perfect matchings generated by our algorithm are very close to the optimum. An efficient implementation of the algorithm scales up to 256 nodes (17,408 cores) on a Cray XC40 supercomputer and can solve instances that are too large to be handled by a single node using the sequential algorithm.« less
[Teaching evaluation at Medical School, UNAM].

PubMed

Salas-Gómez, Luz Elena; Ortiz-Montalvo, Armando; Alaminos-Sager, Isabel Luisa

2006-01-01

The purpose of this article is to offer a synthesis of what has been done in the Teaching Evaluation Program at the Medical School of the Autonomous National University of Mexico (UNAM). The Program involves three questionnaires of the students' opinion that evaluate professors of the basic and sociomedical areas, microbiology and parasitology laboratory and surgery. Between 1994 and 2003, 134,811 questionnaires were answered to evaluate the teaching performance of 6262 professors of pregraduate students. Although the evaluation of teaching through a single way is insufficient, the results obtained allow us to affirm that the Medical School at UNAM has a good professor staff, as well as they are useful for the design of programs dedicated to the acknowledgment of excellence and the needs for teaching education.
The UNAM M. Sc. program in Medical Physics enters its teen years

NASA Astrophysics Data System (ADS)

Brandan, María-Ester

2010-12-01

The M.Sc. (Medical Physics) program at the National Autonomous University of Mexico UNAM, created in 1997, has graduated a substantial number of medical physicists who constitute today about 30% of the medical physics clinical workforce in the country. Up to present date (May 2010) more than 60 students have graduated, 60% of them hold clinical jobs, 20% have completed or study a Ph.D., and 15% perform activities related to this specialization. In addition to strengthening the clinical practice of medical physics, the program has served as an incentive for medical physics research in UNAM and other centers. We report the circumstances of the program origin, the evolution of its curriculum, the main achievements, and the next challenges.
Integration, Development and Performance of the 500 TFLOPS Heterogeneous Cluster (Condor)

DTIC Science & Technology

2012-08-01

PlayStation 3 for High Performance Cluster Computing” LAPACK Working Note 185, 2007. [ 4 ] Feng, W., X. Feng, and R. Ge, “Green Supercomputing Comes of...CONFERENCE PAPER (Post Print) 3. DATES COVERED (From - To) JUN 2010 – MAY 2013 4 . TITLE AND SUBTITLE INTEGRATION, DEVELOPMENT AND PERFORMANCE OF...and streaming processing; the PlayStation 3 uses the IBM Cell BE processor, which adopts the multi-processor, single-instruction-multiple- data (SIMD
VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Childs, Hank; Brugger, Eric; Whitlock, Brad

2012-11-01

VisIt is a popular open source tool for visualizing and analyzing big data. It owes its success to its foci of increasing data understanding, large data support, and providing a robust and usable product, as well as its underlying design that fits today's supercomputing landscape. This report, which draws heavily from an earlier publication at the SciDAC Conference in 2011 describes the VisIt project and its accomplishments.
OPAL: An Open-Source MPI-IO Library over Cray XT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yu, Weikuan; Vetter, Jeffrey S; Canon, Richard Shane

Parallel IO over Cray XT is supported by a vendor-supplied MPI-IO package. This package contains a proprietary ADIO implementation built on top of the sysio library. While it is reasonable to maintain a stable code base for application scientists' convenience, it is also very important to the system developers and researchers to analyze and assess the effectiveness of parallel IO software, and accordingly, tune and optimize the MPI-IO implementation. A proprietary parallel IO code base relinquishes such flexibilities. On the other hand, a generic UFS-based MPI-IO implementation is typically used on many Linux-based platforms. We have developed an open-source MPI-IOmore » package over Lustre, referred to as OPAL (OPportunistic and Adaptive MPI-IO Library over Lustre). OPAL provides a single source-code base for MPI-IO over Lustre on Cray XT and Linux platforms. Compared to Cray implementation, OPAL provides a number of good features, including arbitrary specification of striping patterns and Lustre-stripe aligned file domain partitioning. This paper presents the performance comparisons between OPAL and Cray's proprietary implementation. Our evaluation demonstrates that OPAL achieves the performance comparable to the Cray implementation. We also exemplify the benefits of an open source package in revealing the underpinning of the parallel IO performance.« less
PREFACE: IUPAP C20 Conference on Computational Physics (CCP 2011)

NASA Astrophysics Data System (ADS)

Troparevsky, Claudia; Stocks, George Malcolm

2012-12-01

Increasingly, computational physics stands alongside experiment and theory as an integral part of the modern approach to solving the great scientific challenges of the day on all scales - from cosmology and astrophysics, through climate science, to materials physics, and the fundamental structure of matter. Computational physics touches aspects of science and technology with direct relevance to our everyday lives, such as communication technologies and securing a clean and efficient energy future. This volume of Journal of Physics: Conference Series contains the proceedings of the scientific contributions presented at the 23rd Conference on Computational Physics held in Gatlinburg, Tennessee, USA, in November 2011. The annual Conferences on Computational Physics (CCP) are dedicated to presenting an overview of the most recent developments and opportunities in computational physics across a broad range of topical areas and from around the world. The CCP series has been in existence for more than 20 years, serving as a lively forum for computational physicists. The topics covered by this conference were: Materials/Condensed Matter Theory and Nanoscience, Strongly Correlated Systems and Quantum Phase Transitions, Quantum Chemistry and Atomic Physics, Quantum Chromodynamics, Astrophysics, Plasma Physics, Nuclear and High Energy Physics, Complex Systems: Chaos and Statistical Physics, Macroscopic Transport and Mesoscopic Methods, Biological Physics and Soft Materials, Supercomputing and Computational Physics Teaching, Computational Physics and Sustainable Energy. We would like to take this opportunity to thank our sponsors: International Union of Pure and Applied Physics (IUPAP), IUPAP Commission on Computational Physics (C20), American Physical Society Division of Computational Physics (APS-DCOMP), Oak Ridge National Laboratory (ORNL), Center for Defect Physics (CDP), the University of Tennessee (UT)/ORNL Joint Institute for Computational Sciences (JICS) and Cray, Inc. We are grateful to the committees that helped put the conference together, especially the local organizing committee. Particular thanks are also due to a number of ORNL staff who spent long hours with the administrative details. We are pleased to express our thanks to the conference administrator Ann Strange (ORNL/CDP) for her responsive and efficient day-to-day handling of this event, Sherry Samples, Assistant Conference Administrator (ORNL), Angie Beach and the ORNL Conference Office, and Shirley Shugart (ORNL) and Fern Stooksbury (ORNL) who created and maintained the conference website. Editors: G Malcolm Stocks (ORNL) and M Claudia Troparevsky (UT) http://ccp2011.ornl.gov Chair: Dr Malcolm Stocks (ORNL) Vice Chairs: Adriana Moreo (ORNL/UT) James Guberrnatis (LANL) Local Program Committee: Don Batchelor (ORNL) Jack Dongarra (UTK/ORNL) James Hack (ORNL) Robert Harrison (ORNL) Paul Kent (ORNL) Anthony Mezzacappa (ORNL) Adriana Moreo (ORNL) Witold Nazarewicz (UT) Loukas Petridis (ORNL) David Schultz (ORNL) Bill Shelton (ORNL) Claudia Troparevsky (ORNL) Mina Yoon (ORNL) International Advisory Board Members: Joan Adler (Israel Institute of Technology, Israel) Constantia Alexandrou (University of Cyprus, Cyprus) Claudia Ambrosch-Draxl (University of Leoben, Austria) Amanda Barnard (CSIRO, Australia) Peter Borcherds (University of Birmingham, UK) Klaus Cappelle (UFABC, Brazil) Giovanni Ciccotti (Università degli Studi di Roma 'La Sapienza', Italy) Nithaya Chetty (University of Pretoria, South Africa) Charlotte Froese-Fischer (NIST, US) Giulia A. Galli (University of California, Davis, US) Gillian Gehring (University of Sheffield, UK) Guang-Yu Guo (National Taiwan University, Taiwan) Sharon Hammes-Schiffer (Penn State, US) Alex Hansen (Norweigan UST) Duane D. Johnson (University of Illinois at Urbana-Champaign, US) David Landau (University of Georgia, US) Joaquin Marro (University of Granada, Spain) Richard Martin (UIUC, US) Todd Martinez (Stanford University, US) Bill McCurdy (Lawrence Berkeley National Laboratory, US) Ingrid Mertig (Martin Luther University, Germany) Alejandro Muramatsu (Universitat Stuttgart, Germany) Richard Needs (Cavendish Laboratory, UK) Giuseppina Orlandini (University of Trento, Italy) Martin Savage (University of Washington, US) Thomas Schulthess (ETH, Switzerland) Dzidka Szotek (Daresbury Laboratory, UK) Hideaki Takabe (Osaka University, Japan) William M. Tang (Princeton University, US) James Vary (Iowa State, US) Enge Wang (Chinese Academy of Science, China) Jian-Guo Wang (Institute of Applied Physics and Computational Mathematics, China) Jian-Sheng Wang (National University, Singapore) Dan Wei (Tsinghua University, China) Tony Williams (University of Adelaide, Australia) Rudy Zeller (Julich, Germany) Conference Administrator: Ann Strange (ORNL)
DOE Office of Scientific and Technical Information (OSTI.GOV)

Maxwell, Don E; Ezell, Matthew A; Becklehimer, Jeff

While sites generally have systems in place to monitor the health of Cray computers themselves, often the cooling systems are ignored until a computer failure requires investigation into the source of the failure. The Liebert XDP units used to cool the Cray XE/XK models as well as the Cray proprietary cooling system used for the Cray XC30 models provide data useful for health monitoring. Unfortunately, this valuable information is often available only to custom solutions not accessible by a center-wide monitoring system or is simply ignored entirely. In this paper, methods and tools used to harvest the monitoring data availablemore » are discussed, and the implementation needed to integrate the data into a center-wide monitoring system at the Oak Ridge National Laboratory is provided.« less
Contract W911NF-09-1-0488 (Rush University Medical Center)

DTIC Science & Technology

2012-11-23

algorithm. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pages 12�21, New York, 1993. ACM. [8] R. Yokota, T. Hamada, J. P. Bardhan , M...computing gravity anom- alies. Geophysical Journal International, 2011. to appear. [13] R. Yokota, T. Hamada, J. P. Bardhan , M. G. Knepley, and L. A. Barba...extension of the petfmm a fast multipole library. Presentation at WCCM 2010, Sydney Australia, 2010. [15] J. P. Bardhan . Interpreting the Coulomb
Information Literacy in Nursing Students of Fes Zaragoza Unam.

PubMed

Sánchez G, Ana; Carmona M, Beatriz; Pérez O, Edgar; Del Socorro García V, Ma

2018-01-01

In an exploratory quantitative study,information literacy was analyzed in first-year students of the nursing program at the Facultad de Estudios superiores Zaragoza UNAM. A sample of 150 students completed, a validated scale that consisted of 8 categories and a total of 110 items. The average score obtained was 2.11, which places them in a 'low knowledge' category of information literacy.
Multitasking the three-dimensional transport code TORT on CRAY platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Azmy, Y.Y.; Barnett, D.A.; Burre, C.A.

1996-04-01

The multitasking options in the three-dimensional neutral particle transport code TORT originally implemented for Cray`s CTSS operating system are revived and extended to run on Cray Y/MP and C90 computers using the UNICOS operating system. These include two coarse-grained domain decompositions; across octants, and across directions within an octant, termed Octant Parallel (OP), and Direction Parallel (DP), respectively. Parallel performance of the DP is significantly enhanced by increasing the task grain size and reducing load imbalance via dynamic scheduling of the discrete angles among the participating tasks. Substantial Wall Clock speedup factors, approaching 4.5 using 8 tasks, have been measuredmore » in a time-sharing environment, and generally depend on the test problem specifications, number of tasks, and machine loading during execution.« less
Crystal Structures of Active Fully Assembled Substrate- and Product-Bound Complexes of UDP-N-Acetylmuramic Acid:l-Alanine Ligase (MurC) from Haemophilus influenzae

PubMed Central

Mol, Clifford D.; Brooun, Alexei; Dougan, Douglas R.; Hilgers, Mark T.; Tari, Leslie W.; Wijnands, Robert A.; Knuth, Mark W.; McRee, Duncan E.; Swanson, Ronald V.

2003-01-01

UDP-N-acetylmuramic acid:l-alanine ligase (MurC) catalyzes the addition of the first amino acid to the cytoplasmic precursor of the bacterial cell wall peptidoglycan. The crystal structures of Haemophilus influenzae MurC in complex with its substrate UDP-N-acetylmuramic acid (UNAM) and Mg2+ and of a fully assembled MurC complex with its product UDP-N-acetylmuramoyl-l-alanine (UMA), the nonhydrolyzable ATP analogue AMPPNP, and Mn2+ have been determined to 1.85- and 1.7-Å resolution, respectively. These structures reveal a conserved, three-domain architecture with the binding sites for UNAM and ATP formed at the domain interfaces: the N-terminal domain binds the UDP portion of UNAM, and the central and C-terminal domains form the ATP-binding site, while the C-terminal domain also positions the alanine. An active enzyme structure is thus assembled at the common domain interfaces when all three substrates are bound. The MurC active site clearly shows that the γ-phosphate of AMPPNP is positioned between two bound metal ions, one of which also binds the reactive UNAM carboxylate, and that the alanine is oriented by interactions with the positively charged side chains of two MurC arginine residues and the negatively charged alanine carboxyl group. These results indicate that significant diversity exists in binding of the UDP moiety of the substrate by MurC and the subsequent ligases in the bacterial cell wall biosynthesis pathway and that alterations in the domain packing and tertiary structure allow the Mur ligases to bind sequentially larger UNAM peptide substrates. PMID:12837790
Crystal structures of active fully assembled substrate- and product-bound complexes of UDP-N-acetylmuramic acid:L-alanine ligase (MurC) from Haemophilus influenzae.

PubMed

Mol, Clifford D; Brooun, Alexei; Dougan, Douglas R; Hilgers, Mark T; Tari, Leslie W; Wijnands, Robert A; Knuth, Mark W; McRee, Duncan E; Swanson, Ronald V

2003-07-01

UDP-N-acetylmuramic acid:L-alanine ligase (MurC) catalyzes the addition of the first amino acid to the cytoplasmic precursor of the bacterial cell wall peptidoglycan. The crystal structures of Haemophilus influenzae MurC in complex with its substrate UDP-N-acetylmuramic acid (UNAM) and Mg(2+) and of a fully assembled MurC complex with its product UDP-N-acetylmuramoyl-L-alanine (UMA), the nonhydrolyzable ATP analogue AMPPNP, and Mn(2+) have been determined to 1.85- and 1.7-A resolution, respectively. These structures reveal a conserved, three-domain architecture with the binding sites for UNAM and ATP formed at the domain interfaces: the N-terminal domain binds the UDP portion of UNAM, and the central and C-terminal domains form the ATP-binding site, while the C-terminal domain also positions the alanine. An active enzyme structure is thus assembled at the common domain interfaces when all three substrates are bound. The MurC active site clearly shows that the gamma-phosphate of AMPPNP is positioned between two bound metal ions, one of which also binds the reactive UNAM carboxylate, and that the alanine is oriented by interactions with the positively charged side chains of two MurC arginine residues and the negatively charged alanine carboxyl group. These results indicate that significant diversity exists in binding of the UDP moiety of the substrate by MurC and the subsequent ligases in the bacterial cell wall biosynthesis pathway and that alterations in the domain packing and tertiary structure allow the Mur ligases to bind sequentially larger UNAM peptide substrates.
Final Report for Project FG02-05ER25685

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiaosong Ma

2009-05-07

In this report, the PI summarizes the results and achievements obtained in the sponsored project. Overall, the project has been very successful and produced both research results in massive data-intensive computing and data management for large scale supercomputers today, and in open-source software products. During the project period, 14 conference/journal publications, as well as two PhD students, have been produced due to exclusive or shared support from this award. In addition, the PI has recently been granted tenure from NC State University.
Computation Directorate 2008 Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, D L

2009-03-25

Whether a computer is simulating the aging and performance of a nuclear weapon, the folding of a protein, or the probability of rainfall over a particular mountain range, the necessary calculations can be enormous. Our computers help researchers answer these and other complex problems, and each new generation of system hardware and software widens the realm of possibilities. Building on Livermore's historical excellence and leadership in high-performance computing, Computation added more than 331 trillion floating-point operations per second (teraFLOPS) of power to LLNL's computer room floors in 2008. In addition, Livermore's next big supercomputer, Sequoia, advanced ever closer to itsmore » 2011-2012 delivery date, as architecture plans and the procurement contract were finalized. Hyperion, an advanced technology cluster test bed that teams Livermore with 10 industry leaders, made a big splash when it was announced during Michael Dell's keynote speech at the 2008 Supercomputing Conference. The Wall Street Journal touted Hyperion as a 'bright spot amid turmoil' in the computer industry. Computation continues to measure and improve the costs of operating LLNL's high-performance computing systems by moving hardware support in-house, by measuring causes of outages to apply resources asymmetrically, and by automating most of the account and access authorization and management processes. These improvements enable more dollars to go toward fielding the best supercomputers for science, while operating them at less cost and greater responsiveness to the customers.« less
Early Experiences Writing Performance Portable OpenMP 4 Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Joubert, Wayne; Hernandez, Oscar R

In this paper, we evaluate the recently available directives in OpenMP 4 to parallelize a computational kernel using both the traditional shared memory approach and the newer accelerator targeting capabilities. In addition, we explore various transformations that attempt to increase application performance portability, and examine the expressiveness and performance implications of using these approaches. For example, we want to understand if the target map directives in OpenMP 4 improve data locality when mapped to a shared memory system, as opposed to the traditional first touch policy approach in traditional OpenMP. To that end, we use recent Cray and Intel compilersmore » to measure the performance variations of a simple application kernel when executed on the OLCF s Titan supercomputer with NVIDIA GPUs and the Beacon system with Intel Xeon Phi accelerators attached. To better understand these trade-offs, we compare our results from traditional OpenMP shared memory implementations to the newer accelerator programming model when it is used to target both the CPU and an attached heterogeneous device. We believe the results and lessons learned as presented in this paper will be useful to the larger user community by providing guidelines that can assist programmers in the development of performance portable code.« less
Microgravity: Molecular Dynamics Simulations at the NCCS Probe the Behavior of Liquids in Low Gravity

NASA Technical Reports Server (NTRS)

2002-01-01

The life of the very small, whether in something as complicated as a human cell or as simple as a drop of water, is of fundamental scientific interest: By knowing how a tiny amount of material reacts to changes in its environment, scientists maybe able to answer questions about how a bulk of material would react to comparable changes. NASA is in the forefront of computational research into a broad range of basic scientific questions about fluid dynamics and the nature of liquid boundary instability. For example, one important issue for the space program is how drops of water and other materials will behave in the low-gravity environment of space and how the low gravity will affect the transport and containment of these materials. Accurate prediction of this behavior is among the aims of a set of molecular dynamics experiments carried out on the NCCSs Cray supercomputers. In conventional computational studies of materials, matter is treated as continuous - a macroscopic whole without regard to its molecular parts - and the behavior patterns of the matter in various physical environments are studied using well-established differential equations and mathematical parameters based on physical properties such as compressibility density, heat capacity, and vapor pressure of the bulk material.
DNA dynamics in aqueous solution: opening the double helix

NASA Technical Reports Server (NTRS)

Pohorille, A.; Ross, W. S.; Tinoco, I. Jr; MacElroy, R. D. (Principal Investigator)

1990-01-01

The opening of a DNA base pair is a simple reaction that is a prerequisite for replication, transcription, and other vital biological functions. Understanding the molecular mechanisms of biological reactions is crucial for predicting and, ultimately, controlling them. Realistic computer simulations of the reactions can provide the needed understanding. To model even the simplest reaction in aqueous solution requires hundreds of hours of supercomputing time. We have used molecular dynamics techniques to simulate fraying of the ends of a six base pair double strand of DNA, [TCGCGA]2, where the four bases of DNA are denoted by T (thymine), C (cytosine), G (guanine), and A (adenine), and to estimate the free energy barrier to this process. The calculations, in which the DNA was surrounded by 2,594 water molecules, required 50 hours of CRAY-2 CPU time for every simulated 100 picoseconds. A free energy barrier to fraying, which is mainly characterized by the movement of adenine away from thymine into aqueous environment, was estimated to be 4 kcal/mol. Another fraying pathway, which leads to stacking between terminal adenine and thymine, was also observed. These detailed pictures of the motions and energetics of DNA base pair opening in water are a first step toward understanding how DNA will interact with any molecule.
GPU-Accelerated Large-Scale Electronic Structure Theory on Titan with a First-Principles All-Electron Code

NASA Astrophysics Data System (ADS)

Huhn, William Paul; Lange, Björn; Yu, Victor; Blum, Volker; Lee, Seyong; Yoon, Mina

Density-functional theory has been well established as the dominant quantum-mechanical computational method in the materials community. Large accurate simulations become very challenging on small to mid-scale computers and require high-performance compute platforms to succeed. GPU acceleration is one promising approach. In this talk, we present a first implementation of all-electron density-functional theory in the FHI-aims code for massively parallel GPU-based platforms. Special attention is paid to the update of the density and to the integration of the Hamiltonian and overlap matrices, realized in a domain decomposition scheme on non-uniform grids. The initial implementation scales well across nodes on ORNL's Titan Cray XK7 supercomputer (8 to 64 nodes, 16 MPI ranks/node) and shows an overall speed up in runtime due to utilization of the K20X Tesla GPUs on each Titan node of 1.4x, with the charge density update showing a speed up of 2x. Further acceleration opportunities will be discussed. Work supported by the LDRD Program of ORNL managed by UT-Battle, LLC, for the U.S. DOE and by the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.
A practical approach to portability and performance problems on massively parallel supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beazley, D.M.; Lomdahl, P.S.

1994-12-08

We present an overview of the tactics we have used to achieve a high-level of performance while improving portability for a large-scale molecular dynamics code SPaSM. SPaSM was originally implemented in ANSI C with message passing for the Connection Machine 5 (CM-5). In 1993, SPaSM was selected as one of the winners in the IEEE Gordon Bell Prize competition for sustaining 50 Gflops on the 1024 node CM-5 at Los Alamos National Laboratory. Achieving this performance on the CM-5 required rewriting critical sections of code in CDPEAC assembler language. In addition, the code made extensive use of CM-5 parallel I/Omore » and the CMMD message passing library. Given this highly specialized implementation, we describe how we have ported the code to the Cray T3D and high performance workstations. In addition we will describe how it has been possible to do this using a single version of source code that runs on all three platforms without sacrificing any performance. Sound too good to be true? We hope to demonstrate that one can realize both code performance and portability without relying on the latest and greatest prepackaged tool or parallelizing compiler.« less

A compositional reservoir simulator on distributed memory parallel computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rame, M.; Delshad, M.

1995-12-31

This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
A 1998 Workshop on Heterogeneous Computing

DTIC Science & Technology

1998-09-18

of Sussex, England, in 1994. From 1988 to 1990 he was a Lecturer with the UNAM. In 1994, he joined the Laboratorio Nacional de Informatica Avanzada...1984) and at the UNAM (1988-1991). Since 1992, he is titular Researcher and consultant at the Laboratorio Nacional de Informatica Avanzada (LANIA). He...Box 1331 Piscataway, NJ 08855-1331 Tel: + 1-908-981-1393 Fax: + 1-908-981-9667 mis.custserv@computer.org IEEE Computer Society 13, Avenue de
Ada Compiler Validation Summary Report: Certificate Number: 901112W1. 11116 Cray Research, Inc., Cray Ada Compiler, Release 2.0, Cray X-MP/EA (Host & Target)

DTIC Science & Technology

1990-11-12

This feature prevents any significant unexpected and undesired size overhead introduced by the automatic inlining of a called subprogram. Any...PRESERVELAYOUT forces the 5.5.1 compiler to maintain the Ada source order of a given record type, thereby, preventing the compiler from performing this...Environment, Volme 2: Prgram nng Guide assignments to the copied array in Ada do not affect the Fortran version of the array. The dimensions and order of
Y-MP floating point and Cholesky factorization

NASA Technical Reports Server (NTRS)

Carter, Russell

1991-01-01

The floating point arithmetics implemented in the Cray 2 and Cray Y-MP computer systems are nearly identical, but large scale computations performed on the two systems have exhibited significant differences in accuracy. The difference in accuracy is analyzed for Cholesky factorization algorithm, and it is found that the source of the difference is the subtract magnitude operation of the Cray Y-MP. The results from numerical experiments for a range of problem sizes are presented, and an efficient method for improving the accuracy of the factorization obtained on the Y-MP is presented.
Preface

NASA Astrophysics Data System (ADS)

Mallamace, Francesco; Quintana, Jacqueline

2002-03-01

Complex systems represent one of the richest and more fascinating fields of current scientific research. The reason behind this is the important role that the properties of complex systems and materials play in a variety of different but overlapping areas in physics, chemistry, biology, mathematics, and social sciences, like medicine and economy. Such unusually broad research field is, therefore, of primary interest nowadays in pure science and technology. The role of statistical physics in this new field of complex systems has been present since its onset and it has been accelerating recently. Methods developed for studying ordering phenomena in simple systems have been generalized for application to more complex forms of matter (polymers, biological macromolecules, glasses, etc) and complex processes (e.g. chaos, turbulence, economy, jamming, biological processes). In particular, many different phenomena (considered in the past to belong to separate research fields) have now a common description. Pillars of such a description are the concepts of scaling and universality. The International Conference on `Scaling Concepts and Complex Systems' (a satellite meeting of STATPHYS21) was devoted to give an overview on recent developments around these two concepts. The Conference took place in Merida, Yucatan, Mexico, in July 9-14 2001. The meeting was held in the Gordon Conference style and was attended by about 100 scientists, it covered a large variety of theoretical and experimental research topics of current interest in complex systems and materials. The meeting consisted of a total number of about 40 invited and contributed talks and a poster session. The topics covered included: scaling behaviour, supra-molecular systems, aggregation, aggregation kinetics, growth mechanisms, disordered systems, soft condensed matter (polymers, biological polymers, bio-colloids, gels, colloids, membranes and interfacial phenomena), granular matter, phase separation and out-of-equilibrium dynamics, non-linear dynamics, chaos, turbulence and chaotic dynamics. The present issue contains a substantial number of the invited and contributed talks presented at the meeting. We made an effort to arrange these papers with an order similar to that of presentation during the meeting. It is our pleasure to thank the scientific committee, all the speakers, the session chairs and all participants who contributed to the success of the conference. We are grateful to the Bonino-Pulejo Foundation (Messina-Italy), and to the President On. Nino Calarco, for the Patronage and the enthusiastic support. Our thanks goes also for the Messina University, the INFM (Istituto Nazionale per la Fisica della Materia, Italy), the Consejo Nacional de Ciencia y Tecnología (CONACyT, Mexico) and the Universidad Nacional Autonoma de Mexico (UNAM). The Conference was sponsored by the INFM-Sec.C, CONACyT, UNAM, the Bonino-Pulejo Foundation which contributed financial support to participants and to the publication of the present issue. We are grateful to them for the support. Last, but not the least, we express our warmest gratitude to all the members of the local organizing committee for their assistance and for the work spent in organizing this meeting and especially to Professor~Alberto Robledo for his valuable advice.
Interconnect Performance Evaluation of SGI Altix 3700 BX2, Cray X1, Cray Opteron Cluster, and Dell PowerEdge

NASA Technical Reports Server (NTRS)

Fatoohi, Rod; Saini, Subbash; Ciotti, Robert

2006-01-01

We study the performance of inter-process communication on four high-speed multiprocessor systems using a set of communication benchmarks. The goal is to identify certain limiting factors and bottlenecks with the interconnect of these systems as well as to compare these interconnects. We measured network bandwidth using different number of communicating processors and communication patterns, such as point-to-point communication, collective communication, and dense communication patterns. The four platforms are: a 512-processor SGI Altix 3700 BX2 shared-memory machine with 3.2 GB/s links; a 64-processor (single-streaming) Cray XI shared-memory machine with 32 1.6 GB/s links; a 128-processor Cray Opteron cluster using a Myrinet network; and a 1280-node Dell PowerEdge cluster with an InfiniBand network. Our, results show the impact of the network bandwidth and topology on the overall performance of each interconnect.
La Seccion de Investigacion sobre Educacion Medica de la Facultad de Medicina de la Universidad National Autonoma de Mexico (The Medical Education Investigation Section of the School of Medicine of UNAM).

ERIC Educational Resources Information Center

Campillo Sainz, Carlos; Alvarez, Tostado, Juan

This document is an English-language abstract (approximately 1,500 words) of a survey of Mexican medical education needs for the future. To plan for these needs, SISEM of UNAM was formed with the objectives of carrying out and promoting investigations in the different areas of medical education. It also wants to distribute the information…
A Neural Network Based Workstation for Automated Cell Proliferation Analysis

DTIC Science & Technology

2001-10-25

work was supported by the Programa de Apoyo a Proyectos de Desarrollo e Investigacíon en Informática REDII 2000. We thank Blanca Itzel Taboada for...Meléndez1, G. Corkidi.2 1Centro de Instrumentos, UNAM. P.O. Box 70-186, México 04510, D.F. 2Instituto de Biotecnología, UNAM. P.O. Box 510-3, 62250...proliferation analysis, of cytological microscope images. The software of the system assists the expert biotechnologist during cell proliferation and
Multitasking a three-dimensional Navier-Stokes algorithm on the Cray-2

NASA Technical Reports Server (NTRS)

Swisshelm, Julie M.

1989-01-01

A three-dimensional computational aerodynamics algorithm has been multitasked for efficient parallel execution on the Cray-2. It provides a means for examining the multitasking performance of a complete CFD application code. An embedded zonal multigrid scheme is used to solve the Reynolds-averaged Navier-Stokes equations for an internal flow model problem. The explicit nature of each component of the method allows a spatial partitioning of the computational domain to achieve a well-balanced task load for MIMD computers with vector-processing capability. Experiments have been conducted with both two- and three-dimensional multitasked cases. The best speedup attained by an individual task group was 3.54 on four processors of the Cray-2, while the entire solver yielded a speedup of 2.67 on four processors for the three-dimensional case. The multiprocessing efficiency of various types of computational tasks is examined, performance on two Cray-2s with different memory access speeds is compared, and extrapolation to larger problems is discussed.
Attaching IBM-compatible 3380 disks to Cray X-MP

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engert, D.E.; Midlock, J.L.

1989-01-01

A method of attaching IBM-compatible 3380 disks directly to a Cray X-MP via the XIOP with a BMC is described. The IBM 3380 disks appear to the UNICOS operating system as DD-29 disks with UNICOS file systems. IBM 3380 disks provide cheap, reliable large capacity disk storage. Combined with a small number of high-speed Cray disks, the IBM disks provide for the bulk of the storage for small files and infrequently used files. Cray Research designed the BMC and its supporting software in the XIOP to allow IBM tapes and other devices to be attached to the X-MP. No hardwaremore » changes were necessary, and we added less than 2000 lines of code to the XIOP to accomplish this project. This system has been in operation for over eight months. Future enhancements such as the use of a cache controller and attachment to a Y-MP are also described. 1 tab.« less
PREFACE: The IX Mexican Workshop on Particles and Fields

NASA Astrophysics Data System (ADS)

Amore, Paolo; Aranda, Alfredo; Bashir, Adnan; Mondragón, Myriam; Raya, Alfredo

2006-05-01

The IX Mexican Workshop on Particles and Fields was held in the beautiful city of Colima, in the South-West of Mexico, from 17-22 November 2003. The proceedings of the Workshop were delayed due to problems with a previous publisher, we are very grateful that Journal of Physics: Conference Series kindly agreed to publish the proceedings rapidly at this late stage. The Workshop aimed to cover, through invited lectures delivered by internationally known experts, the most recent developments in the field. There was also a series of short seminars as well as a poster session, which allowed the whole community to participate with their most recent research results. A special session was dedicated to awarding the Division Medal to Professor Benjamin Grinstein, from The University of California, San Diego, for his outstanding contributions to the field. This volume contains the written version of the material presented at the Workshop. The Workshop was attended by more than 100 participants, including faculty members, postdocs and graduate students. It was organized by the Particles and Fields Division of the Mexican Physical Society, and generously sponsored by several institutions: Universidad de Colima, Universidad Nacional Autónoma de México, Universidad Michoacana de San Nicolás Hidalgo, Centro de Investigaciones y Estudios Avanzados del IPN (CINVESTAV), Consejo Nacional de Ciencia y Tecnología (Conacyt). The Local Organizing Committee was integrated by Paolo Amore, Alfredo Aranda, Carlos Moisés Hernóndez Suórez (Director of the Physics Faculty), Arturo Gonzólez Larios, Enrique Farías Martínez, and Myriam Cruz Calvario, all from the University of Colima. The members of the National Organizing Committee were Adnan Bashir (IFM-UMSHN), Jens Erler (IF-UNAM), Heriberto Castilla Valdés (CINVESTAV-U.Zacatenco), Gabriel López Castro (CINVESTAV-U.Zacatenco), Myriam Mondragón (IF-UNAM) and Luis Villaseñ or (IFM-UMSHN). We gratefully acknowledge the help given by each one of them to the organization of the Workshop. Many thanks also to our Conference Secretaries Patricia Carranza and Soledad López for their efficiency and readiness to help, and to the many students of the University of Colima who helped us with innumerable tasks. Last but not least, we thank most warmly all the speakers for their excellent lectures.
User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Earth Sciences Division; Zhang, Keni; Zhang, Keni

TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less
Global Adjoint Tomography: Next-Generation Models

NASA Astrophysics Data System (ADS)

Bozdag, Ebru; Lefebvre, Matthieu; Lei, Wenjie; Orsvuran, Ridvan; Peter, Daniel; Ruan, Youyi; Smith, James; Komatitsch, Dimitri; Tromp, Jeroen

2017-04-01

The first-generation global adjoint tomography model GLAD-M15 (Bozdag et al. 2016) is the result of 15 conjugate-gradient iterations based on GPU-accelerated spectral-element simulations of 3D wave propagation and Fréchet kernels. For simplicity, GLAD-M15 was constructed as an elastic model with transverse isotropy confined to the upper mantle. However, Earth's mantle and crust show significant evidence of anisotropy as a result of its composition and deformation. There may be different sources of seismic anisotropy affecting both body and surface waves. As a first attempt, we initially tackle with surface-wave anisotropy and proceed iterations using the same 253 earthquake data set used in GLAD-M15 with an emphasize on upper-mantle. Furthermore, we explore new misfits, such as double-difference measurements (Yuan et al. 2016), to better deal with the possible artifacts of the uneven distribution of seismic stations globally and minimize source uncertainties in structural inversions. We will present our observations with the initial results of azimuthally anisotropic inversions and also discuss the next generation global models with various parametrizations. Meanwhile our goal is to use all available seismic data in imaging. This however requires a solid framework to perform iterative adjoint tomography workflows with big data on supercomputers. We will talk about developments in adjoint tomography workflow from the need of defining new seismic and computational data formats (e.g., ASDF by Krischer et al. 2016, ADIOS by Liu et al. 2011) to developing new pre- and post-processing tools together with experimenting workflow management tools, such as Pegasus (Deelman et al. 2015). All our simulations are performed on Oak Ridge National Laboratory's Cray XK7 "Titan" system. Our ultimate aim is to get ready to harness ORNL's next-generation supercomputer "Summit", an IBM with Power-9 CPUs and NVIDIA Volta GPU accelerators, to be ready by 2018 which will enable us to reduce the shortest period in our global simulations from 17 s to 9 s, and exascale systems will reduce this further to just a few seconds.
Reduction and analysis of VLA maps for 281 radio-loud quasars using the UNLV Cray Y-MP supercomputer

NASA Technical Reports Server (NTRS)

Ding, Ailian; Hintzen, Paul; Weistrop, Donna; Owen, Frazer

1993-01-01

The identification of distorted radio-loud quasars provides a potentially very powerful tool for basic cosmological studies. If large morphological distortions are correlated with membership of the quasars in rich clusters of galaxies, optical observations can be used to identify rich clusters of galaxies at large redshifts. Hintzen, Ulvestad, and Owen (1983, HUO) undertook a VLA A array snapshot survey at 20 cm of 123 radio-loud quasars, and they found that among triple sources in their sample, 17 percent had radio axes which were bent more than 20 deg and 5 percent were bent more than 40 deg. Their subsequent optical observations showed that excess galaxy densities within 30 arcsec of 6 low-redshift distorted quasars were on average 3 times as great as those around undistorted quasars (Hintzen 1984). At least one of the distorted quasars observed, 3C275.1, apparently lies in the first-ranked galaxy at the center of a rich cluster of galaxies (Hintzen and Romanishin, 1986). Although their sample was small, these results indicated that observations of distorted quasars could be used to identify clusters of galaxies at large redshifts. The purpose of this project is to increase the available sample of distorted quasars to allow optical detection of a significant sample of quasar-associated clusters of galaxies at large redshifts.
Massively parallel and linear-scaling algorithm for second-order Møller-Plesset perturbation theory applied to the study of supramolecular wires

NASA Astrophysics Data System (ADS)

Kjærgaard, Thomas; Baudin, Pablo; Bykov, Dmytro; Eriksen, Janus Juul; Ettenhuber, Patrick; Kristensen, Kasper; Larkin, Jeff; Liakh, Dmitry; Pawłowski, Filip; Vose, Aaron; Wang, Yang Min; Jørgensen, Poul

2017-03-01

We present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide-Expand-Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide-Expand-Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalability of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the "resolution of the identity second-order Møller-Plesset perturbation theory" (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.
Enhancing dendritic cell immunotherapy for melanoma using a simple mathematical model.

PubMed

Castillo-Montiel, E; Chimal-Eguía, J C; Tello, J Ignacio; Piñon-Zaráte, G; Herrera-Enríquez, M; Castell-Rodríguez, A E

2015-06-09

The immunotherapy using dendritic cells (DCs) against different varieties of cancer is an approach that has been previously explored which induces a specific immune response. This work presents a mathematical model of DCs immunotherapy for melanoma in mice based on work by Experimental Immunotherapy Laboratory of the Medicine Faculty in the Universidad Autonoma de Mexico (UNAM). The model is a five delay differential equation (DDEs) which represents a simplified view of the immunotherapy mechanisms. The mathematical model takes into account the interactions between tumor cells, dendritic cells, naive cytotoxic T lymphocytes cells (inactivated cytotoxic cells), effector cells (cytotoxic T activated cytotoxic cells) and transforming growth factor β cytokine (T G F-β). The model is validated comparing the computer simulation results with biological trial results of the immunotherapy developed by the research group of UNAM. The results of the growth of tumor cells obtained by the control immunotherapy simulation show a similar amount of tumor cell population than the biological data of the control immunotherapy. Moreover, comparing the increase of tumor cells obtained from the immunotherapy simulation and the biological data of the immunotherapy applied by the UNAM researchers obtained errors of approximately 10 %. This allowed us to use the model as a framework to test hypothetical treatments. The numerical simulations suggest that by using more doses of DCs and changing the infusion time, the tumor growth decays compared with the current immunotherapy. In addition, a local sensitivity analysis is performed; the results show that the delay in time " τ", the maximal growth rate of tumor "r" and the maximal efficiency of tumor cytotoxic cells rate "aT" are the most sensitive model parameters. By using this mathematical model it is possible to simulate the growth of the tumor cells with or without immunotherapy using the infusion protocol of the UNAM researchers, to obtain a good approximation of the biological trials data. It is worth mentioning that by manipulating the different parameters of the model the effectiveness of the immunotherapy may increase. This last suggests that different protocols could be implemented by the Immunotherapy Laboratory of UNAM in order to improve their results.
LBNL Computational ResearchTheory Facility Groundbreaking - Full Press Conference. Feb 1st, 2012

ScienceCinema

Yelick, Kathy

2018-01-24

Energy Secretary Steven Chu, along with Berkeley Lab and UC leaders, broke ground on the Lab's Computational Research and Theory (CRT) facility yesterday. The CRT will be at the forefront of high-performance supercomputing research and be DOE's most efficient facility of its kind. Joining Secretary Chu as speakers were Lab Director Paul Alivisatos, UC President Mark Yudof, Office of Science Director Bill Brinkman, and UC Berkeley Chancellor Robert Birgeneau. The festivities were emceed by Associate Lab Director for Computing Sciences, Kathy Yelick, and Berkeley Mayor Tom Bates joined in the shovel ceremony.
Mexican Space Weather Service (SCIESMEX)

NASA Astrophysics Data System (ADS)

Gonzalez-Esparza, A.; De la Luz, V.; Mejia-Ambriz, J. C.; Aguilar-Rodriguez, E.; Corona-Romero, P.; Gonzalez, L. X.

2015-12-01

Recent modifications of the Civil Protection Law in Mexico include now specific mentions to space hazards and space weather phenomena. During the last few years, the UN has promoted international cooperation on Space Weather awareness, studies and monitoring. Internal and external conditions motivated the creation of a Space Weather Service in Mexico (SCIESMEX). The SCIESMEX (www.sciesmex.unam.mx) is operated by the Geophysics Institute at the National Autonomous University of Mexico (UNAM). The UNAM has the experience of operating several critical national services, including the National Seismological Service (SSN); besides that has a well established scientific group with expertise in space physics and solar- terrestrial phenomena. The SCIESMEX is also related with the recent creation of the Mexican Space Agency (AEM). The project combines a network of different ground instruments covering solar, interplanetary, geomagnetic, and ionospheric observations. The SCIESMEX has already in operation computing infrastructure running the web application, a virtual observatory and a high performance computing server to run numerical models. SCIESMEX participates in the International Space Environment Services (ISES) and in the Inter-progamme Coordination Team on Space Weather (ICTSW) of the Word Meteorological Organization (WMO).
Towards the measurement of the13C(d, p)14C cross section using AMS

NASA Astrophysics Data System (ADS)

Murillo-Morales, S.; Barrón-Palos, L.; Chávez, E.; Araujo-Escalona, V.

2017-07-01

A plan to study the total cross section for the13C(d, p)14C nuclear reaction has been developed for energies in the center-of-mass frame between 133 and 400 keV. The proposed experiment will use a deuterium beam (1-3 MeV of energy) from the Instituto de Física-UNAM 5.5 MV Van de Graaff accelerator and the produced14C will be afterwards measured by AMS technique in the LEMA-UNAM (HVEE 1 MV Tandetron). One of the main goals is to study the performance of the LEMA-UNAM facility in the cross section measurement in comparison with other data reported in the literature, measured by other techniques. In this work we present the current status of these studies. The relevance of the13C(d, p)14C reaction in the study of compound nucleus formation as well as in some astrophysics scenarios, and the importance of the development of the AMS technique to measure cross sections of nuclear reactions of astrophysical interest in Mixico are also discussed.
Dynamic overset grid communication on distributed memory parallel processors

NASA Technical Reports Server (NTRS)

Barszcz, Eric; Weeratunga, Sisira K.; Meakin, Robert L.

1993-01-01

A parallel distributed memory implementation of intergrid communication for dynamic overset grids is presented. Included are discussions of various options considered during development. Results are presented comparing an Intel iPSC/860 to a single processor Cray Y-MP. Results for grids in relative motion show the iPSC/860 implementation to be faster than the Cray implementation.

Beyond the Face of Race: Emo-Cognitive Explorations of White Neurosis and Racial Cray-Cray

ERIC Educational Resources Information Center

Matias, Cheryl E.; DiAngelo, Robin

2013-01-01

In this article, the authors focus on the emotional and cognitive context that underlies whiteness. They employ interdisciplinary approaches of critical Whiteness studies and critical race theory to entertain how common White responses to racial material stem from the need for Whites to deny race, a traumatizing process that begins in childhood.…
Parallel processing on the Livermore VAX 11/780-4 parallel processor system with compatibility to Cray Research, Inc. (CRI) multitasking. Version 1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Werner, N.E.; Van Matre, S.W.

1985-05-01

This manual describes the CRI Subroutine Library and Utility Package. The CRI library provides Cray multitasking functionality on the four-processor shared memory VAX 11/780-4. Additional functionality has been added for more flexibility. A discussion of the library, utilities, error messages, and example programs is provided.
Implementing dense linear algebra algorithms using multitasking on the CRAY X-MP-4 (or approaching the gigaflop)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dongarra, J.J.; Hewitt, T.

1985-08-01

This note describes some experiments on simple, dense linear algebra algorithms. These experiments show that the CRAY X-MP is capable of small-grain multitasking arising from standard implementations of LU and Cholesky decomposition. The implementation described here provides the ''fastest'' execution rate for LU decomposition, 718 MFLOPS for a matrix of order 1000.
Early MIMD experience on the CRAY X-MP

NASA Astrophysics Data System (ADS)

Rhoades, Clifford E.; Stevens, K. G.

1985-07-01

This paper describes some early experience with converting four physics simulation programs to the CRAY X-MP, a current Multiple Instruction, Multiple Data (MIMD) computer consisting of two processors each with an architecture similar to that of the CRAY-1. As a multi-processor, the CRAY X-MP together with the high speed Solid-state Storage Device (SSD) in an ideal machine upon which to study MIMD algorithms for solving the equations of mathematical physics because it is fast enough to run real problems. The computer programs used in this study are all FORTRAN versions of original production codes. They range in sophistication from a one-dimensional numerical simulation of collisionless plasma to a two-dimensional hydrodynamics code with heat flow to a couple of three-dimensional fluid dynamics codes with varying degrees of viscous modeling. Early research with a dual processor configuration has shown speed-ups ranging from 1.55 to 1.98. It has been observed that a few simple extensions to FORTRAN allow a typical programmer to achieve a remarkable level of efficiency. These extensions involve the concept of memory local to a concurrent subprogram and memory common to all concurrent subprograms.
Combining density functional theory calculations, supercomputing, and data-driven methods to design new materials (Conference Presentation)

NASA Astrophysics Data System (ADS)

Jain, Anubhav

2017-04-01

Density functional theory (DFT) simulations solve for the electronic structure of materials starting from the Schrödinger equation. Many case studies have now demonstrated that researchers can often use DFT to design new compounds in the computer (e.g., for batteries, catalysts, and hydrogen storage) before synthesis and characterization in the lab. In this talk, I will focus on how DFT calculations can be executed on large supercomputing resources in order to generate very large data sets on new materials for functional applications. First, I will briefly describe the Materials Project, an effort at LBNL that has virtually characterized over 60,000 materials using DFT and has shared the results with over 17,000 registered users. Next, I will talk about how such data can help discover new materials, describing how preliminary computational screening led to the identification and confirmation of a new family of bulk AMX2 thermoelectric compounds with measured zT reaching 0.8. I will outline future plans for how such data-driven methods can be used to better understand the factors that control thermoelectric behavior, e.g., for the rational design of electronic band structures, in ways that are different from conventional approaches.
Experiences and results multitasking a hydrodynamics code on global and local memory machines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mandell, D.

1987-01-01

A one-dimensional, time-dependent Lagrangian hydrodynamics code using a Godunov solution method has been multitasked for the Cray X-MP/48, the Intel iPSC hypercube, the Alliant FX series and the IBM RP3 computers. Actual multitasking results have been obtained for the Cray, Intel and Alliant computers and simulated results were obtained for the Cray and RP3 machines. The differences in the methods required to multitask on each of the machines is discussed. Results are presented for a sample problem involving a shock wave moving down a channel. Comparisons are made between theoretical speedups, predicted by Amdahl's law, and the actual speedups obtained.more » The problems of debugging on the different machines are also described.« less
ARC2D - EFFICIENT SOLUTION METHODS FOR THE NAVIER-STOKES EQUATIONS (CRAY VERSION)

NASA Technical Reports Server (NTRS)

Pulliam, T. H.

1994-01-01

ARC2D is a computational fluid dynamics program developed at the NASA Ames Research Center specifically for airfoil computations. The program uses implicit finite-difference techniques to solve two-dimensional Euler equations and thin layer Navier-Stokes equations. It is based on the Beam and Warming implicit approximate factorization algorithm in generalized coordinates. The methods are either time accurate or accelerated non-time accurate steady state schemes. The evolution of the solution through time is physically realistic; good solution accuracy is dependent on mesh spacing and boundary conditions. The mathematical development of ARC2D begins with the strong conservation law form of the two-dimensional Navier-Stokes equations in Cartesian coordinates, which admits shock capturing. The Navier-Stokes equations can be transformed from Cartesian coordinates to generalized curvilinear coordinates in a manner that permits one computational code to serve a wide variety of physical geometries and grid systems. ARC2D includes an algebraic mixing length model to approximate the effect of turbulence. In cases of high Reynolds number viscous flows, thin layer approximation can be applied. ARC2D allows for a variety of solutions to stability boundaries, such as those encountered in flows with shocks. The user has considerable flexibility in assigning geometry and developing grid patterns, as well as in assigning boundary conditions. However, the ARC2D model is most appropriate for attached and mildly separated boundary layers; no attempt is made to model wake regions and widely separated flows. The techniques have been successfully used for a variety of inviscid and viscous flowfield calculations. The Cray version of ARC2D is written in FORTRAN 77 for use on Cray series computers and requires approximately 5Mb memory. The program is fully vectorized. The tape includes variations for the COS and UNICOS operating systems. Also included is a sample routine for CONVEX computers to emulate Cray system time calls, which should be easy to modify for other machines as well. The standard distribution media for this version is a 9-track 1600 BPI ASCII Card Image format magnetic tape. The Cray version was developed in 1987. The IBM ES/3090 version is an IBM port of the Cray version. It is written in IBM VS FORTRAN and has the capability of executing in both vector and parallel modes on the MVS/XA operating system and in vector mode on the VM/XA operating system. Various options of the IBM VS FORTRAN compiler provide new features for the ES/3090 version, including 64-bit arithmetic and up to 2 GB of virtual addressability. The IBM ES/3090 version is available only as a 9-track, 1600 BPI IBM IEBCOPY format magnetic tape. The IBM ES/3090 version was developed in 1989. The DEC RISC ULTRIX version is a DEC port of the Cray version. It is written in FORTRAN 77 for RISC-based Digital Equipment platforms. The memory requirement is approximately 7Mb of main memory. It is available in UNIX tar format on TK50 tape cartridge. The port to DEC RISC ULTRIX was done in 1990. COS and UNICOS are trademarks and Cray is a registered trademark of Cray Research, Inc. IBM, ES/3090, VS FORTRAN, MVS/XA, and VM/XA are registered trademarks of International Business Machines. DEC and ULTRIX are registered trademarks of Digital Equipment Corporation.
Comprehensive efficiency analysis of supercomputer resource usage based on system monitoring data

NASA Astrophysics Data System (ADS)

Mamaeva, A. A.; Shaykhislamov, D. I.; Voevodin, Vad V.; Zhumatiy, S. A.

2018-03-01

One of the main problems of modern supercomputers is the low efficiency of their usage, which leads to the significant idle time of computational resources, and, in turn, to the decrease in speed of scientific research. This paper presents three approaches to study the efficiency of supercomputer resource usage based on monitoring data analysis. The first approach performs an analysis of computing resource utilization statistics, which allows to identify different typical classes of programs, to explore the structure of the supercomputer job flow and to track overall trends in the supercomputer behavior. The second approach is aimed specifically at analyzing off-the-shelf software packages and libraries installed on the supercomputer, since efficiency of their usage is becoming an increasingly important factor for the efficient functioning of the entire supercomputer. Within the third approach, abnormal jobs – jobs with abnormally inefficient behavior that differs significantly from the standard behavior of the overall supercomputer job flow – are being detected. For each approach, the results obtained in practice in the Supercomputer Center of Moscow State University are demonstrated.
Supercomputer applications in molecular modeling.

PubMed

Gund, T M

1988-01-01

An overview of the functions performed by molecular modeling is given. Molecular modeling techniques benefiting from supercomputing are described, namely, conformation, search, deriving bioactive conformations, pharmacophoric pattern searching, receptor mapping, and electrostatic properties. The use of supercomputers for problems that are computationally intensive, such as protein structure prediction, protein dynamics and reactivity, protein conformations, and energetics of binding is also examined. The current status of supercomputing and supercomputer resources are discussed.
FORTRAN multitasking library for use on the ELXSI 6400 and the CRAY XMP

DOE Office of Scientific and Technical Information (OSTI.GOV)

Montry, G.R.

1985-07-16

A library of FORTRAN-based multitasking routines has been written for the ELXSI 6400 and the CRAY XMP. This library is designed to make multitasking codes easily transportable between machines with different hardware configurations. The library provides enhanced error checking and diagnostics over vendor-supplied multitasking intrinsics. The library also contains multitasking control structures not normally supplied by the vendor.
Keeping an Eye on the Prize

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hazi, A U

2007-02-06

Setting performance goals is part of the business plan for almost every company. The same is true in the world of supercomputers. Ten years ago, the Department of Energy (DOE) launched the Accelerated Strategic Computing Initiative (ASCI) to help ensure the safety and reliability of the nation's nuclear weapons stockpile without nuclear testing. ASCI, which is now called the Advanced Simulation and Computing (ASC) Program and is managed by DOE's National Nuclear Security Administration (NNSA), set an initial 10-year goal to obtain computers that could process up to 100 trillion floating-point operations per second (teraflops). Many computer experts thought themore » goal was overly ambitious, but the program's results have proved them wrong. Last November, a Livermore-IBM team received the 2005 Gordon Bell Prize for achieving more than 100 teraflops while modeling the pressure-induced solidification of molten metal. The prestigious prize, which is named for a founding father of supercomputing, is awarded each year at the Supercomputing Conference to innovators who advance high-performance computing. Recipients for the 2005 prize included six Livermore scientists--physicists Fred Streitz, James Glosli, and Mehul Patel and computer scientists Bor Chan, Robert Yates, and Bronis de Supinski--as well as IBM researchers James Sexton and John Gunnels. This team produced the first atomic-scale model of metal solidification from the liquid phase with results that were independent of system size. The record-setting calculation used Livermore's domain decomposition molecular-dynamics (ddcMD) code running on BlueGene/L, a supercomputer developed by IBM in partnership with the ASC Program. BlueGene/L reached 280.6 teraflops on the Linpack benchmark, the industry standard used to measure computing speed. As a result, it ranks first on the list of Top500 Supercomputer Sites released in November 2005. To evaluate the performance of nuclear weapons systems, scientists must understand how materials behave under extreme conditions. Because experiments at high pressures and temperatures are often difficult or impossible to conduct, scientists rely on computer models that have been validated with obtainable data. Of particular interest to weapons scientists is the solidification of metals. ''To predict the performance of aging nuclear weapons, we need detailed information on a material's phase transitions'', says Streitz, who leads the Livermore-IBM team. For example, scientists want to know what happens to a metal as it changes from molten liquid to a solid and how that transition affects the material's characteristics, such as its strength.« less
Dynamic Load Balancing for Adaptive Computations on Distributed-Memory Machines

NASA Technical Reports Server (NTRS)

1999-01-01

Dynamic load balancing is central to adaptive mesh-based computations on large-scale parallel computers. The principal investigator has investigated various issues on the dynamic load balancing problem under NASA JOVE and JAG rants. The major accomplishments of the project are two graph partitioning algorithms and a load balancing framework. The S-HARP dynamic graph partitioner is known to be the fastest among the known dynamic graph partitioners to date. It can partition a graph of over 100,000 vertices in 0.25 seconds on a 64- processor Cray T3E distributed-memory multiprocessor while maintaining the scalability of over 16-fold speedup. Other known and widely used dynamic graph partitioners take over a second or two while giving low scalability of a few fold speedup on 64 processors. These results have been published in journals and peer-reviewed flagship conferences.
The role of graphics super-workstations in a supercomputing environment

NASA Technical Reports Server (NTRS)

Levin, E.

1989-01-01

A new class of very powerful workstations has recently become available which integrate near supercomputer computational performance with very powerful and high quality graphics capability. These graphics super-workstations are expected to play an increasingly important role in providing an enhanced environment for supercomputer users. Their potential uses include: off-loading the supercomputer (by serving as stand-alone processors, by post-processing of the output of supercomputer calculations, and by distributed or shared processing), scientific visualization (understanding of results, communication of results), and by real time interaction with the supercomputer (to steer an iterative computation, to abort a bad run, or to explore and develop new algorithms).
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2010 CFR

2010-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2014 CFR

2014-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2012 CFR

2012-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2013 CFR

2013-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2011 CFR

2011-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
Investigating the impact of the cielo cray XE6 architecture on scientific application codes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajan, Mahesh; Barrett, Richard; Pedretti, Kevin Thomas Tauke

2010-12-01

Cielo, a Cray XE6, is the Department of Energy NNSA Advanced Simulation and Computing (ASC) campaign's newest capability machine. Rated at 1.37 PFLOPS, it consists of 8,944 dual-socket oct-core AMD Magny-Cours compute nodes, linked using Cray's Gemini interconnect. Its primary mission objective is to enable a suite of the ASC applications implemented using MPI to scale to tens of thousands of cores. Cielo is an evolutionary improvement to a successful architecture previously available to many of our codes, thus enabling a basis for understanding the capabilities of this new architecture. Using three codes strategically important to the ASC campaign, andmore » supplemented with some micro-benchmarks that expose the fundamental capabilities of the XE6, we report on the performance characteristics and capabilities of Cielo.« less
Experiences with Cray multi-tasking

NASA Technical Reports Server (NTRS)

Miya, E. N.

1985-01-01

The issues involved in modifying an existing code for multitasking is explored. They include Cray extensions to FORTRAN, an examination of the application code under study, designing workable modifications, specific code modifications to the VAX and Cray versions, performance, and efficiency results. The finished product is a faster, fully synchronous, parallel version of the original program. A production program is partitioned by hand to run on two CPUs. Loop splitting multitasks three key subroutines. Simply dividing subroutine data and control structure down the middle of a subroutine is not safe. Simple division produces results that are inconsistent with uniprocessor runs. The safest way to partition the code is to transfer one block of loops at a time and check the results of each on a test case. Other issues include debugging and performance. Task startup and maintenance (e.g., synchronization) are potentially expensive.

Enviro-HIRLAM Applicability for Black Carbon Studies in Arctic

NASA Astrophysics Data System (ADS)

Nuterman, Roman; Mahura, Alexander; Baklanov, Alexander; Kurganskiy, Alexander; Amstrup, Bjarne; Kaas, Eigil

2015-04-01

One of the main aims of the Nordic CarboNord project ("Impact of black carbon on air quality and climate in Northern Europe and Arctic") is focused on providing new information on distribution and effects of black carbon in Northern Europe and Arctic. It can be done through assessing robustness of model predictions of long-range black carbon distribution and its relation to climate change and forcing. In our study, the online integrated meteorology-chemistry/aerosols model - Enviro-HIRLAM (Environment - HIgh Resolution Limited Area Model) - is used. This study, at first, is focused on adaptation (model setup, domain for the Northern Hemisphere and Arctic region, emissions, boundary conditions, refining aerosols microphysics and chemistry, cloud-aerosol interaction processes) of Enviro-HIRLAM model and selection of most unfavorable weather and air pollution episodes for the Arctic region. Simulations of interactions between black carbon and meteorological processes in northern conditions for selected episodes will be performed (at DMI's supercomputer HPC CRAY-XT5), and then long-term simulations at regional scale for selected winter vs. summer months. Modelling results will be compared on a diurnal cycle and monthly basis against observations for key meteorological parameters (such as air temperature, wind speed, relative humidity, and precipitation) as well as aerosol concentration. Finally, evaluation of black carbon atmospheric transport, dispersion, and deposition patterns at different spatio-temporal scales; physical-chemical processes and transformations of black carbon containing aerosols; and interactions and effects between black carbon and meteorological processes in Arctic weather conditions will be done.
Modeling and new equipment definition for the vibration isolation box equipment system

NASA Technical Reports Server (NTRS)

Sani, Robert L.

1993-01-01

Our MSAD-funded research project is to provide numerical modeling support for the VIBES (Vibration Isolation Box Experiment System) which is an IML2 flight experiment being built by the Japanese research team of Dr. H. Azuma of the Japanese National Aerospace Laboratory. During this reporting period, the following have been accomplished: A semi-consistent mass finite element projection algorithm for 2D and 3D Boussinesq flows has been implemented on Sun, HP And Cray Platforms. The algorithm has better phase speed accuracy than similar finite difference or lumped mass finite element algorithms, an attribute which is essential for addressing realistic g-jitter effects as well as convectively-dominated transient systems. The projection algorithm has been benchmarked against solutions generated via the commercial code FIDAP. The algorithm appears to be accurate as well as computationally efficient. Optimization and potential parallelization studies are underway. Our implementation to date has focused on execution of the basic algorithm with at most a concern for vectorization. The initial time-varying gravity Boussinesq flow simulation is being set up. The mesh is being designed and the input file is being generated. Some preliminary 'small mesh' cases will be attempted on our HP9000/735 while our request to MSAD for supercomputing resources is being addressed. The Japanese research team for VIBES was visited, the current set up and status of the physical experiment was obtained and ongoing E-Mail communication link was established.
Implementation of Parallel Computing Technology to Vortex Flow

NASA Technical Reports Server (NTRS)

Dacles-Mariani, Jennifer

1999-01-01

Mainframe supercomputers such as the Cray C90 was invaluable in obtaining large scale computations using several millions of grid points to resolve salient features of a tip vortex flow over a lifting wing. However, real flight configurations require tracking not only of the flow over several lifting wings but its growth and decay in the near- and intermediate- wake regions, not to mention the interaction of these vortices with each other. Resolving and tracking the evolution and interaction of these vortices shed from complex bodies is computationally intensive. Parallel computing technology is an attractive option in solving these flows. In planetary science vortical flows are also important in studying how planets and protoplanets form when cosmic dust and gases become gravitationally unstable and eventually form planets or protoplanets. The current paradigm for the formation of planetary systems maintains that the planets accreted from the nebula of gas and dust left over from the formation of the Sun. Traditional theory also indicate that such a preplanetary nebula took the form of flattened disk. The coagulation of dust led to the settling of aggregates toward the midplane of the disk, where they grew further into asteroid-like planetesimals. Some of the issues still remaining in this process are the onset of gravitational instability, the role of turbulence in the damping of particles and radial effects. In this study the focus will be with the role of turbulence and the radial effects.
Massively parallel and linear-scaling algorithm for second-order Moller–Plesset perturbation theory applied to the study of supramolecular wires

DOE PAGES

Kjaergaard, Thomas; Baudin, Pablo; Bykov, Dmytro; ...

2016-11-16

Here, we present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide–Expand–Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide–Expand–Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalabilitymore » of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the “resolution of the identity second-order Moller–Plesset perturbation theory” (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.« less
Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Zubair, Mohammad

1993-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.
Portability and Cross-Platform Performance of an MPI-Based Parallel Polygon Renderer

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1999-01-01

Visualizing the results of computations performed on large-scale parallel computers is a challenging problem, due to the size of the datasets involved. One approach is to perform the visualization and graphics operations in place, exploiting the available parallelism to obtain the necessary rendering performance. Over the past several years, we have been developing algorithms and software to support visualization applications on NASA's parallel supercomputers. Our results have been incorporated into a parallel polygon rendering system called PGL. PGL was initially developed on tightly-coupled distributed-memory message-passing systems, including Intel's iPSC/860 and Paragon, and IBM's SP2. Over the past year, we have ported it to a variety of additional platforms, including the HP Exemplar, SGI Origin2OOO, Cray T3E, and clusters of Sun workstations. In implementing PGL, we have had two primary goals: cross-platform portability and high performance. Portability is important because (1) our manpower resources are limited, making it difficult to develop and maintain multiple versions of the code, and (2) NASA's complement of parallel computing platforms is diverse and subject to frequent change. Performance is important in delivering adequate rendering rates for complex scenes and ensuring that parallel computing resources are used effectively. Unfortunately, these two goals are often at odds. In this paper we report on our experiences with portability and performance of the PGL polygon renderer across a range of parallel computing platforms.
Field-scale multi-phase LNAPL remediation: Validating a new computational framework against sequential field pilot trials.

PubMed

Sookhak Lari, Kaveh; Johnston, Colin D; Rayner, John L; Davis, Greg B

2018-03-05

Remediation of subsurface systems, including groundwater, soil and soil gas, contaminated with light non-aqueous phase liquids (LNAPLs) is challenging. Field-scale pilot trials of multi-phase remediation were undertaken at a site to determine the effectiveness of recovery options. Sequential LNAPL skimming and vacuum-enhanced skimming, with and without water table drawdown were trialled over 78days; in total extracting over 5m 3 of LNAPL. For the first time, a multi-component simulation framework (including the multi-phase multi-component code TMVOC-MP and processing codes) was developed and applied to simulate the broad range of multi-phase remediation and recovery methods used in the field trials. This framework was validated against the sequential pilot trials by comparing predicted and measured LNAPL mass removal rates and compositional changes. The framework was tested on both a Cray supercomputer and a cluster. Simulations mimicked trends in LNAPL recovery rates (from 0.14 to 3mL/s) across all remediation techniques each operating over periods of 4-14days over the 78day trial. The code also approximated order of magnitude compositional changes of hazardous chemical concentrations in extracted gas during vacuum-enhanced recovery. The verified framework enables longer term prediction of the effectiveness of remediation approaches allowing better determination of remediation endpoints and long-term risks. Copyright © 2017 Commonwealth Scientific and Industrial Research Organisation. Published by Elsevier B.V. All rights reserved.
Data-intensive computing on numerically-insensitive supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahrens, James P; Fasel, Patricia K; Habib, Salman

2010-12-03

With the advent of the era of petascale supercomputing, via the delivery of the Roadrunner supercomputing platform at Los Alamos National Laboratory, there is a pressing need to address the problem of visualizing massive petascale-sized results. In this presentation, I discuss progress on a number of approaches including in-situ analysis, multi-resolution out-of-core streaming and interactive rendering on the supercomputing platform. These approaches are placed in context by the emerging area of data-intensive supercomputing.
Computer Electromagnetics and Supercomputer Architecture

NASA Technical Reports Server (NTRS)

Cwik, Tom

1993-01-01

The dramatic increase in performance over the last decade for microporcessor computations is compared with that for the supercomputer computations. This performance, the projected performance, and a number of other issues such as cost and the inherent pysical limitations in curent supercomputer technology have naturally led to parallel supercomputers and ensemble of interconnected microprocessors.
CDC to CRAY FORTRAN conversion manual

NASA Technical Reports Server (NTRS)

Mcgary, C.; Diebert, D.

1983-01-01

Documentation describing software differences between two general purpose computers for scientific applications is presented. Descriptions of the use of the FORTRAN and FORTRAN 77 high level programming language on a CDC 7600 under SCOPE and a CRAY XMP under COS are offered. Itemized differences of the FORTRAN language sets of the two machines are also included. The material is accompanied by numerous examples of preferred programming techniques for the two machines.
Force user's manual: A portable, parallel FORTRAN

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Arenstorf, Norbert S.; Ramanan, Aruna V.

1990-01-01

The use of Force, a parallel, portable FORTRAN on shared memory parallel computers is described. Force simplifies writing code for parallel computers and, once the parallel code is written, it is easily ported to computers on which Force is installed. Although Force is nearly the same for all computers, specific details are included for the Cray-2, Cray-YMP, Convex 220, Flex/32, Encore, Sequent, Alliant computers on which it is installed.
A Spectral Element Ocean Model on the Cray T3D: the interannual variability of the Mediterranean Sea general circulation

NASA Astrophysics Data System (ADS)

Molcard, A. J.; Pinardi, N.; Ansaloni, R.

A new numerical model, SEOM (Spectral Element Ocean Model, (Iskandarani et al, 1994)), has been implemented in the Mediterranean Sea. Spectral element methods combine the geometric flexibility of finite element techniques with the rapid convergence rate of spectral schemes. The current version solves the shallow water equations with a fifth (or sixth) order accuracy spectral scheme and about 50.000 nodes. The domain decomposition philosophy makes it possible to exploit the power of parallel machines. The original MIMD master/slave version of SEOM, written in F90 and PVM, has been ported to the Cray T3D. When critical for performance, Cray specific high-performance one-sided communication routines (SHMEM) have been adopted to fully exploit the Cray T3D interprocessor network. Tests performed with highly unstructured and irregular grid, on up to 128 processors, show an almost linear scalability even with unoptimized domain decomposition techniques. Results from various case studies on the Mediterranean Sea are shown, involving realistic coastline geometry, and monthly mean 1000mb winds from the ECMWF's atmospheric model operational analysis from the period January 1987 to December 1994. The simulation results show that variability in the wind forcing considerably affect the circulation dynamics of the Mediterranean Sea.
KAMEDIN: a telemedicine system for computer supported cooperative work and remote image analysis in radiology.

PubMed

Handels, H; Busch, C; Encarnação, J; Hahn, C; Kühn, V; Miehe, J; Pöppl, S I; Rinast, E; Rossmanith, C; Seibert, F; Will, A

1997-03-01

The software system KAMEDIN (Kooperatives Arbeiten und MEdizinische Diagnostik auf Innovativen Netzen) is a multimedia telemedicine system for exchange, cooperative diagnostics, and remote analysis of digital medical image data. It provides components for visualisation, processing, and synchronised audio-visual discussion of medical images. Techniques of computer supported cooperative work (CSCW) synchronise user interactions during a teleconference. Visibility of both local and remote cursor on the conference workstations facilitates telepointing and reinforces the conference partner's telepresence. Audio communication during teleconferences is supported by an integrated audio component. Furthermore, brain tissue segmentation with artificial neural networks can be performed on an external supercomputer as a remote image analysis procedure. KAMEDIN is designed as a low cost CSCW tool for ISDN based telecommunication. However it can be used on any TCP/IP supporting network. In a field test, KAMEDIN was installed in 15 clinics and medical departments to validate the systems' usability. The telemedicine system KAMEDIN has been developed, tested, and evaluated within a research project sponsored by German Telekom.
EDITORIAL: 19th International Conference on General Relativity and Gravitation (GR19), México, 4-9 July 2010 19th International Conference on General Relativity and Gravitation (GR19), México City, México, 4-9 July 2010

NASA Astrophysics Data System (ADS)

Marolf, Don; Sudarsky, Daniel

2011-06-01

The GR19 meeting was held in México City from 6-9 July 2010. The decision to have the meeting in México was taken during the GR18 meeting in Sydney, Australia in 2007, and represented a great milestone for the scientific community working in the fields related to gravitation in México. This fact was evidenced by the commitment of the most important institutions in México where the field is developed, by the support the meeting received at various governmental levels, and also by a promotional campaign dedicated to educate the public about our subject, which was undertaken by important segments of the gravitational physics community in Mexico. This campaign was named 'El Mes de Einstein' or 'The Einsteinian Month', and consisted of a series of presentations, talks and movies about topics related to General Relativity which culminated with the public talk of the GR19 meeting (now a traditional aspect of the GR events). This talk was given by George Smoot, Nobel laureate in physics 2006, on the amazing developments around the detailed studies of the Cosmic Microwave Background, and was held at the Nezahualc'oyotl Hall in the main campus of the National Autonomous University of México, which was filled to capacity by enthusiastic crowds of lay people fascinated with the subject. The meeting itself was a very successful one with participants from dozens of countries spanning the five continents, with a rich, varied and informative plenary program. Highlights, featured in this issue, were perhaps the talk by Veronika Hubeny on the fluid/gravity correspondence, a subject that has grown dramatically during the last few years, the lecture by Tarun Souradeep on the enormous potential for discovery offered by the ever increasing accuracy of cosmological observations, the presentation by Jeffrey McClintock, about accreting black holes and the exciting possibility of measuring their spins, the informative review about Loop Quantum Gravity from one of the pioneers of the subject: Carlo Rovelli, and an interesting update on one the recent developments in the ADF/CFT correspondence 'the condensed matter-gravity connection' presented by Gary Horowitz, among other excellent contributions. The material presented by other plenary speakers, which is not represented in this issue, has already been widely reported either by themselves or by others. The parallel sessions covered a wide range of topics currently under investigation by the worldwide gravity community, and were enthusiastically attended by the conference participants. In the past, it has been traditional to ask session chairs to write a brief summary of the talks presented. This time, however, we asked the parallel session chairs to nominate a small number of individual speakers to contribute to this issue. By focusing on a smaller number of topics, we were able to allocate more space to each, which we hope will provide a more useful overview of at least some of the most exciting physics at GR19. We hope that this novel approach will become a standard feature of future proceedings of the GR meetings. Anther aspect we should mention is the satellite meeting on Quantum Gravity and its preceding PASI Quantum Gravity Summer School held at the Institute for Mathematics, National Autonomous University of México (UNAM), Campus Morelia, Morelia México, from 23 June to 3 July, 2010 which was attended by many students from Mexican institutions, as well as by several participants who came from abroad to learn about this fascinating subject. The organizers want to thank the generous support of the following institutions: the Mexican National Council for Science and Technology (CONACYT), the National Autonomous University of México (UNAM), the Autonomous Metropolitan University of México UAM, the Center for Research and Advanced Studies CINVESTAV, the Mexican Physical Society (SMF), The Mexican Academy of Sciences AMC, The Institute for Nuclear Sciences of UNAM (ICN), the International Union of Pure and Applied Physics (IUPAP), The International Society on General Relativity and Gravitation, the 'Benemerita' Autonomous University of Puebla (BAUP), the University Michoacana San Nicolas de Hidalgo, the University of Veracruz, the University of Guadalajara, the University of Guanajuato, the University Iberoamericana, the University of Sinaloa, the University of Zacatecas, the Advanced Institute of Cosmology (IAC), as well as the Government of the Federal District of Mexico City. Don Marolf and Daniel Sudarsky Guest Editors Conference photograph
[Teaching surgery at the UNAM and some educational concepts.

PubMed

Graue-Wiechers, Enrique

2011-01-01

Nowadays surgery cannot be conceived as independent from medicine; consequently, surgical education cannot be far from the main principles of medical education. This review underlines the characteristics of medical training in the field of surgery. General physicians should be trained to perform surgical procedures under particular situations. A new lesson plan was implemented at the Facultad de Medicina in Mexico City (UNAM), comprised of eight fundamental surgical skills. A well-structured surgical program implies clear and exact definitions of the skills to be acquired during training as well as an appropriate follow-up, knowledge reinforcement, continuing educational skills, application of medical tests for patient care and evaluation of the learning process.
Achieving High Performance on the i860 Microprocessor

NASA Technical Reports Server (NTRS)

Lee, King; Kutler, Paul (Technical Monitor)

1998-01-01

The i860 is a high performance microprocessor used in the Intel Touchstone project. This paper proposes a paradigm for programming the i860 that is modelled on the vector instructions of the Cray computers. Fortran callable assembler subroutines were written that mimic the concurrent vector instructions of the Cray. Cache takes the place of vector registers. Using this paradigm we have achieved twice the performance of compiled code on a traditional solve.
Production Experiences with the Cray-Enabled TORQUE Resource Manager

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ezell, Matthew A; Maxwell, Don E; Beer, David

High performance computing resources utilize batch systems to manage the user workload. Cray systems are uniquely different from typical clusters due to Cray s Application Level Placement Scheduler (ALPS). ALPS manages binary transfer, job launch and monitoring, and error handling. Batch systems require special support to integrate with ALPS using an XML protocol called BASIL. Previous versions of Adaptive Computing s TORQUE and Moab batch suite integrated with ALPS from within Moab, using PERL scripts to interface with BASIL. This would occasionally lead to problems when all the components would become unsynchronized. Version 4.1 of the TORQUE Resource Manager introducedmore » new features that allow it to directly integrate with ALPS using BASIL. This paper describes production experiences at Oak Ridge National Laboratory using the new TORQUE software versions, as well as ongoing and future work to improve TORQUE.« less
NASA Langley Research Center's distributed mass storage system

NASA Technical Reports Server (NTRS)

Pao, Juliet Z.; Humes, D. Creig

1993-01-01

There is a trend in institutions with high performance computing and data management requirements to explore mass storage systems with peripherals directly attached to a high speed network. The Distributed Mass Storage System (DMSS) Project at NASA LaRC is building such a system and expects to put it into production use by the end of 1993. This paper presents the design of the DMSS, some experiences in its development and use, and a performance analysis of its capabilities. The special features of this system are: (1) workstation class file servers running UniTree software; (2) third party I/O; (3) HIPPI network; (4) HIPPI/IPI3 disk array systems; (5) Storage Technology Corporation (STK) ACS 4400 automatic cartridge system; (6) CRAY Research Incorporated (CRI) CRAY Y-MP and CRAY-2 clients; (7) file server redundancy provision; and (8) a transition mechanism from the existent mass storage system to the DMSS.
Parallel algorithms for modeling flow in permeable media. Annual report, February 15, 1995 - February 14, 1996

DOE Office of Scientific and Technical Information (OSTI.GOV)

G.A. Pope; K. Sephernoori; D.C. McKinney

1996-03-15

This report describes the application of distributed-memory parallel programming techniques to a compositional simulator called UTCHEM. The University of Texas Chemical Flooding reservoir simulator (UTCHEM) is a general-purpose vectorized chemical flooding simulator that models the transport of chemical species in three-dimensional, multiphase flow through permeable media. The parallel version of UTCHEM addresses solving large-scale problems by reducing the amount of time that is required to obtain the solution as well as providing a flexible and portable programming environment. In this work, the original parallel version of UTCHEM was modified and ported to CRAY T3D and CRAY T3E, distributed-memory, multiprocessor computersmore » using CRAY-PVM as the interprocessor communication library. Also, the data communication routines were modified such that the portability of the original code across different computer architectures was mad possible.« less
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 48 Federal Acquisition Regulations System 3 2014-10-01 2014-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...

48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 48 Federal Acquisition Regulations System 3 2010-10-01 2010-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 48 Federal Acquisition Regulations System 3 2013-10-01 2013-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 48 Federal Acquisition Regulations System 3 2011-10-01 2011-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 48 Federal Acquisition Regulations System 3 2012-10-01 2012-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
EDITORIAL: The 29th International Conference on Phenomena in Ionized Gases The 29th International Conference on Phenomena in Ionized Gases

NASA Astrophysics Data System (ADS)

de Urquijo, J.

2010-06-01

The 29th International Conference on Phenomena in Ionized Gases (ICPIG) was held in Cancún, Mexico, on 12-17 July, 2009, under the sponsorship of the Universidad Nacional Autónoma de Mexico, UNAM, the Universidad Autónoma Metropolitana, UAM, and the International Union of Pure and Applied Physics (IUPAP). ICPIG, founded in 1953, has since been held biennally, and nowadays it covers both fundamental and applied research in all areas of low-temperature plasmas, including those related to the cold plasma in fusion devices. ICPIG fosters interdisciplinary research and interchange between different communities. The conference was attended by scientists from 33 countries. The scientific programme of ICPIG 2009 consisted of 10 General Invited and 24 Topical Lectures, covering all major topics of ICPIG. All speakers were invited to submit peer-reviewed articles based on their lectures for this special issue of Plasma Sources Science and Technology, either as reviews or original work. This special issue contains the papers of most of these talks, covering timely and key issues on elementary processes and fundamental data, plasma wall interactions, including those related to the low-temperature plasma in fusion devices. Several interesting papers were dedicated to plasma modelling, simulation and diagnostics. Important contributions to this issue deal with natural plasmas, low- and atmospheric-pressure plasmas, microplasmas and high-frequency plasmas. Almost half of the contributed papers in this issue are dedicated to applications dealing with plasmas for nanotechnology, plasma sources of various kinds, and other uses of plasmas in particle detection and mass spectrometry. Two workshops were organized. The first reviewed the state of the art on our knowledge of electron, positron and ion interaction processes in gases, with an emphasis on charged particle transport and reactions in electric and magnetic fields, measurement and calculation of cross sections and swarm coefficients, and their applications. The second workshop was dedicated to the recent research and future challenges on non-thermal plasmas relevant to fusion, reviewing the vital role played by the physics of the edge plasma in fusion devices, bridging hot-fusion core and wall materials, which is crucial for plasma confinement and the lifetime of the first wall. The ICPIG participants contributed with 219 papers, covering all ICPIG's topics, of which microplasmas, plasma diagnostics and plasma processes were the most abundant. These papers can be accessed freely at the website http://www.icpig2009.unam.mx. The Von Engel Prize, sponsored by the Hans von Engel and Gordon Francis Fund, was awarded to Professor Lev D Tsendin for his outstanding contribution to the understanding of the physical kinetics of low-pressure gas discharges, by introducing a non-local treatment. The 2009 IUPAP Young Scientist Medal and Prize in Plasma Physics was awarded to Dr Timo Gans, in recognition of his outstanding contribution, at an early stage of his career, in developing very imaginative and highly sophisticated optical diagnostics that allowed a deep understanding of the dynamics of low-temperature plasmas, widely used in microelectronics, photonics and many other emerging applications. On behalf of the Local Organising Committee (LOC) and the International Scientific Committee (ISC) of ICPIG 2009, the guest editor wishes to thank all authors for their efforts in contributing to this special issue. Thanks are due to all members of the LOC and ISC 29th ICPIG, chaired by Professor Jean-Paul Booth, for their contribution to the success of this conference, and to the Editorial Board of Plasma Sources Science and Technology for the opportunity to publish most of the lectures of the 29th ICPIG. We hope that this special issue will be a useful source of information for all those scientists and engineers working in this growing and fascinating field of basic and applied science, and will remind the attendants of the 29th ICPIG of the wonderful time we had in Cancún.
ARC2D - EFFICIENT SOLUTION METHODS FOR THE NAVIER-STOKES EQUATIONS (DEC RISC ULTRIX VERSION)

NASA Technical Reports Server (NTRS)

Biyabani, S. R.

1994-01-01

ARC2D is a computational fluid dynamics program developed at the NASA Ames Research Center specifically for airfoil computations. The program uses implicit finite-difference techniques to solve two-dimensional Euler equations and thin layer Navier-Stokes equations. It is based on the Beam and Warming implicit approximate factorization algorithm in generalized coordinates. The methods are either time accurate or accelerated non-time accurate steady state schemes. The evolution of the solution through time is physically realistic; good solution accuracy is dependent on mesh spacing and boundary conditions. The mathematical development of ARC2D begins with the strong conservation law form of the two-dimensional Navier-Stokes equations in Cartesian coordinates, which admits shock capturing. The Navier-Stokes equations can be transformed from Cartesian coordinates to generalized curvilinear coordinates in a manner that permits one computational code to serve a wide variety of physical geometries and grid systems. ARC2D includes an algebraic mixing length model to approximate the effect of turbulence. In cases of high Reynolds number viscous flows, thin layer approximation can be applied. ARC2D allows for a variety of solutions to stability boundaries, such as those encountered in flows with shocks. The user has considerable flexibility in assigning geometry and developing grid patterns, as well as in assigning boundary conditions. However, the ARC2D model is most appropriate for attached and mildly separated boundary layers; no attempt is made to model wake regions and widely separated flows. The techniques have been successfully used for a variety of inviscid and viscous flowfield calculations. The Cray version of ARC2D is written in FORTRAN 77 for use on Cray series computers and requires approximately 5Mb memory. The program is fully vectorized. The tape includes variations for the COS and UNICOS operating systems. Also included is a sample routine for CONVEX computers to emulate Cray system time calls, which should be easy to modify for other machines as well. The standard distribution media for this version is a 9-track 1600 BPI ASCII Card Image format magnetic tape. The Cray version was developed in 1987. The IBM ES/3090 version is an IBM port of the Cray version. It is written in IBM VS FORTRAN and has the capability of executing in both vector and parallel modes on the MVS/XA operating system and in vector mode on the VM/XA operating system. Various options of the IBM VS FORTRAN compiler provide new features for the ES/3090 version, including 64-bit arithmetic and up to 2 GB of virtual addressability. The IBM ES/3090 version is available only as a 9-track, 1600 BPI IBM IEBCOPY format magnetic tape. The IBM ES/3090 version was developed in 1989. The DEC RISC ULTRIX version is a DEC port of the Cray version. It is written in FORTRAN 77 for RISC-based Digital Equipment platforms. The memory requirement is approximately 7Mb of main memory. It is available in UNIX tar format on TK50 tape cartridge. The port to DEC RISC ULTRIX was done in 1990. COS and UNICOS are trademarks and Cray is a registered trademark of Cray Research, Inc. IBM, ES/3090, VS FORTRAN, MVS/XA, and VM/XA are registered trademarks of International Business Machines. DEC and ULTRIX are registered trademarks of Digital Equipment Corporation.
Automatic discovery of the communication network topology for building a supercomputer model

NASA Astrophysics Data System (ADS)

Sobolev, Sergey; Stefanov, Konstantin; Voevodin, Vadim

2016-10-01

The Research Computing Center of Lomonosov Moscow State University is developing the Octotron software suite for automatic monitoring and mitigation of emergency situations in supercomputers so as to maximize hardware reliability. The suite is based on a software model of the supercomputer. The model uses a graph to describe the computing system components and their interconnections. One of the most complex components of a supercomputer that needs to be included in the model is its communication network. This work describes the proposed approach for automatically discovering the Ethernet communication network topology in a supercomputer and its description in terms of the Octotron model. This suite automatically detects computing nodes and switches, collects information about them and identifies their interconnections. The application of this approach is demonstrated on the "Lomonosov" and "Lomonosov-2" supercomputers.
VizieR Online Data Catalog: NGC 6302 CO emission SHAPE model (Santander-Garcia+, 2017)

NASA Astrophysics Data System (ADS)

Santander-Garcia, M.; Bujarrabal, V.; Alcolea, J.; Castro-Carrizo, A.; Sanchez Contreras, C.; Quintana-Lacaci, G.; Corradi, R. L. M.; Neri, R.

2016-08-01

SHAPE model of the 12CO and 13CO J=3-2 emission o nebula NGC 6302, to be matched to ALMA observations as described in the paper. The file is intended to be loaded with SHAPE v5 (http://www.astrosen.unam.mx/shape/) and makes use of the SHAPEMOL plugin to achieve the radiative transfer in CO species (i.e. The CO data tables in http://www.astrosen.unam.mx/shape/v5/Downloads/SHAPE_INSTALLERS/index. html must be downloaded and pointed at within SHAPE). For additional details on how to work with SHAPE+SHAPEMOL, see Santander-Garcia et al. (2015, Cat. J/A+A/573/A56). (1 data file).
Large-scale structural analysis: The structural analyst, the CSM Testbed and the NAS System

NASA Technical Reports Server (NTRS)

Knight, Norman F., Jr.; Mccleary, Susan L.; Macy, Steven C.; Aminpour, Mohammad A.

1989-01-01

The Computational Structural Mechanics (CSM) activity is developing advanced structural analysis and computational methods that exploit high-performance computers. Methods are developed in the framework of the CSM testbed software system and applied to representative complex structural analysis problems from the aerospace industry. An overview of the CSM testbed methods development environment is presented and some numerical methods developed on a CRAY-2 are described. Selected application studies performed on the NAS CRAY-2 are also summarized.
Automotive applications of superconductors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ginsberg, M.

1987-01-01

These proceedings compile papers on supercomputers in the automobile industry. Titles include: An automotive engineer's guide to the effective use of scalar, vector, and parallel computers; fluid mechanics, finite elements, and supercomputers; and Automotive crashworthiness performance on a supercomputer.
INS3D - NUMERICAL SOLUTION OF THE INCOMPRESSIBLE NAVIER-STOKES EQUATIONS IN THREE-DIMENSIONAL GENERALIZED CURVILINEAR COORDINATES (CRAY VERSION)

NASA Technical Reports Server (NTRS)

Rogers, S. E.

1994-01-01

INS3D computes steady-state solutions to the incompressible Navier-Stokes equations. The INS3D approach utilizes pseudo-compressibility combined with an approximate factorization scheme. This computational fluid dynamics (CFD) code has been verified on problems such as flow through a channel, flow over a backwardfacing step and flow over a circular cylinder. Three dimensional cases include flow over an ogive cylinder, flow through a rectangular duct, wind tunnel inlet flow, cylinder-wall juncture flow and flow through multiple posts mounted between two plates. INS3D uses a pseudo-compressibility approach in which a time derivative of pressure is added to the continuity equation, which together with the momentum equations form a set of four equations with pressure and velocity as the dependent variables. The equations' coordinates are transformed for general three dimensional applications. The equations are advanced in time by the implicit, non-iterative, approximately-factored, finite-difference scheme of Beam and Warming. The numerical stability of the scheme depends on the use of higher-order smoothing terms to damp out higher-frequency oscillations caused by second-order central differencing. The artificial compressibility introduces pressure (sound) waves of finite speed (whereas the speed of sound would be infinite in an incompressible fluid). As the solution converges, these pressure waves die out, causing the derivation of pressure with respect to time to approach zero. Thus, continuity is satisfied for the incompressible fluid in the steady state. Computational efficiency is achieved using a diagonal algorithm. A block tri-diagonal option is also available. When a steady-state solution is reached, the modified continuity equation will satisfy the divergence-free velocity field condition. INS3D is capable of handling several different types of boundaries encountered in numerical simulations, including solid-surface, inflow and outflow, and far-field boundaries. Three machine versions of INS3D are available. INS3D for the CRAY is written in CRAY FORTRAN for execution on a CRAY X-MP under COS, INS3D for the IBM is written in FORTRAN 77 for execution on an IBM 3090 under the VM or MVS operating system, and INS3D for DEC RISC-based systems is written in RISC FORTRAN for execution on a DEC workstation running RISC ULTRIX 3.1 or later. The CRAY version has a central memory requirement of 730279 words. The central memory requirement for the IBM is 150Mb. The memory requirement for the DEC RISC ULTRIX version is 3Mb of main memory. INS3D was developed in 1987. The port to the IBM was done in 1990. The port to the DECstation 3100 was done in 1991. CRAY is a registered trademark of Cray Research Inc. IBM is a registered trademark of International Business Machines. DEC, DECstation, and ULTRIX are trademarks of the Digital Equipment Corporation.
Scientific Visualization in High Speed Network Environments

NASA Technical Reports Server (NTRS)

Vaziri, Arsi; Kutler, Paul (Technical Monitor)

1997-01-01

In several cases, new visualization techniques have vastly increased the researcher's ability to analyze and comprehend data. Similarly, the role of networks in providing an efficient supercomputing environment have become more critical and continue to grow at a faster rate than the increase in the processing capabilities of supercomputers. A close relationship between scientific visualization and high-speed networks in providing an important link to support efficient supercomputing is identified. The two technologies are driven by the increasing complexities and volume of supercomputer data. The interaction of scientific visualization and high-speed networks in a Computational Fluid Dynamics simulation/visualization environment are given. Current capabilities supported by high speed networks, supercomputers, and high-performance graphics workstations at the Numerical Aerodynamic Simulation Facility (NAS) at NASA Ames Research Center are described. Applied research in providing a supercomputer visualization environment to support future computational requirements are summarized.
Parallel processing a three-dimensional free-lagrange code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mandell, D.A.; Trease, H.E.

1989-01-01

A three-dimensional, time-dependent free-Lagrange hydrodynamics code has been multitasked and autotasked on a CRAY X-MP/416. The multitasking was done by using the Los Alamos Multitasking Control Library, which is a superset of the CRAY multitasking library. Autotasking is done by using constructs which are only comment cards if the source code is not run through a preprocessor. The three-dimensional algorithm has presented a number of problems that simpler algorithms, such as those for one-dimensional hydrodynamics, did not exhibit. Problems in converting the serial code, originally written for a CRAY-1, to a multitasking code are discussed. Autotasking of a rewritten versionmore » of the code is discussed. Timing results for subroutines and hot spots in the serial code are presented and suggestions for additional tools and debugging aids are given. Theoretical speedup results obtained from Amdahl's law and actual speedup results obtained on a dedicated machine are presented. Suggestions for designing large parallel codes are given.« less
Multitasking the three-dimensional shock wave code CTH on the Cray X-MP/416

DOE Office of Scientific and Technical Information (OSTI.GOV)

McGlaun, J.M.; Thompson, S.L.

1988-01-01

CTH is a software system under development at Sandia National Laboratories Albuquerque that models multidimensional, multi-material, large-deformation, strong shock wave physics. CTH was carefully designed to both vectorize and multitask on the Cray X-MP/416. All of the physics routines are vectorized except the thermodynamics and the interface tracer. All of the physics routines are multitasked except the boundary conditions. The Los Alamos National Laboratory multitasking library was used for the multitasking. The resulting code is easy to maintain, easy to understand, gives the same answers as the unitasked code, and achieves a measured speedup of approximately 3.5 on the fourmore » cpu Cray. This document discusses the design, prototyping, development, and debugging of CTH. It also covers the architecture features of CTH that enhances multitasking, granularity of the tasks, and synchronization of tasks. The utility of system software and utilities such as simulators and interactive debuggers are also discussed. 5 refs., 7 tabs.« less
Parallel processing a real code: A case history

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mandell, D.A.; Trease, H.E.

1988-01-01

A three-dimensional, time-dependent Free-Lagrange hydrodynamics code has been multitasked and autotasked on a Cray X-MP/416. The multitasking was done by using the Los Alamos Multitasking Control Library, which is a superset of the Cray multitasking library. Autotasking is done by using constructs which are only comment cards if the source code is not run through a preprocessor. The 3-D algorithm has presented a number of problems that simpler algorithms, such as 1-D hydrodynamics, did not exhibit. Problems in converting the serial code, originally written for a Cray 1, to a multitasking code are discussed, Autotasking of a rewritten version ofmore » the code is discussed. Timing results for subroutines and hot spots in the serial code are presented and suggestions for additional tools and debugging aids are given. Theoretical speedup results obtained from Amdahl's law and actual speedup results obtained on a dedicated machine are presented. Suggestions for designing large parallel codes are given. 8 refs., 13 figs.« less
The solution of linear systems of equations with a structural analysis code on the NAS CRAY-2

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Overman, Andrea L.

1988-01-01

Two methods for solving linear systems of equations on the NAS Cray-2 are described. One is a direct method; the other is an iterative method. Both methods exploit the architecture of the Cray-2, particularly the vectorization, and are aimed at structural analysis applications. To demonstrate and evaluate the methods, they were installed in a finite element structural analysis code denoted the Computational Structural Mechanics (CSM) Testbed. A description of the techniques used to integrate the two solvers into the Testbed is given. Storage schemes, memory requirements, operation counts, and reformatting procedures are discussed. Finally, results from the new methods are compared with results from the initial Testbed sparse Choleski equation solver for three structural analysis problems. The new direct solvers described achieve the highest computational rates of the methods compared. The new iterative methods are not able to achieve as high computation rates as the vectorized direct solvers but are best for well conditioned problems which require fewer iterations to converge to the solution.
Porting the AVS/Express scientific visualization software to Cray XT4.

PubMed

Leaver, George W; Turner, Martin J; Perrin, James S; Mummery, Paul M; Withers, Philip J

2011-08-28

Remote scientific visualization, where rendering services are provided by larger scale systems than are available on the desktop, is becoming increasingly important as dataset sizes increase beyond the capabilities of desktop workstations. Uptake of such services relies on access to suitable visualization applications and the ability to view the resulting visualization in a convenient form. We consider five rules from the e-Science community to meet these goals with the porting of a commercial visualization package to a large-scale system. The application uses message-passing interface (MPI) to distribute data among data processing and rendering processes. The use of MPI in such an interactive application is not compatible with restrictions imposed by the Cray system being considered. We present details, and performance analysis, of a new MPI proxy method that allows the application to run within the Cray environment yet still support MPI communication required by the application. Example use cases from materials science are considered.
A Pacific Ocean general circulation model for satellite data assimilation

NASA Technical Reports Server (NTRS)

Chao, Y.; Halpern, D.; Mechoso, C. R.

1991-01-01

A tropical Pacific Ocean General Circulation Model (OGCM) to be used in satellite data assimilation studies is described. The transfer of the OGCM from a CYBER-205 at NOAA's Geophysical Fluid Dynamics Laboratory to a CRAY-2 at NASA's Ames Research Center is documented. Two 3-year model integrations from identical initial conditions but performed on those two computers are compared. The model simulations are very similar to each other, as expected, but the simulations performed with the higher-precision CRAY-2 is smoother than that with the lower-precision CYBER-205. The CYBER-205 and CRAY-2 use 32 and 64-bit mantissa arithmetic, respectively. The major features of the oceanic circulation in the tropical Pacific, namely the North Equatorial Current, the North Equatorial Countercurrent, the South Equatorial Current, and the Equatorial Undercurrent, are realistically produced and their seasonal cycles are described. The OGCM provides a powerful tool for study of tropical oceans and for the assimilation of satellite altimetry data.
Towards Efficient Supercomputing: Searching for the Right Efficiency Metric

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hsu, Chung-Hsing; Kuehn, Jeffery A; Poole, Stephen W

2012-01-01

The efficiency of supercomputing has traditionally been in the execution time. In early 2000 s, the concept of total cost of ownership was re-introduced, with the introduction of efficiency measure to include aspects such as energy and space. Yet the supercomputing community has never agreed upon a metric that can cover these aspects altogether and also provide a fair basis for comparison. This paper exam- ines the metrics that have been proposed in the past decade, and proposes a vector-valued metric for efficient supercom- puting. Using this metric, the paper presents a study of where the supercomputing industry has beenmore » and how it stands today with respect to efficient supercomputing.« less
Distributed multitasking ITS with PVM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fan, W.C.; Halbleib, J.A. Sr.

1995-12-31

Advances in computer hardware and communication software have made it possible to perform parallel-processing computing on a collection of desktop workstations. For many applications, multitasking on a cluster of high-performance workstations has achieved performance comparable to or better than that on a traditional supercomputer. From the point of view of cost-effectiveness, it also allows users to exploit available but unused computational resources and thus achieve a higher performance-to-cost ratio. Monte Carlo calculations are inherently parallelizable because the individual particle trajectories can be generated independently with minimum need for interprocessor communication. Furthermore, the number of particle histories that can be generatedmore » in a given amount of wall-clock time is nearly proportional to the number of processors in the cluster. This is an important fact because the inherent statistical uncertainty in any Monte Carlo result decreases as the number of histories increases. For these reasons, researchers have expended considerable effort to take advantage of different parallel architectures for a variety of Monte Carlo radiation transport codes, often with excellent results. The initial interest in this work was sparked by the multitasking capability of the MCNP code on a cluster of workstations using the Parallel Virtual Machine (PVM) software. On a 16-machine IBM RS/6000 cluster, it has been demonstrated that MCNP runs ten times as fast as on a single-processor CRAY YMP. In this paper, we summarize the implementation of a similar multitasking capability for the coupled electronphoton transport code system, the Integrated TIGER Series (ITS), and the evaluation of two load-balancing schemes for homogeneous and heterogeneous networks.« less

Distributed multitasking ITS with PVM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fan, W.C.; Halbleib, J.A. Sr.

1995-02-01

Advances of computer hardware and communication software have made it possible to perform parallel-processing computing on a collection of desktop workstations. For many applications, multitasking on a cluster of high-performance workstations has achieved performance comparable or better than that on a traditional supercomputer. From the point of view of cost-effectiveness, it also allows users to exploit available but unused computational resources, and thus achieve a higher performance-to-cost ratio. Monte Carlo calculations are inherently parallelizable because the individual particle trajectories can be generated independently with minimum need for interprocessor communication. Furthermore, the number of particle histories that can be generated inmore » a given amount of wall-clock time is nearly proportional to the number of processors in the cluster. This is an important fact because the inherent statistical uncertainty in any Monte Carlo result decreases as the number of histories increases. For these reasons, researchers have expended considerable effort to take advantage of different parallel architectures for a variety of Monte Carlo radiation transport codes, often with excellent results. The initial interest in this work was sparked by the multitasking capability of MCNP on a cluster of workstations using the Parallel Virtual Machine (PVM) software. On a 16-machine IBM RS/6000 cluster, it has been demonstrated that MCNP runs ten times as fast as on a single-processor CRAY YMP. In this paper, we summarize the implementation of a similar multitasking capability for the coupled electron/photon transport code system, the Integrated TIGER Series (ITS), and the evaluation of two load balancing schemes for homogeneous and heterogeneous networks.« less
Using High Performance Computing to Understand Roles of Labile and Nonlabile U(VI) on Hanford 300 Area Plume Longevity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lichtner, Peter C.; Hammond, Glenn E.

Evolution of a hexavalent uranium [U(VI)] plume at the Hanford 300 Area bordering the Columbia River is investigated to evaluate the roles of labile and nonlabile forms of U(VI) on the longevity of the plume. A high fidelity, three-dimensional, field-scale, reactive flow and transport model is used to represent the system. Richards equation coupled to multicomponent reactive transport equations are solved for times up to 100 years taking into account rapid fluctuations in the Columbia River stage resulting in pulse releases of U(VI) into the river. The peta-scale computer code PFLOTRAN developed under a DOE SciDAC-2 project is employed inmore » the simulations and executed on ORNL's Cray XT5 supercomputer Jaguar. Labile U(VI) is represented in the model through surface complexation reactions and its nonlabile form through dissolution of metatorbernite used as a surrogate mineral. Initial conditions are constructed corresponding to the U(VI) plume already in place to avoid uncertainties associated with the lack of historical data for the waste stream. The cumulative U(VI) flux into the river is compared for cases of equilibrium and multirate sorption models and for no sorption. The sensitivity of the U(VI) flux into the river on the initial plume configuration is investigated. The presence of nonlabile U(VI) was found to be essential in explaining the longevity of the U(VI) plume and the prolonged high U(VI) concentrations at the site exceeding the EPA MCL for uranium.« less
[Morphology of some tricostrongilinae (Strongylida) from the National Helminth Collection, Institute of Biology, UNAM, Mexico].

PubMed

Falcón-Ordaz, Jorge; García-Prieto, Luis

2004-06-01

The present study analyses the taxonomic status of eleven species of trichostrongylins that parasitize rodents and lagomorphs deposited in the Colección Nacional de Helmintos, Instituto de Biología. UNAM. Mexico. This analysis is based on the morphology of the synlophe, characteristic that had not been studied for most of these nematode species and at present, it has a very important taxonomic value. As a result of this study, the identity of five species is ratified (Trichostrongylus calcaratus, Obeliscoides cuniculi, Carolinensis huehuetlana. Stilestrongylus peromysci and Nippostrongylus brasiliensis), the transference suggested previously for two more (Vexillata convoluta and Vexillata vexillata) is confirmed, Trichostrongylus chiapensis is synonymized with Boehmiella willsoni, and finally Lamothiella romerolagi is re-determined as Teporingonema cerropeladoensis and Stilestrongylus atlatilpinensis as Stilestrongylus hidalguensis.
Is There any Relationship Between the Santa Elena Depression and Chicxulub Impact Crater, Northwestern Yucatan Peninsula, Mexico?

NASA Astrophysics Data System (ADS)

Lefticariu, L.

2005-05-01

The Terminal Cretaceous Chicxulub Impact Crater had a strong control on the depositional and diagenetic history of the northern Yucatan Platform during most of the Cenozoic Era. The Chicxulub Sedimentary Basin (henceforth Basin), which approximately coincides with the impact crater, is circumscribed by a concentration of karstic sinkholes known as the Ring of Cenotes. Santa Elena Depression (henceforth Depression) is the name proposed for the bowl-shaped buried feature, first contoured by geophysical studies, immediately south of the Basin, in the area where the Ticul 1 and UNAM 5 wells were drilled. Lithologic, petrographic, and biostratigraphic data on PEMEX, UNAM, and ICDP cores show that: 1) Cenozoic deposits are much thicker inside the Basin than inside the Depression, 2) in general, the Cenozoic formations from inside the Depression are the thickest among those outside the Basin, 3) variably dolomitized pelagic or outer-platform wackestone or mudstone occur both inside the Basin and Depression, 4) the age of the deeper-water sedimentary carbonate rocks is Paleocene-Eocene inside the Basin and Paleocene?-Early Eocene inside the Depression, 5) the oldest formations that crop out are of Middle Eocene age at the edge of the Basin and Early-Middle Eocene age inside the Depression, 6) saline lake deposits, that consist chiefly of anhydrite, gypsum, and fine carbonate, and also contain quartz, chert, clay, zeolite, potassium feldspar, pyrite, and fragments of wood, are present in the Cenozoic section of the UNAM 5 core between 282 and 198 m below the present land surface, 7) the dolomite, subaerial exposure features (subaerial crusts, vugs, karst, dedolomite), and vug-filling cement from the Eocene formations are more abundant inside the Depression than inside the Basin. The depositional environments that are proposed for explaining the Cenozoic facies succession within the Santa Elena Depression are: 1) deeper marine water (Paleocene?-Early Eocene), 2) relatively isolated saline lake (Middle Eocene), and 3) shallow marine water (Middle-Late Eocene?). In places, the deeper-water facies are similar to those within the Chicxulub Sedimentary Basin. The shallow-water facies is similar to those occurring outside the Basin. In general, quartz and silicates are rare in the Cenozoic sedimentary carbonate of the northwestern Yucatan Peninsula. Therefore, their presence in the UNAM 5 core could be attributed to either impact breccia reworking or silicic volcanic processes. Quartz, chert, zeolite, and clay also are common in the suevite breccia of both Yax-1 and UNAM 5 cores. The fact that the Santa Elena Depression was a distinct sedimentary basin during much of the Paleogene could be explained by any or a combination of the following hypotheses: 1) In spite of being located outside the cenote ring, the Depression is a sub-basin of the larger and deeper Chicxulub Sedimentary Basin and is therefore located within the Chicxulub Impact Crater, 2) the Depression coincides with an impact crater distinct from the Chicxulub Impact Crater, 3) the Depression formed after the Chicxulub bolide impact due to slumping, crater wall failure, or larger-scale tectonic processes. The lack of conclusive evidence for multiple impact breccia layers in the northwestern Yucatan Peninsula, corroborated with the presence on top of the impact breccia from UNAM 5 core of deeper-water limestone similar to that of Late Paleocene-Early Eocene age from Yax-1 core, would be more consistent with either the first or third hypothesis.
NASA's supercomputing experience

NASA Technical Reports Server (NTRS)

Bailey, F. Ron

1990-01-01

A brief overview of NASA's recent experience in supercomputing is presented from two perspectives: early systems development and advanced supercomputing applications. NASA's role in supercomputing systems development is illustrated by discussion of activities carried out by the Numerical Aerodynamical Simulation Program. Current capabilities in advanced technology applications are illustrated with examples in turbulence physics, aerodynamics, aerothermodynamics, chemistry, and structural mechanics. Capabilities in science applications are illustrated by examples in astrophysics and atmospheric modeling. Future directions and NASA's new High Performance Computing Program are briefly discussed.
OpenMP Performance on the Columbia Supercomputer

NASA Technical Reports Server (NTRS)

Haoqiang, Jin; Hood, Robert

2005-01-01

This presentation discusses Columbia World Class Supercomputer which is one of the world's fastest supercomputers providing 61 TFLOPs (10/20/04). Conceived, designed, built, and deployed in just 120 days. A 20-node supercomputer built on proven 512-processor nodes. The largest SGI system in the world with over 10,000 Intel Itanium 2 processors and provides the largest node size incorporating commodity parts (512) and the largest shared-memory environment (2048) with 88% efficiency tops the scalar systems on the Top500 list.
NASA's Climate in a Box: Desktop Supercomputing for Open Scientific Model Development

NASA Astrophysics Data System (ADS)

Wojcik, G. S.; Seablom, M. S.; Lee, T. J.; McConaughy, G. R.; Syed, R.; Oloso, A.; Kemp, E. M.; Greenseid, J.; Smith, R.

2009-12-01

NASA's High Performance Computing Portfolio in cooperation with its Modeling, Analysis, and Prediction program intends to make its climate and earth science models more accessible to a larger community. A key goal of this effort is to open the model development and validation process to the scientific community at large such that a natural selection process is enabled and results in a more efficient scientific process. One obstacle to others using NASA models is the complexity of the models and the difficulty in learning how to use them. This situation applies not only to scientists who regularly use these models but also non-typical users who may want to use the models such as scientists from different domains, policy makers, and teachers. Another obstacle to the use of these models is that access to high performance computing (HPC) accounts, from which the models are implemented, can be restrictive with long wait times in job queues and delays caused by an arduous process of obtaining an account, especially for foreign nationals. This project explores the utility of using desktop supercomputers in providing a complete ready-to-use toolkit of climate research products to investigators and on demand access to an HPC system. One objective of this work is to pre-package NASA and NOAA models so that new users will not have to spend significant time porting the models. In addition, the prepackaged toolkit will include tools, such as workflow, visualization, social networking web sites, and analysis tools, to assist users in running the models and analyzing the data. The system architecture to be developed will allow for automatic code updates for each user and an effective means with which to deal with data that are generated. We plan to investigate several desktop systems, but our work to date has focused on a Cray CX1. Currently, we are investigating the potential capabilities of several non-traditional development environments. While most NASA and NOAA models are designed for Linux operating systems (OS), the arrival of the WindowsHPC 2008 OS provides the opportunity to evaluate the use of a new platform on which to develop and port climate and earth science models. In particular, we are evaluating Microsoft's Visual Studio Integrated Developer Environment to determine its appropriateness for the climate modeling community. In the initial phases of this project, we have ported GEOS-5, WRF, GISS ModelE, and GFS to Linux on a CX1 and are in the process of porting WRF and ModelE to WindowsHPC 2008. Initial tests on the CX1 Linux OS indicate favorable comparisons in terms of performance and consistency of scientific results when compared with experiments executed on NASA high end systems. As in the past, NASA's large clusters will continue to be an important part of our objectives. We envision a seamless environment in which an investigator performs model development and testing on a desktop system and can seamlessly transfer execution to supercomputer clusters for production.
CSM Testbed Development and Large-Scale Structural Applications

NASA Technical Reports Server (NTRS)

Knight, Norman F., Jr.; Gillian, R. E.; Mccleary, Susan L.; Lotts, C. G.; Poole, E. L.; Overman, A. L.; Macy, S. C.

1989-01-01

A research activity called Computational Structural Mechanics (CSM) conducted at the NASA Langley Research Center is described. This activity is developing advanced structural analysis and computational methods that exploit high-performance computers. Methods are developed in the framework of the CSM Testbed software system and applied to representative complex structural analysis problems from the aerospace industry. An overview of the CSM Testbed methods development environment is presented and some new numerical methods developed on a CRAY-2 are described. Selected application studies performed on the NAS CRAY-2 are also summarized.
Supercomputer networking for space science applications

NASA Technical Reports Server (NTRS)

Edelson, B. I.

1992-01-01

The initial design of a supercomputer network topology including the design of the communications nodes along with the communications interface hardware and software is covered. Several space science applications that are proposed experiments by GSFC and JPL for a supercomputer network using the NASA ACTS satellite are also reported.
Most Social Scientists Shun Free Use of Supercomputers.

ERIC Educational Resources Information Center

Kiernan, Vincent

1998-01-01

Social scientists, who frequently complain that the federal government spends too little on them, are passing up what scholars in the physical and natural sciences see as the government's best give-aways: free access to supercomputers. Some social scientists say the supercomputers are difficult to use; others find desktop computers provide…
A fault tolerant spacecraft supercomputer to enable a new class of scientific discovery

NASA Technical Reports Server (NTRS)

Katz, D. S.; McVittie, T. I.; Silliman, A. G., Jr.

2000-01-01

The goal of the Remote Exploration and Experimentation (REE) Project is to move supercomputeing into space in a coste effective manner and to allow the use of inexpensive, state of the art, commercial-off-the-shelf components and subsystems in these space-based supercomputers.
TOP500 Supercomputers for November 2003

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

2003-11-16

22nd Edition of TOP500 List of World s Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 22nd edition of the TOP500 list of the worlds fastest supercomputers was released today (November 16, 2003). The Earth Simulator supercomputer retains the number one position with its Linpack benchmark performance of 35.86 Tflop/s (''teraflops'' or trillions of calculations per second). It was built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan.
GASNet-EX Performance Improvements Due to Specialization for the Cray Aries Network

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hargrove, Paul H.; Bonachea, Dan

This document is a deliverable for milestone STPM17-6 of the Exascale Computing Project, delivered by WBS 2.3.1.14. It reports on the improvements in performance observed on Cray XC-series systems due to enhancements made to the GASNet-EX software. These enhancements, known as “specializations”, primarily consist of replacing network-independent implementations of several recently added features with implementations tailored to the Cray Aries network. Performance gains from specialization include (1) Negotiated-Payload Active Messages improve bandwidth of a ping-pong test by up to 14%, (2) Immediate Operations reduce running time of a synthetic benchmark by up to 93%, (3) non-bulk RMA Put bandwidth ismore » increased by up to 32%, (4) Remote Atomic performance is 70% faster than the reference on a point-to-point test and allows a hot-spot test to scale robustly, and (5) non-contiguous RMA interfaces see up to 8.6x speedups for an intra-node benchmark and 26% for inter-node. These improvements are available in the GASNet-EX 2018.3.0 release.« less
Modeling high-temperature superconductors and metallic alloys on the Intel IPSC/860

NASA Astrophysics Data System (ADS)

Geist, G. A.; Peyton, B. W.; Shelton, W. A.; Stocks, G. M.

Oak Ridge National Laboratory has embarked on several computational Grand Challenges, which require the close cooperation of physicists, mathematicians, and computer scientists. One of these projects is the determination of the material properties of alloys from first principles and, in particular, the electronic structure of high-temperature superconductors. While the present focus of the project is on superconductivity, the approach is general enough to permit study of other properties of metallic alloys such as strength and magnetic properties. This paper describes the progress to date on this project. We include a description of a self-consistent KKR-CPA method, parallelization of the model, and the incorporation of a dynamic load balancing scheme into the algorithm. We also describe the development and performance of a consolidated KKR-CPA code capable of running on CRAYs, workstations, and several parallel computers without source code modification. Performance of this code on the Intel iPSC/860 is also compared to a CRAY 2, CRAY YMP, and several workstations. Finally, some density of state calculations of two perovskite superconductors are given.
New computing systems and their impact on structural analysis and design

NASA Technical Reports Server (NTRS)

Noor, Ahmed K.

1989-01-01

A review is given of the recent advances in computer technology that are likely to impact structural analysis and design. The computational needs for future structures technology are described. The characteristics of new and projected computing systems are summarized. Advances in programming environments, numerical algorithms, and computational strategies for new computing systems are reviewed, and a novel partitioning strategy is outlined for maximizing the degree of parallelism. The strategy is designed for computers with a shared memory and a small number of powerful processors (or a small number of clusters of medium-range processors). It is based on approximating the response of the structure by a combination of symmetric and antisymmetric response vectors, each obtained using a fraction of the degrees of freedom of the original finite element model. The strategy was implemented on the CRAY X-MP/4 and the Alliant FX/8 computers. For nonlinear dynamic problems on the CRAY X-MP with four CPUs, it resulted in an order of magnitude reduction in total analysis time, compared with the direct analysis on a single-CPU CRAY X-MP machine.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gittens, Alex; Kottalam, Jey; Yang, Jiyan

We investigate the performance and scalability of the randomized CX low-rank matrix factorization and demonstrate its applicability through the analysis of a 1TB mass spectrometry imaging (MSI) dataset, using Apache Spark on an Amazon EC2 cluster, a Cray XC40 system, and an experimental Cray cluster. We implemented this factorization both as a parallelized C implementation with hand-tuned optimizations and in Scala using the Apache Spark high-level cluster computing framework. We obtained consistent performance across the three platforms: using Spark we were able to process the 1TB size dataset in under 30 minutes with 960 cores on all systems, with themore » fastest times obtained on the experimental Cray cluster. In comparison, the C implementation was 21X faster on the Amazon EC2 system, due to careful cache optimizations, bandwidth-friendly access of matrices and vector computation using SIMD units. We report these results and their implications on the hardware and software issues arising in supporting data-centric workloads in parallel and distributed environments.« less
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
[Fiftieth anniversary of the Medical Movement in Mexico (1964-1965)].

PubMed

Treviño-Becerra, Alejandro; García-Manzo, Norberto Treviño; Mota-Hernández, Felipe; Gutiérrez-Samperio, César; Cano-Valle, Fernando

This Symposium highlights the recognition that this year reaches half a century of the Medical Movement (1964-1965), and 27 years of publishing the book titled, "Documental Memories and Reflections" ("Crónica Documental y Reflexiones") edited by the Faculty of Medicine of the UNAM, at that time directed by the prestigious Dr. Fernando Cano Valle. Our President Dr. Graue indicated that Dr. Alejandro Treviño-Becerra assumed the coordination of this session with the commitment to be published in the Medical Gazette of Mexico for current and future generations. The Academic participants were: Norberto Treviño García-Manzo, president of the Academy in 1988. Dr. Felipe Mota Hernández was the Recording Secretary of the Mexican Medical Alliance ("Alianza de Médicos Mexicanos"). Now he is the Dean of the Children's Hospital of Mexico "Federico Gómez". Dr. Cesar Gutiérrez Samperio, surgeon at IMSS and professor at Medicine School, University of Queretaro until a year ago. Dr. Fernando Cano Valle, former Head of the Medical Faculty, UNAM, presently a researcher in Medicine and Human Rights in the Institute for Juridical Research, UNAM. I quote the Academic Treviño Zapata: "I believe that it will be difficult to bring again the conditions and circumstances that made possible the vigorous realization of the Medical Movement, the enthusiastic and hopeful creation of the Mexican Medical Alliance, and the promising start and progress of the integration of the national medical union."
Distributed user services for supercomputers

NASA Technical Reports Server (NTRS)

Sowizral, Henry A.

1989-01-01

User-service operations at supercomputer facilities are examined. The question is whether a single, possibly distributed, user-services organization could be shared by NASA's supercomputer sites in support of a diverse, geographically dispersed, user community. A possible structure for such an organization is identified as well as some of the technologies needed in operating such an organization.

Scientific workflow and support for high resolution global climate modeling at the Oak Ridge Leadership Computing Facility

NASA Astrophysics Data System (ADS)

Anantharaj, V.; Mayer, B.; Wang, F.; Hack, J.; McKenna, D.; Hartman-Baker, R.

2012-04-01

The Oak Ridge Leadership Computing Facility (OLCF) facilitates the execution of computational experiments that require tens of millions of CPU hours (typically using thousands of processors simultaneously) while generating hundreds of terabytes of data. A set of ultra high resolution climate experiments in progress, using the Community Earth System Model (CESM), will produce over 35,000 files, ranging in sizes from 21 MB to 110 GB each. The execution of the experiments will require nearly 70 Million CPU hours on the Jaguar and Titan supercomputers at OLCF. The total volume of the output from these climate modeling experiments will be in excess of 300 TB. This model output must then be archived, analyzed, distributed to the project partners in a timely manner, and also made available more broadly. Meeting this challenge would require efficient movement of the data, staging the simulation output to a large and fast file system that provides high volume access to other computational systems used to analyze the data and synthesize results. This file system also needs to be accessible via high speed networks to an archival system that can provide long term reliable storage. Ideally this archival system is itself directly available to other systems that can be used to host services making the data and analysis available to the participants in the distributed research project and to the broader climate community. The various resources available at the OLCF now support this workflow. The available systems include the new Jaguar Cray XK6 2.63 petaflops (estimated) supercomputer, the 10 PB Spider center-wide parallel file system, the Lens/EVEREST analysis and visualization system, the HPSS archival storage system, the Earth System Grid (ESG), and the ORNL Climate Data Server (CDS). The ESG features federated services, search & discovery, extensive data handling capabilities, deep storage access, and Live Access Server (LAS) integration. The scientific workflow enabled on these systems, and developed as part of the Ultra-High Resolution Climate Modeling Project, allows users of OLCF resources to efficiently share simulated data, often multi-terabyte in volume, as well as the results from the modeling experiments and various synthesized products derived from these simulations. The final objective in the exercise is to ensure that the simulation results and the enhanced understanding will serve the needs of a diverse group of stakeholders across the world, including our research partners in U.S. Department of Energy laboratories & universities, domain scientists, students (K-12 as well as higher education), resource managers, decision makers, and the general public.
Will Moores law be sufficient?

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeBenedictis, Erik P.

2004-07-01

It seems well understood that supercomputer simulation is an enabler for scientific discoveries, weapons, and other activities of value to society. It also seems widely believed that Moore's Law will make progressively more powerful supercomputers over time and thus enable more of these contributions. This paper seeks to add detail to these arguments, revealing them to be generally correct but not a smooth and effortless progression. This paper will review some key problems that can be solved with supercomputer simulation, showing that more powerful supercomputers will be useful up to a very high yet finite limit of around 1021 FLOPSmore » (1 Zettaflops) . The review will also show the basic nature of these extreme problems. This paper will review work by others showing that the theoretical maximum supercomputer power is very high indeed, but will explain how a straightforward extrapolation of Moore's Law will lead to technological maturity in a few decades. The power of a supercomputer at the maturity of Moore's Law will be very high by today's standards at 1016-1019 FLOPS (100 Petaflops to 10 Exaflops), depending on architecture, but distinctly below the level required for the most ambitious applications. Having established that Moore's Law will not be that last word in supercomputing, this paper will explore the nearer term issue of what a supercomputer will look like at maturity of Moore's Law. Our approach will quantify the maximum performance as permitted by the laws of physics for extension of current technology and then find a design that approaches this limit closely. We study a 'multi-architecture' for supercomputers that combines a microprocessor with other 'advanced' concepts and find it can reach the limits as well. This approach should be quite viable in the future because the microprocessor would provide compatibility with existing codes and programming styles while the 'advanced' features would provide a boost to the limits of performance.« less
Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yilk, Todd

The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.
Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL

DOE PAGES

Yilk, Todd

2018-02-17

The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-01-01

This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Non-preconditioned conjugate gradient on cell and FPCA-based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-03-10

This work presents a detailed implementation of a double precision, Non-Preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{trademark} in conjunction with x86 Opteron{trademark} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
MAGNA (Materially and Geometrically Nonlinear Analysis). Part I. Finite Element Analysis Manual.

DTIC Science & Technology

1982-12-01

provided for operating the program, modifying storage caoacity, preparing input data, estimating computer run times , and interpreting the output...7.1.3 Reserved File Names 7.1.16 7.1.4 Typical Execution Times on CDC Computers 7.1.18 7.2 CRAY PROGRAM VERSION 7.2.1 7.2.1 Job Control Language 7.2.1...7.2.2 Modification of Storage Capacity 7.2.8 7.2.3 Execution Times on the CRAY-I Computer 7.2.12 7.3 VAX PROGRAM VERSION 7.3.1 8 INPUT DATA 8.0.1 8.1
Implementing TCP/IP and a socket interface as a server in a message-passing operating system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hipp, E.; Wiltzius, D.

1990-03-01

The UNICOS 4.3BSD network code and socket transport interface are the basis of an explicit network server for NLTSS, a message passing operating system on the Cray YMP. A BSD socket user library provides access to the network server using an RPC mechanism. The advantages of this server methodology are its modularity and extensibility to migrate to future protocol suites (e.g. OSI) and transport interfaces. In addition, the network server is implemented in an explicit multi-tasking environment to take advantage of the Cray YMP multi-processor platform. 19 refs., 5 figs.
Two decades of Mexican particle physics at Fermilab

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roy Rubinstein

2002-12-03

This report is a view from Fermilab of Mexican particle physics at the Laboratory since about 1980; it is not intended to be a history of Mexican particle physics: that topic is outside the expertise of the writer. The period 1980 to the present coincides with the growth of Mexican experimental particle physics from essentially no activity to its current state where Mexican groups take part in experiments at several of the world's major laboratories. Soon after becoming Fermilab director in 1979, Leon Lederman initiated a program to encourage experimental physics, especially experimental particle physics, in Latin America. At themore » time, Mexico had significant theoretical particle physics activity, but none in experiment. Following a visit by Lederman to UNAM in 1981, a conference ''Panamerican Symposium on Particle Physics and Technology'' was held in January 1982 at Cocoyoc, Mexico, with about 50 attendees from Europe, North America, and Latin America; these included Lederman, M. Moshinsky, J. Flores, S. Glashow, J. Bjorken, and G. Charpak. Among the conference outcomes were four subsequent similar symposia over the next decade, and a formal Fermilab program to aid Latin American physics (particularly particle physics); it also influenced a decision by Mexican physicist Clicerio Avilez to switch from theoretical to experimental particle physics. The first physics collaboration between Fermilab and Mexico was in particle theory. Post-docs Rodrigo Huerta and Jose Luis Lucio spent 1-2 years at Fermilab starting in 1981, and other theorists (including Augusto Garcia, Arnulfo Zepeda, Matias Moreno and Miguel Angel Perez) also spent time at the Laboratory in the 1980s.« less
High-Performance Computing: Industry Uses of Supercomputers and High-Speed Networks. Report to Congressional Requesters.

ERIC Educational Resources Information Center

General Accounting Office, Washington, DC. Information Management and Technology Div.

This report was prepared in response to a request for information on supercomputers and high-speed networks from the Senate Committee on Commerce, Science, and Transportation, and the House Committee on Science, Space, and Technology. The following information was requested: (1) examples of how various industries are using supercomputers to…
Supercomputer Provides Molecular Insight into Cellulose (Fact Sheet)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

2011-02-01

Groundbreaking research at the National Renewable Energy Laboratory (NREL) has used supercomputing simulations to calculate the work that enzymes must do to deconstruct cellulose, which is a fundamental step in biomass conversion technologies for biofuels production. NREL used the new high-performance supercomputer Red Mesa to conduct several million central processing unit (CPU) hours of simulation.
GREEN SUPERCOMPUTING IN A DESKTOP BOX

DOE Office of Scientific and Technical Information (OSTI.GOV)

HSU, CHUNG-HSING; FENG, WU-CHUN; CHING, AVERY

2007-01-17

The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an open-source operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicatedmore » computer cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren - a shared (and power-hungry) datacenter resource that must reside in a machine-cooled room in order to operate properly. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a 'green' desktop supercomputer - a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 'pizza box' workstation. In this paper, they present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop supercomputer that achieves 14 Gflops on Linpack but sips only 185 watts of power at load, resulting in a performance-power ratio that is over 300% better than their reference SMP platform.« less
The ASCI Network for SC 2000: Gigabyte Per Second Networking

DOE Office of Scientific and Technical Information (OSTI.GOV)

PRATT, THOMAS J.; NAEGLE, JOHN H.; MARTINEZ JR., LUIS G.

2001-11-01

This document highlights the Discom's Distance computing and communication team activities at the 2000 Supercomputing conference in Dallas Texas. This conference is sponsored by the IEEE and ACM. Sandia's participation in the conference has now spanned a decade, for the last five years Sandia National Laboratories, Los Alamos National Lab and Lawrence Livermore National Lab have come together at the conference under the DOE's ASCI, Accelerated Strategic Computing Initiatives, Program rubric to demonstrate ASCI's emerging capabilities in computational science and our combined expertise in high performance computer science and communication networking developments within the program. At SC 2000, DISCOM demonstratedmore » an infrastructure. DISCOM2 uses this forum to demonstrate and focus communication and pre-standard implementation of 10 Gigabit Ethernet, the first gigabyte per second data IP network transfer application, and VPN technology that enabled a remote Distributed Resource Management tools demonstration. Additionally a national OC48 POS network was constructed to support applications running between the show floor and home facilities. This network created the opportunity to test PSE's Parallel File Transfer Protocol (PFTP) across a network that had similar speed and distances as the then proposed DISCOM WAN. The SCINET SC2000 showcased wireless networking and the networking team had the opportunity to explore this emerging technology while on the booth. This paper documents those accomplishments, discusses the details of their convention exhibit floor. We also supported the production networking needs of the implementation, and describes how these demonstrations supports DISCOM overall strategies in high performance computing networking.« less
Vectorized and multitasked solution of the few-group neutron diffusion equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zee, S.K.; Turinsky, P.J.; Shayer, Z.

1989-03-01

A numerical algorithm with parallelism was used to solve the two-group, multidimensional neutron diffusion equations on computers characterized by shared memory, vector pipeline, and multi-CPU architecture features. Specifically, solutions were obtained on the Cray X/MP-48, the IBM-3090 with vector facilities, and the FPS-164. The material-centered mesh finite difference method approximation and outer-inner iteration method were employed. Parallelism was introduced in the inner iterations using the cyclic line successive overrelaxation iterative method and solving in parallel across lines. The outer iterations were completed using the Chebyshev semi-iterative method that allows parallelism to be introduced in both space and energy groups. Formore » the three-dimensional model, power, soluble boron, and transient fission product feedbacks were included. Concentrating on the pressurized water reactor (PWR), the thermal-hydraulic calculation of moderator density assumed single-phase flow and a closed flow channel, allowing parallelism to be introduced in the solution across the radial plane. Using a pinwise detail, quarter-core model of a typical PWR in cycle 1, for the two-dimensional model without feedback the measured million floating point operations per second (MFLOPS)/vector speedups were 83/11.7. 18/2.2, and 2.4/5.6 on the Cray, IBM, and FPS without multitasking, respectively. Lower performance was observed with a coarser mesh, i.e., shorter vector length, due to vector pipeline start-up. For an 18 x 18 x 30 (x-y-z) three-dimensional model with feedback of the same core, MFLOPS/vector speedups of --61/6.7 and an execution time of 0.8 CPU seconds on the Cray without multitasking were measured. Finally, using two CPUs and the vector pipelines of the Cray, a multitasking efficiency of 81% was noted for the three-dimensional model.« less
Finite element analysis in fluids; Proceedings of the Seventh International Conference on Finite Element Methods in Flow Problems, University of Alabama, Huntsville, Apr. 3-7, 1989

NASA Technical Reports Server (NTRS)

Chung, T. J. (Editor); Karr, Gerald R. (Editor)

1989-01-01

Recent advances in computational fluid dynamics are examined in reviews and reports, with an emphasis on finite-element methods. Sections are devoted to adaptive meshes, atmospheric dynamics, combustion, compressible flows, control-volume finite elements, crystal growth, domain decomposition, EM-field problems, FDM/FEM, and fluid-structure interactions. Consideration is given to free-boundary problems with heat transfer, free surface flow, geophysical flow problems, heat and mass transfer, high-speed flow, incompressible flow, inverse design methods, MHD problems, the mathematics of finite elements, and mesh generation. Also discussed are mixed finite elements, multigrid methods, non-Newtonian fluids, numerical dissipation, parallel vector processing, reservoir simulation, seepage, shallow-water problems, spectral methods, supercomputer architectures, three-dimensional problems, and turbulent flows.
Large-Scale Simulations of Plastic Neural Networks on Neuromorphic Hardware

PubMed Central

Knight, James C.; Tully, Philip J.; Kaplan, Bernhard A.; Lansner, Anders; Furber, Steve B.

2016-01-01

SpiNNaker is a digital, neuromorphic architecture designed for simulating large-scale spiking neural networks at speeds close to biological real-time. Rather than using bespoke analog or digital hardware, the basic computational unit of a SpiNNaker system is a general-purpose ARM processor, allowing it to be programmed to simulate a wide variety of neuron and synapse models. This flexibility is particularly valuable in the study of biological plasticity phenomena. A recently proposed learning rule based on the Bayesian Confidence Propagation Neural Network (BCPNN) paradigm offers a generic framework for modeling the interaction of different plasticity mechanisms using spiking neurons. However, it can be computationally expensive to simulate large networks with BCPNN learning since it requires multiple state variables for each synapse, each of which needs to be updated every simulation time-step. We discuss the trade-offs in efficiency and accuracy involved in developing an event-based BCPNN implementation for SpiNNaker based on an analytical solution to the BCPNN equations, and detail the steps taken to fit this within the limited computational and memory resources of the SpiNNaker architecture. We demonstrate this learning rule by learning temporal sequences of neural activity within a recurrent attractor network which we simulate at scales of up to 2.0 × 104 neurons and 5.1 × 107 plastic synapses: the largest plastic neural network ever to be simulated on neuromorphic hardware. We also run a comparable simulation on a Cray XC-30 supercomputer system and find that, if it is to match the run-time of our SpiNNaker simulation, the super computer system uses approximately 45× more power. This suggests that cheaper, more power efficient neuromorphic systems are becoming useful discovery tools in the study of plasticity in large-scale brain models. PMID:27092061
Scaling Properties of Algorithms in Nanotechnology

NASA Technical Reports Server (NTRS)

Saini, Subhash; Bailey, David H.; Chancellor, Marisa K. (Technical Monitor)

1996-01-01

At the present time, several technologies are pressing the limits of microminiature manufacturing. In semiconductor technology, for example, the Intel Pentium Pro (which is used in the Department of Energy's ASCI 'red' parallel supercomputer system) and the DEC Alpha 21164 (which is used in the CRAY T3E) both are fabricated using 0.35 micron process technology. Recently Texas Instruments (TI) announced the availability of 0.25 micron technology chips by the end of 1996 and plans to have 0.18 micron devices in production within two years. However, some significant challenges lie down the road. These include the skyrocketing cost of manufacturing plants, the 0.1 micron foreseeable limit of the photolithography process, quantum effects, data communication bandwidth limitations, heat dissipation, and others. Some related microminiature technologies include micro-electromechanical systems (MEMS), opto-electronic devices, quantum computing, biological computing, and others. All of these technologies require the fabrication of devices whose sizes are approaching the nanometer level. As such they are often collectively referred to with the name 'nanotechnology'. Clearly nanotechnology in this general sense is destined to be a very important technology of the 21st century. The ultimate dream in this arena is 'molecular nanotechnology', in other words the fabrication of devices and materials with most or all atoms and molecules in a pre-programmed position, possibly placed there by 'nano-robots'. This futuristic capability will probably not be achieved for at least two decades. However, it appears that somewhat less ambitious variations of molecular nanotechnology, such as devices and materials based on 'buckyballs' and 'nanotubes' may be realized significantly sooner, possibly within ten years or so. Even at the present time, semiconductor devices are approaching the regime where quantum chemical effects must be considered in design.
Computational Investigations of Noise Suppression in Subsonic Round Jets

NASA Technical Reports Server (NTRS)

Pruett, C. David

1997-01-01

NASA Grant NAG1-1802, originally submitted in June 1996 as a two-year proposal, was awarded one-year's funding by NASA LaRC for the period 5 Oct., 1996, through 4 Oct., 1997. Because of the inavailability (from IT at NASA ARC) of sufficient supercomputer time in fiscal 1998 to complete the computational goals of the second year of the original proposal (estimated to be at least 400 Cray C-90 CPU hours), those goals have been appropriately amended, and a new proposal has been submitted to LaRC as a follow-on to NAG1-1802. The current report documents the activities and accomplishments on NAG1-1802 during the one-year period from 5 Oct., 1996, through 4 Oct., 1997. NASA Grant NAG1-1802, and its predecessor, NAG1-1772, have been directed toward adapting the numerical tool of Large-Eddy Simulation (LES) to aeroacoustic applications, with particular focus on noise suppression in subsonic round jets. In LES, the filtered Navier-Stokes equations are solved numerically on a relatively coarse computational grid. Residual stresses, generated by scales of motion too small to be resolved on the coarse grid, are modeled. Although most LES incorporate spatial filtering, time-domain filtering affords certain conceptual and computational advantages, particularly for aeroacoustic applications. Consequently, this work has focused on the development of SubGrid-Scale (SGS) models that incorporate time- domain filters. The author is unaware of any previous attempt at purely time-filtered LES; however, Aldama and Dakhoul and Bedford have considered approaches that combine both spatial and temporal filtering. In our view, filtering in both space and time is redundant, because removal of high frequencies effects the removal of small spatial scales and vice versa.
Enabling parallel simulation of large-scale HPC network systems

DOE PAGES

Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.; ...

2016-04-07

Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Enabling parallel simulation of large-scale HPC network systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.

Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less

Geophysics of Small Planetary Bodies

NASA Technical Reports Server (NTRS)

Asphaug, Erik I.

1998-01-01

As a SETI Institute PI from 1996-1998, Erik Asphaug studied impact and tidal physics and other geophysical processes associated with small (low-gravity) planetary bodies. This work included: a numerical impact simulation linking basaltic achondrite meteorites to asteroid 4 Vesta (Asphaug 1997), which laid the groundwork for an ongoing study of Martian meteorite ejection; cratering and catastrophic evolution of small bodies (with implications for their internal structure; Asphaug et al. 1996); genesis of grooved and degraded terrains in response to impact; maturation of regolith (Asphaug et al. 1997a); and the variation of crater outcome with impact angle, speed, and target structure. Research of impacts into porous, layered and prefractured targets (Asphaug et al. 1997b, 1998a) showed how shape, rheology and structure dramatically affects sizes and velocities of ejecta, and the survivability and impact-modification of comets and asteroids (Asphaug et al. 1998a). As an affiliate of the Galileo SSI Team, the PI studied problems related to cratering, tectonics, and regolith evolution, including an estimate of the impactor flux around Jupiter and the effect of impact on local and regional tectonics (Asphaug et al. 1998b). Other research included tidal breakup modeling (Asphaug and Benz 1996; Schenk et al. 1996), which is leading to a general understanding of the role of tides in planetesimal evolution. As a Guest Computational Investigator for NASA's BPCC/ESS supercomputer testbed, helped graft SPH3D onto an existing tree code tuned for the massively parallel Cray T3E (Olson and Asphaug, in preparation), obtaining a factor xIO00 speedup in code execution time (on 512 cpus). Runs which once took months are now completed in hours.
Prospects for Boiling of Subcooled Dielectric Liquids for Supercomputer Cooling

NASA Astrophysics Data System (ADS)

Zeigarnik, Yu. A.; Vasil'ev, N. V.; Druzhinin, E. A.; Kalmykov, I. V.; Kosoi, A. S.; Khodakov, K. A.

2018-02-01

It is shown experimentally that using forced-convection boiling of dielectric coolants of the Novec 649 Refrigerant subcooled relative to the saturation temperature makes possible removing heat flow rates up to 100 W/cm2 from modern supercomputer chip interface. This fact creates prerequisites for the application of dielectric liquids in cooling systems of modern supercomputers with increased requirements for their operating reliability.
Parallel performance of TORT on the CRAY J90: Model and measurement

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barnett, A.; Azmy, Y.Y.

1997-10-01

A limitation on the parallel performance of TORT on the CRAY J90 is the amount of extra work introduced by the multitasking algorithm itself. The extra work beyond that of the serial version of the code, called overhead, arises from the synchronization of the parallel tasks and the accumulation of results by the master task. The goal of recent updates to TORT was to reduce the time consumed by these activities. To help understand which components of the multitasking algorithm contribute significantly to the overhead, a parallel performance model was constructed and compared to measurements of actual timings of themore » code.« less
Performance of a plasma fluid code on the Intel parallel computers

NASA Technical Reports Server (NTRS)

Lynch, V. E.; Carreras, B. A.; Drake, J. B.; Leboeuf, J. N.; Liewer, P.

1992-01-01

One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel (sigma) machine gives an improvement factor close to 64 over the single-processor CRAY-2.
Basic JCL for the CRAY-1 operating system (COS) with emphasis on making the transition from CDC 7600/SCOPE

NASA Technical Reports Server (NTRS)

Howe, G.; Saunders, D.

1983-01-01

Users of the CDC 7600 at Ames are assisted in making the transition to the CRAY-1. Similarities and differences in the basic JCL are summarized, and a dozen or so examples of typical batch jobs for the two systems are shown in parallel. Some changes to look for in FORTRAN programs and in the use of UPDATE are also indicated. No attempt is made to cover magnetic tape handling. The material here should not be considered a substitute for reading the more conventional manuals or the User's Guide for the Advanced Computational Facility, available from the Computer Information Center.
Theoretical research program to study chemical reactions in AOTV bow shock tubes

NASA Technical Reports Server (NTRS)

Taylor, P.

1986-01-01

Progress in the development of computational methods for the characterization of chemical reactions in aerobraking orbit transfer vehicle (AOTV) propulsive flows is reported. Two main areas of code development were undertaken: (1) the implementation of CASSCF (complete active space self-consistent field) and SCF (self-consistent field) analytical first derivatives on the CRAY X-MP; and (2) the installation of the complete set of electronic structure codes on the CRAY 2. In the area of application calculations the main effort was devoted to performing full configuration-interaction calculations and using these results to benchmark other methods. Preprints describing some of the systems studied are included.
Comparing the Performance of Blue Gene/Q with Leading Cray XE6 and InfiniBand Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kerbyson, Darren J.; Barker, Kevin J.; Vishnu, Abhinav

2013-01-21

Abstract—Three types of systems dominate the current High Performance Computing landscape: the Cray XE6, the IBM Blue Gene, and commodity clusters using InfiniBand. These systems have quite different characteristics making the choice for a particular deployment difficult. The XE6 uses Cray’s proprietary Gemini 3-D torus interconnect with two nodes at each network endpoint. The latest IBM Blue Gene/Q uses a single socket integrating processor and communication in a 5-D torus network. InfiniBand provides the flexibility of using nodes from many vendors connected in many possible topologies. The performance characteristics of each vary vastly along with their utilization model. In thismore » work we compare the performance of these three systems using a combination of micro-benchmarks and a set of production applications. In particular we discuss the causes of variability in performance across the systems and also quantify where performance is lost using a combination of measurements and models. Our results show that significant performance can be lost in normal production operation of the Cray XT6 and InfiniBand Clusters in comparison to Blue Gene/Q.« less
A comparison of the Cray-2 performance before and after the installation of memory pseudo-banking

NASA Technical Reports Server (NTRS)

Schmickley, Ronald D.; Bailey, David H.

1987-01-01

A suite of 13 large Fortran benchmark codes were run on a Cray-2 configured with memory pseudo-banking circuits, and floating point operation rates were measured for each under a variety of system load configurations. These were compared with similar flop measurements taken on the same system before installation of the pseudo-banking. A useful memory access efficiency parameter was defined and calculated for both sets of performance rates, allowing a crude quantitative measure of the improvement in efficiency due to pseudo-banking. Programs were categorized as either highly scalar (S) or highly vectorized (V) and either memory-intensive or register-intensive, giving 4 categories: S-memory, S-register, V-memory, and V-register. Using flop rates as a simple quantifier of these 4 categories, a scatter plot of efficiency gain vs Mflops roughly illustrates the improvement in floating point processing speed due to pseudo-banking. On the Cray-2 system tested this improvement ranged from 1 percent for S-memory codes to about 12 percent for V-memory codes. No significant gains were made for V-register codes, which was to be expected.
National Test Facility civilian agency use of supercomputers not feasible

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1994-12-01

Based on interviews with civilian agencies cited in the House report (DOE, DoEd, HHS, FEMA, NOAA), none would be able to make effective use of NTF`s excess supercomputing capabilities. These agencies stated they could not use the resources primarily because (1) NTF`s supercomputers are older machines whose performance and costs cannot match those of more advanced computers available from other sources and (2) some agencies have not yet developed applications requiring supercomputer capabilities or do not have funding to support such activities. In addition, future support for the hardware and software at NTF is uncertain, making any investment by anmore » outside user risky.« less
Kriging for Spatial-Temporal Data on the Bridges Supercomputer

NASA Astrophysics Data System (ADS)

Hodgess, E. M.

2017-12-01

Currently, kriging of spatial-temporal data is slow and limited to relatively small vector sizes. We have developed a method on the Bridges supercomputer, at the Pittsburgh supercomputer center, which uses a combination of the tools R, Fortran, the Message Passage Interface (MPI), OpenACC, and special R packages for big data. This combination of tools now permits us to complete tasks which could previously not be completed, or takes literally hours to complete. We ran simulation studies from a laptop against the supercomputer. We also look at "real world" data sets, such as the Irish wind data, and some weather data. We compare the timings. We note that the timings are suprising good.
Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

PubMed

Tajima, K

1988-11-01

This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.
Supercomputing in Aerospace

NASA Technical Reports Server (NTRS)

Kutler, Paul; Yee, Helen

1987-01-01

Topics addressed include: numerical aerodynamic simulation; computational mechanics; supercomputers; aerospace propulsion systems; computational modeling in ballistics; turbulence modeling; computational chemistry; computational fluid dynamics; and computational astrophysics.
INS3D - NUMERICAL SOLUTION OF THE INCOMPRESSIBLE NAVIER-STOKES EQUATIONS IN THREE-DIMENSIONAL GENERALIZED CURVILINEAR COORDINATES (DEC RISC ULTRIX VERSION)

NASA Technical Reports Server (NTRS)

Biyabani, S. R.

1994-01-01

INS3D computes steady-state solutions to the incompressible Navier-Stokes equations. The INS3D approach utilizes pseudo-compressibility combined with an approximate factorization scheme. This computational fluid dynamics (CFD) code has been verified on problems such as flow through a channel, flow over a backwardfacing step and flow over a circular cylinder. Three dimensional cases include flow over an ogive cylinder, flow through a rectangular duct, wind tunnel inlet flow, cylinder-wall juncture flow and flow through multiple posts mounted between two plates. INS3D uses a pseudo-compressibility approach in which a time derivative of pressure is added to the continuity equation, which together with the momentum equations form a set of four equations with pressure and velocity as the dependent variables. The equations' coordinates are transformed for general three dimensional applications. The equations are advanced in time by the implicit, non-iterative, approximately-factored, finite-difference scheme of Beam and Warming. The numerical stability of the scheme depends on the use of higher-order smoothing terms to damp out higher-frequency oscillations caused by second-order central differencing. The artificial compressibility introduces pressure (sound) waves of finite speed (whereas the speed of sound would be infinite in an incompressible fluid). As the solution converges, these pressure waves die out, causing the derivation of pressure with respect to time to approach zero. Thus, continuity is satisfied for the incompressible fluid in the steady state. Computational efficiency is achieved using a diagonal algorithm. A block tri-diagonal option is also available. When a steady-state solution is reached, the modified continuity equation will satisfy the divergence-free velocity field condition. INS3D is capable of handling several different types of boundaries encountered in numerical simulations, including solid-surface, inflow and outflow, and far-field boundaries. Three machine versions of INS3D are available. INS3D for the CRAY is written in CRAY FORTRAN for execution on a CRAY X-MP under COS, INS3D for the IBM is written in FORTRAN 77 for execution on an IBM 3090 under the VM or MVS operating system, and INS3D for DEC RISC-based systems is written in RISC FORTRAN for execution on a DEC workstation running RISC ULTRIX 3.1 or later. The CRAY version has a central memory requirement of 730279 words. The central memory requirement for the IBM is 150Mb. The memory requirement for the DEC RISC ULTRIX version is 3Mb of main memory. INS3D was developed in 1987. The port to the IBM was done in 1990. The port to the DECstation 3100 was done in 1991. CRAY is a registered trademark of Cray Research Inc. IBM is a registered trademark of International Business Machines. DEC, DECstation, and ULTRIX are trademarks of the Digital Equipment Corporation.
PLOT3D/AMES, APOLLO UNIX VERSION USING GMR3D (WITHOUT TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. The Apollo implementation of PLOT3D uses some of the capabilities of Apollo's 3-dimensional graphics hardware, but does not take advantage of the shading and hidden line/surface removal capabilities of the Apollo DN10000. Although this implementation does not offer a capability for putting text on plots, it does support the use of a mouse to translate, rotate, or zoom in on views. The version 3.6b+ Apollo implementations of PLOT3D (ARC-12789) and PLOT3D/TURB3D (ARC-12785) were developed for use on Apollo computers running UNIX System V with BSD 4.3 extensions and the graphics library GMR3D Version 2.0. The standard distribution media for each of these programs is a 9-track, 6250 bpi magnetic tape in TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: 1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, and Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); 2) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC-12777, ARC-12781); 3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstations (ARC-12783, ARC-12782). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. UNIX is a registered trademark of AT&T.
PLOT3D/AMES, SGI IRIS VERSION (WITHOUT TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. In each of these areas, the IRIS implementation of PLOT3D offers advanced features which aid visualization efforts. Shading and hidden line/surface removal can be used to enhance depth perception and other aspects of the graphical displays. A mouse can be used to translate, rotate, or zoom in on views. Files for several types of output can be produced. Two animation options are even offered: creation of simple animation sequences without the need for other software; and, creation of files for use in GAS (Graphics Animation System, ARC-12379), an IRIS program which offers more complex rendering and animation capabilities and can record images to digital disk, video tape, or 16-mm film. The version 3.6b+ SGI implementations of PLOT3D (ARC-12783) and PLOT3D/TURB3D (ARC-12782) were developed for use on Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstations. These programs are each distributed on one .25 inch magnetic tape cartridge in IRIS TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, and Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); (2) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC-12777,ARC-12781); (3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. UNIX is a registered trademark of AT&T.
PLOT3D/AMES, GENERIC UNIX VERSION USING DISSPLA (WITH TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. The UNIX/DISSPLA implementation of PLOT3D supports 2-D polygons as well as 2-D and 3-D lines, but does not support graphics features requiring 3-D polygons (shading and hidden line removal, for example). Views can be manipulated using keyboard commands. This version of PLOT3D is potentially able to produce files for a variety of output devices; however, site-specific capabilities will vary depending on the device drivers supplied with the user's DISSPLA library. The version 3.6b+ UNIX/DISSPLA implementations of PLOT3D (ARC-12788) and PLOT3D/TURB3D (ARC-12778) were developed for use on computers running UNIX SYSTEM 5 with BSD 4.3 extensions. The standard distribution media for each ofthese programs is a 9track, 6250 bpi magnetic tape in TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); (2) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D (ARC-12783, ARC-12782); (3) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC-12777, ARC-12781); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. System 5 is a trademark of Bell Labs, Incorporated. BSD4.3 is a trademark of the University of California at Berkeley. UNIX is a registered trademark of AT&T.
PLOT3D/AMES, APOLLO UNIX VERSION USING GMR3D (WITH TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. The Apollo implementation of PLOT3D uses some of the capabilities of Apollo's 3-dimensional graphics hardware, but does not take advantage of the shading and hidden line/surface removal capabilities of the Apollo DN10000. Although this implementation does not offer a capability for putting text on plots, it does support the use of a mouse to translate, rotate, or zoom in on views. The version 3.6b+ Apollo implementations of PLOT3D (ARC-12789) and PLOT3D/TURB3D (ARC-12785) were developed for use on Apollo computers running UNIX System V with BSD 4.3 extensions and the graphics library GMR3D Version 2.0. The standard distribution media for each of these programs is a 9-track, 6250 bpi magnetic tape in TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: 1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, and Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); 2) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC-12777, ARC-12781); 3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstations (ARC-12783, ARC-12782). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. UNIX is a registered trademark of AT&T.
PLOT3D/AMES, SGI IRIS VERSION (WITH TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. In each of these areas, the IRIS implementation of PLOT3D offers advanced features which aid visualization efforts. Shading and hidden line/surface removal can be used to enhance depth perception and other aspects of the graphical displays. A mouse can be used to translate, rotate, or zoom in on views. Files for several types of output can be produced. Two animation options are even offered: creation of simple animation sequences without the need for other software; and, creation of files for use in GAS (Graphics Animation System, ARC-12379), an IRIS program which offers more complex rendering and animation capabilities and can record images to digital disk, video tape, or 16-mm film. The version 3.6b+ SGI implementations of PLOT3D (ARC-12783) and PLOT3D/TURB3D (ARC-12782) were developed for use on Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D workstations. These programs are each distributed on one .25 inch magnetic tape cartridge in IRIS TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, and Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); (2) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC-12777,ARC-12781); (3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. UNIX is a registered trademark of AT&T.
PLOT3D/AMES, GENERIC UNIX VERSION USING DISSPLA (WITHOUT TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. The UNIX/DISSPLA implementation of PLOT3D supports 2-D polygons as well as 2-D and 3-D lines, but does not support graphics features requiring 3-D polygons (shading and hidden line removal, for example). Views can be manipulated using keyboard commands. This version of PLOT3D is potentially able to produce files for a variety of output devices; however, site-specific capabilities will vary depending on the device drivers supplied with the user's DISSPLA library. The version 3.6b+ UNIX/DISSPLA implementations of PLOT3D (ARC-12788) and PLOT3D/TURB3D (ARC-12778) were developed for use on computers running UNIX SYSTEM 5 with BSD 4.3 extensions. The standard distribution media for each ofthese programs is a 9track, 6250 bpi magnetic tape in TAR format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); (2) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D (ARC-12783, ARC-12782); (3) VAX computers running VMS Version 5.0 and DISSPLA Version 11.0 (ARC-12777, ARC-12781); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. System 5 is a trademark of Bell Labs, Incorporated. BSD4.3 is a trademark of the University of California at Berkeley. UNIX is a registered trademark of AT&T.
Desktop supercomputer: what can it do?

NASA Astrophysics Data System (ADS)

Bogdanov, A.; Degtyarev, A.; Korkhov, V.

2017-12-01

The paper addresses the issues of solving complex problems that require using supercomputers or multiprocessor clusters available for most researchers nowadays. Efficient distribution of high performance computing resources according to actual application needs has been a major research topic since high-performance computing (HPC) technologies became widely introduced. At the same time, comfortable and transparent access to these resources was a key user requirement. In this paper we discuss approaches to build a virtual private supercomputer available at user's desktop: a virtual computing environment tailored specifically for a target user with a particular target application. We describe and evaluate possibilities to create the virtual supercomputer based on light-weight virtualization technologies, and analyze the efficiency of our approach compared to traditional methods of HPC resource management.

Demonstration of NICT Space Weather Cloud --Integration of Supercomputer into Analysis and Visualization Environment--

NASA Astrophysics Data System (ADS)

Watari, S.; Morikawa, Y.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Kato, H.; Shimojo, S.; Murata, K. T.

2010-12-01

In the Solar-Terrestrial Physics (STP) field, spatio-temporal resolution of computer simulations is getting higher and higher because of tremendous advancement of supercomputers. A more advanced technology is Grid Computing that integrates distributed computational resources to provide scalable computing resources. In the simulation research, it is effective that a researcher oneself designs his physical model, performs calculations with a supercomputer, and analyzes and visualizes for consideration by a familiar method. A supercomputer is far from an analysis and visualization environment. In general, a researcher analyzes and visualizes in the workstation (WS) managed at hand because the installation and the operation of software in the WS are easy. Therefore, it is necessary to copy the data from the supercomputer to WS manually. Time necessary for the data transfer through long delay network disturbs high-accuracy simulations actually. In terms of usefulness, integrating a supercomputer and an analysis and visualization environment seamlessly with a researcher's familiar method is important. NICT has been developing a cloud computing environment (NICT Space Weather Cloud). In the NICT Space Weather Cloud, disk servers are located near its supercomputer and WSs for data analysis and visualization. They are connected to JGN2plus that is high-speed network for research and development. Distributed virtual high-capacity storage is also constructed by Grid Datafarm (Gfarm v2). Huge-size data output from the supercomputer is transferred to the virtual storage through JGN2plus. A researcher can concentrate on the research by a familiar method without regard to distance between a supercomputer and an analysis and visualization environment. Now, total 16 disk servers are setup in NICT headquarters (at Koganei, Tokyo), JGN2plus NOC (at Otemachi, Tokyo), Okinawa Subtropical Environment Remote-Sensing Center, and Cybermedia Center, Osaka University. They are connected on JGN2plus, and they constitute 1PB (physical size) virtual storage by Gfarm v2. These disk servers are connected with supercomputers of NICT and Osaka University. A system that data output from the supercomputers are automatically transferred to the virtual storage had been built up. Transfer rate is about 50 GB/hrs by actual measurement. It is estimated that the performance is reasonable for a certain simulation and analysis for reconstruction of coronal magnetic field. This research is assumed an experiment of the system, and the verification of practicality is advanced at the same time. Herein we introduce an overview of the space weather cloud system so far we have developed. We also demonstrate several scientific results using the space weather cloud system. We also introduce several web applications of the cloud as a service of the space weather cloud, which is named as "e-SpaceWeather" (e-SW). The e-SW provides with a variety of space weather online services from many aspects.
Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

DOE Office of Scientific and Technical Information (OSTI.GOV)

De, K; Jha, S; Klimentov, A

2016-01-01

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), MIRA supercomputer at Argonne Leadership Computing Facilities (ALCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava and others). Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full production for the ATLAS experiment since September 2015. We will present our current accomplishments with running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less
New tools using the hardware performance monitor to help users tune programs on the Cray X-MP

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engert, D.E.; Rudsinski, L.; Doak, J.

1991-09-25

The performance of a Cray system is highly dependent on the tuning techniques used by individuals on their codes. Many of our users were not taking advantage of the tuning tools that allow them to monitor their own programs by using the Hardware Performance Monitor (HPM). We therefore modified UNICOS to collect HPM data for all processes and to report Mflop ratings based on users, programs, and time used. Our tuning efforts are now being focused on the users and programs that have the best potential for performance improvements. These modifications and some of the more striking performance improvements aremore » described.« less
Color graphics, interactive processing, and the supercomputer

NASA Technical Reports Server (NTRS)

Smith-Taylor, Rudeen

1987-01-01

The development of a common graphics environment for the NASA Langley Research Center user community and the integration of a supercomputer into this environment is examined. The initial computer hardware, the software graphics packages, and their configurations are described. The addition of improved computer graphics capability to the supercomputer, and the utilization of the graphic software and hardware are discussed. Consideration is given to the interactive processing system which supports the computer in an interactive debugging, processing, and graphics environment.
Automated Help System For A Supercomputer

NASA Technical Reports Server (NTRS)

Callas, George P.; Schulbach, Catherine H.; Younkin, Michael

1994-01-01

Expert-system software developed to provide automated system of user-helping displays in supercomputer system at Ames Research Center Advanced Computer Facility. Users located at remote computer terminals connected to supercomputer and each other via gateway computers, local-area networks, telephone lines, and satellite links. Automated help system answers routine user inquiries about how to use services of computer system. Available 24 hours per day and reduces burden on human experts, freeing them to concentrate on helping users with complicated problems.
"Curso de Vulcanología General": Web-education efforts on volcanic hazards for the Latin American region from Mexico.

NASA Astrophysics Data System (ADS)

Delgado, Hugo

2016-04-01

Education of volcanic hazards is a never-ending task in countries where volcanoes erupt very frequently as they do in the Latin American region (LAR). Eleven countries in the LAR have active volcanoes within their territories and some volcanoes are located in between countries so the volcanic hazards associated to the eruption of those volcanoes affect more than one country. Besides, countries without volcanoes within their territory (i. e. Belize, Honduras or Brazil) can be impacted as well. Personnel working at several volcano observatories in the LAR need training in Volcanology and, more importantly, in Volcanic Hazards. Unfortunately, Volcanology is a discipline that is not taught at universities of some countries. Even worse, Earth Sciences are not even taught at high education centers in some countries of the LAR. Thus, there is an important need for the acquisition of volcanological knowledge by the personnel working at volcano observatories but there are no possibilities for them to study at their countries or they are impended for travel abroad for training. The international course: "Curso de Vulcanología General" taught from Mexico City at the Universidad Nacional Autónoma de México (UNAM) has been successfully implemented and has been active over the last five years. Nearly 700 students have participated in this course although only ~150 have been awarded the certificate UNAM grants to the students who have concluded the course successfully. This course has been sponsored by UNAM, ALVO (Latin American Volcanological Association) and IAVCEI (International Association of Volcanology and Chemistry of the Earth's Interior). More than 50 lecturers from LAR, Europe and US have been involved in these courses. Here, Reflections on the course, the opportunities sparkled, the educational tools, benefits, statistics and virtues of the course are presented.
NASA Advanced Supercomputing (NAS) User Services Group

NASA Technical Reports Server (NTRS)

Pandori, John; Hamilton, Chris; Niggley, C. E.; Parks, John W. (Technical Monitor)

2002-01-01

This viewgraph presentation provides an overview of NAS (NASA Advanced Supercomputing), its goals, and its mainframe computer assets. Also covered are its functions, including systems monitoring and technical support.
A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kerbyson, Darren J.; Barker, Kevin J.; Vishnu, Abhinav

2014-01-01

We present here a performance analysis of three of current architectures that have become commonplace in the High Performance Computing world. Blue Gene/Q is the third generation of systems from IBM that use modestly performing cores but at large-scale in order to achieve high performance. The XE6 is the latest in a long line of Cray systems that use a 3-D topology but the first to use its Gemini interconnection network. InfiniBand provides the flexibility of using compute nodes from many vendors that can be connected in many possible topologies. The performance characteristics of each vary vastly, and the waymore » in which nodes are allocated in each type of system can significantly impact on achieved performance. In this work we compare these three systems using a combination of micro-benchmarks and a set of production applications. In addition we also examine the differences in performance variability observed on each system and quantify the lost performance using a combination of both empirical measurements and performance models. Our results show that significant performance can be lost in normal production operation of the Cray XE6 and InfiniBand Clusters in comparison to Blue Gene/Q.« less
Large-Scale Parallel Viscous Flow Computations using an Unstructured Multigrid Algorithm

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

1999-01-01

The development and testing of a parallel unstructured agglomeration multigrid algorithm for steady-state aerodynamic flows is discussed. The agglomeration multigrid strategy uses a graph algorithm to construct the coarse multigrid levels from the given fine grid, similar to an algebraic multigrid approach, but operates directly on the non-linear system using the FAS (Full Approximation Scheme) approach. The scalability and convergence rate of the multigrid algorithm are examined on the SGI Origin 2000 and the Cray T3E. An argument is given which indicates that the asymptotic scalability of the multigrid algorithm should be similar to that of its underlying single grid smoothing scheme. For medium size problems involving several million grid points, near perfect scalability is obtained for the single grid algorithm, while only a slight drop-off in parallel efficiency is observed for the multigrid V- and W-cycles, using up to 128 processors on the SGI Origin 2000, and up to 512 processors on the Cray T3E. For a large problem using 25 million grid points, good scalability is observed for the multigrid algorithm using up to 1450 processors on a Cray T3E, even when the coarsest grid level contains fewer points than the total number of processors.
NSF Commits to Supercomputers.

ERIC Educational Resources Information Center

Waldrop, M. Mitchell

1985-01-01

The National Science Foundation (NSF) has allocated at least $200 million over the next five years to support four new supercomputer centers. Issues and trends related to this NSF initiative are examined. (JN)
Mira: Argonne's 10-petaflops supercomputer

ScienceCinema

Papka, Michael; Coghlan, Susan; Isaacs, Eric; Peters, Mark; Messina, Paul

2018-02-13

Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.
Adventures in Computational Grids

NASA Technical Reports Server (NTRS)

Walatka, Pamela P.; Biegel, Bryan A. (Technical Monitor)

2002-01-01

Sometimes one supercomputer is not enough. Or your local supercomputers are busy, or not configured for your job. Or you don't have any supercomputers. You might be trying to simulate worldwide weather changes in real time, requiring more compute power than you could get from any one machine. Or you might be collecting microbiological samples on an island, and need to examine them with a special microscope located on the other side of the continent. These are the times when you need a computational grid.
Mira: Argonne's 10-petaflops supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Papka, Michael; Coghlan, Susan; Isaacs, Eric

2013-07-03

Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.
Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)

ScienceCinema

Guenther, Chris

2018-05-23

The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.
A high level language for a high performance computer

NASA Technical Reports Server (NTRS)

Perrott, R. H.

1978-01-01

The proposed computational aerodynamic facility will join the ranks of the supercomputers due to its architecture and increased execution speed. At present, the languages used to program these supercomputers have been modifications of programming languages which were designed many years ago for sequential machines. A new programming language should be developed based on the techniques which have proved valuable for sequential programming languages and incorporating the algorithmic techniques required for these supercomputers. The design objectives for such a language are outlined.
Technology advances and market forces: Their impact on high performance architectures

NASA Technical Reports Server (NTRS)

Best, D. R.

1978-01-01

Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.
Floating point arithmetic in future supercomputers

NASA Technical Reports Server (NTRS)

Bailey, David H.; Barton, John T.; Simon, Horst D.; Fouts, Martin J.

1989-01-01

Considerations in the floating-point design of a supercomputer are discussed. Particular attention is given to word size, hardware support for extended precision, format, and accuracy characteristics. These issues are discussed from the perspective of the Numerical Aerodynamic Simulation Systems Division at NASA Ames. The features believed to be most important for a future supercomputer floating-point design include: (1) a 64-bit IEEE floating-point format with 11 exponent bits, 52 mantissa bits, and one sign bit and (2) hardware support for reasonably fast double-precision arithmetic.
Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guenther, Chris

The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.
Optimizing Blocking and Nonblocking Reduction Operations for Multicore Systems: Hierarchical Design and Implementation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gorentla Venkata, Manjunath; Shamis, Pavel; Graham, Richard L

2013-01-01

Many scientific simulations, using the Message Passing Interface (MPI) programming model, are sensitive to the performance and scalability of reduction collective operations such as MPI Allreduce and MPI Reduce. These operations are the most widely used abstractions to perform mathematical operations over all processes that are part of the simulation. In this work, we propose a hierarchical design to implement the reduction operations on multicore systems. This design aims to improve the efficiency of reductions by 1) tailoring the algorithms and customizing the implementations for various communication mechanisms in the system 2) providing the ability to configure the depth ofmore » hierarchy to match the system architecture, and 3) providing the ability to independently progress each of this hierarchy. Using this design, we implement MPI Allreduce and MPI Reduce operations (and its nonblocking variants MPI Iallreduce and MPI Ireduce) for all message sizes, and evaluate on multiple architectures including InfiniBand and Cray XT5. We leverage and enhance our existing infrastructure, Cheetah, which is a framework for implementing hierarchical collective operations to implement these reductions. The experimental results show that the Cheetah reduction operations outperform the production-grade MPI implementations such as Open MPI default, Cray MPI, and MVAPICH2, demonstrating its efficiency, flexibility and portability. On Infini- Band systems, with a microbenchmark, a 512-process Cheetah nonblocking Allreduce and Reduce achieves a speedup of 23x and 10x, respectively, compared to the default Open MPI reductions. The blocking variants of the reduction operations also show similar performance benefits. A 512-process nonblocking Cheetah Allreduce achieves a speedup of 3x, compared to the default MVAPICH2 Allreduce implementation. On a Cray XT5 system, a 6144-process Cheetah Allreduce outperforms the Cray MPI by 145%. The evaluation with an application kernel, Conjugate Gradient solver, shows that the Cheetah reductions speeds up total time to solution by 195%, demonstrating the potential benefits for scientific simulations.« less
PAN AIR: A computer program for predicting subsonic or supersonic linear potential flows about arbitrary configurations using a higher order panel method. Volume 4: Maintenance document (version 3.0)

NASA Technical Reports Server (NTRS)

Purdon, David J.; Baruah, Pranab K.; Bussoletti, John E.; Epton, Michael A.; Massena, William A.; Nelson, Franklin D.; Tsurusaki, Kiyoharu

1990-01-01

The Maintenance Document Version 3.0 is a guide to the PAN AIR software system, a system which computes the subsonic or supersonic linear potential flow about a body of nearly arbitrary shape, using a higher order panel method. The document describes the overall system and each program module of the system. Sufficient detail is given for program maintenance, updating, and modification. It is assumed that the reader is familiar with programming and CRAY computer systems. The PAN AIR system was written in FORTRAN 4 language except for a few CAL language subroutines which exist in the PAN AIR library. Structured programming techniques were used to provide code documentation and maintainability. The operating systems accommodated are COS 1.11, COS 1.12, COS 1.13, and COS 1.14 on the CRAY 1S, 1M, and X-MP computing systems. The system is comprised of a data base management system, a program library, an execution control module, and nine separate FORTRAN technical modules. Each module calculates part of the posed PAN AIR problem. The data base manager is used to communicate between modules and within modules. The technical modules must be run in a prescribed fashion for each PAN AIR problem. In order to ease the problem of supplying the many JCL cards required to execute the modules, a set of CRAY procedures (PAPROCS) was created to automatically supply most of the JCL cards. Most of this document has not changed for Version 3.0. It now, however, strictly applies only to PAN AIR version 3.0. The major changes are: (1) additional sections covering the new FDP module (which calculates streamlines and offbody points); (2) a complete rewrite of the section on the MAG module; and (3) strict applicability to CRAY computing systems.

Juan Carlos D'Olivo: A portrait

NASA Astrophysics Data System (ADS)

Aguilar-Arévalo, Alexis A.

2013-06-01

This report attempts to give a brief bibliographical sketch of the academic life of Juan Carlos D'Olivo, researcher and teacher at the Instituto de Ciencias Nucleares of UNAM, devoted to advancing the fields of High Energy Physics and Astroparticle Physics in Mexico and Latin America.
Meteorological Factors Affecting Evaporation Duct Height Climatologies.

DTIC Science & Technology

1980-07-01

Italy Maritime Meteorology Division Japan Meteorological Agency Ote-Machi 1-3-4 Chiyoda-Ku Tokyo, Japan Instituto De Geofisica U.N.A.M. Biblioteca ...Torre De Ciencias, 3ER Piso Ciudad Universitaria Mexico 20, D.F. Koninklijk Nederlands Meteorologisch Instituu. Postbus 201 3730 AE Debilt Netherlands
Effect of Seeding Particles on the Shock Structure of a Supersonic Jet

NASA Astrophysics Data System (ADS)

Porta, David; Echeverría, Carlos; Stern, Catalina

2012-11-01

The original goal of our work was to measure. With PIV, the velocity field of a supersonic flow produced by the discharge of air through a 4mm cylindrical nozzle. The results were superposed to a shadowgraph and combined with previous density measurements made with a Rayleigh scattering technique. The idea was to see if there were any changes in the flow field, close to the high density areas near the shocks. Shadowgraphs were made with and without seeding particles, (spheres of titanium dioxide). Surprisingly, it was observed that the flow structure with particles was shifted in the direction opposite to the flow with respect to the flow structure obtained without seeds. This result might contradict the belief that the seeding particles do not affect the flow and that the speed of the seeds correspond to the local speed of the flow. We acknowledge support from DGAPA UNAM through project IN117712 and from Facultad de Ciencias UNAM.
CO2 variability from in situ and vertical column measurements in Mexico City

NASA Astrophysics Data System (ADS)

Baylon, J. L.; Grutter, M.; Stremme, W.; Bezanilla, A.; Plaza, E.

2014-12-01

UNAM started a program to measure, among many other atmospheric parameters, greenhouse gas concentrations at six stations in the Mexican territory as part of the "Red Universitaria de Observatorios Atmosfericos", RUOA (www.ruoa.unam.mx). In this work we present recent time series of CO2 measured at the station located in the university campus in Mexico City, and compare them to total vertical columns of this gas measured at the same location. In situ measurements are continuously carried out with a cavity ring-down spectrometer (Picarro Inc., G2401) since July 2014 and the columns are retrieved from solar absorption measurements taken with a Fourier transform infrared spectrometer (Bruker, Vertex 80) when conditions allow. The retrieval method is described and results of the comparison of both techniques and a detailed analysis of the variability of this important greenhouse gas is presented. Simultaneous surface and column CO2 data are useful to constrain models and estimate emissions.
TRACTOR_DB: a database of regulatory networks in gamma-proteobacterial genomes

PubMed Central

González, Abel D.; Espinosa, Vladimir; Vasconcelos, Ana T.; Pérez-Rueda, Ernesto; Collado-Vides, Julio

2005-01-01

Experimental data on the Escherichia coli transcriptional regulatory system has been used in the past years to predict new regulatory elements (promoters, transcription factors (TFs), TFs' binding sites and operons) within its genome. As more genomes of gamma-proteobacteria are being sequenced, the prediction of these elements in a growing number of organisms has become more feasible, as a step towards the study of how different bacteria respond to environmental changes at the level of transcriptional regulation. In this work, we present TRACTOR_DB (TRAnscription FaCTORs' predicted binding sites in prokaryotic genomes), a relational database that contains computational predictions of new members of 74 regulons in 17 gamma-proteobacterial genomes. For these predictions we used a comparative genomics approach regarding which several proof-of-principle articles for large regulons have been published. TRACTOR_DB may be currently accessed at http://www.bioinfo.cu/Tractor_DB, http://www.tractor.lncc.br/ or at http://www.cifn.unam.mx/Computational_Genomics/tractorDB. Contact Email id is tractor@cifn.unam.mx. PMID:15608293
An Integral, Multidisciplinary and Global Geophysical Field Experience for Undergraduates

NASA Astrophysics Data System (ADS)

Vázquez, O.; Carrillo, D. J.; Pérez-Campos, X.

2007-05-01

The udergraduate program of Geophysical Engineering at the School of Engineering, of the Univesidad Nacional Autónoma de México (UNAM), went through an update process that concluded in 2006. As part of the program, the student takes three geophysical prospecting courses (gravity and magnetics, electric, electromagnetics, and seismic methods). The older program required a three-week field experience for each course in order to gradute. The new program considers only one extended field experience. This work stresses the importance of international academic exchange, where undergraduate students could participate, such as the Summer of Applied Geophysical Experience (SAGE), and interaction with research programs, such as the MesoAmerican Subduction Experiment (MASE). Also, we propose a scheeme for this activity based on those examples; both of them have in common real geophysical problems, from which students could benefit. Our proposal covers academic and logistic aspects to be taken into account, enhancing the relevance of interaction between other academic institutions, industry, and UNAM, in order to obtain a broader view of geophysics.
Integration of Panda Workload Management System with supercomputers

NASA Astrophysics Data System (ADS)

De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Novikov, A.; Oleynik, D.; Panitkin, S.; Poyda, A.; Read, K. F.; Ryabinkin, E.; Teslyuk, A.; Velikhov, V.; Wells, J. C.; Wenaus, T.

2016-09-01

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center "Kurchatov Institute", IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run singlethreaded workloads in parallel on Titan's multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility's infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.
Tracing Scientific Facilities through the Research Literature Using Persistent Identifiers

NASA Astrophysics Data System (ADS)

Mayernik, M. S.; Maull, K. E.

2016-12-01

Tracing persistent identifiers to their source publications is an easy task when authors use them, since it is a simple matter of matching the persistent identifier to the specific text string of the identifier. However, trying to understand if a publication uses the resource behind an identifier when such identifier is not referenced explicitly is a harder task. In this research, we explore the effectiveness of alternative strategies of associating publications with uses of the resource referenced by an identifier when it may not be explicit. This project is explored within the context of the NCAR supercomputer, where we are broadly interesting in the science that can be traced to the usage of the NCAR supercomputing facility, by way of the peer-reviewed research publications that utilize and reference it. In this project we explore several ways of drawing linkages between publications and the NCAR supercomputing resources. Identifying and compiling peer-reviewed publications related to NCAR supercomputer usage are explored via three sources: 1) User-supplied publications gathered through a community survey, 2) publications that were identified via manual searching of the Google scholar search index, and 3) publications associated with National Science Foundation (NSF) grants extracted from a public NSF database. These three sources represent three styles of collecting information about publications that likely imply usage of the NCAR supercomputing facilities. Each source has strengths and weaknesses, thus our discussion will explore how our publication identification and analysis methods vary in terms of accuracy, reliability, and effort. We will also discuss strategies for enabling more efficient tracing of research impacts of supercomputing facilities going forward through the assignment of a persistent web identifier to the NCAR supercomputer. While this solution has potential to greatly enhance our ability to trace the use of the facility through publications, authors must cite the facility consistently. It is therefore necessary to provide recommendations for citation and attribution behavior, and we will conclude our discussion with how such recommendations have improved tracing the supercomputer facility allowing for more consistent and widespread measurement of its impact.
Energy Efficient Supercomputing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anypas, Katie

2014-10-17

Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.
Energy Efficient Supercomputing

ScienceCinema

Anypas, Katie

2018-05-07

Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.
Space Radar Image of Mammoth Mountain, California

NASA Image and Video Library

1999-05-01

These two false-color composite images of the Mammoth Mountain area in the Sierra Nevada Mountains, Calif., show significant seasonal changes in snow cover. The image at left was acquired by the Spaceborne Imaging Radar-C and X-band Synthetic Aperture Radar aboard the space shuttle Endeavour on its 67th orbit on April 13, 1994. The image is centered at 37.6 degrees north latitude and 119 degrees west longitude. The area is about 36 kilometers by 48 kilometers (22 miles by 29 miles). In this image, red is L-band (horizontally transmitted and vertically received) polarization data; green is C-band (horizontally transmitted and vertically received) polarization data; and blue is C-band (horizontally transmitted and received) polarization data. The image at right was acquired on October 3, 1994, on the space shuttle Endeavour's 67th orbit of the second radar mission. Crowley Lake appears dark at the center left of the image, just above or south of Long Valley. The Mammoth Mountain ski area is visible at the top right of the scene. The red areas correspond to forests, the dark blue areas are bare surfaces and the green areas are short vegetation, mainly brush. The changes in color tone at the higher elevations (e.g. the Mammoth Mountain ski area) from green-blue in April to purple in September reflect changes in snow cover between the two missions. The April mission occurred immediately following a moderate snow storm. During the mission the snow evolved from a dry, fine-grained snowpack with few distinct layers to a wet, coarse-grained pack with multiple ice inclusions. Since that mission, all snow in the area has melted except for small glaciers and permanent snowfields on the Silver Divide and near the headwaters of Rock Creek. On October 3, 1994, only discontinuous patches of snow cover were present at very high elevations following the first snow storm of the season on September 28, 1994. For investigations in hydrology and land-surface climatology, seasonal snow cover and alpine glaciers are critical to the radiation and water balances. SIR-C/X-SAR is a powerful tool because it is sensitive to most snowpack conditions and is less influenced by weather conditions than other remote sensing instruments, such as Landsat. In parallel with the operational SIR-C data processing, an experimental effort is being conducted to test SAR data processing using the Jet Propulsion Laboratory's massively parallel supercomputing facility, centered around the Cray Research T3D. These experiments will assess the abilities of large supercomputers to produce high throughput SAR processing in preparation for upcoming data-intensive SAR missions. The images released here were produced as part of this experimental effort. http://photojournal.jpl.nasa.gov/catalog/PIA01753
A Portable Regional Weather and Climate Downscaling System Using GEOS-5, LIS-6, WRF, and the NASA Workflow Tool

NASA Astrophysics Data System (ADS)

Kemp, E. M.; Putman, W. M.; Gurganus, J.; Burns, R. W.; Damon, M. R.; McConaughy, G. R.; Seablom, M. S.; Wojcik, G. S.

2009-12-01

We present a regional downscaling system (RDS) suitable for high-resolution weather and climate simulations in multiple supercomputing environments. The RDS is built on the NASA Workflow Tool, a software framework for configuring, running, and managing computer models on multiple platforms with a graphical user interface. The Workflow Tool is used to run the NASA Goddard Earth Observing System Model Version 5 (GEOS-5), a global atmospheric-ocean model for weather and climate simulations down to 1/4 degree resolution; the NASA Land Information System Version 6 (LIS-6), a land surface modeling system that can simulate soil temperature and moisture profiles; and the Weather Research and Forecasting (WRF) community model, a limited-area atmospheric model for weather and climate simulations down to 1-km resolution. The Workflow Tool allows users to customize model settings to user needs; saves and organizes simulation experiments; distributes model runs across different computer clusters (e.g., the DISCOVER cluster at Goddard Space Flight Center, the Cray CX-1 Desktop Supercomputer, etc.); and handles all file transfers and network communications (e.g., scp connections). Together, the RDS is intended to aid researchers by making simulations as easy as possible to generate on the computer resources available. Initial conditions for LIS-6 and GEOS-5 are provided by Modern Era Retrospective-Analysis for Research and Applications (MERRA) reanalysis data stored on DISCOVER. The LIS-6 is first run for 2-4 years forced by MERRA atmospheric analyses, generating initial conditions for the WRF soil physics. GEOS-5 is then initialized from MERRA data and run for the period of interest. Large-scale atmospheric data, sea-surface temperatures, and sea ice coverage from GEOS-5 are used as boundary conditions for WRF, which is run for the same period of interest. Multiply nested grids are used for both LIS-6 and WRF, with the innermost grid run at a resolution sufficient for typical local weather features (terrain, convection, etc.) All model runs, restarts, and file transfers are coordinated by the Workflow Tool. Two use cases are being pursued. First, the RDS generates regional climate simulations down to 4-km for the Chesapeake Bay region, with WRF output provided as input to more specialized models (e.g., ocean/lake, hydrological, marine biology, and air pollution). This will allow assessment of climate impact on local interests (e.g., changes in Bay water levels and temperatures, innundation, fish kills, etc.) Second, the RDS generates high-resolution hurricane simulations in the tropical North Atlantic. This use case will support Observing System Simulation Experiments (OSSEs) of dynamically-targeted lidar observations as part of the NASA Sensor Web Simulator project. Sample results will be presented at the AGU Fall Meeting.
Job Management Requirements for NAS Parallel Systems and Clusters

NASA Technical Reports Server (NTRS)

Saphir, William; Tanner, Leigh Ann; Traversat, Bernard

1995-01-01

A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance computing industry. Newer job management systems offer new functionality but do not solve fundamental problems. We address some of the main issues in resource allocation and job scheduling we have encountered on two parallel computers - a 160-node IBM SP2 and a cluster of 20 high performance workstations located at the Numerical Aerodynamic Simulation facility. We describe the requirements for resource allocation and job management that are necessary to provide a production supercomputing environment on these machines, prioritizing according to difficulty and importance, and advocating a return to fundamental issues.
Parallel Visualization of Large-Scale Aerodynamics Calculations: A Case Study on the Cray T3E

NASA Technical Reports Server (NTRS)

Ma, Kwan-Liu; Crockett, Thomas W.

1999-01-01

This paper reports the performance of a parallel volume rendering algorithm for visualizing a large-scale, unstructured-grid dataset produced by a three-dimensional aerodynamics simulation. This dataset, containing over 18 million tetrahedra, allows us to extend our performance results to a problem which is more than 30 times larger than the one we examined previously. This high resolution dataset also allows us to see fine, three-dimensional features in the flow field. All our tests were performed on the Silicon Graphics Inc. (SGI)/Cray T3E operated by NASA's Goddard Space Flight Center. Using 511 processors, a rendering rate of almost 9 million tetrahedra/second was achieved with a parallel overhead of 26%.
Integrating Grid Services into the Cray XT4 Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

NERSC; Cholia, Shreyas; Lin, Hwa-Chun Wendy

2009-05-01

The 38640 core Cray XT4"Franklin" system at the National Energy Research Scientific Computing Center (NERSC) is a massively parallel resource available to Department of Energy researchers that also provides on-demand grid computing to the Open Science Grid. The integration of grid services on Franklin presented various challenges, including fundamental differences between the interactive and compute nodes, a stripped down compute-node operating system without dynamic library support, a shared-root environment and idiosyncratic application launching. Inour work, we describe how we resolved these challenges on a running, general-purpose production system to provide on-demand compute, storage, accounting and monitoring services through generic gridmore » interfaces that mask the underlying system-specific details for the end user.« less
Approaching the exa-scale: a real-world evaluation of rendering extremely large data sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patchett, John M; Ahrens, James P; Lo, Li - Ta

2010-10-15

Extremely large scale analysis is becoming increasingly important as supercomputers and their simulations move from petascale to exascale. The lack of dedicated hardware acceleration for rendering on today's supercomputing platforms motivates our detailed evaluation of the possibility of interactive rendering on the supercomputer. In order to facilitate our understanding of rendering on the supercomputing platform, we focus on scalability of rendering algorithms and architecture envisioned for exascale datasets. To understand tradeoffs for dealing with extremely large datasets, we compare three different rendering algorithms for large polygonal data: software based ray tracing, software based rasterization and hardware accelerated rasterization. We presentmore » a case study of strong and weak scaling of rendering extremely large data on both GPU and CPU based parallel supercomputers using Para View, a parallel visualization tool. Wc use three different data sets: two synthetic and one from a scientific application. At an extreme scale, algorithmic rendering choices make a difference and should be considered while approaching exascale computing, visualization, and analysis. We find software based ray-tracing offers a viable approach for scalable rendering of the projected future massive data sizes.« less
Supercomputing Drives Innovation - Continuum Magazine | NREL

Science.gov Websites

years, NREL scientists have used supercomputers to simulate 3D models of the primary enzymes and Scientist, discuss a 3D model of wind plant aerodynamics, showing low velocity wakes and impact on
DOE Office of Scientific and Technical Information (OSTI.GOV)

Mubarak, Misbah; Ross, Robert B.

This technical report describes the experiments performed to validate the MPI performance measurements reported by the CODES dragonfly network simulation with the Theta Cray XC system at the Argonne Leadership Computing Facility (ALCF).
A mass storage system for supercomputers based on Unix

NASA Technical Reports Server (NTRS)

Richards, J.; Kummell, T.; Zarlengo, D. G.

1988-01-01

The authors present the design, implementation, and utilization of a large mass storage subsystem (MSS) for the numerical aerodynamics simulation. The MSS supports a large networked, multivendor Unix-based supercomputing facility. The MSS at Ames Research Center provides all processors on the numerical aerodynamics system processing network, from workstations to supercomputers, the ability to store large amounts of data in a highly accessible, long-term repository. The MSS uses Unix System V and is capable of storing hundreds of thousands of files ranging from a few bytes to 2 Gb in size.
Intelligent supercomputers: the Japanese computer sputnik

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walter, G.

1983-11-01

Japan's government-supported fifth-generation computer project has had a pronounced effect on the American computer and information systems industry. The US firms are intensifying their research on and production of intelligent supercomputers, a combination of computer architecture and artificial intelligence software programs. While the present generation of computers is built for the processing of numbers, the new supercomputers will be designed specifically for the solution of symbolic problems and the use of artificial intelligence software. This article discusses new and exciting developments that will increase computer capabilities in the 1990s. 4 references.

Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation and Completion of Episodic Information.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aimone, James Bradley; Bernard, Michael Lewis; Vineyard, Craig Michael

2014-10-01

Adult neurogenesis in the hippocampus region of the brain is a neurobiological process that is believed to contribute to the brain's advanced abilities in complex pattern recognition and cognition. Here, we describe how realistic scale simulations of the neurogenesis process can offer both a unique perspective on the biological relevance of this process and confer computational insights that are suggestive of novel machine learning techniques. First, supercomputer based scaling studies of the neurogenesis process demonstrate how a small fraction of adult-born neurons have a uniquely larger impact in biologically realistic scaled networks. Second, we describe a novel technical approach bymore » which the information content of ensembles of neurons can be estimated. Finally, we illustrate several examples of broader algorithmic impact of neurogenesis, including both extending existing machine learning approaches and novel approaches for intelligent sensing.« less
Rendezvous of the "Third Kind": Triple Helix Origins and Future Possibilities

ERIC Educational Resources Information Center

Etzkowitz, Henry

2015-01-01

The Triple Helix, representing university-industry-government interactions, was rooted in a 1993 International Workshop on University-Industry Relations at UNAM's Centro Para la Innovacion Technologica in Mexico City. Impelled by Mexican reality, where university-industry interactions and the institutions themselves operated within a governmental…
Psychology in Mexico

ERIC Educational Resources Information Center

Ruiz, Eleonora Rubio

2011-01-01

The first formal psychology course taught in Mexico was in 1896 at Mexico's National University; today, National Autonomous University of Mexico (UNAM in Spanish). The modern psychology from Europe and the US in the late 19th century were the primary influences of Mexican psychology, as well as psychoanalysis and both clinical and experimental…
Interdisciplinary Education and Research in Mexico

ERIC Educational Resources Information Center

Villa-Soto, Juan Carlos

2016-01-01

In this article we discuss interdisciplinary teaching and research in Latin America through the lens of Mexican perspectives, in particular the experiences at the National Autonomous University of Mexico (UNAM). The history of these experiences goes back to the creation of the frst interdisciplinary education programs in Mexico in the 1970s and…
Introducing Mira, Argonne's Next-Generation Supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

2013-03-19

Mira, the new petascale IBM Blue Gene/Q system installed at the ALCF, will usher in a new era of scientific supercomputing. An engineering marvel, the 10-petaflops machine is capable of carrying out 10 quadrillion calculations per second.
Green Supercomputing at Argonne

ScienceCinema

Pete Beckman

2017-12-09

Pete Beckman, head of Argonne's Leadership Computing Facility (ALCF) talks about Argonne National Laboratory's green supercomputingâeverything from designing algorithms to use fewer kilowatts per operation to using cold Chicago winter air to cool the machine more efficiently.
TOP500 Supercomputers for June 2003

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

2003-06-23

21st Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 21st edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2003). The Earth Simulator supercomputer built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan, with its Linpack benchmark performance of 35.86 Tflop/s (teraflops or trillions of calculations per second), retains the number one position. The number 2 position is held by the re-measured ASCI Q system at Los Alamosmore » National Laboratory. With 13.88 Tflop/s, it is the second system ever to exceed the 10 Tflop/smark. ASCIQ was built by Hewlett-Packard and is based on the AlphaServerSC computer system.« less
Characterizing output bottlenecks in a supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Bing; Chase, Jeffrey; Dillow, David A

2012-01-01

Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more » contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.« less
Advanced Computing for Manufacturing.

ERIC Educational Resources Information Center

Erisman, Albert M.; Neves, Kenneth W.

1987-01-01

Discusses ways that supercomputers are being used in the manufacturing industry, including the design and production of airplanes and automobiles. Describes problems that need to be solved in the next few years for supercomputers to assume a major role in industry. (TW)
INTEGRATION OF PANDA WORKLOAD MANAGEMENT SYSTEM WITH SUPERCOMPUTERS

DOE Office of Scientific and Technical Information (OSTI.GOV)

De, K; Jha, S; Maeno, T

Abstract The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the funda- mental nature of matter and the basic forces that shape our universe, and were recently credited for the dis- covery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Datamore » Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data cen- ters are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Com- puting Facility (OLCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single- threaded workloads in parallel on Titan s multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accom- plishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility s infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less
Supercomputers Join the Fight against Cancer – U.S. Department of Energy

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

The Department of Energy has some of the best supercomputers in the world. Now, they’re joining the fight against cancer. Learn about our new partnership with the National Cancer Institute and GlaxoSmithKline Pharmaceuticals.
NAS-current status and future plans

NASA Technical Reports Server (NTRS)

Bailey, F. R.

1987-01-01

The Numerical Aerodynamic Simulation (NAS) has met its first major milestone, the NAS Processing System Network (NPSN) Initial Operating Configuration (IOC). The program has met its goal of providing a national supercomputer facility capable of greatly enhancing the Nation's research and development efforts. Furthermore, the program is fulfilling its pathfinder role by defining and implementing a paradigm for supercomputing system environments. The IOC is only the begining and the NAS Program will aggressively continue to develop and implement emerging supercomputer, communications, storage, and software technologies to strengthen computations as a critical element in supporting the Nation's leadership role in aeronautics.
PLOT3D/AMES, DEC VAX VMS VERSION USING DISSPLA (WITHOUT TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P. G.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. The VAX/VMS/DISSPLA implementation of PLOT3D supports 2-D polygons as well as 2-D and 3-D lines, but does not support graphics features requiring 3-D polygons (shading and hidden line removal, for example). Views can be manipulated using keyboard commands. This version of PLOT3D is potentially able to produce files for a variety of output devices; however, site-specific capabilities will vary depending on the device drivers supplied with the user's DISSPLA library. If ARCGRAPH (ARC-12350) is installed on the user's VAX, the VMS/DISSPLA version of PLOT3D can also be used to create files for use in GAS (Graphics Animation System, ARC-12379), an IRIS program capable of animating and recording images on film. The version 3.6b+ VMS/DISSPLA implementations of PLOT3D (ARC-12777) and PLOT3D/TURB3D (ARC-12781) were developed for use on VAX computers running VMS Version 5.0 and DISSPLA Version 11.0. The standard distribution media for each of these programs is a 9-track, 6250 bpi magnetic tape in DEC VAX BACKUP format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, and Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); (2) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D (ARC-12783, ARC12782); (3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. UNIX is a registered trademark of AT&T.
PLOT3D/AMES, DEC VAX VMS VERSION USING DISSPLA (WITH TURB3D)

NASA Technical Reports Server (NTRS)

Buning, P.

1994-01-01

PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into five groups: 1) Grid Functions for grids, grid-checking, etc.; 2) Scalar Functions for contour or carpet plots of density, pressure, temperature, Mach number, vorticity magnitude, helicity, etc.; 3) Vector Functions for vector plots of velocity, vorticity, momentum, and density gradient, etc.; 4) Particle Trace Functions for rake-like plots of particle flow or vortex lines; and 5) Shock locations based on pressure gradient. TURB3D is a modification of PLOT3D which is used for viewing CFD simulations of incompressible turbulent flow. Input flow data consists of pressure, velocity and vorticity. Typical quantities to plot include local fluctuations in flow quantities and turbulent production terms, plotted in physical or wall units. PLOT3D/TURB3D includes both TURB3D and PLOT3D because the operation of TURB3D is identical to PLOT3D, and there is no additional sample data or printed documentation for TURB3D. Graphical capabilities of PLOT3D version 3.6b+ vary among the implementations available through COSMIC. Customers are encouraged to purchase and carefully review the PLOT3D manual before ordering the program for a specific computer and graphics library. There is only one manual for use with all implementations of PLOT3D, and although this manual generally assumes that the Silicon Graphics Iris implementation is being used, informative comments concerning other implementations appear throughout the text. With all implementations, the visual representation of the object and flow field created by PLOT3D consists of points, lines, and polygons. Points can be represented with dots or symbols, color can be used to denote data values, and perspective is used to show depth. Differences among implementations impact the program's ability to use graphical features that are based on 3D polygons, the user's ability to manipulate the graphical displays, and the user's ability to obtain alternate forms of output. The VAX/VMS/DISSPLA implementation of PLOT3D supports 2-D polygons as well as 2-D and 3-D lines, but does not support graphics features requiring 3-D polygons (shading and hidden line removal, for example). Views can be manipulated using keyboard commands. This version of PLOT3D is potentially able to produce files for a variety of output devices; however, site-specific capabilities will vary depending on the device drivers supplied with the user's DISSPLA library. If ARCGRAPH (ARC-12350) is installed on the user's VAX, the VMS/DISSPLA version of PLOT3D can also be used to create files for use in GAS (Graphics Animation System, ARC-12379), an IRIS program capable of animating and recording images on film. The version 3.6b+ VMS/DISSPLA implementations of PLOT3D (ARC-12777) and PLOT3D/TURB3D (ARC-12781) were developed for use on VAX computers running VMS Version 5.0 and DISSPLA Version 11.0. The standard distribution media for each of these programs is a 9-track, 6250 bpi magnetic tape in DEC VAX BACKUP format. Customers purchasing one implementation version of PLOT3D or PLOT3D/TURB3D will be given a $200 discount on each additional implementation version ordered at the same time. Version 3.6b+ of PLOT3D and PLOT3D/TURB3D are also supported for the following computers and graphics libraries: (1) generic UNIX Supercomputer and IRIS, suitable for CRAY 2/UNICOS, CONVEX, and Alliant with remote IRIS 2xxx/3xxx or IRIS 4D (ARC-12779, ARC-12784); (2) Silicon Graphics IRIS 2xxx/3xxx or IRIS 4D (ARC-12783, ARC12782); (3) generic UNIX and DISSPLA Version 11.0 (ARC-12788, ARC-12778); and (4) Apollo computers running UNIX and GMR3D Version 2.0 (ARC-12789, ARC-12785 which have no capabilities to put text on plots). Silicon Graphics Iris, IRIS 4D, and IRIS 2xxx/3xxx are trademarks of Silicon Graphics Incorporated. VAX and VMS are trademarks of Digital Electronics Corporation. DISSPLA is a trademark of Computer Associates. CRAY 2 and UNICOS are trademarks of CRAY Research, Incorporated. CONVEX is a trademark of Convex Computer Corporation. Alliant is a trademark of Alliant. Apollo and GMR3D are trademarks of Hewlett-Packard, Incorporated. UNIX is a registered trademark of AT&T.
Roadrunner Supercomputer Breaks the Petaflop Barrier

ScienceCinema

Los Alamos National Lab - Brian Albright, Charlie McMillan, Lin Yin

2017-12-09

At 3:30 a.m. on May 26, 2008, Memorial Day, the "Roadrunner" supercomputer exceeded a sustained speed of 1 petaflop/s, or 1 million billion calculations per second. The sustained performance makes Roadrunner more than twice as fast as the current number 1
QCD on the BlueGene/L Supercomputer

NASA Astrophysics Data System (ADS)

Bhanot, G.; Chen, D.; Gara, A.; Sexton, J.; Vranas, P.

2005-03-01

In June 2004 QCD was simulated for the first time at sustained speed exceeding 1 TeraFlops in the BlueGene/L supercomputer at the IBM T.J. Watson Research Lab. The implementation and performance of QCD in the BlueGene/L is presented.
Supercomputer Issues from a University Perspective.

ERIC Educational Resources Information Center

Beering, Steven C.

1984-01-01

Discusses issues related to the access of and training of university researchers in using supercomputers, considering National Science Foundation's (NSF) role in this area, microcomputers on campuses, and the limited use of existing telecommunication networks. Includes examples of potential scientific projects (by subject area) utilizing…
Tropical Cyclone Wind Probability Forecasting for the Eastern North Pacific (EPWINDP).

DTIC Science & Technology

1982-04-01

INSTITUTO DE GEOFISICA DIRECTOR HONOLULU, HI 96822 U.N.A.M. BIBLIOTECA NATIONAL HURRICANE CENTER TORRE DE CIENCIAS, 3ER PISO NOAA, GABLES ONE TOWER CHAIRMAN...METEOROLOGY DEPT. CIUDAD UNIVERSITARIA 1320 S. DIXIE HWY. UNIVERSITY OF WISCONSIN MEXICO 20, D.F. CORAL GABLES, FL 33146 1225 W. DAYTON STREET
The Student Strike at the National Autonomous University of Mexico: A Political Analysis.

ERIC Educational Resources Information Center

Rhoads, Robert A.; Mina, Liliana

2001-01-01

Analyzes political tensions related to student strikes at the Universidad Nacional Autonoma de Mexico (UNAM), 1998-2000, which were sparked by proposed tuition fees. Discusses conflict between social justice sentiments focused on free and egalitarian access to higher education versus market-driven views promoting selective, competitive higher…
PISMA: A Visual Representation of Motif Distribution in DNA Sequences.

PubMed

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.

The Observatorio Astronómico Nacional - Tonantzintla: Site Evaluation

NASA Astrophysics Data System (ADS)

Hernández-Toledo, H. M.; Martínez-Vázquez, L. A.; Pani-Cielo, A.

2011-06-01

The objective of this contribution is to present some results of an evaluation on the local conditions at the site that were considered in order to propose that the Observatorio Astronómico Nacional, Tonantzintla, (OAN-Tonantzintla) become a National Facility for Astronomy Education. The evaluation included a quantitative diagnostic (CCD photometry) on the quality of the local sky. The attributes of the 1-m telescope, the current instrumentation and a well planned upgrading that includes new instrumentation is considered at the basis for a successful transition maintaining the attractiveness of the site for astronomy education. A 3-year upgrading program actually in progress at UNAM is providing funding for that purpose. Physics and astronomy programs at college and graduated levels at UNAM will benefit from this, yielding clear connections among astronomy researchers and educators and students at various levels. Although the OAN-Tonantzintla faces the danger of deteriorating its sky conditions, we are maintaining awareness of the night sky characteristics in long-term monitoring campaigns and encouraging the local authorities to find alternative solutions to this problem.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences

PubMed Central

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418
Macroseismic Intensity and Instrumental Ground Motion Parameter Correlations for Central Mexico

NASA Astrophysics Data System (ADS)

Sandoval Gómez, H.; Ramirez Guzman, L.; Espindola, V.

2012-12-01

We present instrumental intensity prediction equations for earthquakes in Central Mexico based on the correlation of observed Instrumental Ground-Motion Parameters (IGMP) and Modified Mercalli Intensity (MMI) scale reports. The goal of this study is to provide a model that can be used by the near real-time earthquake response system operated by the Institutes of Engineering and Geophysics at the National Autonomous University of Mexico (UNAM), which delivers estimates of key information associated with the societal impact due to earthquakes not available in the immediate aftermath of the event. Correlations of MMI and IGMP have been derived in other countries with different tectonic settings and built environments, but this is the first study devoted to the development of equations for the central region of Mexico. The IGMP are obtained from records of several stations for earthquakes with Mw 5.0-8.0 from the seismic networks operated by UNAM and other institutions in Mexico. The MMI observations were primarily obtained from the Did You Feel It (DYFI) report service of the U.S. Geological Survey and re-interpreted MMI reports from UNAM earthquake bulletin archives. For each instrumental observation we assigned a mean MMI intensity based on the proximity of the site where reported value are available, constrained by geological conditions and a visual inspection to guarantee that the intensity would be within one unit of the assigned value; following the procedure by Atkinson and Kaka (2006). We derived correlations for peak ground velocity (pgv) and acceleration, and three spectral acceleration periods (T=1, 2 and 3 s). In addition, we analyzed the Mw and distance dependence. We concluded that pgv and the spectral accelerations are the most useful IGMP predictors for MMI in the region of interest and the correlations differ significantly from those obtained in regions with other tectonic settings and infrastructure vulnerabilities (e.g. Wald et al, 1999; Atkinson and Kaka, 2006, Cramer and Dangkua, 2011).
Advances and issues from the simulation of planetary magnetospheres with recent supercomputer systems

NASA Astrophysics Data System (ADS)

Fukazawa, K.; Walker, R. J.; Kimura, T.; Tsuchiya, F.; Murakami, G.; Kita, H.; Tao, C.; Murata, K. T.

2016-12-01

Planetary magnetospheres are very large, while phenomena within them occur on meso- and micro-scales. These scales range from 10s of planetary radii to kilometers. To understand dynamics in these multi-scale systems, numerical simulations have been performed by using the supercomputer systems. We have studied the magnetospheres of Earth, Jupiter and Saturn by using 3-dimensional magnetohydrodynamic (MHD) simulations for a long time, however, we have not obtained the phenomena near the limits of the MHD approximation. In particular, we have not studied meso-scale phenomena that can be addressed by using MHD.Recently we performed our MHD simulation of Earth's magnetosphere by using the K-computer which is the first 10PFlops supercomputer and obtained multi-scale flow vorticity for the both northward and southward IMF. Furthermore, we have access to supercomputer systems which have Xeon, SPARC64, and vector-type CPUs and can compare simulation results between the different systems. Finally, we have compared the results of our parameter survey of the magnetosphere with observations from the HISAKI spacecraft.We have encountered a number of difficulties effectively using the latest supercomputer systems. First the size of simulation output increases greatly. Now a simulation group produces over 1PB of output. Storage and analysis of this much data is difficult. The traditional way to analyze simulation results is to move the results to the investigator's home computer. This takes over three months using an end-to-end 10Gbps network. In reality, there are problems at some nodes such as firewalls that can increase the transfer time to over one year. Another issue is post-processing. It is hard to treat a few TB of simulation output due to the memory limitations of a post-processing computer. To overcome these issues, we have developed and introduced the parallel network storage, the highly efficient network protocol and the CUI based visualization tools.In this study, we will show the latest simulation results using the petascale supercomputer and problems from the use of these supercomputer systems.
Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

NASA Astrophysics Data System (ADS)

Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.

2016-10-01

The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.
RAID/C90 Technology Integration

NASA Technical Reports Server (NTRS)

Ciotti, Bob; Cooper, D. M. (Technical Monitor)

1994-01-01

In March 1993, NAS was the first to connect a Maximum Strategy RAID disk to the C90 using standard Cray provided software. This paper discusses the problems encountered, lessons learned, and performance achieved.
Finite element methods on supercomputers - The scatter-problem

NASA Technical Reports Server (NTRS)

Loehner, R.; Morgan, K.

1985-01-01

Certain problems arise in connection with the use of supercomputers for the implementation of finite-element methods. These problems are related to the desirability of utilizing the power of the supercomputer as fully as possible for the rapid execution of the required computations, taking into account the gain in speed possible with the aid of pipelining operations. For the finite-element method, the time-consuming operations may be divided into three categories. The first two present no problems, while the third type of operation can be a reason for the inefficient performance of finite-element programs. Two possibilities for overcoming certain difficulties are proposed, giving attention to a scatter-process.
Code IN Exhibits - Supercomputing 2000

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; McCann, Karen M.; Biswas, Rupak; VanderWijngaart, Rob F.; Kwak, Dochan (Technical Monitor)

2000-01-01

The creation of parameter study suites has recently become a more challenging problem as the parameter studies have become multi-tiered and the computational environment has become a supercomputer grid. The parameter spaces are vast, the individual problem sizes are getting larger, and researchers are seeking to combine several successive stages of parameterization and computation. Simultaneously, grid-based computing offers immense resource opportunities but at the expense of great difficulty of use. We present ILab, an advanced graphical user interface approach to this problem. Our novel strategy stresses intuitive visual design tools for parameter study creation and complex process specification, and also offers programming-free access to grid-based supercomputer resources and process automation.
NSF Establishes First Four National Supercomputer Centers.

ERIC Educational Resources Information Center

Lepkowski, Wil

1985-01-01

The National Science Foundation (NSF) has awarded support for supercomputer centers at Cornell University, Princeton University, University of California (San Diego), and University of Illinois. These centers are to be the nucleus of a national academic network for use by scientists and engineers throughout the United States. (DH)
Library Services in a Supercomputer Center.

ERIC Educational Resources Information Center

Layman, Mary

1991-01-01

Describes library services that are offered at the San Diego Supercomputer Center (SDSC), which is located at the University of California at San Diego. Topics discussed include the user population; online searching; microcomputer use; electronic networks; current awareness programs; library catalogs; and the slide collection. A sidebar outlines…
Probing the cosmic causes of errors in supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

Cosmic rays from outer space are causing errors in supercomputers. The neutrons that pass through the CPU may be causing binary data to flip leading to incorrect calculations. Los Alamos National Laboratory has developed detectors to determine how much data is being corrupted by these cosmic particles.
Navier-Stokes Aerodynamic Simulation of the V-22 Osprey on the Intel Paragon MPP

NASA Technical Reports Server (NTRS)

Vadyak, Joseph; Shrewsbury, George E.; Narramore, Jim C.; Montry, Gary; Holst, Terry; Kwak, Dochan (Technical Monitor)

1995-01-01

The paper will describe the Development of a general three-dimensional multiple grid zone Navier-Stokes flowfield simulation program (ENS3D-MPP) designed for efficient execution on the Intel Paragon Massively Parallel Processor (MPP) supercomputer, and the subsequent application of this method to the prediction of the viscous flowfield about the V-22 Osprey tiltrotor vehicle. The flowfield simulation code solves the thin Layer or full Navier-Stoke's equation - for viscous flow modeling, or the Euler equations for inviscid flow modeling on a structured multi-zone mesh. In the present paper only viscous simulations will be shown. The governing difference equations are solved using a time marching implicit approximate factorization method with either TVD upwind or central differencing used for the convective terms and central differencing used for the viscous diffusion terms. Steady state or Lime accurate solutions can be calculated. The present paper will focus on steady state applications, although time accurate solution analysis is the ultimate goal of this effort. Laminar viscosity is calculated using Sutherland's law and the Baldwin-Lomax two layer algebraic turbulence model is used to compute the eddy viscosity. The Simulation method uses an arbitrary block, curvilinear grid topology. An automatic grid adaption scheme is incorporated which concentrates grid points in high density gradient regions. A variety of user-specified boundary conditions are available. This paper will present the application of the scalable and superscalable versions to the steady state viscous flow analysis of the V-22 Osprey using a multiple zone global mesh. The mesh consists of a series of sheared cartesian grid blocks with polar grids embedded within to better simulate the wing tip mounted nacelle. MPP solutions will be shown in comparison to equivalent Cray C-90 results and also in comparison to experimental data. Discussions on meshing considerations, wall clock execution time, load balancing, and scalability will be provided.
The application of CFD to rotary wing flow problems

NASA Technical Reports Server (NTRS)

Caradonna, F. X.

1990-01-01

Rotorcraft aerodynamics is especially rich in unsolved problems, and for this reason the need for independent computational and experimental studies is great. Three-dimensional unsteady, nonlinear potential methods are becoming fast enough to enable their use in parametric design studies. At present, combined CAMRAD/FPR analyses for a complete trimmed rotor soltution can be performed in about an hour on a CRAY Y-MP (or ten minutes, with multiple processors). These computational speeds indicate that in the near future many of the large CFD problems will no longer require a supercomputer. The ability to convect circulation is routine for integral methods, but only recently was it discovered how to do the same with differential methods. It is clear that the differential CFD rotor analyses are poised to enter the engineering workplace. Integral methods already constitute a mainstay. Ultimately, it is the users who will integrate CFD into the entire engineering process and provide a new measure of confidence in design and analysis. It should be recognized that the above classes of analyses do not include several major limiting phenomena which will continue to require empirical treatment because of computational time constraints and limited physical understanding. Such empirical treatment should be included, however, into the developing CFD, engineering level analyses. It is likely that properly constructed flow models containing corrections from physical testing will be able to fill in unavoidable gaps in the experimental data base, both for basic studies and for specific configuration testing. For these kinds of applications, computational cost is not an issue. Finally, it should be recognized that although rotorcraft are probably the most complex of aircraft, the rotorcraft engineering community is very small compared to the fixed-wing community. Likewise, rotorcraft CFD resources can never achieve fixed-wing proportions and must be used wisely. Therefore the fixed-wing work must be gleaned for many of the basic methods.
Flux-Level Transit Injection Experiments with NASA Pleiades Supercomputer

NASA Astrophysics Data System (ADS)

Li, Jie; Burke, Christopher J.; Catanzarite, Joseph; Seader, Shawn; Haas, Michael R.; Batalha, Natalie; Henze, Christopher; Christiansen, Jessie; Kepler Project, NASA Advanced Supercomputing Division

2016-06-01

Flux-Level Transit Injection (FLTI) experiments are executed with NASA's Pleiades supercomputer for the Kepler Mission. The latest release (9.3, January 2016) of the Kepler Science Operations Center Pipeline is used in the FLTI experiments. Their purpose is to validate the Analytic Completeness Model (ACM), which can be computed for all Kepler target stars, thereby enabling exoplanet occurrence rate studies. Pleiades, a facility of NASA's Advanced Supercomputing Division, is one of the world's most powerful supercomputers and represents NASA's state-of-the-art technology. We discuss the details of implementing the FLTI experiments on the Pleiades supercomputer. For example, taking into account that ~16 injections are generated by one core of the Pleiades processors in an hour, the “shallow” FLTI experiment, in which ~2000 injections are required per target star, can be done for 16% of all Kepler target stars in about 200 hours. Stripping down the transit search to bare bones, i.e. only searching adjacent high/low periods at high/low pulse durations, makes the computationally intensive FLTI experiments affordable. The design of the FLTI experiments and the analysis of the resulting data are presented in “Validating an Analytic Completeness Model for Kepler Target Stars Based on Flux-level Transit Injection Experiments” by Catanzarite et al. (#2494058).Kepler was selected as the 10th mission of the Discovery Program. Funding for the Kepler Mission has been provided by the NASA Science Mission Directorate.
Input-independent, Scalable and Fast String Matching on the Cray XMT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villa, Oreste; Chavarría-Miranda, Daniel; Maschhoff, Kristyn J

2009-05-25

String searching is at the core of many security and network applications like search engines, intrusion detection systems, virus scanners and spam filters. The growing size of on-line content and the increasing wire speeds push the need for fast, and often real- time, string searching solutions. For these conditions, many software implementations (if not all) targeting conventional cache-based microprocessors do not perform well. They either exhibit overall low performance or exhibit highly variable performance depending on the types of inputs. For this reason, real-time state of the art solutions rely on the use of either custom hardware or Field-Programmable Gatemore » Arrays (FPGAs) at the expense of overall system flexibility and programmability. This paper presents a software based implementation of the Aho-Corasick string searching algorithm on the Cray XMT multithreaded shared memory machine. Our so- lution relies on the particular features of the XMT architecture and on several algorith- mic strategies: it is fast, scalable and its performance is virtually content-independent. On a 128-processor Cray XMT, it reaches a scanning speed of ≈ 28 Gbps with a performance variability below 10 %. In the 10 Gbps performance range, variability is below 2.5%. By comparison, an Intel dual-socket, 8-core system running at 2.66 GHz achieves a peak performance which varies from 500 Mbps to 10 Gbps depending on the type of input and dictionary size.« less
Introduction and Committees

NASA Astrophysics Data System (ADS)

Angelova, Maia; Zakrzewski, Wojciech; Hussin, Véronique; Piette, Bernard

2011-03-01

This volume contains contributions to the XXVIIIth International Colloquium on Group-Theoretical Methods in Physics, the GROUP 28 conference, which took place in Newcastle upon Tyne from 26-30 July 2010. All plenary and contributed papers have undergone an independent review; as a result of this review and the decisions of the Editorial Board most but not all of the contributions were accepted. The volume is organised as follows: it starts with notes in memory of Marcos Moshinsky, followed by contributions related to the Wigner Medal and Hermann Weyl prize. Then the invited talks at the plenary sessions and the public lecture are published followed by contributions in the parallel and poster sessions in alphabetical order. The Editors:Maia Angelova, Wojciech Zakrzewski, Véronique Hussin and Bernard Piette International Advisory Committee Michael BaakeUniversity of Bielefeld, Germany Gerald DunneUniversity of Connecticut, USA J F (Frank) GomesUNESP, Sao Paolo, Brazil Peter HanggiUniversity of Augsburg, Germany Jeffrey C LagariasUniversity of Michigan, USA Michael MackeyMcGill University, Canada Nicholas MantonCambridge University, UK Alexei MorozovITEP, Moscow, Russia Valery RubakovINR, Moscow, Russia Barry SandersUniversity of Calgary, Canada Allan SolomonOpen University, Milton Keynes, UK Christoph SchweigertUniversity of Hamburg, Germany Standing Committee Twareque AliConcordia University, Canada Luis BoyaSalamanca University, Spain Enrico CeleghiniFirenze University, Italy Vladimir DobrevBulgarian Academy of Sciences, Bulgaria Heinz-Dietrich DoebnerHonorary Member, Clausthal University, Germany Jean-Pierre GazeauChairman, Paris Diderot University, France Mo-Lin GeNankai University. China Gerald GoldinRutgers University, USA Francesco IachelloYale University, USA Joris Van der JeugtGhent University, Belgium Richard KernerPierre et Marie Curie University, France Piotr KielanowskiCINVESTAV, Mexico Alan KosteleckyIndiana University, USA Mariano del OlmoValladolid University, Spain George PogosyanUNAM, Mexico, JINR, Dubna, Russia Christoph SchweigertUniversity of Hamburg, Germany Reidun TwarockYork University, UK Luc VinetMontréal University, Canada Apostolos VourdasBradford University, UK Kurt WolfUNAM, Mexico Local Organising Committee Maia Angelova - ChairNorthumbria University, Newcastle Wojtek Zakrzewski - ChairDurham University, Durham Sarah Howells - SecretaryNorthumbria University, Newcastle Jeremy Ellman - WebNorthumbria University, Newcastle Véronique HussinNorthumbria, Durham and University of Montréal Safwat MansiNorthumbria University, Newcastle James McLaughlinNorthumbria University, Newcastle Bernard PietteDurham University, Durham Ghanim PutrusNorthumbria University, Newcastle Sarah ReesNewcastle University, Newcastle Petia SiceNorthumbria University, Newcastle Anne TaorminaDurham University, Durham Rosemary ZakrzewskiAccompanying persons programme Lighthouse Photograph by Bernard Piette: Souter Lighthouse, Marsden, Tyne and Wear, England
The Sky's the Limit When Super Students Meet Supercomputers.

ERIC Educational Resources Information Center

Trotter, Andrew

1991-01-01

In a few select high schools in the U.S., supercomputers are allowing talented students to attempt sophisticated research projects using simultaneous simulations of nature, culture, and technology not achievable by ordinary microcomputers. Schools can get their students online by entering contests and seeking grants and partnerships with…
NSF Says It Will Support Supercomputer Centers in California and Illinois.

ERIC Educational Resources Information Center

Strosnider, Kim; Young, Jeffrey R.

1997-01-01

The National Science Foundation will increase support for supercomputer centers at the University of California, San Diego and the University of Illinois, Urbana-Champaign, while leaving unclear the status of the program at Cornell University (New York) and a cooperative Carnegie-Mellon University (Pennsylvania) and University of Pittsburgh…
Access to Supercomputers. Higher Education Panel Report 69.

ERIC Educational Resources Information Center

Holmstrom, Engin Inel

This survey was conducted to provide the National Science Foundation with baseline information on current computer use in the nation's major research universities, including the actual and potential use of supercomputers. Questionnaires were sent to 207 doctorate-granting institutions; after follow-ups, 167 institutions (91% of the institutions…
NOAA announces significant investment in next generation of supercomputers

Science.gov Websites

provide more timely, accurate weather forecasts. (Credit: istockphoto.com) Today, NOAA announced the next phase in the agency's efforts to increase supercomputing capacity to provide more timely, accurate turn will lead to more timely, accurate, and reliable forecasts." Ahead of this upgrade, each of

Developments in the simulation of compressible inviscid and viscous flow on supercomputers

NASA Technical Reports Server (NTRS)

Steger, J. L.; Buning, P. G.

1985-01-01

In anticipation of future supercomputers, finite difference codes are rapidly being extended to simulate three-dimensional compressible flow about complex configurations. Some of these developments are reviewed. The importance of computational flow visualization and diagnostic methods to three-dimensional flow simulation is also briefly discussed.
Computing and data processing

NASA Technical Reports Server (NTRS)

Smarr, Larry; Press, William; Arnett, David W.; Cameron, Alastair G. W.; Crutcher, Richard M.; Helfand, David J.; Horowitz, Paul; Kleinmann, Susan G.; Linsky, Jeffrey L.; Madore, Barry F.

1991-01-01

The applications of computers and data processing to astronomy are discussed. Among the topics covered are the emerging national information infrastructure, workstations and supercomputers, supertelescopes, digital astronomy, astrophysics in a numerical laboratory, community software, archiving of ground-based observations, dynamical simulations of complex systems, plasma astrophysics, and the remote control of fourth dimension supercomputers.
University and Government in Mexico: Autonomy in an Authoritarian System.

ERIC Educational Resources Information Center

Levy, Daniel C.

The relationship between the authoritarian state and higher education in Mexico is examined in this case study. Focus is on the National Autonomous University of Mexico (UNAM) since it receives 40 percent of the federal budget for higher education, which makes it a prime example of autonomy within an authoritarian political system. Using three…
15. BUILDING 239. SECTIONS AND DETAILS OF DRYING ROOMS AND ...

Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

15. BUILDING 239. SECTIONS AND DETAILS OF DRYING ROOMS AND MIXING ROOMS. March 6, 1941. - Frankford Arsenal, Building Nos. 239-239A, Southeast corner of Clay Street & Cray Road, Philadelphia, Philadelphia County, PA
Supercomputer use in orthopaedic biomechanics research: focus on functional adaptation of bone.

PubMed

Hart, R T; Thongpreda, N; Van Buskirk, W C

1988-01-01

The authors describe two biomechanical analyses carried out using numerical methods. One is an analysis of the stress and strain in a human mandible, and the other analysis involves modeling the adaptive response of a sheep bone to mechanical loading. The computing environment required for the two types of analyses is discussed. It is shown that a simple stress analysis of a geometrically complex mandible can be accomplished using a minicomputer. However, more sophisticated analyses of the same model with dynamic loading or nonlinear materials would require supercomputer capabilities. A supercomputer is also required for modeling the adaptive response of living bone, even when simple geometric and material models are use.
NREL's Building-Integrated Supercomputer Provides Heating and Efficient Computing (Fact Sheet)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

2014-09-01

NREL's Energy Systems Integration Facility (ESIF) is meant to investigate new ways to integrate energy sources so they work together efficiently, and one of the key tools to that investigation, a new supercomputer, is itself a prime example of energy systems integration. NREL teamed with Hewlett-Packard (HP) and Intel to develop the innovative warm-water, liquid-cooled Peregrine supercomputer, which not only operates efficiently but also serves as the primary source of building heat for ESIF offices and laboratories. This innovative high-performance computer (HPC) can perform more than a quadrillion calculations per second as part of the world's most energy-efficient HPC datamore » center.« less
Supercomputer optimizations for stochastic optimal control applications

NASA Technical Reports Server (NTRS)

Chung, Siu-Leung; Hanson, Floyd B.; Xu, Huihuang

1991-01-01

Supercomputer optimizations for a computational method of solving stochastic, multibody, dynamic programming problems are presented. The computational method is valid for a general class of optimal control problems that are nonlinear, multibody dynamical systems, perturbed by general Markov noise in continuous time, i.e., nonsmooth Gaussian as well as jump Poisson random white noise. Optimization techniques for vector multiprocessors or vectorizing supercomputers include advanced data structures, loop restructuring, loop collapsing, blocking, and compiler directives. These advanced computing techniques and superconducting hardware help alleviate Bellman's curse of dimensionality in dynamic programming computations, by permitting the solution of large multibody problems. Possible applications include lumped flight dynamics models for uncertain environments, such as large scale and background random aerospace fluctuations.
NAS Technical Summaries, March 1993 - February 1994

NASA Technical Reports Server (NTRS)

1995-01-01

NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1993-94 operational year concluded with 448 high-speed processor projects and 95 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.
NAS technical summaries. Numerical aerodynamic simulation program, March 1992 - February 1993

NASA Technical Reports Server (NTRS)

1994-01-01

NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1992-93 operational year concluded with 399 high-speed processor projects and 91 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.
TRASYS - THERMAL RADIATION ANALYZER SYSTEM (CRAY VERSION WITH NASADIG)

NASA Technical Reports Server (NTRS)

Anderson, G. E.

1994-01-01

The Thermal Radiation Analyzer System, TRASYS, is a computer software system with generalized capability to solve the radiation related aspects of thermal analysis problems. TRASYS computes the total thermal radiation environment for a spacecraft in orbit. The software calculates internode radiation interchange data as well as incident and absorbed heat rate data originating from environmental radiant heat sources. TRASYS provides data of both types in a format directly usable by such thermal analyzer programs as SINDA/FLUINT (available from COSMIC, program number MSC-21528). One primary feature of TRASYS is that it allows users to write their own driver programs to organize and direct the preprocessor and processor library routines in solving specific thermal radiation problems. The preprocessor first reads and converts the user's geometry input data into the form used by the processor library routines. Then, the preprocessor accepts the user's driving logic, written in the TRASYS modified FORTRAN language. In many cases, the user has a choice of routines to solve a given problem. Users may also provide their own routines where desirable. In particular, the user may write output routines to provide for an interface between TRASYS and any thermal analyzer program using the R-C network concept. Input to the TRASYS program consists of Options and Edit data, Model data, and Logic Flow and Operations data. Options and Edit data provide for basic program control and user edit capability. The Model data describe the problem in terms of geometry and other properties. This information includes surface geometry data, documentation data, nodal data, block coordinate system data, form factor data, and flux data. Logic Flow and Operations data house the user's driver logic, including the sequence of subroutine calls and the subroutine library. Output from TRASYS consists of two basic types of data: internode radiation interchange data, and incident and absorbed heat rate data. The flexible structure of TRASYS allows considerable freedom in the definition and choice of solution method for a thermal radiation problem. The program's flexible structure has also allowed TRASYS to retain the same basic input structure as the authors update it in order to keep up with changing requirements. Among its other important features are the following: 1) up to 3200 node problem size capability with shadowing by intervening opaque or semi-transparent surfaces; 2) choice of diffuse, specular, or diffuse/specular radiant interchange solutions; 3) a restart capability that minimizes recomputing; 4) macroinstructions that automatically provide the executive logic for orbit generation that optimizes the use of previously completed computations; 5) a time variable geometry package that provides automatic pointing of the various parts of an articulated spacecraft and an automatic look-back feature that eliminates redundant form factor calculations; 6) capability to specify submodel names to identify sets of surfaces or components as an entity; and 7) subroutines to perform functions which save and recall the internodal and/or space form factors in subsequent steps for nodes with fixed geometry during a variable geometry run. There are two machine versions of TRASYS v27: a DEC VAX version and a Cray UNICOS version. Both versions require installation of the NASADIG library (MSC-21801 for DEC VAX or COS-10049 for CRAY), which is available from COSMIC either separately or bundled with TRASYS. The NASADIG (NASA Device Independent Graphics Library) plot package provides a pictorial representation of input geometry, orbital/orientation parameters, and heating rate output as a function of time. NASADIG supports Tektronix terminals. The CRAY version of TRASYS v27 is written in FORTRAN 77 for batch or interactive execution and has been implemented on CRAY X-MP and CRAY Y-MP series computers running UNICOS. The standard distribution medium for MSC-21959 (CRAY version without NASADIG) is a 1600 BPI 9-track magnetic tape in UNIX tar format. The standard distribution medium for COS-10040 (CRAY version with NASADIG) is a set of two 6250 BPI 9-track magnetic tapes in UNIX tar format. Alternate distribution media and formats are available upon request. The DEC VAX version of TRASYS v27 is written in FORTRAN 77 for batch execution (only the plotting driver program is interactive) and has been implemented on a DEC VAX 8650 computer under VMS. Since the source codes for MSC-21030 and COS-10026 are in VAX/VMS text library files and DEC Command Language files, COSMIC will only provide these programs in the following formats: MSC-21030, TRASYS (DEC VAX version without NASADIG) is available on a 1600 BPI 9-track magnetic tape in VAX BACKUP format (standard distribution medium) or in VAX BACKUP format on a TK50 tape cartridge; COS-10026, TRASYS (DEC VAX version with NASADIG), is available in VAX BACKUP format on a set of three 6250 BPI 9-track magnetic tapes (standard distribution medium) or a set of three TK50 tape cartridges in VAX BACKUP format. TRASYS was last updated in 1993.
The ASC Sequoia Programming Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seager, M

2008-08-06

In the late 1980's and early 1990's, Lawrence Livermore National Laboratory was deeply engrossed in determining the next generation programming model for the Integrated Design Codes (IDC) beyond vectorization for the Cray 1s series of computers. The vector model, developed in mid 1970's first for the CDC 7600 and later extended from stack based vector operation to memory to memory operations for the Cray 1s, lasted approximately 20 years (See Slide 5). The Cray vector era was deemed an extremely long lived era as it allowed vector codes to be developed over time (the Cray 1s were faster in scalarmore » mode than the CDC 7600) with vector unit utilization increasing incrementally over time. The other attributes of the Cray vector era at LLNL were that we developed, supported and maintained the Operating System (LTSS and later NLTSS), communications protocols (LINCS), Compilers (Civic Fortran77 and Model), operating system tools (e.g., batch system, job control scripting, loaders, debuggers, editors, graphics utilities, you name it) and math and highly machine optimized libraries (e.g., SLATEC, and STACKLIB). Although LTSS was adopted by Cray for early system generations, they later developed COS and UNICOS operating systems and environment on their own. In the late 1970s and early 1980s two trends appeared that made the Cray vector programming model (described above including both the hardware and system software aspects) seem potentially dated and slated for major revision. These trends were the appearance of low cost CMOS microprocessors and their attendant, departmental and mini-computers and later workstations and personal computers. With the wide spread adoption of Unix in the early 1980s, it appeared that LLNL (and the other DOE Labs) would be left out of the mainstream of computing without a rapid transition to these 'Killer Micros' and modern OS and tools environments. The other interesting advance in the period is that systems were being developed with multiple 'cores' in them and called Symmetric Multi-Processor or Shared Memory Processor (SMP) systems. The parallel revolution had begun. The Laboratory started a small 'parallel processing project' in 1983 to study the new technology and its application to scientific computing with four people: Tim Axelrod, Pete Eltgroth, Paul Dubois and Mark Seager. Two years later, Eugene Brooks joined the team. This team focused on Unix and 'killer micro' SMPs. Indeed, Eugene Brooks was credited with coming up with the 'Killer Micro' term. After several generations of SMP platforms (e.g., Sequent Balance 8000 with 8 33MHz MC32032s, Allian FX8 with 8 MC68020 and FPGA based Vector Units and finally the BB&N Butterfly with 128 cores), it became apparent to us that the killer micro revolution would indeed take over Crays and that we definitely needed a new programming and systems model. The model developed by Mark Seager and Dale Nielsen focused on both the system aspects (Slide 3) and the code development aspects (Slide 4). Although now succinctly captured in two attached slides, at the time there was tremendous ferment in the research community as to what parallel programming model would emerge, dominate and survive. In addition, we wanted a model that would provide portability between platforms of a single generation but also longevity over multiple--and hopefully--many generations. Only after we developed the 'Livermore Model' and worked it out in considerable detail did it become obvious that what we came up with was the right approach. In a nutshell, the applications programming model of the Livermore Model posited that SMP parallelism would ultimately not scale indefinitely and one would have to bite the bullet and implement MPI parallelism within the Integrated Design Code (IDC). We also had a major emphasis on doing everything in a completely standards based, portable methodology with POSIX/Unix as the target environment. We decided against specialized libraries like STACKLIB for performance, but kept as many general purpose, portable math libraries as were needed by the codes. Third, we assumed that the SMPs in clusters would evolve in time to become more powerful, feature rich and, in particular, offer more cores. Thus, we focused on OpenMP, and POSIX PThreads for programming SMP parallelism. These code porting efforts were lead by Dale Nielsen, A-Division code group leader, and Randy Christensen, B-Division code group leader. Most of the porting effort revolved removing 'Crayisms' in the codes: artifacts of LTSS/NLTSS, Civic compiler extensions beyond Fortran77, IO libraries and dealing with new code control languages (we switched to Perl and later to Python). Adding MPI to the codes was initially problematic and error prone because the programmers used MPI directly and sprinkled the calls throughout the code.« less
Congressional Panel Seeks To Curb Access of Foreign Students to U.S. Supercomputers.

ERIC Educational Resources Information Center

Kiernan, Vincent

1999-01-01

Fearing security problems, a congressional committee on Chinese espionage recommends that foreign students and other foreign nationals be barred from using supercomputers at national laboratories unless they first obtain export licenses from the federal government. University officials dispute the data on which the report is based and find the…
The Age of the Supercomputer Gives Way to the Age of the Super Infrastructure.

ERIC Educational Resources Information Center

Young, Jeffrey R.

1997-01-01

In October 1997, the National Science Foundation will discontinue financial support for two university-based supercomputer facilities to concentrate resources on partnerships led by facilities at the University of California, San Diego and the University of Illinois, Urbana-Champaign. The reconfigured program will develop more user-friendly and…
The ChemViz Project: Using a Supercomputer To Illustrate Abstract Concepts in Chemistry.

ERIC Educational Resources Information Center

Beckwith, E. Kenneth; Nelson, Christopher

1998-01-01

Describes the Chemistry Visualization (ChemViz) Project, a Web venture maintained by the University of Illinois National Center for Supercomputing Applications (NCSA) that enables high school students to use computational chemistry as a technique for understanding abstract concepts. Discusses the evolution of computational chemistry and provides a…
Intricacies of modern supercomputing illustrated with recent advances in simulations of strongly correlated electron systems

NASA Astrophysics Data System (ADS)

Schulthess, Thomas C.

2013-03-01

The continued thousand-fold improvement in sustained application performance per decade on modern supercomputers keeps opening new opportunities for scientific simulations. But supercomputers have become very complex machines, built with thousands or tens of thousands of complex nodes consisting of multiple CPU cores or, most recently, a combination of CPU and GPU processors. Efficient simulations on such high-end computing systems require tailored algorithms that optimally map numerical methods to particular architectures. These intricacies will be illustrated with simulations of strongly correlated electron systems, where the development of quantum cluster methods, Monte Carlo techniques, as well as their optimal implementation by means of algorithms with improved data locality and high arithmetic density have gone hand in hand with evolving computer architectures. The present work would not have been possible without continued access to computing resources at the National Center for Computational Science of Oak Ridge National Laboratory, which is funded by the Facilities Division of the Office of Advanced Scientific Computing Research, and the Swiss National Supercomputing Center (CSCS) that is funded by ETH Zurich.
Extracting the Textual and Temporal Structure of Supercomputing Logs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jain, S; Singh, I; Chandra, A

2009-05-26

Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an onlinemore » clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.« less
Supercomputations and big-data analysis in strong-field ultrafast optical physics: filamentation of high-peak-power ultrashort laser pulses

NASA Astrophysics Data System (ADS)

Voronin, A. A.; Panchenko, V. Ya; Zheltikov, A. M.

2016-06-01

High-intensity ultrashort laser pulses propagating in gas media or in condensed matter undergo complex nonlinear spatiotemporal evolution where temporal transformations of optical field waveforms are strongly coupled to an intricate beam dynamics and ultrafast field-induced ionization processes. At the level of laser peak powers orders of magnitude above the critical power of self-focusing, the beam exhibits modulation instabilities, producing random field hot spots and breaking up into multiple noise-seeded filaments. This problem is described by a (3 + 1)-dimensional nonlinear field evolution equation, which needs to be solved jointly with the equation for ultrafast ionization of a medium. Analysis of this problem, which is equivalent to solving a billion-dimensional evolution problem, is only possible by means of supercomputer simulations augmented with coordinated big-data processing of large volumes of information acquired through theory-guiding experiments and supercomputations. Here, we review the main challenges of supercomputations and big-data processing encountered in strong-field ultrafast optical physics and discuss strategies to confront these challenges.
Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers

NASA Astrophysics Data System (ADS)

Dreher, Patrick; Scullin, William; Vouk, Mladen

2015-09-01

Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers.
Teaching solar astronomy on march 21 th in a multicultural village during IYA 2009 Mexico

NASA Astrophysics Data System (ADS)

Zueck, S.; Lara, A.

2009-12-01

We describe activities and resources at a popularization of science event that was organized in a multicultural mystical small village and the response of the audience that attended it. On March 21 Th. 2009 (spring equinox) we conducted a social experiment of science outreach. Scientists, educators and graduate students interacted with general public at a village named Tepoztlan, State of Morelos, Mexico, that is a former farmers town in process of urbanization, which depends to an extended degree of thousands of tourists which frequents the place, most of all during the equinox day. A team of scientists and their graduate students that belong to the solar physics program of Instituto de Geofísica (UNAM)organized a solar observation, setting at the garden of an old Hispanic dominican convent (XVI century), 10 telescopes with solar filters to show on real time to the general audience, our principal star : The Sun. We also prepared a free separate resource guide to help answer questions about basic information about our star like his structure, sun spots,age,diameter,evolution etc and two researchers offer conferences to the local elementary school children. The main audience came from the local people like bakers, open market workers or home makers that after finish his labor day went to the Museum to observe the Sun trough the telescopes or to attend the conferences with their children. They have several questions about scientific and pseudo-scientific topics related not just to the solar equinox, but about the earth's magnetic field, planets etc. We also discuss our experiences communicating science face to face to an audience that came to a town that is famous for his widely mystical legends related to solar energy or vibration that humans can use to get luck or health especially on this date.
Finite Element Flow Code Optimization on the Cray T3D,

DTIC Science & Technology

1997-04-01

present time, the system is configured with 512 processing elements and 32.8 Cigabytes of memory. Through a gift of time from MSCI and other arrangements, the AHPCRC has limited access to this system.

Implementation of an ADI method on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1987-01-01

The implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, an SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the FLEX/32 and CRAY/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Implementation of an ADI method on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1987-01-01

In this paper the implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are the MPP, an SIMD machine with 16-Kbit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the Flex/32 and Cray/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally conclusions are presented.
Discrete sensitivity derivatives of the Navier-Stokes equations with a parallel Krylov solver

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Taylor, Arthur C., III

1994-01-01

This paper solves an 'incremental' form of the sensitivity equations derived by differentiating the discretized thin-layer Navier Stokes equations with respect to certain design variables of interest. The equations are solved with a parallel, preconditioned Generalized Minimal RESidual (GMRES) solver on a distributed-memory architecture. The 'serial' sensitivity analysis code is parallelized by using the Single Program Multiple Data (SPMD) programming model, domain decomposition techniques, and message-passing tools. Sensitivity derivatives are computed for low and high Reynolds number flows over a NACA 1406 airfoil on a 32-processor Intel Hypercube, and found to be identical to those computed on a single-processor Cray Y-MP. It is estimated that the parallel sensitivity analysis code has to be run on 40-50 processors of the Intel Hypercube in order to match the single-processor processing time of a Cray Y-MP.
Parallelization of ARC3D with Computer-Aided Tools

NASA Technical Reports Server (NTRS)

Jin, Haoqiang; Hribar, Michelle; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

A series of efforts have been devoted to investigating methods of porting and parallelizing applications quickly and efficiently for new architectures, such as the SCSI Origin 2000 and Cray T3E. This report presents the parallelization of a CFD application, ARC3D, using the computer-aided tools, Cesspools. Steps of parallelizing this code and requirements of achieving better performance are discussed. The generated parallel version has achieved reasonably well performance, for example, having a speedup of 30 for 36 Cray T3E processors. However, this performance could not be obtained without modification of the original serial code. It is suggested that in many cases improving serial code and performing necessary code transformations are important parts for the automated parallelization process although user intervention in many of these parts are still necessary. Nevertheless, development and improvement of useful software tools, such as Cesspools, can help trim down many tedious parallelization details and improve the processing efficiency.
Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

1998-01-01

A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
An Evaluation of Architectural Platforms for Parallel Navier-Stokes Computations

NASA Technical Reports Server (NTRS)

Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

1996-01-01

We study the computational, communication, and scalability characteristics of a computational fluid dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architecture platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and distributed memory multiprocessors with different topologies - the IBM SP and the Cray T3D. We investigate the impact of various networks connecting the cluster of workstations on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Parallelizing Navier-Stokes Computations on a Variety of Architectural Platforms

NASA Technical Reports Server (NTRS)

Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

1997-01-01

We study the computational, communication, and scalability characteristics of a Computational Fluid Dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architectural platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), distributed memory multiprocessors with different topologies-the IBM SP and the Cray T3D. We investigate the impact of various networks, connecting the cluster of workstations, on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Observations of Interplanetary Scintillation (IPS) Using the Mexican Array Radio Telescope (MEXART)

NASA Astrophysics Data System (ADS)

Mejia-Ambriz, J. C.; Villanueva-Hernandez, P.; Gonzalez-Esparza, J. A.; Aguilar-Rodriguez, E.; Jeyakumar, S.

2010-08-01

The Mexican Array Radio Telescope (MEXART) consists of a 64×64 (4096) full-wavelength dipole antenna array, operating at 140 MHz, with a bandwidth of 2 MHz, occupying about 9660 square meters (69 m × 140 m) ( http://www.mexart.unam.mx ). This is a dedicated radio array for Interplanetary Scintillation (IPS) observations located at latitude 19°48'N, longitude 101°41'W. We characterize the performance of the system. We report the first IPS observations with the instrument, employing a Butler Matrix (BM) of 16×16 ports, fed by 16 east - west lines of 64 dipoles (1/4 of the total array). The BM displays a radiation pattern of 16 beams at different declinations (from -48, to +88 degrees). We present a list of 19 strong IPS radio sources (having at least 3 σ in power gain) detected by the instrument. We report the power spectral analysis procedure of the intensity fluctuations. The operation of MEXART will allow us a better coverage of solar wind disturbances, complementing the data provided by the other, previously built, instruments.
Mexican Space Weather Service (SCiESMEX)

NASA Astrophysics Data System (ADS)

Gonzalez-Esparza, J. A.; De la Luz, V.; Corona-Romero, P.; Mejia-Ambriz, J. C.; Gonzalez, L. X.; Sergeeva, M. A.; Romero-Hernandez, E.; Aguilar-Rodriguez, E.

2017-01-01

Legislative modifications of the General Civil Protection Law in Mexico in 2014 included specific references to space hazards and space weather phenomena. The legislation is consistent with United Nations promotion of international engagement and cooperation on space weather awareness, studies, and monitoring. These internal and external conditions motivated the creation of a space weather service in Mexico. The Mexican Space Weather Service (SCiESMEX in Spanish) (www.sciesmex.unam.mx) was initiated in October 2014 and is operated by the Institute of Geophysics at the Universidad Nacional Autonoma de Mexico (UNAM). SCiESMEX became a Regional Warning Center of the International Space Environment Services (ISES) in June 2015. We present the characteristics of the service, some products, and the initial actions for developing a space weather strategy in Mexico. The service operates a computing infrastructure including a web application, data repository, and a high-performance computing server to run numerical models. SCiESMEX uses data of the ground-based instrumental network of the National Space Weather Laboratory (LANCE), covering solar radio burst emissions, solar wind and interplanetary disturbances (by interplanetary scintillation observations), geomagnetic measurements, and analysis of the total electron content (TEC) of the ionosphere (by employing data from local networks of GPS receiver stations).
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yee, Ben Chung; Wollaber, Allan Benton; Haut, Terry Scot

The high-order low-order (HOLO) method is a recently developed moment-based acceleration scheme for solving time-dependent thermal radiative transfer problems, and has been shown to exhibit orders of magnitude speedups over traditional time-stepping schemes. However, a linear stability analysis by Haut et al. (2015 Haut, T. S., Lowrie, R. B., Park, H., Rauenzahn, R. M., Wollaber, A. B. (2015). A linear stability analysis of the multigroup High-Order Low-Order (HOLO) method. In Proceedings of the Joint International Conference on Mathematics and Computation (M&C), Supercomputing in Nuclear Applications (SNA) and the Monte Carlo (MC) Method; Nashville, TN, April 19–23, 2015. American Nuclear Society.)more » revealed that the current formulation of the multigroup HOLO method was unstable in certain parameter regions. Since then, we have replaced the intensity-weighted opacity in the first angular moment equation of the low-order (LO) system with the Rosseland opacity. Furthermore, this results in a modified HOLO method (HOLO-R) that is significantly more stable.« less
A stable 1D multigroup high-order low-order method

DOE PAGES

Yee, Ben Chung; Wollaber, Allan Benton; Haut, Terry Scot; ...

2016-07-13

The high-order low-order (HOLO) method is a recently developed moment-based acceleration scheme for solving time-dependent thermal radiative transfer problems, and has been shown to exhibit orders of magnitude speedups over traditional time-stepping schemes. However, a linear stability analysis by Haut et al. (2015 Haut, T. S., Lowrie, R. B., Park, H., Rauenzahn, R. M., Wollaber, A. B. (2015). A linear stability analysis of the multigroup High-Order Low-Order (HOLO) method. In Proceedings of the Joint International Conference on Mathematics and Computation (M&C), Supercomputing in Nuclear Applications (SNA) and the Monte Carlo (MC) Method; Nashville, TN, April 19–23, 2015. American Nuclear Society.)more » revealed that the current formulation of the multigroup HOLO method was unstable in certain parameter regions. Since then, we have replaced the intensity-weighted opacity in the first angular moment equation of the low-order (LO) system with the Rosseland opacity. Furthermore, this results in a modified HOLO method (HOLO-R) that is significantly more stable.« less
'Towers in the Tempest' Computer Animation Submission

NASA Technical Reports Server (NTRS)

Shirah, Greg

2008-01-01

The following describes a computer animation that has been submitted to the ACM/SIGGRAPH 2008 computer graphics conference: 'Towers in the Tempest' clearly communicates recent scientific research into how hurricanes intensify. This intensification can be caused by a phenomenon called a 'hot tower.' For the first time, research meteorologists have run complex atmospheric simulations at a very fine temporal resolution of 3 minutes. Combining this simulation data with satellite observations enables detailed study of 'hot towers.' The science of 'hot towers' is described using: satellite observation data, conceptual illustrations, and a volumetric atmospheric simulation data. The movie starts by showing a 'hot tower' observed by NASA's Tropical Rainfall Measuring Mission (TRMM) spacecraft's three dimensional precipitation radar data of Hurricane Bonnie. Next, the dynamics of a hurricane and the formation of 'hot towers' are briefly explained using conceptual illustrations. Finally, volumetric cloud, wind, and vorticity data from a supercomputer simulation of Hurricane Bonnie are shown using volume techniques such as ray marching.
The impact of the U.S. supercomputing initiative will be global

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, Dona

2016-01-15

Last July, President Obama issued an executive order that created a coordinated federal strategy for HPC research, development, and deployment called the U.S. National Strategic Computing Initiative (NSCI). However, this bold, necessary step toward building the next generation of supercomputers has inaugurated a new era for U.S. high performance computing (HPC).
Parallel-vector solution of large-scale structural analysis problems on supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1989-01-01

A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.
Predicting Hurricanes with Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

2010-01-01

Hurricane Emily, formed in the Atlantic Ocean on July 10, 2005, was the strongest hurricane ever to form before August. By checking computer models against the actual path of the storm, researchers can improve hurricane prediction. In 2010, NOAA researchers were awarded 25 million processor-hours on Argonne's BlueGene/P supercomputer for the project. Read more at http://go.usa.gov/OLh
Supercomputers Of The Future

NASA Technical Reports Server (NTRS)

Peterson, Victor L.; Kim, John; Holst, Terry L.; Deiwert, George S.; Cooper, David M.; Watson, Andrew B.; Bailey, F. Ron

1992-01-01

Report evaluates supercomputer needs of five key disciplines: turbulence physics, aerodynamics, aerothermodynamics, chemistry, and mathematical modeling of human vision. Predicts these fields will require computer speed greater than 10(Sup 18) floating-point operations per second (FLOP's) and memory capacity greater than 10(Sup 15) words. Also, new parallel computer architectures and new structured numerical methods will make necessary speed and capacity available.
Advances in petascale kinetic plasma simulation with VPIC and Roadrunner

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowers, Kevin J; Albright, Brian J; Yin, Lin

2009-01-01

VPIC, a first-principles 3d electromagnetic charge-conserving relativistic kinetic particle-in-cell (PIC) code, was recently adapted to run on Los Alamos's Roadrunner, the first supercomputer to break a petaflop (10{sup 15} floating point operations per second) in the TOP500 supercomputer performance rankings. They give a brief overview of the modeling capabilities and optimization techniques used in VPIC and the computational characteristics of petascale supercomputers like Roadrunner. They then discuss three applications enabled by VPIC's unprecedented performance on Roadrunner: modeling laser plasma interaction in upcoming inertial confinement fusion experiments at the National Ignition Facility (NIF), modeling short pulse laser GeV ion acceleration andmore » modeling reconnection in magnetic confinement fusion experiments.« less
Supercomputing Sheds Light on the Dark Universe

DOE Office of Scientific and Technical Information (OSTI.GOV)

Habib, Salman; Heitmann, Katrin

2012-11-15

At Argonne National Laboratory, scientists are using supercomputers to shed light on one of the great mysteries in science today, the Dark Universe. With Mira, a petascale supercomputer at the Argonne Leadership Computing Facility, a team led by physicists Salman Habib and Katrin Heitmann will run the largest, most complex simulation of the universe ever attempted. By contrasting the results from Mira with state-of-the-art telescope surveys, the scientists hope to gain new insights into the distribution of matter in the universe, advancing future investigations of dark energy and dark matter into a new realm. The team's research was named amore » finalist for the 2012 Gordon Bell Prize, an award recognizing outstanding achievement in high-performance computing.« less
Surprise

DOE Office of Scientific and Technical Information (OSTI.GOV)

Curran, L.

1988-03-03

Interest has been building in recent months over the imminent arrival of a new class of supercomputer, called the ''supercomputer on a desk'' or the single-user model. Most observers expected the first such product to come from either of two startups, Ardent Computer Corp. or Stellar Computer Inc. But a surprise entry has shown up. Apollo Computer Inc. is launching a new work station this week that racks up an impressive list of industry first as it puts supercomputer power at the disposal of a single user. The new series 10000 from the Chelmsford, Mass., a company is built aroundmore » a reduced-instruction-set architecture that the company calls Prism, for parallel reduced-instruction-set multiprocessor. This article describes the 10000 and Prism.« less
Progress and supercomputing in computational fluid dynamics; Proceedings of U.S.-Israel Workshop, Jerusalem, Israel, December 1984

NASA Technical Reports Server (NTRS)

Murman, E. M. (Editor); Abarbanel, S. S. (Editor)

1985-01-01

Current developments and future trends in the application of supercomputers to computational fluid dynamics are discussed in reviews and reports. Topics examined include algorithm development for personal-size supercomputers, a multiblock three-dimensional Euler code for out-of-core and multiprocessor calculations, simulation of compressible inviscid and viscous flow, high-resolution solutions of the Euler equations for vortex flows, algorithms for the Navier-Stokes equations, and viscous-flow simulation by FEM and related techniques. Consideration is given to marching iterative methods for the parabolized and thin-layer Navier-Stokes equations, multigrid solutions to quasi-elliptic schemes, secondary instability of free shear flows, simulation of turbulent flow, and problems connected with weather prediction.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.