supercomputing advanced methods: Topics by Science.gov

Sample records for supercomputing advanced methods

NASA Advanced Supercomputing Facility Expansion

NASA Technical Reports Server (NTRS)

Thigpen, William W.

2017-01-01

The NASA Advanced Supercomputing (NAS) Division enables advances in high-end computing technologies and in modeling and simulation methods to tackle some of the toughest science and engineering challenges facing NASA today. The name "NAS" has long been associated with leadership and innovation throughout the high-end computing (HEC) community. We play a significant role in shaping HEC standards and paradigms, and provide leadership in the areas of large-scale InfiniBand fabrics, Lustre open-source filesystems, and hyperwall technologies. We provide an integrated high-end computing environment to accelerate NASA missions and make revolutionary advances in science. Pleiades, a petaflop-scale supercomputer, is used by scientists throughout the U.S. to support NASA missions, and is ranked among the most powerful systems in the world. One of our key focus areas is in modeling and simulation to support NASA's real-world engineering applications and make fundamental advances in modeling and simulation methods.
Supercomputer optimizations for stochastic optimal control applications

NASA Technical Reports Server (NTRS)

Chung, Siu-Leung; Hanson, Floyd B.; Xu, Huihuang

1991-01-01

Supercomputer optimizations for a computational method of solving stochastic, multibody, dynamic programming problems are presented. The computational method is valid for a general class of optimal control problems that are nonlinear, multibody dynamical systems, perturbed by general Markov noise in continuous time, i.e., nonsmooth Gaussian as well as jump Poisson random white noise. Optimization techniques for vector multiprocessors or vectorizing supercomputers include advanced data structures, loop restructuring, loop collapsing, blocking, and compiler directives. These advanced computing techniques and superconducting hardware help alleviate Bellman's curse of dimensionality in dynamic programming computations, by permitting the solution of large multibody problems. Possible applications include lumped flight dynamics models for uncertain environments, such as large scale and background random aerospace fluctuations.
Intricacies of modern supercomputing illustrated with recent advances in simulations of strongly correlated electron systems

NASA Astrophysics Data System (ADS)

Schulthess, Thomas C.

2013-03-01

The continued thousand-fold improvement in sustained application performance per decade on modern supercomputers keeps opening new opportunities for scientific simulations. But supercomputers have become very complex machines, built with thousands or tens of thousands of complex nodes consisting of multiple CPU cores or, most recently, a combination of CPU and GPU processors. Efficient simulations on such high-end computing systems require tailored algorithms that optimally map numerical methods to particular architectures. These intricacies will be illustrated with simulations of strongly correlated electron systems, where the development of quantum cluster methods, Monte Carlo techniques, as well as their optimal implementation by means of algorithms with improved data locality and high arithmetic density have gone hand in hand with evolving computer architectures. The present work would not have been possible without continued access to computing resources at the National Center for Computational Science of Oak Ridge National Laboratory, which is funded by the Facilities Division of the Office of Advanced Scientific Computing Research, and the Swiss National Supercomputing Center (CSCS) that is funded by ETH Zurich.
Demonstration of NICT Space Weather Cloud --Integration of Supercomputer into Analysis and Visualization Environment--

NASA Astrophysics Data System (ADS)

Watari, S.; Morikawa, Y.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Kato, H.; Shimojo, S.; Murata, K. T.

2010-12-01

In the Solar-Terrestrial Physics (STP) field, spatio-temporal resolution of computer simulations is getting higher and higher because of tremendous advancement of supercomputers. A more advanced technology is Grid Computing that integrates distributed computational resources to provide scalable computing resources. In the simulation research, it is effective that a researcher oneself designs his physical model, performs calculations with a supercomputer, and analyzes and visualizes for consideration by a familiar method. A supercomputer is far from an analysis and visualization environment. In general, a researcher analyzes and visualizes in the workstation (WS) managed at hand because the installation and the operation of software in the WS are easy. Therefore, it is necessary to copy the data from the supercomputer to WS manually. Time necessary for the data transfer through long delay network disturbs high-accuracy simulations actually. In terms of usefulness, integrating a supercomputer and an analysis and visualization environment seamlessly with a researcher's familiar method is important. NICT has been developing a cloud computing environment (NICT Space Weather Cloud). In the NICT Space Weather Cloud, disk servers are located near its supercomputer and WSs for data analysis and visualization. They are connected to JGN2plus that is high-speed network for research and development. Distributed virtual high-capacity storage is also constructed by Grid Datafarm (Gfarm v2). Huge-size data output from the supercomputer is transferred to the virtual storage through JGN2plus. A researcher can concentrate on the research by a familiar method without regard to distance between a supercomputer and an analysis and visualization environment. Now, total 16 disk servers are setup in NICT headquarters (at Koganei, Tokyo), JGN2plus NOC (at Otemachi, Tokyo), Okinawa Subtropical Environment Remote-Sensing Center, and Cybermedia Center, Osaka University. They are connected on JGN2plus, and they constitute 1PB (physical size) virtual storage by Gfarm v2. These disk servers are connected with supercomputers of NICT and Osaka University. A system that data output from the supercomputers are automatically transferred to the virtual storage had been built up. Transfer rate is about 50 GB/hrs by actual measurement. It is estimated that the performance is reasonable for a certain simulation and analysis for reconstruction of coronal magnetic field. This research is assumed an experiment of the system, and the verification of practicality is advanced at the same time. Herein we introduce an overview of the space weather cloud system so far we have developed. We also demonstrate several scientific results using the space weather cloud system. We also introduce several web applications of the cloud as a service of the space weather cloud, which is named as "e-SpaceWeather" (e-SW). The e-SW provides with a variety of space weather online services from many aspects.
NASA's supercomputing experience

NASA Technical Reports Server (NTRS)

Bailey, F. Ron

1990-01-01

A brief overview of NASA's recent experience in supercomputing is presented from two perspectives: early systems development and advanced supercomputing applications. NASA's role in supercomputing systems development is illustrated by discussion of activities carried out by the Numerical Aerodynamical Simulation Program. Current capabilities in advanced technology applications are illustrated with examples in turbulence physics, aerodynamics, aerothermodynamics, chemistry, and structural mechanics. Capabilities in science applications are illustrated by examples in astrophysics and atmospheric modeling. Future directions and NASA's new High Performance Computing Program are briefly discussed.
NASA Advanced Supercomputing (NAS) User Services Group

NASA Technical Reports Server (NTRS)

Pandori, John; Hamilton, Chris; Niggley, C. E.; Parks, John W. (Technical Monitor)

2002-01-01

This viewgraph presentation provides an overview of NAS (NASA Advanced Supercomputing), its goals, and its mainframe computer assets. Also covered are its functions, including systems monitoring and technical support.
Edison - A New Cray Supercomputer Advances Discovery at NERSC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy

2014-02-06

When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.
Edison - A New Cray Supercomputer Advances Discovery at NERSC

ScienceCinema

Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy; Trebotich, David; Broughton, Jeff; Antypas, Katie; Lukic, Zarija, Borrill, Julian; Draney, Brent; Chen, Jackie

2018-01-16

When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.
Scaling of data communications for an advanced supercomputer network

NASA Technical Reports Server (NTRS)

Levin, E.; Eaton, C. K.; Young, Bruce

1986-01-01

The goal of NASA's Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations and by remote communication to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. The implications of a projected 20-fold increase in processing power on the data communications requirements are described.
Advanced Computing for Manufacturing.

ERIC Educational Resources Information Center

Erisman, Albert M.; Neves, Kenneth W.

1987-01-01

Discusses ways that supercomputers are being used in the manufacturing industry, including the design and production of airplanes and automobiles. Describes problems that need to be solved in the next few years for supercomputers to assume a major role in industry. (TW)
Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

DOE PAGES

Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; ...

2015-12-21

This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000 ® problems. These benchmark and scaling studies show promising results.« less
Automated Help System For A Supercomputer

NASA Technical Reports Server (NTRS)

Callas, George P.; Schulbach, Catherine H.; Younkin, Michael

1994-01-01

Expert-system software developed to provide automated system of user-helping displays in supercomputer system at Ames Research Center Advanced Computer Facility. Users located at remote computer terminals connected to supercomputer and each other via gateway computers, local-area networks, telephone lines, and satellite links. Automated help system answers routine user inquiries about how to use services of computer system. Available 24 hours per day and reduces burden on human experts, freeing them to concentrate on helping users with complicated problems.
Will Moores law be sufficient?

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeBenedictis, Erik P.

2004-07-01

It seems well understood that supercomputer simulation is an enabler for scientific discoveries, weapons, and other activities of value to society. It also seems widely believed that Moore's Law will make progressively more powerful supercomputers over time and thus enable more of these contributions. This paper seeks to add detail to these arguments, revealing them to be generally correct but not a smooth and effortless progression. This paper will review some key problems that can be solved with supercomputer simulation, showing that more powerful supercomputers will be useful up to a very high yet finite limit of around 1021 FLOPSmore » (1 Zettaflops) . The review will also show the basic nature of these extreme problems. This paper will review work by others showing that the theoretical maximum supercomputer power is very high indeed, but will explain how a straightforward extrapolation of Moore's Law will lead to technological maturity in a few decades. The power of a supercomputer at the maturity of Moore's Law will be very high by today's standards at 1016-1019 FLOPS (100 Petaflops to 10 Exaflops), depending on architecture, but distinctly below the level required for the most ambitious applications. Having established that Moore's Law will not be that last word in supercomputing, this paper will explore the nearer term issue of what a supercomputer will look like at maturity of Moore's Law. Our approach will quantify the maximum performance as permitted by the laws of physics for extension of current technology and then find a design that approaches this limit closely. We study a 'multi-architecture' for supercomputers that combines a microprocessor with other 'advanced' concepts and find it can reach the limits as well. This approach should be quite viable in the future because the microprocessor would provide compatibility with existing codes and programming styles while the 'advanced' features would provide a boost to the limits of performance.« less
Technology advances and market forces: Their impact on high performance architectures

NASA Technical Reports Server (NTRS)

Best, D. R.

1978-01-01

Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.
Role of HPC in Advancing Computational Aeroelasticity

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.

2004-01-01

On behalf of the High Performance Computing and Modernization Program (HPCMP) and NASA Advanced Supercomputing Division (NAS) a study is conducted to assess the role of supercomputers on computational aeroelasticity of aerospace vehicles. The study is mostly based on the responses to a web based questionnaire that was designed to capture the nuances of high performance computational aeroelasticity, particularly on parallel computers. A procedure is presented to assign a fidelity-complexity index to each application. Case studies based on major applications using HPCMP resources are presented.
National Test Facility civilian agency use of supercomputers not feasible

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1994-12-01

Based on interviews with civilian agencies cited in the House report (DOE, DoEd, HHS, FEMA, NOAA), none would be able to make effective use of NTF`s excess supercomputing capabilities. These agencies stated they could not use the resources primarily because (1) NTF`s supercomputers are older machines whose performance and costs cannot match those of more advanced computers available from other sources and (2) some agencies have not yet developed applications requiring supercomputer capabilities or do not have funding to support such activities. In addition, future support for the hardware and software at NTF is uncertain, making any investment by anmore » outside user risky.« less
Japanese supercomputer technology.

PubMed

Buzbee, B L; Ewald, R H; Worlton, W J

1982-12-17

Under the auspices of the Ministry for International Trade and Industry the Japanese have launched a National Superspeed Computer Project intended to produce high-performance computers for scientific computation and a Fifth-Generation Computer Project intended to incorporate and exploit concepts of artificial intelligence. If these projects are successful, which appears likely, advanced economic and military research in the United States may become dependent on access to supercomputers of foreign manufacture.
Code IN Exhibits - Supercomputing 2000

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; McCann, Karen M.; Biswas, Rupak; VanderWijngaart, Rob F.; Kwak, Dochan (Technical Monitor)

2000-01-01

The creation of parameter study suites has recently become a more challenging problem as the parameter studies have become multi-tiered and the computational environment has become a supercomputer grid. The parameter spaces are vast, the individual problem sizes are getting larger, and researchers are seeking to combine several successive stages of parameterization and computation. Simultaneously, grid-based computing offers immense resource opportunities but at the expense of great difficulty of use. We present ILab, an advanced graphical user interface approach to this problem. Our novel strategy stresses intuitive visual design tools for parameter study creation and complex process specification, and also offers programming-free access to grid-based supercomputer resources and process automation.
Advanced Numerical Techniques of Performance Evaluation. Volume 1

DTIC Science & Technology

1990-06-01

system scheduling3thread. The scheduling thread then runs any other ready thread that can be found. A thread can only sleep or switch out on itself...Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers C...Kuck 1987] C.D. Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Trans. on Comp
Data communication requirements for the advanced NAS network

NASA Technical Reports Server (NTRS)

Levin, Eugene; Eaton, C. K.; Young, Bruce

1986-01-01

The goal of the Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations, and by remote communications to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. In the 1987/1988 time period it is anticipated that a computer with 4 times the processing speed of a Cray 2 will be obtained and by 1990 an additional supercomputer with 16 times the speed of the Cray 2. The implications of this 20-fold increase in processing power on the data communications requirements are described. The analysis was based on models of the projected workload and system architecture. The results are presented together with the estimates of their sensitivity to assumptions inherent in the models.

NAS Technical Summaries, March 1993 - February 1994

NASA Technical Reports Server (NTRS)

1995-01-01

NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1993-94 operational year concluded with 448 high-speed processor projects and 95 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.
NAS technical summaries. Numerical aerodynamic simulation program, March 1992 - February 1993

NASA Technical Reports Server (NTRS)

1994-01-01

NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1992-93 operational year concluded with 399 high-speed processor projects and 91 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.
NAS technical summaries: Numerical aerodynamic simulation program, March 1991 - February 1992

NASA Technical Reports Server (NTRS)

1992-01-01

NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefiting other supercomputer centers in Government and industry. This report contains selected scientific results from the 1991-92 NAS Operational Year, March 4, 1991 to March 3, 1992, which is the fifth year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP. The Cray-2, the first generation supercomputer, has four processors, 256 megawords of central memory, and a total sustained speed of 250 million floating point operations per second. The Cray Y-MP, the second generation supercomputer, has eight processors and a total sustained speed of one billion floating point operations per second. Additional memory was installed this year, doubling capacity from 128 to 256 megawords of solid-state storage-device memory. Because of its higher performance, the Cray Y-MP delivered approximately 77 percent of the total number of supercomputer hours used during this year.
NASA's Pleiades Supercomputer Crunches Data For Groundbreaking Analysis and Visualizations

NASA Image and Video Library

2016-11-23

The Pleiades supercomputer at NASA's Ames Research Center, recently named the 13th fastest computer in the world, provides scientists and researchers high-fidelity numerical modeling of complex systems and processes. By using detailed analyses and visualizations of large-scale data, Pleiades is helping to advance human knowledge and technology, from designing the next generation of aircraft and spacecraft to understanding the Earth's climate and the mysteries of our galaxy.
US Department of Energy High School Student Supercomputing Honors Program: A follow-up assessment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1987-01-01

The US DOE High School Student Supercomputing Honors Program was designed to recognize high school students with superior skills in mathematics and computer science and to provide them with formal training and experience with advanced computer equipment. This document reports on the participants who attended the first such program, which was held at the National Magnetic Fusion Energy Computer Center at the Lawrence Livermore National Laboratory (LLNL) during August 1985.
Advances in petascale kinetic plasma simulation with VPIC and Roadrunner

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowers, Kevin J; Albright, Brian J; Yin, Lin

2009-01-01

VPIC, a first-principles 3d electromagnetic charge-conserving relativistic kinetic particle-in-cell (PIC) code, was recently adapted to run on Los Alamos's Roadrunner, the first supercomputer to break a petaflop (10{sup 15} floating point operations per second) in the TOP500 supercomputer performance rankings. They give a brief overview of the modeling capabilities and optimization techniques used in VPIC and the computational characteristics of petascale supercomputers like Roadrunner. They then discuss three applications enabled by VPIC's unprecedented performance on Roadrunner: modeling laser plasma interaction in upcoming inertial confinement fusion experiments at the National Ignition Facility (NIF), modeling short pulse laser GeV ion acceleration andmore » modeling reconnection in magnetic confinement fusion experiments.« less
Supercomputing Sheds Light on the Dark Universe

DOE Office of Scientific and Technical Information (OSTI.GOV)

Habib, Salman; Heitmann, Katrin

2012-11-15

At Argonne National Laboratory, scientists are using supercomputers to shed light on one of the great mysteries in science today, the Dark Universe. With Mira, a petascale supercomputer at the Argonne Leadership Computing Facility, a team led by physicists Salman Habib and Katrin Heitmann will run the largest, most complex simulation of the universe ever attempted. By contrasting the results from Mira with state-of-the-art telescope surveys, the scientists hope to gain new insights into the distribution of matter in the universe, advancing future investigations of dark energy and dark matter into a new realm. The team's research was named amore » finalist for the 2012 Gordon Bell Prize, an award recognizing outstanding achievement in high-performance computing.« less
Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

PubMed

Tajima, K

1988-11-01

This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.
Science and Technology Review June 2000

DOE Office of Scientific and Technical Information (OSTI.GOV)

de Pruneda, J.H.

2000-06-01

This issue contains the following articles: (1) ''Accelerating on the ASCI Challenge''. (2) ''New Day Daws in Supercomputing'' When the ASCI White supercomputer comes online this summer, DOE's Stockpile Stewardship Program will make another significant advanced toward helping to ensure the safety, reliability, and performance of the nation's nuclear weapons. (3) ''Uncovering the Secrets of Actinides'' Researchers are obtaining fundamental information about the actinides, a group of elements with a key role in nuclear weapons and fuels. (4) ''A Predictable Structure for Aerogels''. (5) ''Tibet--Where Continents Collide''.
Interfaces for Advanced Computing.

ERIC Educational Resources Information Center

Foley, James D.

1987-01-01

Discusses the coming generation of supercomputers that will have the power to make elaborate "artificial realities" that facilitate user-computer communication. Illustrates these technological advancements with examples of the use of head-mounted monitors which are connected to position and orientation sensors, and gloves that track finger and…
Wakefield Computations for the CLIC PETS using the Parallel Finite Element Time-Domain Code T3P

DOE Office of Scientific and Technical Information (OSTI.GOV)

Candel, A; Kabel, A.; Lee, L.

In recent years, SLAC's Advanced Computations Department (ACD) has developed the high-performance parallel 3D electromagnetic time-domain code, T3P, for simulations of wakefields and transients in complex accelerator structures. T3P is based on advanced higher-order Finite Element methods on unstructured grids with quadratic surface approximation. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with unprecedented accuracy, aiding the design of the next generation of accelerator facilities. Applications to the Compact Linear Collider (CLIC) Power Extraction and Transfer Structure (PETS) are presented.
2014 Annual Report - Argonne Leadership Computing Facility

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collins, James R.; Papka, Michael E.; Cerny, Beth A.

The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines.
2015 Annual Report - Argonne Leadership Computing Facility

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collins, James R.; Papka, Michael E.; Cerny, Beth A.

The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines.
CFD in the 1980's from one point of view

NASA Technical Reports Server (NTRS)

Lomax, Harvard

1991-01-01

The present interpretive treatment of the development history of CFD in the 1980s gives attention to advancements in such algorithmic techniques as flux Jacobian-based upwind differencing, total variation-diminishing and essentially nonoscillatory schemes, multigrid methods, unstructured grids, and nonrectangular structured grids. At the same time, computational turbulence research gave attention to turbulence modeling on the bases of increasingly powerful supercomputers and meticulously constructed databases. The major future developments in CFD will encompass such capabilities as structured and unstructured three-dimensional grids.
Finite element methods on supercomputers - The scatter-problem

NASA Technical Reports Server (NTRS)

Loehner, R.; Morgan, K.

1985-01-01

Certain problems arise in connection with the use of supercomputers for the implementation of finite-element methods. These problems are related to the desirability of utilizing the power of the supercomputer as fully as possible for the rapid execution of the required computations, taking into account the gain in speed possible with the aid of pipelining operations. For the finite-element method, the time-consuming operations may be divided into three categories. The first two present no problems, while the third type of operation can be a reason for the inefficient performance of finite-element programs. Two possibilities for overcoming certain difficulties are proposed, giving attention to a scatter-process.
Next Generation Security for the 10,240 Processor Columbia System

NASA Technical Reports Server (NTRS)

Hinke, Thomas; Kolano, Paul; Shaw, Derek; Keller, Chris; Tweton, Dave; Welch, Todd; Liu, Wen (Betty)

2005-01-01

This presentation includes a discussion of the Columbia 10,240-processor system located at the NASA Advanced Supercomputing (NAS) division at the NASA Ames Research Center which supports each of NASA's four missions: science, exploration systems, aeronautics, and space operations. It is comprised of 20 Silicon Graphics nodes, each consisting of 512 Itanium II processors. A 64 processor Columbia front-end system supports users as they prepare their jobs and then submits them to the PBS system. Columbia nodes and front-end systems use the Linux OS. Prior to SC04, the Columbia system was used to attain a processing speed of 51.87 TeraFlops, which made it number two on the Top 500 list of the world's supercomputers and the world's fastest "operational" supercomputer since it was fully engaged in supporting NASA users.
NASA's Participation in the National Computational Grid

NASA Technical Reports Server (NTRS)

Feiereisen, William J.; Zornetzer, Steve F. (Technical Monitor)

1998-01-01

Over the last several years it has become evident that the character of NASA's supercomputing needs has changed. One of the major missions of the agency is to support the design and manufacture of aero- and space-vehicles with technologies that will significantly reduce their cost. It is becoming clear that improvements in the process of aerospace design and manufacturing will require a high performance information infrastructure that allows geographically dispersed teams to draw upon resources that are broader than traditional supercomputing. A computational grid draws together our information resources into one system. We can foresee the time when a Grid will allow engineers and scientists to use the tools of supercomputers, databases and on line experimental devices in a virtual environment to collaborate with distant colleagues. The concept of a computational grid has been spoken of for many years, but several events in recent times are conspiring to allow us to actually build one. In late 1997 the National Science Foundation initiated the Partnerships for Advanced Computational Infrastructure (PACI) which is built around the idea of distributed high performance computing. The Alliance lead, by the National Computational Science Alliance (NCSA), and the National Partnership for Advanced Computational Infrastructure (NPACI), lead by the San Diego Supercomputing Center, have been instrumental in drawing together the "Grid Community" to identify the technology bottlenecks and propose a research agenda to address them. During the same period NASA has begun to reformulate parts of two major high performance computing research programs to concentrate on distributed high performance computing and has banded together with the PACI centers to address the research agenda in common.
Flux-Level Transit Injection Experiments with NASA Pleiades Supercomputer

NASA Astrophysics Data System (ADS)

Li, Jie; Burke, Christopher J.; Catanzarite, Joseph; Seader, Shawn; Haas, Michael R.; Batalha, Natalie; Henze, Christopher; Christiansen, Jessie; Kepler Project, NASA Advanced Supercomputing Division

2016-06-01

Flux-Level Transit Injection (FLTI) experiments are executed with NASA's Pleiades supercomputer for the Kepler Mission. The latest release (9.3, January 2016) of the Kepler Science Operations Center Pipeline is used in the FLTI experiments. Their purpose is to validate the Analytic Completeness Model (ACM), which can be computed for all Kepler target stars, thereby enabling exoplanet occurrence rate studies. Pleiades, a facility of NASA's Advanced Supercomputing Division, is one of the world's most powerful supercomputers and represents NASA's state-of-the-art technology. We discuss the details of implementing the FLTI experiments on the Pleiades supercomputer. For example, taking into account that ~16 injections are generated by one core of the Pleiades processors in an hour, the “shallow” FLTI experiment, in which ~2000 injections are required per target star, can be done for 16% of all Kepler target stars in about 200 hours. Stripping down the transit search to bare bones, i.e. only searching adjacent high/low periods at high/low pulse durations, makes the computationally intensive FLTI experiments affordable. The design of the FLTI experiments and the analysis of the resulting data are presented in “Validating an Analytic Completeness Model for Kepler Target Stars Based on Flux-level Transit Injection Experiments” by Catanzarite et al. (#2494058).Kepler was selected as the 10th mission of the Discovery Program. Funding for the Kepler Mission has been provided by the NASA Science Mission Directorate.
Role of High-End Computing in Meeting NASA's Science and Engineering Challenges

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Tu, Eugene L.; Van Dalsem, William R.

2006-01-01

Two years ago, NASA was on the verge of dramatically increasing its HEC capability and capacity. With the 10,240-processor supercomputer, Columbia, now in production for 18 months, HEC has an even greater impact within the Agency and extending to partner institutions. Advanced science and engineering simulations in space exploration, shuttle operations, Earth sciences, and fundamental aeronautics research are occurring on Columbia, demonstrating its ability to accelerate NASA s exploration vision. This talk describes how the integrated production environment fostered at the NASA Advanced Supercomputing (NAS) facility at Ames Research Center is accelerating scientific discovery, achieving parametric analyses of multiple scenarios, and enhancing safety for NASA missions. We focus on Columbia s impact on two key engineering and science disciplines: Aerospace, and Climate. We also discuss future mission challenges and plans for NASA s next-generation HEC environment.
Recent advances and future prospects for Monte Carlo

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B

2010-01-01

The history of Monte Carlo methods is closely linked to that of computers: The first known Monte Carlo program was written in 1947 for the ENIAC; a pre-release of the first Fortran compiler was used for Monte Carlo In 1957; Monte Carlo codes were adapted to vector computers in the 1980s, clusters and parallel computers in the 1990s, and teraflop systems in the 2000s. Recent advances include hierarchical parallelism, combining threaded calculations on multicore processors with message-passing among different nodes. With the advances In computmg, Monte Carlo codes have evolved with new capabilities and new ways of use. Production codesmore » such as MCNP, MVP, MONK, TRIPOLI and SCALE are now 20-30 years old (or more) and are very rich in advanced featUres. The former 'method of last resort' has now become the first choice for many applications. Calculations are now routinely performed on office computers, not just on supercomputers. Current research and development efforts are investigating the use of Monte Carlo methods on FPGAs. GPUs, and many-core processors. Other far-reaching research is exploring ways to adapt Monte Carlo methods to future exaflop systems that may have 1M or more concurrent computational processes.« less

Seismic signal processing on heterogeneous supercomputers

NASA Astrophysics Data System (ADS)

Gokhberg, Alexey; Ermert, Laura; Fichtner, Andreas

2015-04-01

The processing of seismic signals - including the correlation of massive ambient noise data sets - represents an important part of a wide range of seismological applications. It is characterized by large data volumes as well as high computational input/output intensity. Development of efficient approaches towards seismic signal processing on emerging high performance computing systems is therefore essential. Heterogeneous supercomputing systems introduced in the recent years provide numerous computing nodes interconnected via high throughput networks, every node containing a mix of processing elements of different architectures, like several sequential processor cores and one or a few graphical processing units (GPU) serving as accelerators. A typical representative of such computing systems is "Piz Daint", a supercomputer of the Cray XC 30 family operated by the Swiss National Supercomputing Center (CSCS), which we used in this research. Heterogeneous supercomputers provide an opportunity for manifold application performance increase and are more energy-efficient, however they have much higher hardware complexity and are therefore much more difficult to program. The programming effort may be substantially reduced by the introduction of modular libraries of software components that can be reused for a wide class of seismology applications. The ultimate goal of this research is design of a prototype for such library suitable for implementing various seismic signal processing applications on heterogeneous systems. As a representative use case we have chosen an ambient noise correlation application. Ambient noise interferometry has developed into one of the most powerful tools to image and monitor the Earth's interior. Future applications will require the extraction of increasingly small details from noise recordings. To meet this demand, more advanced correlation techniques combined with very large data volumes are needed. This poses new computational problems that require dedicated HPC solutions. The chosen application is using a wide range of common signal processing methods, which include various IIR filter designs, amplitude and phase correlation, computing the analytic signal, and discrete Fourier transforms. Furthermore, various processing methods specific for seismology, like rotation of seismic traces, are used. Efficient implementation of all these methods on the GPU-accelerated systems represents several challenges. In particular, it requires a careful distribution of work between the sequential processors and accelerators. Furthermore, since the application is designed to process very large volumes of data, special attention had to be paid to the efficient use of the available memory and networking hardware resources in order to reduce intensity of data input and output. In our contribution we will explain the software architecture as well as principal engineering decisions used to address these challenges. We will also describe the programming model based on C++ and CUDA that we used to develop the software. Finally, we will demonstrate performance improvements achieved by using the heterogeneous computing architecture. This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID d26.
Parallel-vector solution of large-scale structural analysis problems on supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1989-01-01

A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.
Porting Ordinary Applications to Blue Gene/Q Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maheshwari, Ketan C.; Wozniak, Justin M.; Armstrong, Timothy

2015-08-31

Efficiently porting ordinary applications to Blue Gene/Q supercomputers is a significant challenge. Codes are often originally developed without considering advanced architectures and related tool chains. Science needs frequently lead users to want to run large numbers of relatively small jobs (often called many-task computing, an ensemble, or a workflow), which can conflict with supercomputer configurations. In this paper, we discuss techniques developed to execute ordinary applications over leadership class supercomputers. We use the high-performance Swift parallel scripting framework and build two workflow execution techniques-sub-jobs and main-wrap. The sub-jobs technique, built on top of the IBM Blue Gene/Q resource manager Cobalt'smore » sub-block jobs, lets users submit multiple, independent, repeated smaller jobs within a single larger resource block. The main-wrap technique is a scheme that enables C/C++ programs to be defined as functions that are wrapped by a high-performance Swift wrapper and that are invoked as a Swift script. We discuss the needs, benefits, technicalities, and current limitations of these techniques. We further discuss the real-world science enabled by these techniques and the results obtained.« less
Unstructured Adaptive Meshes: Bad for Your Memory?

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Feng, Hui-Yu; VanderWijngaart, Rob

2003-01-01

This viewgraph presentation explores the need for a NASA Advanced Supercomputing (NAS) parallel benchmark for problems with irregular dynamical memory access. This benchmark is important and necessary because: 1) Problems with localized error source benefit from adaptive nonuniform meshes; 2) Certain machines perform poorly on such problems; 3) Parallel implementation may provide further performance improvement but is difficult. Some examples of problems which use irregular dynamical memory access include: 1) Heat transfer problem; 2) Heat source term; 3) Spectral element method; 4) Base functions; 5) Elemental discrete equations; 6) Global discrete equations. Nonconforming Mesh and Mortar Element Method are covered in greater detail in this presentation.
Kriging for Spatial-Temporal Data on the Bridges Supercomputer

NASA Astrophysics Data System (ADS)

Hodgess, E. M.

2017-12-01

Currently, kriging of spatial-temporal data is slow and limited to relatively small vector sizes. We have developed a method on the Bridges supercomputer, at the Pittsburgh supercomputer center, which uses a combination of the tools R, Fortran, the Message Passage Interface (MPI), OpenACC, and special R packages for big data. This combination of tools now permits us to complete tasks which could previously not be completed, or takes literally hours to complete. We ran simulation studies from a laptop against the supercomputer. We also look at "real world" data sets, such as the Irish wind data, and some weather data. We compare the timings. We note that the timings are suprising good.
Real World Uses For Nagios APIs

NASA Technical Reports Server (NTRS)

Singh, Janice

2014-01-01

This presentation describes the Nagios 4 APIs and how the NASA Advanced Supercomputing at Ames Research Center is employing them to upgrade its graphical status display (the HUD) and explain why it's worth trying to use them yourselves.
Science and Technology at Oak Ridge National Laboratory

ScienceCinema

Mason, Thomas

2017-12-22

ORNL Director Thom Mason explains the groundbreaking work in neutron sciences, supercomputing, clean energy, advanced materials, nuclear research, and global security taking place at the Department of Energy's Office of Science laboratory in Oak Ridge, TN.
ARC-2012-ACD12-0022-003

NASA Image and Video Library

2012-02-02

Kepler Program VIP's from left Jon Jenkins, Natalie Batalha, and Bill Borucki pointing at the NASA Ames Hyperwall in the NAS (NASA Advanced Supercomputing) facility filled with exo-planets discovered during Kepler Mission. Moffett Field, CA (for aviation week)
DVS-SOFTWARE: An Effective Tool for Applying Highly Parallelized Hardware To Computational Geophysics

NASA Astrophysics Data System (ADS)

Herrera, I.; Herrera, G. S.

2015-12-01

Most geophysical systems are macroscopic physical systems. The behavior prediction of such systems is carried out by means of computational models whose basic models are partial differential equations (PDEs) [1]. Due to the enormous size of the discretized version of such PDEs it is necessary to apply highly parallelized super-computers. For them, at present, the most efficient software is based on non-overlapping domain decomposition methods (DDM). However, a limiting feature of the present state-of-the-art techniques is due to the kind of discretizations used in them. Recently, I. Herrera and co-workers using 'non-overlapping discretizations' have produced the DVS-Software which overcomes this limitation [2]. The DVS-software can be applied to a great variety of geophysical problems and achieves very high parallel efficiencies (90%, or so [3]). It is therefore very suitable for effectively applying the most advanced parallel supercomputers available at present. In a parallel talk, in this AGU Fall Meeting, Graciela Herrera Z. will present how this software is being applied to advance MOD-FLOW. Key Words: Parallel Software for Geophysics, High Performance Computing, HPC, Parallel Computing, Domain Decomposition Methods (DDM)REFERENCES [1]. Herrera Ismael and George F. Pinder, Mathematical Modelling in Science and Engineering: An axiomatic approach", John Wiley, 243p., 2012. [2]. Herrera, I., de la Cruz L.M. and Rosas-Medina A. "Non Overlapping Discretization Methods for Partial, Differential Equations". NUMER METH PART D E, 30: 1427-1454, 2014, DOI 10.1002/num 21852. (Open source) [3]. Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)
Desktop supercomputer: what can it do?

NASA Astrophysics Data System (ADS)

Bogdanov, A.; Degtyarev, A.; Korkhov, V.

2017-12-01

The paper addresses the issues of solving complex problems that require using supercomputers or multiprocessor clusters available for most researchers nowadays. Efficient distribution of high performance computing resources according to actual application needs has been a major research topic since high-performance computing (HPC) technologies became widely introduced. At the same time, comfortable and transparent access to these resources was a key user requirement. In this paper we discuss approaches to build a virtual private supercomputer available at user's desktop: a virtual computing environment tailored specifically for a target user with a particular target application. We describe and evaluate possibilities to create the virtual supercomputer based on light-weight virtualization technologies, and analyze the efficiency of our approach compared to traditional methods of HPC resource management.
ARC-2012-ACD12-0022-007

NASA Image and Video Library

2012-02-02

Kepler Program VIP's from left Natalie Batalha, Bill Borucki and Jon Jenkins in front of a NASA Ames Hyperwall display of newly discovered planet K-22B art at the NAS (NASA Advanced Supercomputing) Facility, Moffett Field, CA (for aviation week)
Finite element analysis in fluids; Proceedings of the Seventh International Conference on Finite Element Methods in Flow Problems, University of Alabama, Huntsville, Apr. 3-7, 1989

NASA Technical Reports Server (NTRS)

Chung, T. J. (Editor); Karr, Gerald R. (Editor)

1989-01-01

Recent advances in computational fluid dynamics are examined in reviews and reports, with an emphasis on finite-element methods. Sections are devoted to adaptive meshes, atmospheric dynamics, combustion, compressible flows, control-volume finite elements, crystal growth, domain decomposition, EM-field problems, FDM/FEM, and fluid-structure interactions. Consideration is given to free-boundary problems with heat transfer, free surface flow, geophysical flow problems, heat and mass transfer, high-speed flow, incompressible flow, inverse design methods, MHD problems, the mathematics of finite elements, and mesh generation. Also discussed are mixed finite elements, multigrid methods, non-Newtonian fluids, numerical dissipation, parallel vector processing, reservoir simulation, seepage, shallow-water problems, spectral methods, supercomputer architectures, three-dimensional problems, and turbulent flows.
Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide

DOE PAGES

Tang, William; Wang, Bei; Ethier, Stephane; ...

2016-11-01

The goal of the extreme scale plasma turbulence studies described in this paper is to expedite the delivery of reliable predictions on confinement physics in large magnetic fusion systems by using world-class supercomputers to carry out simulations with unprecedented resolution and temporal duration. This has involved architecture-dependent optimizations of performance scaling and addressing code portability and energy issues, with the metrics for multi-platform comparisons being 'time-to-solution' and 'energy-to-solution'. Realistic results addressing how confinement losses caused by plasma turbulence scale from present-day devices to the much larger $25 billion international ITER fusion facility have been enabled by innovative advances in themore » GTC-P code including (i) implementation of one-sided communication from MPI 3.0 standard; (ii) creative optimization techniques on Xeon Phi processors; and (iii) development of a novel performance model for the key kernels of the PIC code. Our results show that modeling data movement is sufficient to predict performance on modern supercomputer platforms.« less
Multi-petascale highly efficient parallel supercomputer

DOEpatents

Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

2015-07-14

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
Computation Directorate 2008 Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, D L

2009-03-25

Whether a computer is simulating the aging and performance of a nuclear weapon, the folding of a protein, or the probability of rainfall over a particular mountain range, the necessary calculations can be enormous. Our computers help researchers answer these and other complex problems, and each new generation of system hardware and software widens the realm of possibilities. Building on Livermore's historical excellence and leadership in high-performance computing, Computation added more than 331 trillion floating-point operations per second (teraFLOPS) of power to LLNL's computer room floors in 2008. In addition, Livermore's next big supercomputer, Sequoia, advanced ever closer to itsmore » 2011-2012 delivery date, as architecture plans and the procurement contract were finalized. Hyperion, an advanced technology cluster test bed that teams Livermore with 10 industry leaders, made a big splash when it was announced during Michael Dell's keynote speech at the 2008 Supercomputing Conference. The Wall Street Journal touted Hyperion as a 'bright spot amid turmoil' in the computer industry. Computation continues to measure and improve the costs of operating LLNL's high-performance computing systems by moving hardware support in-house, by measuring causes of outages to apply resources asymmetrically, and by automating most of the account and access authorization and management processes. These improvements enable more dollars to go toward fielding the best supercomputers for science, while operating them at less cost and greater responsiveness to the customers.« less
High performance computing for advanced modeling and simulation of materials

NASA Astrophysics Data System (ADS)

Wang, Jue; Gao, Fei; Vazquez-Poletti, Jose Luis; Li, Jianjiang

2017-02-01

The First International Workshop on High Performance Computing for Advanced Modeling and Simulation of Materials (HPCMS2015) was held in Austin, Texas, USA, Nov. 18, 2015. HPCMS 2015 was organized by Computer Network Information Center (Chinese Academy of Sciences), University of Michigan, Universidad Complutense de Madrid, University of Science and Technology Beijing, Pittsburgh Supercomputing Center, China Institute of Atomic Energy, and Ames Laboratory.
Developments in the simulation of compressible inviscid and viscous flow on supercomputers

NASA Technical Reports Server (NTRS)

Steger, J. L.; Buning, P. G.

1985-01-01

In anticipation of future supercomputers, finite difference codes are rapidly being extended to simulate three-dimensional compressible flow about complex configurations. Some of these developments are reviewed. The importance of computational flow visualization and diagnostic methods to three-dimensional flow simulation is also briefly discussed.
Advances and issues from the simulation of planetary magnetospheres with recent supercomputer systems

NASA Astrophysics Data System (ADS)

Fukazawa, K.; Walker, R. J.; Kimura, T.; Tsuchiya, F.; Murakami, G.; Kita, H.; Tao, C.; Murata, K. T.

2016-12-01

Planetary magnetospheres are very large, while phenomena within them occur on meso- and micro-scales. These scales range from 10s of planetary radii to kilometers. To understand dynamics in these multi-scale systems, numerical simulations have been performed by using the supercomputer systems. We have studied the magnetospheres of Earth, Jupiter and Saturn by using 3-dimensional magnetohydrodynamic (MHD) simulations for a long time, however, we have not obtained the phenomena near the limits of the MHD approximation. In particular, we have not studied meso-scale phenomena that can be addressed by using MHD.Recently we performed our MHD simulation of Earth's magnetosphere by using the K-computer which is the first 10PFlops supercomputer and obtained multi-scale flow vorticity for the both northward and southward IMF. Furthermore, we have access to supercomputer systems which have Xeon, SPARC64, and vector-type CPUs and can compare simulation results between the different systems. Finally, we have compared the results of our parameter survey of the magnetosphere with observations from the HISAKI spacecraft.We have encountered a number of difficulties effectively using the latest supercomputer systems. First the size of simulation output increases greatly. Now a simulation group produces over 1PB of output. Storage and analysis of this much data is difficult. The traditional way to analyze simulation results is to move the results to the investigator's home computer. This takes over three months using an end-to-end 10Gbps network. In reality, there are problems at some nodes such as firewalls that can increase the transfer time to over one year. Another issue is post-processing. It is hard to treat a few TB of simulation output due to the memory limitations of a post-processing computer. To overcome these issues, we have developed and introduced the parallel network storage, the highly efficient network protocol and the CUI based visualization tools.In this study, we will show the latest simulation results using the petascale supercomputer and problems from the use of these supercomputer systems.
NETL - Supercomputing: NETL Simulation Based Engineering User Center (SBEUC)

ScienceCinema

None

2018-02-07

NETL's Simulation-Based Engineering User Center, or SBEUC, integrates one of the world's largest high-performance computers with an advanced visualization center. The SBEUC offers a collaborative environment among researchers at NETL sites and those working through the NETL-Regional University Alliance.
NETL - Supercomputing: NETL Simulation Based Engineering User Center (SBEUC)

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

2013-09-30

NETL's Simulation-Based Engineering User Center, or SBEUC, integrates one of the world's largest high-performance computers with an advanced visualization center. The SBEUC offers a collaborative environment among researchers at NETL sites and those working through the NETL-Regional University Alliance.

The TESS science processing operations center

NASA Astrophysics Data System (ADS)

Jenkins, Jon M.; Twicken, Joseph D.; McCauliff, Sean; Campbell, Jennifer; Sanderfer, Dwight; Lung, David; Mansouri-Samani, Masoud; Girouard, Forrest; Tenenbaum, Peter; Klaus, Todd; Smith, Jeffrey C.; Caldwell, Douglas A.; Chacon, A. D.; Henze, Christopher; Heiges, Cory; Latham, David W.; Morgan, Edward; Swade, Daryl; Rinehart, Stephen; Vanderspek, Roland

2016-08-01

The Transiting Exoplanet Survey Satellite (TESS) will conduct a search for Earth's closest cousins starting in early 2018 and is expected to discover 1,000 small planets with Rp < 4 R⊕ and measure the masses of at least 50 of these small worlds. The Science Processing Operations Center (SPOC) is being developed at NASA Ames Research Center based on the Kepler science pipeline and will generate calibrated pixels and light curves on the NASA Advanced Supercomputing Division's Pleiades supercomputer. The SPOC will also search for periodic transit events and generate validation products for the transit-like features in the light curves. All TESS SPOC data products will be archived to the Mikulski Archive for Space Telescopes (MAST).
An Advanced User Interface Approach for Complex Parameter Study Process Specification in the Information Power Grid

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; McCann, Karen M.; Biswas, Rupak; VanderWijngaart, Rob; Yan, Jerry C. (Technical Monitor)

2000-01-01

The creation of parameter study suites has recently become a more challenging problem as the parameter studies have now become multi-tiered and the computational environment has become a supercomputer grid. The parameter spaces are vast, the individual problem sizes are getting larger, and researchers are now seeking to combine several successive stages of parameterization and computation. Simultaneously, grid-based computing offers great resource opportunity but at the expense of great difficulty of use. We present an approach to this problem which stresses intuitive visual design tools for parameter study creation and complex process specification, and also offers programming-free access to grid-based supercomputer resources and process automation.
Extracting the Textual and Temporal Structure of Supercomputing Logs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jain, S; Singh, I; Chandra, A

2009-05-26

Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an onlinemore » clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.« less
Supercomputers Of The Future

NASA Technical Reports Server (NTRS)

Peterson, Victor L.; Kim, John; Holst, Terry L.; Deiwert, George S.; Cooper, David M.; Watson, Andrew B.; Bailey, F. Ron

1992-01-01

Report evaluates supercomputer needs of five key disciplines: turbulence physics, aerodynamics, aerothermodynamics, chemistry, and mathematical modeling of human vision. Predicts these fields will require computer speed greater than 10(Sup 18) floating-point operations per second (FLOP's) and memory capacity greater than 10(Sup 15) words. Also, new parallel computer architectures and new structured numerical methods will make necessary speed and capacity available.
Aviation Research and the Internet

NASA Technical Reports Server (NTRS)

Scott, Antoinette M.

1995-01-01

The Internet is a network of networks. It was originally funded by the Defense Advanced Research Projects Agency or DOD/DARPA and evolved in part from the connection of supercomputer sites across the United States. The National Science Foundation (NSF) made the most of their supercomputers by connecting the sites to each other. This made the supercomputers more efficient and now allows scientists, engineers and researchers to access the supercomputers from their own labs and offices. The high speed networks that connect the NSF supercomputers form the backbone of the Internet. The World Wide Web (WWW) is a menu system. It gathers Internet resources from all over the world into a series of screens that appear on your computer. The WWW is also a distributed. The distributed system stores data information on many computers (servers). These servers can go out and get data when you ask for it. Hypermedia is the base of the WWW. One can 'click' on a section and visit other hypermedia (pages). Our approach to demonstrating the importance of aviation research through the Internet began with learning how to put pages on the Internet (on-line) ourselves. We were assigned two aviation companies; Vision Micro Systems Inc. and Innovative Aerodynamic Technologies (IAT). We developed home pages for these SBIR companies. The equipment used to create the pages were the UNIX and Macintosh machines. HTML Supertext software was used to write the pages and the Sharp JX600S scanner to scan the images. As a result, with the use of the UNIX, Macintosh, Sun, PC, and AXIL machines, we were able to present our home pages to over 800,000 visitors.
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan

While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less
Scalability Test of Multiscale Fluid-Platelet Model for Three Top Supercomputers

PubMed Central

Zhang, Peng; Zhang, Na; Gao, Chao; Zhang, Li; Gao, Yuxiang; Deng, Yuefan; Bluestein, Danny

2016-01-01

We have tested the scalability of three supercomputers: the Tianhe-2, Stampede and CS-Storm with multiscale fluid-platelet simulations, in which a highly-resolved and efficient numerical model for nanoscale biophysics of platelets in microscale viscous biofluids is considered. Three experiments involving varying problem sizes were performed: Exp-S: 680,718-particle single-platelet; Exp-M: 2,722,872-particle 4-platelet; and Exp-L: 10,891,488-particle 16-platelet. Our implementations of multiple time-stepping (MTS) algorithm improved the performance of single time-stepping (STS) in all experiments. Using MTS, our model achieved the following simulation rates: 12.5, 25.0, 35.5 μs/day for Exp-S and 9.09, 6.25, 14.29 μs/day for Exp-M on Tianhe-2, CS-Storm 16-K80 and Stampede K20. The best rate for Exp-L was 6.25 μs/day for Stampede. Utilizing current advanced HPC resources, the simulation rates achieved by our algorithms bring within reach performing complex multiscale simulations for solving vexing problems at the interface of biology and engineering, such as thrombosis in blood flow which combines millisecond-scale hematology with microscale blood flow at resolutions of micro-to-nanoscale cellular components of platelets. This study of testing the performance characteristics of supercomputers with advanced computational algorithms that offer optimal trade-off to achieve enhanced computational performance serves to demonstrate that such simulations are feasible with currently available HPC resources. PMID:27570250
Replica Exchange Molecular Dynamics in the Age of Heterogeneous Architectures

NASA Astrophysics Data System (ADS)

Roitberg, Adrian

2014-03-01

The rise of GPU-based codes has allowed MD to reach timescales only dreamed of only 5 years ago. Even within this new paradigm there is still need for advanced sampling techniques. Modern supercomputers (e.g. Blue Waters, Titan, Keeneland) have made available to users a significant number of GPUS and CPUS, which in turn translate into amazing opportunities for dream calculations. Replica-exchange based methods can optimally use tis combination of codes and architectures to explore conformational variabilities in large systems. I will show our recent work in porting the program Amber to GPUS, and the support for replica exchange methods, where the replicated dimension could be Temperature, pH, Hamiltonian, Umbrella windows and combinations of those schemes.
Prediction of contact path and load sharing in spiral bevel gears

NASA Technical Reports Server (NTRS)

Bibel, George D.; Tiku, Karuna; Kumar, Ashok

1994-01-01

A procedure is presented to perform a contact analysis of spiral bevel gears in order to predict the contact path and the load sharing as the gears roll through mesh. The approach utilizes recent advances in automated contact methods for nonlinear finite element analysis. A sector of the pinion and gear is modeled consisting of three pinion teeth and four gear teeth in mesh. Calculation of the contact force and stresses through the gear meshing cycle are demonstrated. Summary of the results are presented using three dimensional plots and tables. Issues relating to solution convergence and requirements for running large finite element analysis on a supercomputer are discussed.
The TESS Science Processing Operations Center

NASA Technical Reports Server (NTRS)

Jenkins, Jon M.; Twicken, Joseph D.; McCauliff, Sean; Campbell, Jennifer; Sanderfer, Dwight; Lung, David; Mansouri-Samani, Masoud; Girouard, Forrest; Tenenbaum, Peter; Klaus, Todd;

2016-01-01

The Transiting Exoplanet Survey Satellite (TESS) will conduct a search for Earth's closest cousins starting in early 2018 and is expected to discover approximately 1,000 small planets with R(sub p) less than 4 (solar radius) and measure the masses of at least 50 of these small worlds. The Science Processing Operations Center (SPOC) is being developed at NASA Ames Research Center based on the Kepler science pipeline and will generate calibrated pixels and light curves on the NASA Advanced Supercomputing Division's Pleiades supercomputer. The SPOC will also search for periodic transit events and generate validation products for the transit-like features in the light curves. All TESS SPOC data products will be archived to the Mikulski Archive for Space Telescopes (MAST).

A One-of-a-Kind Technology Expansion.

ERIC Educational Resources Information Center

Wiens, Janet

2002-01-01

Describes the design of the expansion of the National Center for Supercomputing Applications (NCSA) Advanced Computation Building at the University of Illinois, Champaign. Discusses how the design incorporated column-free space for flexibility, cooling capacity, a freight elevator, and a 6-foot raised access floor to neatly house airflow, wiring,…
New Computer Simulations of Macular Neural Functioning

NASA Technical Reports Server (NTRS)

Ross, Muriel D.; Doshay, D.; Linton, S.; Parnas, B.; Montgomery, K.; Chimento, T.

1994-01-01

We use high performance graphics workstations and supercomputers to study the functional significance of the three-dimensional (3-D) organization of gravity sensors. These sensors have a prototypic architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scaled-up, 3-D versions run on a Cray Y-MP supercomputer. A semi-automated method of reconstruction of neural tissue from serial sections studied in a transmission electron microscope has been developed to eliminate tedious conventional photography. The reconstructions use a mesh as a step in generating a neural surface for visualization. Two meshes are required to model calyx surfaces. The meshes are connected and the resulting prisms represent the cytoplasm and the bounding membranes. A finite volume analysis method is employed to simulate voltage changes along the calyx in response to synapse activation on the calyx or on calyceal processes. The finite volume method insures that charge is conserved at the calyx-process junction. These and other models indicate that efferent processes act as voltage followers, and that the morphology of some afferent processes affects their functioning. In a final application, morphological information is symbolically represented in three dimensions in a computer. The possible functioning of the connectivities is tested using mathematical interpretations of physiological parameters taken from the literature. Symbolic, 3-D simulations are in progress to probe the functional significance of the connectivities. This research is expected to advance computer-based studies of macular functioning and of synaptic plasticity.
Supercomputer use in orthopaedic biomechanics research: focus on functional adaptation of bone.

PubMed

Hart, R T; Thongpreda, N; Van Buskirk, W C

1988-01-01

The authors describe two biomechanical analyses carried out using numerical methods. One is an analysis of the stress and strain in a human mandible, and the other analysis involves modeling the adaptive response of a sheep bone to mechanical loading. The computing environment required for the two types of analyses is discussed. It is shown that a simple stress analysis of a geometrically complex mandible can be accomplished using a minicomputer. However, more sophisticated analyses of the same model with dynamic loading or nonlinear materials would require supercomputer capabilities. A supercomputer is also required for modeling the adaptive response of living bone, even when simple geometric and material models are use.
A History of High-Performance Computing

NASA Technical Reports Server (NTRS)

2006-01-01

Faster than most speedy computers. More powerful than its NASA data-processing predecessors. Able to leap large, mission-related computational problems in a single bound. Clearly, it s neither a bird nor a plane, nor does it need to don a red cape, because it s super in its own way. It's Columbia, NASA s newest supercomputer and one of the world s most powerful production/processing units. Named Columbia to honor the STS-107 Space Shuttle Columbia crewmembers, the new supercomputer is making it possible for NASA to achieve breakthroughs in science and engineering, fulfilling the Agency s missions, and, ultimately, the Vision for Space Exploration. Shortly after being built in 2004, Columbia achieved a benchmark rating of 51.9 teraflop/s on 10,240 processors, making it the world s fastest operational computer at the time of completion. Putting this speed into perspective, 20 years ago, the most powerful computer at NASA s Ames Research Center, home of the NASA Advanced Supercomputing Division (NAS), ran at a speed of about 1 gigaflop (one billion calculations per second). The Columbia supercomputer is 50,000 times faster than this computer and offers a tenfold increase in capacity over the prior system housed at Ames. What s more, Columbia is considered the world s largest Linux-based, shared-memory system. The system is offering immeasurable benefits to society and is the zenith of years of NASA/private industry collaboration that has spawned new generations of commercial, high-speed computing systems.
Some Hail 'Computational Science' as Biggest Advance Since Newton, Galileo.

ERIC Educational Resources Information Center

Turner, Judith Axler

1987-01-01

Computational science is defined as science done on a computer. A computer can serve as a laboratory for researchers who cannot experiment with their subjects, and as a calculator for those who otherwise might need centuries to solve some problems mathematically. The National Science Foundation's support of supercomputers is discussed. (MLW)
NAS Applications and Advanced Algorithms

NASA Technical Reports Server (NTRS)

Bailey, David H.; Biswas, Rupak; VanDerWijngaart, Rob; Kutler, Paul (Technical Monitor)

1997-01-01

This paper examines the applications most commonly run on the supercomputers at the Numerical Aerospace Simulation (NAS) facility. It analyzes the extent to which such applications are fundamentally oriented to vector computers, and whether or not they can be efficiently implemented on hierarchical memory machines, such as systems with cache memories and highly parallel, distributed memory systems.
Development of the general interpolants method for the CYBER 200 series of supercomputers

NASA Technical Reports Server (NTRS)

Stalnaker, J. F.; Robinson, M. A.; Spradley, L. W.; Kurzius, S. C.; Thoenes, J.

1988-01-01

The General Interpolants Method (GIM) is a 3-D, time-dependent, hybrid procedure for generating numerical analogs of the conservation laws. This study is directed toward the development and application of the GIM computer code for fluid dynamic research applications as implemented for the Cyber 200 series of supercomputers. An elliptic and quasi-parabolic version of the GIM code are discussed. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and an implicit finite difference scheme are also included.
An immersed boundary method for modeling a dirty geometry data

NASA Astrophysics Data System (ADS)

Onishi, Keiji; Tsubokura, Makoto

2017-11-01

We present a robust, fast, and low preparation cost immersed boundary method (IBM) for simulating an incompressible high Re flow around highly complex geometries. The method is achieved by the dispersion of the momentum by the axial linear projection and the approximate domain assumption satisfying the mass conservation around the wall including cells. This methodology has been verified against an analytical theory and wind tunnel experiment data. Next, we simulate the problem of flow around a rotating object and demonstrate the ability of this methodology to the moving geometry problem. This methodology provides the possibility as a method for obtaining a quick solution at a next large scale supercomputer. This research was supported by MEXT as ``Priority Issue on Post-K computer'' (Development of innovative design and production processes) and used computational resources of the K computer provided by the RIKEN Advanced Institute for Computational Science.
A Look at the Impact of High-End Computing Technologies on NASA Missions

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Dunbar, Jill; Hardman, John; Bailey, F. Ron; Wheeler, Lorien; Rogers, Stuart

2012-01-01

From its bold start nearly 30 years ago and continuing today, the NASA Advanced Supercomputing (NAS) facility at Ames Research Center has enabled remarkable breakthroughs in the space agency s science and engineering missions. Throughout this time, NAS experts have influenced the state-of-the-art in high-performance computing (HPC) and related technologies such as scientific visualization, system benchmarking, batch scheduling, and grid environments. We highlight the pioneering achievements and innovations originating from and made possible by NAS resources and know-how, from early supercomputing environment design and software development, to long-term simulation and analyses critical to design safe Space Shuttle operations and associated spinoff technologies, to the highly successful Kepler Mission s discovery of new planets now capturing the world s imagination.
Cancer's Big Data Problem

DOE PAGES

Breaux, Justin H. S.

2017-03-15

The US Department of Energy (DOE) has partnered with the National Cancer Institute (NCI) to use DOE supercomputers to aid in the fight against cancer by building sophisticated models based on data available at the population, patient, and molecular levels. Here, through a three-year pilot project called the Joint Design of Advanced Computing Solutions for Cancer (JDACSC), four participating national laboratories--Argonne, Lawrence Livermore, Los Alamos, and Oak Ridge--will focus on three problems singled out by the NCI as the biggest bottlenecks to advancing cancer research.

Cancer's Big Data Problem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Breaux, Justin H. S.

The US Department of Energy (DOE) has partnered with the National Cancer Institute (NCI) to use DOE supercomputers to aid in the fight against cancer by building sophisticated models based on data available at the population, patient, and molecular levels. Here, through a three-year pilot project called the Joint Design of Advanced Computing Solutions for Cancer (JDACSC), four participating national laboratories--Argonne, Lawrence Livermore, Los Alamos, and Oak Ridge--will focus on three problems singled out by the NCI as the biggest bottlenecks to advancing cancer research.
Cots Correlator Platform

NASA Astrophysics Data System (ADS)

Schaaf, Kjeld; Overeem, Ruud

2004-06-01

Moore’s law is best exploited by using consumer market hardware. In particular, the gaming industry pushes the limit of processor performance thus reducing the cost per raw flop even faster than Moore’s law predicts. Next to the cost benefits of Common-Of-The-Shelf (COTS) processing resources, there is a rapidly growing experience pool in cluster based processing. The typical Beowulf cluster of PC’s supercomputers are well known. Multiple examples exists of specialised cluster computers based on more advanced server nodes or even gaming stations. All these cluster machines build upon the same knowledge about cluster software management, scheduling, middleware libraries and mathematical libraries. In this study, we have integrated COTS processing resources and cluster nodes into a very high performance processing platform suitable for streaming data applications, in particular to implement a correlator. The required processing power for the correlator in modern radio telescopes is in the range of the larger supercomputers, which motivates the usage of supercomputer technology. Raw processing power is provided by graphical processors and is combined with an Infiniband host bus adapter with integrated data stream handling logic. With this processing platform a scalable correlator can be built with continuously growing processing power at consumer market prices.
Trinity to Trinity 1945-2015

ScienceCinema

Moniz, Ernest; Carr, Alan; Bethe, Hans; Morrison, Phillip; Ramsay, Norman; Teller, Edward; Brixner, Berlyn; Archer, Bill; Agnew, Harold; Morrison, John

2018-01-16

The Trinity Test of July 16, 1945 was the first full-scale, real-world test of a nuclear weapon; with the new Trinity supercomputer Los Alamos National Laboratory's goal is to do this virtually, in 3D. Trinity was the culmination of a fantastic effort of groundbreaking science and engineering by hundreds of men and women at Los Alamos and other Manhattan Project sites. It took them less than two years to change the world. The Laboratory is marking the 70th anniversary of the Trinity Test because it not only ushered in the Nuclear Age, but with it the origin of todayâs advanced supercomputing. We live in the Age of Supercomputers due in large part to nuclear weapons science here at Los Alamos. National security science, and nuclear weapons science in particular, at Los Alamos National Laboratory have provided a key motivation for the evolution of large-scale scientific computing. Beginning with the Manhattan Project there has been a constant stream of increasingly significant, complex problems in nuclear weapons science whose timely solutions demand larger and faster computers. The relationship between national security science at Los Alamos and the evolution of computing is one of interdependence.
Application of computational physics within Northrop

NASA Technical Reports Server (NTRS)

George, M. W.; Ling, R. T.; Mangus, J. F.; Thompkins, W. T.

1987-01-01

An overview of Northrop programs in computational physics is presented. These programs depend on access to today's supercomputers, such as the Numerical Aerodynamical Simulator (NAS), and future growth on the continuing evolution of computational engines. Descriptions here are concentrated on the following areas: computational fluid dynamics (CFD), computational electromagnetics (CEM), computer architectures, and expert systems. Current efforts and future directions in these areas are presented. The impact of advances in the CFD area is described, and parallels are drawn to analagous developments in CEM. The relationship between advances in these areas and the development of advances (parallel) architectures and expert systems is also presented.
Computational mechanics - Advances and trends; Proceedings of the Session - Future directions of Computational Mechanics of the ASME Winter Annual Meeting, Anaheim, CA, Dec. 7-12, 1986

NASA Technical Reports Server (NTRS)

Noor, Ahmed K. (Editor)

1986-01-01

The papers contained in this volume provide an overview of the advances made in a number of aspects of computational mechanics, identify some of the anticipated industry needs in this area, discuss the opportunities provided by new hardware and parallel algorithms, and outline some of the current government programs in computational mechanics. Papers are included on advances and trends in parallel algorithms, supercomputers for engineering analysis, material modeling in nonlinear finite-element analysis, the Navier-Stokes computer, and future finite-element software systems.
Progress and supercomputing in computational fluid dynamics; Proceedings of U.S.-Israel Workshop, Jerusalem, Israel, December 1984

NASA Technical Reports Server (NTRS)

Murman, E. M. (Editor); Abarbanel, S. S. (Editor)

1985-01-01

Current developments and future trends in the application of supercomputers to computational fluid dynamics are discussed in reviews and reports. Topics examined include algorithm development for personal-size supercomputers, a multiblock three-dimensional Euler code for out-of-core and multiprocessor calculations, simulation of compressible inviscid and viscous flow, high-resolution solutions of the Euler equations for vortex flows, algorithms for the Navier-Stokes equations, and viscous-flow simulation by FEM and related techniques. Consideration is given to marching iterative methods for the parabolized and thin-layer Navier-Stokes equations, multigrid solutions to quasi-elliptic schemes, secondary instability of free shear flows, simulation of turbulent flow, and problems connected with weather prediction.
Keeping an Eye on the Prize

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hazi, A U

2007-02-06

Setting performance goals is part of the business plan for almost every company. The same is true in the world of supercomputers. Ten years ago, the Department of Energy (DOE) launched the Accelerated Strategic Computing Initiative (ASCI) to help ensure the safety and reliability of the nation's nuclear weapons stockpile without nuclear testing. ASCI, which is now called the Advanced Simulation and Computing (ASC) Program and is managed by DOE's National Nuclear Security Administration (NNSA), set an initial 10-year goal to obtain computers that could process up to 100 trillion floating-point operations per second (teraflops). Many computer experts thought themore » goal was overly ambitious, but the program's results have proved them wrong. Last November, a Livermore-IBM team received the 2005 Gordon Bell Prize for achieving more than 100 teraflops while modeling the pressure-induced solidification of molten metal. The prestigious prize, which is named for a founding father of supercomputing, is awarded each year at the Supercomputing Conference to innovators who advance high-performance computing. Recipients for the 2005 prize included six Livermore scientists--physicists Fred Streitz, James Glosli, and Mehul Patel and computer scientists Bor Chan, Robert Yates, and Bronis de Supinski--as well as IBM researchers James Sexton and John Gunnels. This team produced the first atomic-scale model of metal solidification from the liquid phase with results that were independent of system size. The record-setting calculation used Livermore's domain decomposition molecular-dynamics (ddcMD) code running on BlueGene/L, a supercomputer developed by IBM in partnership with the ASC Program. BlueGene/L reached 280.6 teraflops on the Linpack benchmark, the industry standard used to measure computing speed. As a result, it ranks first on the list of Top500 Supercomputer Sites released in November 2005. To evaluate the performance of nuclear weapons systems, scientists must understand how materials behave under extreme conditions. Because experiments at high pressures and temperatures are often difficult or impossible to conduct, scientists rely on computer models that have been validated with obtainable data. Of particular interest to weapons scientists is the solidification of metals. ''To predict the performance of aging nuclear weapons, we need detailed information on a material's phase transitions'', says Streitz, who leads the Livermore-IBM team. For example, scientists want to know what happens to a metal as it changes from molten liquid to a solid and how that transition affects the material's characteristics, such as its strength.« less
Tracing Scientific Facilities through the Research Literature Using Persistent Identifiers

NASA Astrophysics Data System (ADS)

Mayernik, M. S.; Maull, K. E.

2016-12-01

Tracing persistent identifiers to their source publications is an easy task when authors use them, since it is a simple matter of matching the persistent identifier to the specific text string of the identifier. However, trying to understand if a publication uses the resource behind an identifier when such identifier is not referenced explicitly is a harder task. In this research, we explore the effectiveness of alternative strategies of associating publications with uses of the resource referenced by an identifier when it may not be explicit. This project is explored within the context of the NCAR supercomputer, where we are broadly interesting in the science that can be traced to the usage of the NCAR supercomputing facility, by way of the peer-reviewed research publications that utilize and reference it. In this project we explore several ways of drawing linkages between publications and the NCAR supercomputing resources. Identifying and compiling peer-reviewed publications related to NCAR supercomputer usage are explored via three sources: 1) User-supplied publications gathered through a community survey, 2) publications that were identified via manual searching of the Google scholar search index, and 3) publications associated with National Science Foundation (NSF) grants extracted from a public NSF database. These three sources represent three styles of collecting information about publications that likely imply usage of the NCAR supercomputing facilities. Each source has strengths and weaknesses, thus our discussion will explore how our publication identification and analysis methods vary in terms of accuracy, reliability, and effort. We will also discuss strategies for enabling more efficient tracing of research impacts of supercomputing facilities going forward through the assignment of a persistent web identifier to the NCAR supercomputer. While this solution has potential to greatly enhance our ability to trace the use of the facility through publications, authors must cite the facility consistently. It is therefore necessary to provide recommendations for citation and attribution behavior, and we will conclude our discussion with how such recommendations have improved tracing the supercomputer facility allowing for more consistent and widespread measurement of its impact.
High Performance Computing of Meshless Time Domain Method on Multi-GPU Cluster

NASA Astrophysics Data System (ADS)

Ikuno, Soichiro; Nakata, Susumu; Hirokawa, Yuta; Itoh, Taku

2015-01-01

High performance computing of Meshless Time Domain Method (MTDM) on multi-GPU using the supercomputer HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) at University of Tsukuba is investigated. Generally, the finite difference time domain (FDTD) method is adopted for the numerical simulation of the electromagnetic wave propagation phenomena. However, the numerical domain must be divided into rectangle meshes, and it is difficult to adopt the problem in a complexed domain to the method. On the other hand, MTDM can be easily adept to the problem because MTDM does not requires meshes. In the present study, we implement MTDM on multi-GPU cluster to speedup the method, and numerically investigate the performance of the method on multi-GPU cluster. To reduce the computation time, the communication time between the decomposed domain is hided below the perfect matched layer (PML) calculation procedure. The results of computation show that speedup of MTDM on 128 GPUs is 173 times faster than that of single CPU calculation.
A Computational framework for telemedicine.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Foster, I.; von Laszewski, G.; Thiruvathukal, G. K.

1998-07-01

Emerging telemedicine applications require the ability to exploit diverse and geographically distributed resources. Highspeed networks are used to integrate advanced visualization devices, sophisticated instruments, large databases, archival storage devices, PCs, workstations, and supercomputers. This form of telemedical environment is similar to networked virtual supercomputers, also known as metacomputers. Metacomputers are already being used in many scientific application areas. In this article, we analyze requirements necessary for a telemedical computing infrastructure and compare them with requirements found in a typical metacomputing environment. We will show that metacomputing environments can be used to enable a more powerful and unified computational infrastructure formore » telemedicine. The Globus metacomputing toolkit can provide the necessary low level mechanisms to enable a large scale telemedical infrastructure. The Globus toolkit components are designed in a modular fashion and can be extended to support the specific requirements for telemedicine.« less
Supercomputing in the Age of Discovering Superearths, Earths and Exoplanet Systems

NASA Technical Reports Server (NTRS)

Jenkins, Jon M.

2015-01-01

NASA's Kepler Mission was launched in March 2009 as NASA's first mission capable of finding Earth-size planets orbiting in the habitable zone of Sun-like stars, that range of distances for which liquid water would pool on the surface of a rocky planet. Kepler has discovered over 1000 planets and over 4600 candidates, many of them as small as the Earth. Today, Kepler's amazing success seems to be a fait accompli to those unfamiliar with her history. But twenty years ago, there were no planets known outside our solar system, and few people believed it was possible to detect tiny Earth-size planets orbiting other stars. Motivating NASA to select Kepler for launch required a confluence of the right detector technology, advances in signal processing and algorithms, and the power of supercomputing.
Solving large sparse eigenvalue problems on supercomputers

NASA Technical Reports Server (NTRS)

Philippe, Bernard; Saad, Youcef

1988-01-01

An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed.
Comprehensive efficiency analysis of supercomputer resource usage based on system monitoring data

NASA Astrophysics Data System (ADS)

Mamaeva, A. A.; Shaykhislamov, D. I.; Voevodin, Vad V.; Zhumatiy, S. A.

2018-03-01

One of the main problems of modern supercomputers is the low efficiency of their usage, which leads to the significant idle time of computational resources, and, in turn, to the decrease in speed of scientific research. This paper presents three approaches to study the efficiency of supercomputer resource usage based on monitoring data analysis. The first approach performs an analysis of computing resource utilization statistics, which allows to identify different typical classes of programs, to explore the structure of the supercomputer job flow and to track overall trends in the supercomputer behavior. The second approach is aimed specifically at analyzing off-the-shelf software packages and libraries installed on the supercomputer, since efficiency of their usage is becoming an increasingly important factor for the efficient functioning of the entire supercomputer. Within the third approach, abnormal jobs – jobs with abnormally inefficient behavior that differs significantly from the standard behavior of the overall supercomputer job flow – are being detected. For each approach, the results obtained in practice in the Supercomputer Center of Moscow State University are demonstrated.
Optimization of Supercomputer Use on EADS II System

NASA Technical Reports Server (NTRS)

Ahmed, Ardsher

1998-01-01

The main objective of this research was to optimize supercomputer use to achieve better throughput and utilization of supercomputers and to help facilitate the movement of non-supercomputing (inappropriate for supercomputer) codes to mid-range systems for better use of Government resources at Marshall Space Flight Center (MSFC). This work involved the survey of architectures available on EADS II and monitoring customer (user) applications running on a CRAY T90 system.
Supercomputer applications in molecular modeling.

PubMed

Gund, T M

1988-01-01

An overview of the functions performed by molecular modeling is given. Molecular modeling techniques benefiting from supercomputing are described, namely, conformation, search, deriving bioactive conformations, pharmacophoric pattern searching, receptor mapping, and electrostatic properties. The use of supercomputers for problems that are computationally intensive, such as protein structure prediction, protein dynamics and reactivity, protein conformations, and energetics of binding is also examined. The current status of supercomputing and supercomputer resources are discussed.
Requirements and Usage of NVM in Advanced Onboard Data Processing Systems

NASA Technical Reports Server (NTRS)

Some, R.

2001-01-01

This viewgraph presentation gives an overview of the requirements and uses of non-volatile memory (NVM) in advanced onboard data processing systems. Supercomputing in space presents the only viable approach to the bandwidth problem (can't get data down to Earth), controlling constellations of cooperating satellites, reducing mission operating costs, and real-time intelligent decision making and science data gathering. Details are given on the REE vision and impact on NASA and Department of Defense missions, objectives of REE, baseline architecture, and issues. NVM uses and requirements are listed.
DDBJ read annotation pipeline: a cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data.

PubMed

Nagasaki, Hideki; Mochizuki, Takako; Kodama, Yuichi; Saruhashi, Satoshi; Morizaki, Shota; Sugawara, Hideaki; Ohyanagi, Hajime; Kurata, Nori; Okubo, Kousaku; Takagi, Toshihisa; Kaminuma, Eli; Nakamura, Yasukazu

2013-08-01

High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.
DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data

PubMed Central

Nagasaki, Hideki; Mochizuki, Takako; Kodama, Yuichi; Saruhashi, Satoshi; Morizaki, Shota; Sugawara, Hideaki; Ohyanagi, Hajime; Kurata, Nori; Okubo, Kousaku; Takagi, Toshihisa; Kaminuma, Eli; Nakamura, Yasukazu

2013-01-01

High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/. PMID:23657089
Chemistry Modeling for Aerothermodynamics and TPS

NASA Technical Reports Server (NTRS)

Wang, Dunyou; Stallcop, James R.; Dateo, Christopher e.; Schwenke, David W.; Halicioglu, Timur; Huo, winifred M.

2005-01-01

Recent advances in supercomputers and highly scalable quantum chemistry software render computational chemistry methods a viable means of providing chemistry data for aerothermal analysis at a specific level of confidence. Four examples of first principles quantum chemistry calculations will be presented. Study of the highly nonequilibrium rotational distribution of a nitrogen molecule from the exchange reaction N + N2 illustrates how chemical reactions can influence rotational distribution. The reaction C2H + H2 is one example of a radical reaction that occurs during hypersonic entry into an atmosphere containing methane. A study of the etching of a Si surface illustrates our approach to surface reactions. A recently developed web accessible database and software tool (DDD) that provides the radiation profile of diatomic molecules is also described.
Chemistry Modeling for Aerothermodynamics and TPS

NASA Technical Reports Server (NTRS)

Wang, Dun-You; Stallcop, James R.; Dateo, Christopher E.; Schwenke, David W.; Haliciogiu, Timur; Huo, Winifred

2004-01-01

Recent advances in supercomputers and highly scalable quantum chemistry software render computational chemistry methods a viable means of providing chemistry data for aerothermal analysis at a specific level of confidence. Four examples of first principles quantum chemistry calculations will be presented. The study of the highly nonequilibrium rotational distribution of nitrogen molecule from the exchange reaction N + N2 illustrates how chemical reactions can influence the rotational distribution. The reaction C2H + H2 is one example of a radical reaction that occurs during hypersonic entry into a methane containing atmosphere. A study of the etching of Si surface illustrates our approach to surface reactions. A recently developed web accessible database and software tool (DDD) that provides the radiation profile of diatomic molecules is also described.

Research in Computational Astrobiology

NASA Technical Reports Server (NTRS)

Chaban, Galina; Jaffe, Richard; Liang, Shoudan; New, Michael H.; Pohorille, Andrew; Wilson, Michael A.

2002-01-01

We present results from several projects in the new field of computational astrobiology, which is devoted to advancing our understanding of the origin, evolution and distribution of life in the Universe using theoretical and computational tools. We have developed a procedure for calculating long-range effects in molecular dynamics using a plane wave expansion of the electrostatic potential. This method is expected to be highly efficient for simulating biological systems on massively parallel supercomputers. We have perform genomics analysis on a family of actin binding proteins. We have performed quantum mechanical calculations on carbon nanotubes and nucleic acids, which simulations will allow us to investigate possible sources of organic material on the early earth. Finally, we have developed a model of protobiological chemistry using neural networks.
The role of graphics super-workstations in a supercomputing environment

NASA Technical Reports Server (NTRS)

Levin, E.

1989-01-01

A new class of very powerful workstations has recently become available which integrate near supercomputer computational performance with very powerful and high quality graphics capability. These graphics super-workstations are expected to play an increasingly important role in providing an enhanced environment for supercomputer users. Their potential uses include: off-loading the supercomputer (by serving as stand-alone processors, by post-processing of the output of supercomputer calculations, and by distributed or shared processing), scientific visualization (understanding of results, communication of results), and by real time interaction with the supercomputer (to steer an iterative computation, to abort a bad run, or to explore and develop new algorithms).
Heart Fibrillation and Parallel Supercomputers

NASA Technical Reports Server (NTRS)

Kogan, B. Y.; Karplus, W. J.; Chudin, E. E.

1997-01-01

The Luo and Rudy 3 cardiac cell mathematical model is implemented on the parallel supercomputer CRAY - T3D. The splitting algorithm combined with variable time step and an explicit method of integration provide reasonable solution times and almost perfect scaling for rectilinear wave propagation. The computer simulation makes it possible to observe new phenomena: the break-up of spiral waves caused by intracellular calcium and dynamics and the non-uniformity of the calcium distribution in space during the onset of the spiral wave.
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2010 CFR

2010-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2014 CFR

2014-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2012 CFR

2012-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2013 CFR

2013-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

Code of Federal Regulations, 2011 CFR

2011-10-01

... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
Data-intensive computing on numerically-insensitive supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahrens, James P; Fasel, Patricia K; Habib, Salman

2010-12-03

With the advent of the era of petascale supercomputing, via the delivery of the Roadrunner supercomputing platform at Los Alamos National Laboratory, there is a pressing need to address the problem of visualizing massive petascale-sized results. In this presentation, I discuss progress on a number of approaches including in-situ analysis, multi-resolution out-of-core streaming and interactive rendering on the supercomputing platform. These approaches are placed in context by the emerging area of data-intensive supercomputing.
Computer Electromagnetics and Supercomputer Architecture

NASA Technical Reports Server (NTRS)

Cwik, Tom

1993-01-01

The dramatic increase in performance over the last decade for microporcessor computations is compared with that for the supercomputer computations. This performance, the projected performance, and a number of other issues such as cost and the inherent pysical limitations in curent supercomputer technology have naturally led to parallel supercomputers and ensemble of interconnected microprocessors.
History of the numerical aerodynamic simulation program

NASA Technical Reports Server (NTRS)

Peterson, Victor L.; Ballhaus, William F., Jr.

1987-01-01

The Numerical Aerodynamic Simulation (NAS) program has reached a milestone with the completion of the initial operating configuration of the NAS Processing System Network. This achievement is the first major milestone in the continuing effort to provide a state-of-the-art supercomputer facility for the national aerospace community and to serve as a pathfinder for the development and use of future supercomputer systems. The underlying factors that motivated the initiation of the program are first identified and then discussed. These include the emergence and evolution of computational aerodynamics as a powerful new capability in aerodynamics research and development, the computer power required for advances in the discipline, the complementary nature of computation and wind tunnel testing, and the need for the government to play a pathfinding role in the development and use of large-scale scientific computing systems. Finally, the history of the NAS program is traced from its inception in 1975 to the present time.
Impact of the Columbia Supercomputer on NASA Space and Exploration Mission

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Kwak, Dochan; Kiris, Cetin; Lawrence, Scott

2006-01-01

NASA's 10,240-processor Columbia supercomputer gained worldwide recognition in 2004 for increasing the space agency's computing capability ten-fold, and enabling U.S. scientists and engineers to perform significant, breakthrough simulations. Columbia has amply demonstrated its capability to accelerate NASA's key missions, including space operations, exploration systems, science, and aeronautics. Columbia is part of an integrated high-end computing (HEC) environment comprised of massive storage and archive systems, high-speed networking, high-fidelity modeling and simulation tools, application performance optimization, and advanced data analysis and visualization. In this paper, we illustrate the impact Columbia is having on NASA's numerous space and exploration applications, such as the development of the Crew Exploration and Launch Vehicles (CEV/CLV), effects of long-duration human presence in space, and damage assessment and repair recommendations for remaining shuttle flights. We conclude by discussing HEC challenges that must be overcome to solve space-related science problems in the future.
The advanced role of computational mechanics and visualization in science and technology: analysis of the Germanwings Flight 9525 crash

NASA Astrophysics Data System (ADS)

Chen, Goong; Wang, Yi-Ching; Perronnet, Alain; Gu, Cong; Yao, Pengfei; Bin-Mohsin, Bandar; Hajaiej, Hichem; Scully, Marlan O.

2017-03-01

Computational mathematics, physics and engineering form a major constituent of modern computational science, which now stands on an equal footing with the established branches of theoretical and experimental sciences. Computational mechanics solves problems in science and engineering based upon mathematical modeling and computing, bypassing the need for expensive and time-consuming laboratory setups and experimental measurements. Furthermore, it allows the numerical simulations of large scale systems, such as the formation of galaxies that could not be done in any earth bound laboratories. This article is written as part of the 21st Century Frontiers Series to illustrate some state-of-the-art computational science. We emphasize how to do numerical modeling and visualization in the study of a contemporary event, the pulverizing crash of the Germanwings Flight 9525 on March 24, 2015, as a showcase. Such numerical modeling and the ensuing simulation of aircraft crashes into land or mountain are complex tasks as they involve both theoretical study and supercomputing of a complex physical system. The most tragic type of crash involves ‘pulverization’ such as the one suffered by this Germanwings flight. Here, we show pulverizing airliner crashes by visualization through video animations from supercomputer applications of the numerical modeling tool LS-DYNA. A sound validation process is challenging but essential for any sophisticated calculations. We achieve this by validation against the experimental data from a crash test done in 1993 of an F4 Phantom II fighter jet into a wall. We have developed a method by hybridizing two primary methods: finite element analysis and smoothed particle hydrodynamics. This hybrid method also enhances visualization by showing a ‘debris cloud’. Based on our supercomputer simulations and the visualization, we point out that prior works on this topic based on ‘hollow interior’ modeling can be quite problematic and, thus, not likely to be correct. We discuss the effects of terrain on pulverization using the information from the recovered flight-data-recorder and show our forensics and assessments of what may have happened during the final moments of the crash. Finally, we point out that our study has potential for being made into real-time flight crash simulators to help the study of crashworthiness and survivability for future aviation safety. Some forward-looking statements are also made.
A report documenting the completion of the Los Alamos National Laboratory portion of the ASC level II milestone ""Visualization on the supercomputing platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahrens, James P; Patchett, John M; Lo, Li - Ta

2011-01-24

This report provides documentation for the completion of the Los Alamos portion of the ASC Level II 'Visualization on the Supercomputing Platform' milestone. This ASC Level II milestone is a joint milestone between Sandia National Laboratory and Los Alamos National Laboratory. The milestone text is shown in Figure 1 with the Los Alamos portions highlighted in boldfaced text. Visualization and analysis of petascale data is limited by several factors which must be addressed as ACES delivers the Cielo platform. Two primary difficulties are: (1) Performance of interactive rendering, which is the most computationally intensive portion of the visualization process. Formore » terascale platforms, commodity clusters with graphics processors (GPUs) have been used for interactive rendering. For petascale platforms, visualization and rendering may be able to run efficiently on the supercomputer platform itself. (2) I/O bandwidth, which limits how much information can be written to disk. If we simply analyze the sparse information that is saved to disk we miss the opportunity to analyze the rich information produced every timestep by the simulation. For the first issue, we are pursuing in-situ analysis, in which simulations are coupled directly with analysis libraries at runtime. This milestone will evaluate the visualization and rendering performance of current and next generation supercomputers in contrast to GPU-based visualization clusters, and evaluate the perfromance of common analysis libraries coupled with the simulation that analyze and write data to disk during a running simulation. This milestone will explore, evaluate and advance the maturity level of these technologies and their applicability to problems of interest to the ASC program. In conclusion, we improved CPU-based rendering performance by a a factor of 2-10 times on our tests. In addition, we evaluated CPU and CPU-based rendering performance. We encourage production visualization experts to consider using CPU-based rendering solutions when it is appropriate. For example, on remote supercomputers CPU-based rendering can offer a means of viewing data without having to offload the data or geometry onto a CPU-based visualization system. In terms of comparative performance of the CPU and CPU we believe that further optimizations of the performance of both CPU or CPU-based rendering are possible. The simulation community is currently confronting this reality as they work to port their simulations to different hardware architectures. What is interesting about CPU rendering of massive datasets is that for part two decades CPU performance has significantly outperformed CPU-based systems. Based on our advancements, evaluations and explorations we believe that CPU-based rendering has returned as one viable option for the visualization of massive datasets.« less
GPU acceleration of particle-in-cell methods

NASA Astrophysics Data System (ADS)

Cowan, Benjamin; Cary, John; Meiser, Dominic

2015-11-01

Graphics processing units (GPUs) have become key components in many supercomputing systems, as they can provide more computations relative to their cost and power consumption than conventional processors. However, to take full advantage of this capability, they require a strict programming model which involves single-instruction multiple-data execution as well as significant constraints on memory accesses. To bring the full power of GPUs to bear on plasma physics problems, we must adapt the computational methods to this new programming model. We have developed a GPU implementation of the particle-in-cell (PIC) method, one of the mainstays of plasma physics simulation. This framework is highly general and enables advanced PIC features such as high order particles and absorbing boundary conditions. The main elements of the PIC loop, including field interpolation and particle deposition, are designed to optimize memory access. We describe the performance of these algorithms and discuss some of the methods used. Work supported by DARPA contract W31P4Q-15-C-0061 (SBIR).
Efficient development of memory bounded geo-applications to scale on modern supercomputers

NASA Astrophysics Data System (ADS)

Räss, Ludovic; Omlin, Samuel; Licul, Aleksandar; Podladchikov, Yuri; Herman, Frédéric

2016-04-01

Numerical modeling is an actual key tool in the area of geosciences. The current challenge is to solve problems that are multi-physics and for which the length scale and the place of occurrence might not be known in advance. Also, the spatial extend of the investigated domain might strongly vary in size, ranging from millimeters for reactive transport to kilometers for glacier erosion dynamics. An efficient way to proceed is to develop simple but robust algorithms that perform well and scale on modern supercomputers and permit therefore very high-resolution simulations. We propose an efficient approach to solve memory bounded real-world applications on modern supercomputers architectures. We optimize the software to run on our newly acquired state-of-the-art GPU cluster "octopus". Our approach shows promising preliminary results on important geodynamical and geomechanical problematics: we have developed a Stokes solver for glacier flow and a poromechanical solver including complex rheologies for nonlinear waves in stressed rocks porous rocks. We solve the system of partial differential equations on a regular Cartesian grid and use an iterative finite difference scheme with preconditioning of the residuals. The MPI communication happens only locally (point-to-point); this method is known to scale linearly by construction. The "octopus" GPU cluster, which we use for the computations, has been designed to achieve maximal data transfer throughput at minimal hardware cost. It is composed of twenty compute nodes, each hosting four Nvidia Titan X GPU accelerators. These high-density nodes are interconnected with a parallel (dual-rail) FDR InfiniBand network. Our efforts show promising preliminary results for the different physics investigated. The glacier flow solver achieves good accuracy in the relevant benchmarks and the coupled poromechanical solver permits to explain previously unresolvable focused fluid flow as a natural outcome of the porosity setup. In both cases, near peak memory bandwidth transfer is achieved. Our approach allows us to get the best out of the current hardware.
ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.

PubMed

Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping

2018-04-27

A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.
NAS: The first year

NASA Technical Reports Server (NTRS)

Bailey, F. R.; Kutler, Paul

1988-01-01

Discussed are the capabilities of NASA's Numerical Aerodynamic Simulation (NAS) Program and its application as an advanced supercomputing system for computational fluid dynamics (CFD) research. First, the paper describes the NAS computational system, called the NAS Processing System Network, and the advanced computational capabilities it offers as a consequence of carrying out the NAS pathfinder objective. Second, it presents examples of pioneering CFD research accomplished during NAS's first operational year. Examples are included which illustrate CFD applications for predicting fluid phenomena, complementing and supplementing experimentation, and aiding in design. Finally, pacing elements and future directions for CFD and NAS are discussed.
Multi-GPU accelerated three-dimensional FDTD method for electromagnetic simulation.

PubMed

Nagaoka, Tomoaki; Watanabe, Soichi

2011-01-01

Numerical simulation with a numerical human model using the finite-difference time domain (FDTD) method has recently been performed in a number of fields in biomedical engineering. To improve the method's calculation speed and realize large-scale computing with the numerical human model, we adapt three-dimensional FDTD code to a multi-GPU environment using Compute Unified Device Architecture (CUDA). In this study, we used NVIDIA Tesla C2070 as GPGPU boards. The performance of multi-GPU is evaluated in comparison with that of a single GPU and vector supercomputer. The calculation speed with four GPUs was approximately 3.5 times faster than with a single GPU, and was slightly (approx. 1.3 times) slower than with the supercomputer. Calculation speed of the three-dimensional FDTD method using GPUs can significantly improve with an expanding number of GPUs.
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 48 Federal Acquisition Regulations System 3 2014-10-01 2014-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...

48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 48 Federal Acquisition Regulations System 3 2010-10-01 2010-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 48 Federal Acquisition Regulations System 3 2013-10-01 2013-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 48 Federal Acquisition Regulations System 3 2011-10-01 2011-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 48 Federal Acquisition Regulations System 3 2012-10-01 2012-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
Implementation of the NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan (Technical Monitor)

2002-01-01

Several features make Java an attractive choice for High Performance Computing (HPC). In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for CFD applications.
Performance and Scalability of the NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan A. (Technical Monitor)

2002-01-01

Several features make Java an attractive choice for scientific applications. In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for scientific applications.
Science & Technology Review September 2006

DOE Office of Scientific and Technical Information (OSTI.GOV)

Radousky, H B

2006-07-18

This month's article has the following articles: (1) Simulations Help Plan for Large Earthquakes--Commentary by Jane C. S. Long; (2) Re-creating the 1906 San Francisco Earthquake--Supercomputer simulations of Bay Area earthquakes are providing insight into the great 1906 quake and future temblors along several faults; (3) Decoding the Origin of a Bioagent--The microstructure of a bacterial organism can be linked to the methods used to formulate the pathogen; (4) A New Look at How Aging Bones Fracture--Livermore scientists find that the increased risk of fracture from osteoporosis may be due to a change in the physical structure of trabecular bone;more » and (5) Fusion Targets on the Double--Advances in precision manufacturing allow the production of double-shell fusion targets with submicrometer tolerances.« less
Automatic discovery of the communication network topology for building a supercomputer model

NASA Astrophysics Data System (ADS)

Sobolev, Sergey; Stefanov, Konstantin; Voevodin, Vadim

2016-10-01

The Research Computing Center of Lomonosov Moscow State University is developing the Octotron software suite for automatic monitoring and mitigation of emergency situations in supercomputers so as to maximize hardware reliability. The suite is based on a software model of the supercomputer. The model uses a graph to describe the computing system components and their interconnections. One of the most complex components of a supercomputer that needs to be included in the model is its communication network. This work describes the proposed approach for automatically discovering the Ethernet communication network topology in a supercomputer and its description in terms of the Octotron model. This suite automatically detects computing nodes and switches, collects information about them and identifies their interconnections. The application of this approach is demonstrated on the "Lomonosov" and "Lomonosov-2" supercomputers.
TOP500 Supercomputers for June 2004

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

2004-06-23

23rd Edition of TOP500 List of World's Fastest Supercomputers Released: Japan's Earth Simulator Enters Third Year in Top Position MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 23rd edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2004) at the International Supercomputer Conference in Heidelberg, Germany.
Automotive applications of superconductors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ginsberg, M.

1987-01-01

These proceedings compile papers on supercomputers in the automobile industry. Titles include: An automotive engineer's guide to the effective use of scalar, vector, and parallel computers; fluid mechanics, finite elements, and supercomputers; and Automotive crashworthiness performance on a supercomputer.
Affordable and accurate large-scale hybrid-functional calculations on GPU-accelerated supercomputers

NASA Astrophysics Data System (ADS)

Ratcliff, Laura E.; Degomme, A.; Flores-Livas, José A.; Goedecker, Stefan; Genovese, Luigi

2018-03-01

Performing high accuracy hybrid functional calculations for condensed matter systems containing a large number of atoms is at present computationally very demanding or even out of reach if high quality basis sets are used. We present a highly optimized multiple graphics processing unit implementation of the exact exchange operator which allows one to perform fast hybrid functional density-functional theory (DFT) calculations with systematic basis sets without additional approximations for up to a thousand atoms. With this method hybrid DFT calculations of high quality become accessible on state-of-the-art supercomputers within a time-to-solution that is of the same order of magnitude as traditional semilocal-GGA functionals. The method is implemented in a portable open-source library.
Improved Access to Supercomputers Boosts Chemical Applications.

ERIC Educational Resources Information Center

Borman, Stu

1989-01-01

Supercomputing is described in terms of computing power and abilities. The increase in availability of supercomputers for use in chemical calculations and modeling are reported. Efforts of the National Science Foundation and Cray Research are highlighted. (CW)
Scientific Visualization in High Speed Network Environments

NASA Technical Reports Server (NTRS)

Vaziri, Arsi; Kutler, Paul (Technical Monitor)

1997-01-01

In several cases, new visualization techniques have vastly increased the researcher's ability to analyze and comprehend data. Similarly, the role of networks in providing an efficient supercomputing environment have become more critical and continue to grow at a faster rate than the increase in the processing capabilities of supercomputers. A close relationship between scientific visualization and high-speed networks in providing an important link to support efficient supercomputing is identified. The two technologies are driven by the increasing complexities and volume of supercomputer data. The interaction of scientific visualization and high-speed networks in a Computational Fluid Dynamics simulation/visualization environment are given. Current capabilities supported by high speed networks, supercomputers, and high-performance graphics workstations at the Numerical Aerodynamic Simulation Facility (NAS) at NASA Ames Research Center are described. Applied research in providing a supercomputer visualization environment to support future computational requirements are summarized.
Automation of a Wave-Optics Simulation and Image Post-Processing Package on Riptide

NASA Astrophysics Data System (ADS)

Werth, M.; Lucas, J.; Thompson, D.; Abercrombie, M.; Holmes, R.; Roggemann, M.

Detailed wave-optics simulations and image post-processing algorithms are computationally expensive and benefit from the massively parallel hardware available at supercomputing facilities. We created an automated system that interfaces with the Maui High Performance Computing Center (MHPCC) Distributed MATLAB® Portal interface to submit massively parallel waveoptics simulations to the IBM iDataPlex (Riptide) supercomputer. This system subsequently postprocesses the output images with an improved version of physically constrained iterative deconvolution (PCID) and analyzes the results using a series of modular algorithms written in Python. With this architecture, a single person can simulate thousands of unique scenarios and produce analyzed, archived, and briefing-compatible output products with very little effort. This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA). The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.
Space Transportation and the Computer Industry: Learning from the Past

NASA Technical Reports Server (NTRS)

Merriam, M. L.; Rasky, D.

2002-01-01

Since the space shuttle began flying in 1981, NASA has made a number of attempts to advance the state of the art in space transportation. In spite of billions of dollars invested, and several concerted attempts, no replacement for the shuttle is expected before 2010. Furthermore, the cost of access to space has dropped very slowly over the last two decades. On the other hand, the same two decades have seen dramatic progress in the computer industry. Computational speeds have increased by about a factor of 1000 and available memory, disk space, and network bandwidth has seen similar increases. At the same time, the cost of computing has dropped by about a factor of 10000. Is the space transportation problem simply harder? Or is there something to be learned from the computer industry? In looking for the answers, this paper reviews the early history of NASA's experience with supercomputers and NASA's visionary course change in supercomputer procurement strategy.
Benchmarking and tuning the MILC code on clusters and supercomputers

NASA Astrophysics Data System (ADS)

Gottlieb, Steven

2002-03-01

Recently, we have benchmarked and tuned the MILC code on a number of architectures including Intel Itanium and Pentium IV (PIV), dual-CPU Athlon, and the latest Compaq Alpha nodes. Results will be presented for many of these, and we shall discuss some simple code changes that can result in a very dramatic speedup of the KS conjugate gradient on processors with more advanced memory systems such as PIV, IBM SP and Alpha.
Benchmarking and tuning the MILC code on clusters and supercomputers

NASA Astrophysics Data System (ADS)

Gottlieb, Steven

Recently, we have benchmarked and tuned the MILC code on a number of architectures including Intel Itanium and Pentium IV (PIV), dual-CPU Athlon, and the latest Compaq Alpha nodes. Results will be presented for many of these, and we shall discuss some simple code changes that can result in a very dramatic speedup of the KS conjugate gradient on processors with more advanced memory systems such as PIV, IBM SP and Alpha.
Towards Efficient Supercomputing: Searching for the Right Efficiency Metric

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hsu, Chung-Hsing; Kuehn, Jeffery A; Poole, Stephen W

2012-01-01

The efficiency of supercomputing has traditionally been in the execution time. In early 2000 s, the concept of total cost of ownership was re-introduced, with the introduction of efficiency measure to include aspects such as energy and space. Yet the supercomputing community has never agreed upon a metric that can cover these aspects altogether and also provide a fair basis for comparison. This paper exam- ines the metrics that have been proposed in the past decade, and proposes a vector-valued metric for efficient supercom- puting. Using this metric, the paper presents a study of where the supercomputing industry has beenmore » and how it stands today with respect to efficient supercomputing.« less
Ultrahigh-order Maxwell solver with extreme scalability for electromagnetic PIC simulations of plasmas

NASA Astrophysics Data System (ADS)

Vincenti, Henri; Vay, Jean-Luc

2018-07-01

The advent of massively parallel supercomputers, with their distributed-memory technology using many processing units, has favored the development of highly-scalable local low-order solvers at the expense of harder-to-scale global very high-order spectral methods. Indeed, FFT-based methods, which were very popular on shared memory computers, have been largely replaced by finite-difference (FD) methods for the solution of many problems, including plasmas simulations with electromagnetic Particle-In-Cell methods. For some problems, such as the modeling of so-called "plasma mirrors" for the generation of high-energy particles and ultra-short radiations, we have shown that the inaccuracies of standard FD-based PIC methods prevent the modeling on present supercomputers at sufficient accuracy. We demonstrate here that a new method, based on the use of local FFTs, enables ultrahigh-order accuracy with unprecedented scalability, and thus for the first time the accurate modeling of plasma mirrors in 3D.
OpenMP Performance on the Columbia Supercomputer

NASA Technical Reports Server (NTRS)

Haoqiang, Jin; Hood, Robert

2005-01-01

This presentation discusses Columbia World Class Supercomputer which is one of the world's fastest supercomputers providing 61 TFLOPs (10/20/04). Conceived, designed, built, and deployed in just 120 days. A 20-node supercomputer built on proven 512-processor nodes. The largest SGI system in the world with over 10,000 Intel Itanium 2 processors and provides the largest node size incorporating commodity parts (512) and the largest shared-memory environment (2048) with 88% efficiency tops the scalar systems on the Top500 list.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bailey, David H.

The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, althoughmore » the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.« less
Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies.

PubMed

Standish, Kristopher A; Carland, Tristan M; Lockwood, Glenn K; Pfeiffer, Wayne; Tatineni, Mahidhar; Huang, C Chris; Lamberth, Sarah; Cherkas, Yauheniya; Brodmerkel, Carrie; Jaeger, Ed; Smith, Lance; Rajagopal, Gunaretnam; Curran, Mark E; Schork, Nicholas J

2015-09-22

Next-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost. We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study. We ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging 'big data' problems in biomedical research brought on by the expansion of NGS technologies.
Quantum lattice model solver HΦ

NASA Astrophysics Data System (ADS)

Kawamura, Mitsuaki; Yoshimi, Kazuyoshi; Misawa, Takahiro; Yamaji, Youhei; Todo, Synge; Kawashima, Naoki

2017-08-01

HΦ [aitch-phi ] is a program package based on the Lanczos-type eigenvalue solution applicable to a broad range of quantum lattice models, i.e., arbitrary quantum lattice models with two-body interactions, including the Heisenberg model, the Kitaev model, the Hubbard model and the Kondo-lattice model. While it works well on PCs and PC-clusters, HΦ also runs efficiently on massively parallel computers, which considerably extends the tractable range of the system size. In addition, unlike most existing packages, HΦ supports finite-temperature calculations through the method of thermal pure quantum (TPQ) states. In this paper, we explain theoretical background and user-interface of HΦ. We also show the benchmark results of HΦ on supercomputers such as the K computer at RIKEN Advanced Institute for Computational Science (AICS) and SGI ICE XA (Sekirei) at the Institute for the Solid State Physics (ISSP).
Multi-processing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

The MIMD concept is applied, through multitasking, with relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. An existing single processor algorithm is mapped without the need for developing a new algorithm. The procedure of designing a code utilizing this approach is automated with the Unix stream editor. A Multiple Processor Multiple Grid (MPMG) code is developed as a demonstration of this approach. This code solves the three-dimensional, Reynolds-averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. This solver is applied to a generic, oblique-wing aircraft problem on a four-processor computer using one process for data management and nonparallel computations and three processes for pseudotime advance on three different grid systems.
The design and implementation of cost-effective algorithms for direct solution of banded linear systems on the vector processor system 32 supercomputer

NASA Technical Reports Server (NTRS)

Samba, A. S.

1985-01-01

The problem of solving banded linear systems by direct (non-iterative) techniques on the Vector Processor System (VPS) 32 supercomputer is considered. Two efficient direct methods for solving banded linear systems on the VPS 32 are described. The vector cyclic reduction (VCR) algorithm is discussed in detail. The performance of the VCR on a three parameter model problem is also illustrated. The VCR is an adaptation of the conventional point cyclic reduction algorithm. The second direct method is the Customized Reduction of Augmented Triangles' (CRAT). CRAT has the dominant characteristics of an efficient VPS 32 algorithm. CRAT is tailored to the pipeline architecture of the VPS 32 and as a consequence the algorithm is implicitly vectorizable.
Earth System Grid II, Turning Climate Datasets into Community Resources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Middleton, Don

2006-08-01

The Earth System Grid (ESG) II project, funded by the Department of Energy’s Scientific Discovery through Advanced Computing program, has transformed climate data into community resources. ESG II has accomplished this goal by creating a virtual collaborative environment that links climate centers and users around the world to models and data via a computing Grid, which is based on the Department of Energy’s supercomputing resources and the Internet. Our project’s success stems from partnerships between climate researchers and computer scientists to advance basic and applied research in the terrestrial, atmospheric, and oceanic sciences. By interfacing with other climate science projects,more » we have learned that commonly used methods to manage and remotely distribute data among related groups lack infrastructure and under-utilize existing technologies. Knowledge and expertise gained from ESG II have helped the climate community plan strategies to manage a rapidly growing data environment more effectively. Moreover, approaches and technologies developed under the ESG project have impacted datasimulation integration in other disciplines, such as astrophysics, molecular biology and materials science.« less
Krylov subspace methods on supercomputers

NASA Technical Reports Server (NTRS)

Saad, Youcef

1988-01-01

A short survey of recent research on Krylov subspace methods with emphasis on implementation on vector and parallel computers is presented. Conjugate gradient methods have proven very useful on traditional scalar computers, and their popularity is likely to increase as three-dimensional models gain importance. A conservative approach to derive effective iterative techniques for supercomputers has been to find efficient parallel/vector implementations of the standard algorithms. The main source of difficulty in the incomplete factorization preconditionings is in the solution of the triangular systems at each step. A few approaches consisting of implementing efficient forward and backward triangular solutions are described in detail. Polynomial preconditioning as an alternative to standard incomplete factorization techniques is also discussed. Another efficient approach is to reorder the equations so as to improve the structure of the matrix to achieve better parallelism or vectorization. An overview of these and other ideas and their effectiveness or potential for different types of architectures is given.
Supercomputer networking for space science applications

NASA Technical Reports Server (NTRS)

Edelson, B. I.

1992-01-01

The initial design of a supercomputer network topology including the design of the communications nodes along with the communications interface hardware and software is covered. Several space science applications that are proposed experiments by GSFC and JPL for a supercomputer network using the NASA ACTS satellite are also reported.
Computed Flow Through An Artificial Heart And Valve

NASA Technical Reports Server (NTRS)

Rogers, Stuart E.; Kwak, Dochan; Kiris, Cetin; Chang, I-Dee

1994-01-01

NASA technical memorandum discusses computations of flow of blood through artificial heart and through tilting-disk artificial heart valve. Represents further progress in research described in "Numerical Simulation of Flow Through an Artificial Heart" (ARC-12478). One purpose of research to exploit advanced techniques of computational fluid dynamics and capabilities of supercomputers to gain understanding of complicated internal flows of viscous, essentially incompressible fluids like blood. Another to use understanding to design better artificial hearts and valves.
High temporal resolution mapping of seismic noise sources using heterogeneous supercomputers

NASA Astrophysics Data System (ADS)

Gokhberg, Alexey; Ermert, Laura; Paitz, Patrick; Fichtner, Andreas

2017-04-01

Time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems. Significant interest in seismic noise source maps with high temporal resolution (days) is expected to come from a number of domains, including natural resources exploration, analysis of active earthquake fault zones and volcanoes, as well as geothermal and hydrocarbon reservoir monitoring. Currently, knowledge of noise sources is insufficient for high-resolution subsurface monitoring applications. Near-real-time seismic data, as well as advanced imaging methods to constrain seismic noise sources have recently become available. These methods are based on the massive cross-correlation of seismic noise records from all available seismic stations in the region of interest and are therefore very computationally intensive. Heterogeneous massively parallel supercomputing systems introduced in the recent years combine conventional multi-core CPU with GPU accelerators and provide an opportunity for manifold increase and computing performance. Therefore, these systems represent an efficient platform for implementation of a noise source mapping solution. We present the first results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service that provides seismic noise source maps for Central Europe with high temporal resolution (days to few weeks depending on frequency and data availability). The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept in order to provide the interested external researchers the regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise mapping application is composed of four principal modules: (1) pre-processing of raw data, (2) massive cross-correlation, (3) post-processing of correlation data based on computation of logarithmic energy ratio and (4) generation of source maps from post-processed data. Implementation of the solution posed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service oriented architecture for coordination of various sub-systems, and engineering an appropriate data storage solution. The present pilot version of the service implements noise source maps for Switzerland. Extension of the solution to Central Europe is planned for the next project phase.
Advanced Aerospace Materials by Design

NASA Technical Reports Server (NTRS)

Srivastava, Deepak; Djomehri, Jahed; Wei, Chen-Yu

2004-01-01

The advances in the emerging field of nanophase thermal and structural composite materials; materials with embedded sensors and actuators for morphing structures; light-weight composite materials for energy and power storage; and large surface area materials for in-situ resource generation and waste recycling, are expected to :revolutionize the capabilities of virtually every system comprising of future robotic and :human moon and mars exploration missions. A high-performance multiscale simulation platform, including the computational capabilities and resources of Columbia - the new supercomputer, is being developed to discover, validate, and prototype next generation (of such advanced materials. This exhibit will describe the porting and scaling of multiscale 'physics based core computer simulation codes for discovering and designing carbon nanotube-polymer composite materials for light-weight load bearing structural and 'thermal protection applications.
Most Social Scientists Shun Free Use of Supercomputers.

ERIC Educational Resources Information Center

Kiernan, Vincent

1998-01-01

Social scientists, who frequently complain that the federal government spends too little on them, are passing up what scholars in the physical and natural sciences see as the government's best give-aways: free access to supercomputers. Some social scientists say the supercomputers are difficult to use; others find desktop computers provide…
A fault tolerant spacecraft supercomputer to enable a new class of scientific discovery

NASA Technical Reports Server (NTRS)

Katz, D. S.; McVittie, T. I.; Silliman, A. G., Jr.

2000-01-01

The goal of the Remote Exploration and Experimentation (REE) Project is to move supercomputeing into space in a coste effective manner and to allow the use of inexpensive, state of the art, commercial-off-the-shelf components and subsystems in these space-based supercomputers.
TOP500 Supercomputers for November 2003

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

2003-11-16

22nd Edition of TOP500 List of World s Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 22nd edition of the TOP500 list of the worlds fastest supercomputers was released today (November 16, 2003). The Earth Simulator supercomputer retains the number one position with its Linpack benchmark performance of 35.86 Tflop/s (''teraflops'' or trillions of calculations per second). It was built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan.
Establishing confidence in complex physics codes: Art or science?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trucano, T.

1997-12-31

The ALEGRA shock wave physics code, currently under development at Sandia National Laboratories and partially supported by the US Advanced Strategic Computing Initiative (ASCI), is generic to a certain class of physics codes: large, multi-application, intended to support a broad user community on the latest generation of massively parallel supercomputer, and in a continual state of formal development. To say that the author has ``confidence`` in the results of ALEGRA is to say something different than that he believes that ALEGRA is ``predictive.`` It is the purpose of this talk to illustrate the distinction between these two concepts. The authormore » elects to perform this task in a somewhat historical manner. He will summarize certain older approaches to code validation. He views these methods as aiming to establish the predictive behavior of the code. These methods are distinguished by their emphasis on local information. He will conclude that these approaches are more art than science.« less
Monitoring Object Library Usage and Changes

NASA Technical Reports Server (NTRS)

Owen, R. K.; Craw, James M. (Technical Monitor)

1995-01-01

The NASA Ames Numerical Aerodynamic Simulation program Aeronautics Consolidated Supercomputing Facility (NAS/ACSF) supercomputing center services over 1600 users, and has numerous analysts with root access. Several tools have been developed to monitor object library usage and changes. Some of the tools do "noninvasive" monitoring and other tools implement run-time logging even for object-only libraries. The run-time logging identifies who, when, and what is being used. The benefits are that real usage can be measured, unused libraries can be discontinued, training and optimization efforts can be focused at those numerical methods that are actually used. An overview of the tools will be given and the results will be discussed.
A high performance linear equation solver on the VPP500 parallel supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi

1994-12-31

This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.
Assessment techniques for a learning-centered curriculum: evaluation design for adventures in supercomputing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Helland, B.; Summers, B.G.

1996-09-01

As the classroom paradigm shifts from being teacher-centered to being learner-centered, student assessments are evolving from typical paper and pencil testing to other methods of evaluation. Students should be probed for understanding, reasoning, and critical thinking abilities rather than their ability to return memorized facts. The assessment of the Department of Energy`s pilot program, Adventures in Supercomputing (AiS), offers one example of assessment techniques developed for learner-centered curricula. This assessment has employed a variety of methods to collect student data. Methods of assessment used were traditional testing, performance testing, interviews, short questionnaires via email, and student presentations of projects. Themore » data obtained from these sources have been analyzed by a professional assessment team at the Center for Children and Technology. The results have been used to improve the AiS curriculum and establish the quality of the overall AiS program. This paper will discuss the various methods of assessment used and the results.« less
Distributed user services for supercomputers

NASA Technical Reports Server (NTRS)

Sowizral, Henry A.

1989-01-01

User-service operations at supercomputer facilities are examined. The question is whether a single, possibly distributed, user-services organization could be shared by NASA's supercomputer sites in support of a diverse, geographically dispersed, user community. A possible structure for such an organization is identified as well as some of the technologies needed in operating such an organization.
Demonstration of Cost-Effective, High-Performance Computing at Performance and Reliability Levels Equivalent to a 1994 Vector Supercomputer

NASA Technical Reports Server (NTRS)

Babrauckas, Theresa

2000-01-01

The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.

Full speed ahead for software

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolfe, A.

1986-03-10

Supercomputing software is moving into high gear, spurred by the rapid spread of supercomputers into new applications. The critical challenge is how to develop tools that will make it easier for programmers to write applications that take advantage of vectorizing in the classical supercomputer and the parallelism that is emerging in supercomputers and minisupercomputers. Writing parallel software is a challenge that every programmer must face because parallel architectures are springing up across the range of computing. Cray is developing a host of tools for programmers. Tools to support multitasking (in supercomputer parlance, multitasking means dividing up a single program tomore » run on multiple processors) are high on Cray's agenda. On tap for multitasking is Premult, dubbed a microtasking tool. As a preprocessor for Cray's CFT77 FORTRAN compiler, Premult will provide fine-grain multitasking.« less
Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yilk, Todd

The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.
Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL

DOE PAGES

Yilk, Todd

2018-02-17

The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-01-01

This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Non-preconditioned conjugate gradient on cell and FPCA-based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-03-10

This work presents a detailed implementation of a double precision, Non-Preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{trademark} in conjunction with x86 Opteron{trademark} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
NASA High Performance Computing and Communications program

NASA Technical Reports Server (NTRS)

Holcomb, Lee; Smith, Paul; Hunter, Paul

1994-01-01

The National Aeronautics and Space Administration's HPCC program is part of a new Presidential initiative aimed at producing a 1000-fold increase in supercomputing speed and a 1(X)-fold improvement in available communications capability by 1997. As more advanced technologies are developed under the HPCC program, they will be used to solve NASA's 'Grand Challenge' problems, which include improving the design and simulation of advanced aerospace vehicles, allowing people at remote locations to communicate more effectively and share information, increasing scientists' abilities to model the Earth's climate and forecast global environmental trends, and improving the development of advanced spacecraft. NASA's HPCC program is organized into three projects which are unique to the agency's mission: the Computational Aerosciences (CAS) project, the Earth and Space Sciences (ESS) project, and the Remote Exploration and Experimentation (REE) project. An additional project, the Basic Research and Human Resources (BRHR) project, exists to promote long term research in computer science and engineering and to increase the pool of trained personnel in a variety of scientific disciplines. This document presents an overview of the objectives and organization of these projects, as well as summaries of early accomplishments and the significance, status, and plans for individual research and development programs within each project. Areas of emphasis include benchmarking, testbeds, software and simulation methods.
Using a multifrontal sparse solver in a high performance, finite element code

NASA Technical Reports Server (NTRS)

King, Scott D.; Lucas, Robert; Raefsky, Arthur

1990-01-01

We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
Blizzard 2016

NASA Image and Video Library

2017-12-08

A NASA Center for Climate Simulation supercomputer model that shows the flow of ‪#‎Blizzard2016‬ thru Sunday. Learn more here: go.nasa.gov/1WBm547 NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
Argonne Discovery Yields Self-Healing Diamond-Like Carbon

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cunningham, Greg; Jones, Katie Elyce

We report that large-scale reactive molecular dynamics simulations carried out on the US Department of Energy’s IBM Blue Gene/Q Mira supercomputer at the Argonne Leadership Computing Facility, along with experiments conducted by researchers in Argonne’s Energy Systems Division, enabled the design of a “self-healing” anti-wear coating that drastically reduces friction and related degradation in engines and moving machinery. Now, the computational work advanced for this purpose is being used to identify the friction-fighting potential of other catalysts.
Argonne Discovery Yields Self-Healing Diamond-Like Carbon

DOE PAGES

Cunningham, Greg; Jones, Katie Elyce

2016-10-27

We report that large-scale reactive molecular dynamics simulations carried out on the US Department of Energy’s IBM Blue Gene/Q Mira supercomputer at the Argonne Leadership Computing Facility, along with experiments conducted by researchers in Argonne’s Energy Systems Division, enabled the design of a “self-healing” anti-wear coating that drastically reduces friction and related degradation in engines and moving machinery. Now, the computational work advanced for this purpose is being used to identify the friction-fighting potential of other catalysts.
Experiences Using OpenMP Based on Compiler Directed Software DSM on a PC Cluster

NASA Technical Reports Server (NTRS)

Hess, Matthias; Jost, Gabriele; Mueller, Matthias; Ruehle, Roland; Biegel, Bryan (Technical Monitor)

2002-01-01

In this work we report on our experiences running OpenMP (message passing) programs on a commodity cluster of PCs (personal computers) running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS (NASA Advanced Supercomputing) Parallel Benchmarks that have been automatically parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
Modeling Materials: Design for Planetary Entry, Electric Aircraft, and Beyond

NASA Technical Reports Server (NTRS)

Thompson, Alexander; Lawson, John W.

2014-01-01

NASA missions push the limits of what is possible. The development of high-performance materials must keep pace with the agency's demanding, cutting-edge applications. Researchers at NASA's Ames Research Center are performing multiscale computational modeling to accelerate development times and further the design of next-generation aerospace materials. Multiscale modeling combines several computationally intensive techniques ranging from the atomic level to the macroscale, passing output from one level as input to the next level. These methods are applicable to a wide variety of materials systems. For example: (a) Ultra-high-temperature ceramics for hypersonic aircraft-we utilized the full range of multiscale modeling to characterize thermal protection materials for faster, safer air- and spacecraft, (b) Planetary entry heat shields for space vehicles-we computed thermal and mechanical properties of ablative composites by combining several methods, from atomistic simulations to macroscale computations, (c) Advanced batteries for electric aircraft-we performed large-scale molecular dynamics simulations of advanced electrolytes for ultra-high-energy capacity batteries to enable long-distance electric aircraft service; and (d) Shape-memory alloys for high-efficiency aircraft-we used high-fidelity electronic structure calculations to determine phase diagrams in shape-memory transformations. Advances in high-performance computing have been critical to the development of multiscale materials modeling. We used nearly one million processor hours on NASA's Pleiades supercomputer to characterize electrolytes with a fidelity that would be otherwise impossible. For this and other projects, Pleiades enables us to push the physics and accuracy of our calculations to new levels.
High-Performance Computing: Industry Uses of Supercomputers and High-Speed Networks. Report to Congressional Requesters.

ERIC Educational Resources Information Center

General Accounting Office, Washington, DC. Information Management and Technology Div.

This report was prepared in response to a request for information on supercomputers and high-speed networks from the Senate Committee on Commerce, Science, and Transportation, and the House Committee on Science, Space, and Technology. The following information was requested: (1) examples of how various industries are using supercomputers to…
Supercomputer Provides Molecular Insight into Cellulose (Fact Sheet)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

2011-02-01

Groundbreaking research at the National Renewable Energy Laboratory (NREL) has used supercomputing simulations to calculate the work that enzymes must do to deconstruct cellulose, which is a fundamental step in biomass conversion technologies for biofuels production. NREL used the new high-performance supercomputer Red Mesa to conduct several million central processing unit (CPU) hours of simulation.
Summary Report of Working Group 2: Computation

NASA Astrophysics Data System (ADS)

Stoltz, P. H.; Tsung, R. S.

2009-01-01

The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) new hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.
Summary Report of Working Group 2: Computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stoltz, P. H.; Tsung, R. S.

2009-01-22

The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) newmore » hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.« less
GREEN SUPERCOMPUTING IN A DESKTOP BOX

DOE Office of Scientific and Technical Information (OSTI.GOV)

HSU, CHUNG-HSING; FENG, WU-CHUN; CHING, AVERY

2007-01-17

The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an open-source operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicatedmore » computer cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren - a shared (and power-hungry) datacenter resource that must reside in a machine-cooled room in order to operate properly. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a 'green' desktop supercomputer - a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 'pizza box' workstation. In this paper, they present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop supercomputer that achieves 14 Gflops on Linpack but sips only 185 watts of power at load, resulting in a performance-power ratio that is over 300% better than their reference SMP platform.« less
Large-Scale NASA Science Applications on the Columbia Supercluster

NASA Technical Reports Server (NTRS)

Brooks, Walter

2005-01-01

Columbia, NASA's newest 61 teraflops supercomputer that became operational late last year, is a highly integrated Altix cluster of 10,240 processors, and was named to honor the crew of the Space Shuttle lost in early 2003. Constructed in just four months, Columbia increased NASA's computing capability ten-fold, and revitalized the Agency's high-end computing efforts. Significant cutting-edge science and engineering simulations in the areas of space and Earth sciences, as well as aeronautics and space operations, are already occurring on this largest operational Linux supercomputer, demonstrating its capacity and capability to accelerate NASA's space exploration vision. The presentation will describe how an integrated environment consisting not only of next-generation systems, but also modeling and simulation, high-speed networking, parallel performance optimization, and advanced data analysis and visualization, is being used to reduce design cycle time, accelerate scientific discovery, conduct parametric analysis of multiple scenarios, and enhance safety during the life cycle of NASA missions. The talk will conclude by discussing how NAS partnered with various NASA centers, other government agencies, computer industry, and academia, to create a national resource in large-scale modeling and simulation.
GPU Particle Tracking and MHD Simulations with Greatly Enhanced Computational Speed

NASA Astrophysics Data System (ADS)

Ziemba, T.; O'Donnell, D.; Carscadden, J.; Cash, M.; Winglee, R.; Harnett, E.

2008-12-01

GPUs are intrinsically highly parallelized systems that provide more than an order of magnitude computing speed over a CPU based systems, for less cost than a high end-workstation. Recent advancements in GPU technologies allow for full IEEE float specifications with performance up to several hundred GFLOPs per GPU, and new software architectures have recently become available to ease the transition from graphics based to scientific applications. This allows for a cheap alternative to standard supercomputing methods and should increase the time to discovery. 3-D particle tracking and MHD codes have been developed using NVIDIA's CUDA and have demonstrated speed up of nearly a factor of 20 over equivalent CPU versions of the codes. Such a speed up enables new applications to develop, including real time running of radiation belt simulations and real time running of global magnetospheric simulations, both of which could provide important space weather prediction tools.
State of the art in electromagnetic modeling for the Compact Linear Collider

DOE Office of Scientific and Technical Information (OSTI.GOV)

Candel, Arno; Kabel, Andreas; Lee, Lie-Quan

SLAC's Advanced Computations Department (ACD) has developed the parallel 3D electromagnetic time-domain code T3P for simulations of wakefields and transients in complex accelerator structures. T3P is based on state-of-the-art Finite Element methods on unstructured grids and features unconditional stability, quadratic surface approximation and up to 6th-order vector basis functions for unprecedented simulation accuracy. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with fast turn-around times, aiding the design of the next generation of accelerator facilities. Applications include simulations of the proposed two-beam accelerator structures for the Compact Linear Collider (CLIC) - wakefieldmore » damping in the Power Extraction and Transfer Structure (PETS) and power transfer to the main beam accelerating structures are investigated.« less

Review of FD-TD numerical modeling of electromagnetic wave scattering and radar cross section

NASA Technical Reports Server (NTRS)

Taflove, Allen; Umashankar, Korada R.

1989-01-01

Applications of the finite-difference time-domain (FD-TD) method for numerical modeling of electromagnetic wave interactions with structures are reviewed, concentrating on scattering and radar cross section (RCS). A number of two- and three-dimensional examples of FD-TD modeling of scattering and penetration are provided. The objects modeled range in nature from simple geometric shapes to extremely complex aerospace and biological systems. Rigorous analytical or experimental validatons are provided for the canonical shapes, and it is shown that FD-TD predictive data for near fields and RCS are in excellent agreement with the benchmark data. It is concluded that with continuing advances in FD-TD modeling theory for target features relevant to the RCS problems and in vector and concurrent supercomputer technology, it is likely that FD-TD numerical modeling will occupy an important place in RCS technology in the 1990s and beyond.
Computational Nanotechnology at NASA Ames Research Center, 1996

NASA Technical Reports Server (NTRS)

Globus, Al; Bailey, David; Langhoff, Steve; Pohorille, Andrew; Levit, Creon; Chancellor, Marisa K. (Technical Monitor)

1996-01-01

Some forms of nanotechnology appear to have enormous potential to improve aerospace and computer systems; computational nanotechnology, the design and simulation of programmable molecular machines, is crucial to progress. NASA Ames Research Center has begun a computational nanotechnology program including in-house work, external research grants, and grants of supercomputer time. Four goals have been established: (1) Simulate a hypothetical programmable molecular machine replicating itself and building other products. (2) Develop molecular manufacturing CAD (computer aided design) software and use it to design molecular manufacturing systems and products of aerospace interest, including computer components. (3) Characterize nanotechnologically accessible materials of aerospace interest. Such materials may have excellent strength and thermal properties. (4) Collaborate with experimentalists. Current in-house activities include: (1) Development of NanoDesign, software to design and simulate a nanotechnology based on functionalized fullerenes. Early work focuses on gears. (2) A design for high density atomically precise memory. (3) Design of nanotechnology systems based on biology. (4) Characterization of diamonoid mechanosynthetic pathways. (5) Studies of the laplacian of the electronic charge density to understand molecular structure and reactivity. (6) Studies of entropic effects during self-assembly. Characterization of properties of matter for clusters up to sizes exhibiting bulk properties. In addition, the NAS (NASA Advanced Supercomputing) supercomputer division sponsored a workshop on computational molecular nanotechnology on March 4-5, 1996 held at NASA Ames Research Center. Finally, collaborations with Bill Goddard at CalTech, Ralph Merkle at Xerox Parc, Don Brenner at NCSU (North Carolina State University), Tom McKendree at Hughes, and Todd Wipke at UCSC are underway.
Pre-Hardware Optimization and Implementation Of Fast Optics Closed Control Loop Algorithms

NASA Technical Reports Server (NTRS)

Kizhner, Semion; Lyon, Richard G.; Herman, Jay R.; Abuhassan, Nader

2004-01-01

One of the main heritage tools used in scientific and engineering data spectrum analysis is the Fourier Integral Transform and its high performance digital equivalent - the Fast Fourier Transform (FFT). The FFT is particularly useful in two-dimensional (2-D) image processing (FFT2) within optical systems control. However, timing constraints of a fast optics closed control loop would require a supercomputer to run the software implementation of the FFT2 and its inverse, as well as other image processing representative algorithm, such as numerical image folding and fringe feature extraction. A laboratory supercomputer is not always available even for ground operations and is not feasible for a night project. However, the computationally intensive algorithms still warrant alternative implementation using reconfigurable computing technologies (RC) such as Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), which provide low cost compact super-computing capabilities. We present a new RC hardware implementation and utilization architecture that significantly reduces the computational complexity of a few basic image-processing algorithm, such as FFT2, image folding and phase diversity for the NASA Solar Viewing Interferometer Prototype (SVIP) using a cluster of DSPs and FPGAs. The DSP cluster utilization architecture also assures avoidance of a single point of failure, while using commercially available hardware. This, combined with the control algorithms pre-hardware optimization, or the first time allows construction of image-based 800 Hertz (Hz) optics closed control loops on-board a spacecraft, based on the SVIP ground instrument. That spacecraft is the proposed Earth Atmosphere Solar Occultation Imager (EASI) to study greenhouse gases CO2, C2H, H2O, O3, O2, N2O from Lagrange-2 point in space. This paper provides an advanced insight into a new type of science capabilities for future space exploration missions based on on-board image processing for control and for robotics missions using vision sensors. It presents a top-level description of technologies required for the design and construction of SVIP and EASI and to advance the spatial-spectral imaging and large-scale space interferometry science and engineering.
Input/output behavior of supercomputing applications

NASA Technical Reports Server (NTRS)

Miller, Ethan L.

1991-01-01

The collection and analysis of supercomputer I/O traces and their use in a collection of buffering and caching simulations are described. This serves two purposes. First, it gives a model of how individual applications running on supercomputers request file system I/O, allowing system designer to optimize I/O hardware and file system algorithms to that model. Second, the buffering simulations show what resources are needed to maximize the CPU utilization of a supercomputer given a very bursty I/O request rate. By using read-ahead and write-behind in a large solid stated disk, one or two applications were sufficient to fully utilize a Cray Y-MP CPU.
CFD code evaluation for internal flow modeling

NASA Technical Reports Server (NTRS)

Chung, T. J.

1990-01-01

Research on the computational fluid dynamics (CFD) code evaluation with emphasis on supercomputing in reacting flows is discussed. Advantages of unstructured grids, multigrids, adaptive methods, improved flow solvers, vector processing, parallel processing, and reduction of memory requirements are discussed. As examples, researchers include applications of supercomputing to reacting flow Navier-Stokes equations including shock waves and turbulence and combustion instability problems associated with solid and liquid propellants. Evaluation of codes developed by other organizations are not included. Instead, the basic criteria for accuracy and efficiency have been established, and some applications on rocket combustion have been made. Research toward an ultimate goal, the most accurate and efficient CFD code, is in progress and will continue for years to come.
Scaling of Multimillion-Atom Biological Molecular Dynamics Simulation on a Petascale Supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schulz, Roland; Lindner, Benjamin; Petridis, Loukas

2009-01-01

A strategy is described for a fast all-atom molecular dynamics simulation of multimillion-atom biological systems on massively parallel supercomputers. The strategy is developed using benchmark systems of particular interest to bioenergy research, comprising models of cellulose and lignocellulosic biomass in an aqueous solution. The approach involves using the reaction field (RF) method for the computation of long-range electrostatic interactions, which permits efficient scaling on many thousands of cores. Although the range of applicability of the RF method for biomolecular systems remains to be demonstrated, for the benchmark systems the use of the RF produces molecular dipole moments, Kirkwood G factors,more » other structural properties, and mean-square fluctuations in excellent agreement with those obtained with the commonly used Particle Mesh Ewald method. With RF, three million- and five million atom biological systems scale well up to 30k cores, producing 30 ns/day. Atomistic simulations of very large systems for time scales approaching the microsecond would, therefore, appear now to be within reach.« less
Scaling of Multimillion-Atom Biological Molecular Dynamics Simulation on a Petascale Supercomputer.

PubMed

Schulz, Roland; Lindner, Benjamin; Petridis, Loukas; Smith, Jeremy C

2009-10-13

A strategy is described for a fast all-atom molecular dynamics simulation of multimillion-atom biological systems on massively parallel supercomputers. The strategy is developed using benchmark systems of particular interest to bioenergy research, comprising models of cellulose and lignocellulosic biomass in an aqueous solution. The approach involves using the reaction field (RF) method for the computation of long-range electrostatic interactions, which permits efficient scaling on many thousands of cores. Although the range of applicability of the RF method for biomolecular systems remains to be demonstrated, for the benchmark systems the use of the RF produces molecular dipole moments, Kirkwood G factors, other structural properties, and mean-square fluctuations in excellent agreement with those obtained with the commonly used Particle Mesh Ewald method. With RF, three million- and five million-atom biological systems scale well up to ∼30k cores, producing ∼30 ns/day. Atomistic simulations of very large systems for time scales approaching the microsecond would, therefore, appear now to be within reach.
Next Generation Seismic Imaging; High Fidelity Algorithms and High-End Computing

NASA Astrophysics Data System (ADS)

Bevc, D.; Ortigosa, F.; Guitton, A.; Kaelin, B.

2007-05-01

The rich oil reserves of the Gulf of Mexico are buried in deep and ultra-deep waters up to 30,000 feet from the surface. Minerals Management Service (MMS), the federal agency in the U.S. Department of the Interior that manages the nation's oil, natural gas and other mineral resources on the outer continental shelf in federal offshore waters, estimates that the Gulf of Mexico holds 37 billion barrels of "undiscovered, conventionally recoverable" oil, which, at 50/barrel, would be worth approximately 1.85 trillion. These reserves are very difficult to find and reach due to the extreme depths. Technological advances in seismic imaging represent an opportunity to overcome this obstacle by providing more accurate models of the subsurface. Among these technological advances, Reverse Time Migration (RTM) yields the best possible images. RTM is based on the solution of the two-way acoustic wave-equation. This technique relies on the velocity model to image turning waves. These turning waves are particularly important to unravel subsalt reservoirs and delineate salt-flanks, a natural trap for oil and gas. Because it relies on an accurate velocity model, RTM opens new frontier in designing better velocity estimation algorithms. RTM has been widely recognized as the next chapter in seismic exploration, as it can overcome the limitations of current migration methods in imaging complex geologic structures that exist in the Gulf of Mexico. The chief impediment to the large-scale, routine deployment of RTM has been a lack of sufficient computer power. RTM needs thirty times the computing power used in exploration today to be commercially viable and widely usable. Therefore, advancing seismic imaging to the next level of precision poses a multi-disciplinary challenge. To overcome these challenges, the Kaleidoscope project, a partnership between Repsol YPF, Barcelona Supercomputing Center, 3DGeo Inc., and IBM brings together the necessary components of modeling, algorithms and the uniquely powerful computing power of the MareNostrum supercomputer in Barcelona to realize the promise of RTM, incorporate it into daily processing flows, and to help solve exploration problems in a highly cost-effective way. Uniquely, the Kaleidoscope Project is simultaneously integrating software (algorithms) and hardware (Cell BE), steps that are traditionally taken sequentially. This unique integration of software and hardware will accelerate seismic imaging by several orders of magnitude compared to conventional solutions running on standard Linux Clusters.
Prospects for Boiling of Subcooled Dielectric Liquids for Supercomputer Cooling

NASA Astrophysics Data System (ADS)

Zeigarnik, Yu. A.; Vasil'ev, N. V.; Druzhinin, E. A.; Kalmykov, I. V.; Kosoi, A. S.; Khodakov, K. A.

2018-02-01

It is shown experimentally that using forced-convection boiling of dielectric coolants of the Novec 649 Refrigerant subcooled relative to the saturation temperature makes possible removing heat flow rates up to 100 W/cm2 from modern supercomputer chip interface. This fact creates prerequisites for the application of dielectric liquids in cooling systems of modern supercomputers with increased requirements for their operating reliability.
Very large scale wavefunction orthogonalization in Density Functional Theory electronic structure calculations

NASA Astrophysics Data System (ADS)

Bekas, C.; Curioni, A.

2010-06-01

Enforcing the orthogonality of approximate wavefunctions becomes one of the dominant computational kernels in planewave based Density Functional Theory electronic structure calculations that involve thousands of atoms. In this context, algorithms that enjoy both excellent scalability and single processor performance properties are much needed. In this paper we present block versions of the Gram-Schmidt method and we show that they are excellent candidates for our purposes. We compare the new approach with the state of the art practice in planewave based calculations and find that it has much to offer, especially when applied on massively parallel supercomputers such as the IBM Blue Gene/P Supercomputer. The new method achieves excellent sustained performance that surpasses 73 TFLOPS (67% of peak) on 8 Blue Gene/P racks (32 768 compute cores), while it enables more than a two fold decrease in run time when compared with the best competing methodology.
Direct numerical simulation of the laminar-turbulent transition at hypersonic flow speeds on a supercomputer

NASA Astrophysics Data System (ADS)

Egorov, I. V.; Novikov, A. V.; Fedorov, A. V.

2017-08-01

A method for direct numerical simulation of three-dimensional unsteady disturbances leading to a laminar-turbulent transition at hypersonic flow speeds is proposed. The simulation relies on solving the full three-dimensional unsteady Navier-Stokes equations. The computational technique is intended for multiprocessor supercomputers and is based on a fully implicit monotone approximation scheme and the Newton-Raphson method for solving systems of nonlinear difference equations. This approach is used to study the development of three-dimensional unstable disturbances in a flat-plate and compression-corner boundary layers in early laminar-turbulent transition stages at the free-stream Mach number M = 5.37. The three-dimensional disturbance field is visualized in order to reveal and discuss features of the instability development at the linear and nonlinear stages. The distribution of the skin friction coefficient is used to detect laminar and transient flow regimes and determine the onset of the laminar-turbulent transition.
Supercomputing in Aerospace

NASA Technical Reports Server (NTRS)

Kutler, Paul; Yee, Helen

1987-01-01

Topics addressed include: numerical aerodynamic simulation; computational mechanics; supercomputers; aerospace propulsion systems; computational modeling in ballistics; turbulence modeling; computational chemistry; computational fluid dynamics; and computational astrophysics.
High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

NASA Technical Reports Server (NTRS)

Kramer, Williams T. C.; Simon, Horst D.

1994-01-01

This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.
Calculation of Free Energy Landscape in Multi-Dimensions with Hamiltonian-Exchange Umbrella Sampling on Petascale Supercomputer.

PubMed

Jiang, Wei; Luo, Yun; Maragliano, Luca; Roux, Benoît

2012-11-13

An extremely scalable computational strategy is described for calculations of the potential of mean force (PMF) in multidimensions on massively distributed supercomputers. The approach involves coupling thousands of umbrella sampling (US) simulation windows distributed to cover the space of order parameters with a Hamiltonian molecular dynamics replica-exchange (H-REMD) algorithm to enhance the sampling of each simulation. In the present application, US/H-REMD is carried out in a two-dimensional (2D) space and exchanges are attempted alternatively along the two axes corresponding to the two order parameters. The US/H-REMD strategy is implemented on the basis of parallel/parallel multiple copy protocol at the MPI level, and therefore can fully exploit computing power of large-scale supercomputers. Here the novel technique is illustrated using the leadership supercomputer IBM Blue Gene/P with an application to a typical biomolecular calculation of general interest, namely the binding of calcium ions to the small protein Calbindin D9k. The free energy landscape associated with two order parameters, the distance between the ion and its binding pocket and the root-mean-square deviation (rmsd) of the binding pocket relative the crystal structure, was calculated using the US/H-REMD method. The results are then used to estimate the absolute binding free energy of calcium ion to Calbindin D9k. The tests demonstrate that the 2D US/H-REMD scheme greatly accelerates the configurational sampling of the binding pocket, thereby improving the convergence of the potential of mean force calculation.
Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

NASA Technical Reports Server (NTRS)

1991-01-01

Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chinn, D J

This month's issue has the following articles: (1) The Edward Teller Centennial--Commentary by George H. Miller; (2) Edward Teller's Century: Celebrating the Man and His Vision--Colleagues at the Laboratory remember Edward Teller, cofounder of Lawrence Livermore, adviser to U.S. presidents, and physicist extraordinaire, on the 100th anniversary of his birth; (3) Quark Theory and Today's Supercomputers: It's a Match--Thanks to the power of BlueGene/L, Livermore has become an epicenter for theoretical advances in particle physics; and (4) The Role of Dentin in Tooth Fracture--Studies on tooth dentin show that its mechanical properties degrade with age.
Code Optimization and Parallelization on the Origins: Looking from Users' Perspective

NASA Technical Reports Server (NTRS)

Chang, Yan-Tyng Sherry; Thigpen, William W. (Technical Monitor)

2002-01-01

Parallel machines are becoming the main compute engines for high performance computing. Despite their increasing popularity, it is still a challenge for most users to learn the basic techniques to optimize/parallelize their codes on such platforms. In this paper, we present some experiences on learning these techniques for the Origin systems at the NASA Advanced Supercomputing Division. Emphasis of this paper will be on a few essential issues (with examples) that general users should master when they work with the Origins as well as other parallel systems.
Application of technology developed for flight simulation at NASA. Langley Research Center

NASA Technical Reports Server (NTRS)

Cleveland, Jeff I., II

1991-01-01

In order to meet the stringent time-critical requirements for real-time man-in-the-loop flight simulation, computer processing operations including mathematical model computation and data input/output to the simulators must be deterministic and be completed in as short a time as possible. Personnel at NASA's Langley Research Center are currently developing the use of supercomputers for simulation mathematical model computation for real-time simulation. This, coupled with the use of an open systems software architecture, will advance the state-of-the-art in real-time flight simulation.
NAS (Numerical Aerodynamic Simulation Program) technical summaries, March 1989 - February 1990

NASA Technical Reports Server (NTRS)

1990-01-01

Given here are selected scientific results from the Numerical Aerodynamic Simulation (NAS) Program's third year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP supercomputer. Topics covered include flow field analysis of fighter wing configurations, large-scale ocean modeling, the Space Shuttle flow field, advanced computational fluid dynamics (CFD) codes for rotary-wing airloads and performance prediction, turbulence modeling of separated flows, airloads and acoustics of rotorcraft, vortex-induced nonlinearities on submarines, and standing oblique detonation waves.
Federal Market Information Technology in the Post Flash Crash Era: Roles for Supercomputing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bethel, E. Wes; Leinweber, David; Ruebel, Oliver

2011-09-16

This paper describes collaborative work between active traders, regulators, economists, and supercomputing researchers to replicate and extend investigations of the Flash Crash and other market anomalies in a National Laboratory HPC environment. Our work suggests that supercomputing tools and methods will be valuable to market regulators in achieving the goal of market safety, stability, and security. Research results using high frequency data and analytics are described, and directions for future development are discussed. Currently the key mechanism for preventing catastrophic market action are “circuit breakers.” We believe a more graduated approach, similar to the “yellow light” approach in motorsports tomore » slow down traffic, might be a better way to achieve the same goal. To enable this objective, we study a number of indicators that could foresee hazards in market conditions and explore options to confirm such predictions. Our tests confirm that Volume Synchronized Probability of Informed Trading (VPIN) and a version of volume Herfindahl-Hirschman Index (HHI) for measuring market fragmentation can indeed give strong signals ahead of the Flash Crash event on May 6 2010. This is a preliminary step toward a full-fledged early-warning system for unusual market conditions.« less

Improving Memory Error Handling Using Linux

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carlton, Michael Andrew; Blanchard, Sean P.; Debardeleben, Nathan A.

As supercomputers continue to get faster and more powerful in the future, they will also have more nodes. If nothing is done, then the amount of memory in supercomputer clusters will soon grow large enough that memory failures will be unmanageable to deal with by manually replacing memory DIMMs. "Improving Memory Error Handling Using Linux" is a process oriented method to solve this problem by using the Linux kernel to disable (offline) faulty memory pages containing bad addresses, preventing them from being used again by a process. The process of offlining memory pages simplifies error handling and results in reducingmore » both hardware and manpower costs required to run Los Alamos National Laboratory (LANL) clusters. This process will be necessary for the future of supercomputing to allow the development of exascale computers. It will not be feasible without memory error handling to manually replace the number of DIMMs that will fail daily on a machine consisting of 32-128 petabytes of memory. Testing reveals the process of offlining memory pages works and is relatively simple to use. As more and more testing is conducted, the entire process will be automated within the high-performance computing (HPC) monitoring software, Zenoss, at LANL.« less
Development of a Cloud Resolving Model for Heterogeneous Supercomputers

NASA Astrophysics Data System (ADS)

Sreepathi, S.; Norman, M. R.; Pal, A.; Hannah, W.; Ponder, C.

2017-12-01

A cloud resolving climate model is needed to reduce major systematic errors in climate simulations due to structural uncertainty in numerical treatments of convection - such as convective storm systems. This research describes the porting effort to enable SAM (System for Atmosphere Modeling) cloud resolving model on heterogeneous supercomputers using GPUs (Graphical Processing Units). We have isolated a standalone configuration of SAM that is targeted to be integrated into the DOE ACME (Accelerated Climate Modeling for Energy) Earth System model. We have identified key computational kernels from the model and offloaded them to a GPU using the OpenACC programming model. Furthermore, we are investigating various optimization strategies intended to enhance GPU utilization including loop fusion/fission, coalesced data access and loop refactoring to a higher abstraction level. We will present early performance results, lessons learned as well as optimization strategies. The computational platform used in this study is the Summitdev system, an early testbed that is one generation removed from Summit, the next leadership class supercomputer at Oak Ridge National Laboratory. The system contains 54 nodes wherein each node has 2 IBM POWER8 CPUs and 4 NVIDIA Tesla P100 GPUs. This work is part of a larger project, ACME-MMF component of the U.S. Department of Energy(DOE) Exascale Computing Project. The ACME-MMF approach addresses structural uncertainty in cloud processes by replacing traditional parameterizations with cloud resolving "superparameterization" within each grid cell of global climate model. Super-parameterization dramatically increases arithmetic intensity, making the MMF approach an ideal strategy to achieve good performance on emerging exascale computing architectures. The goal of the project is to integrate superparameterization into ACME, and explore its full potential to scientifically and computationally advance climate simulation and prediction.
Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

DOE Office of Scientific and Technical Information (OSTI.GOV)

De, K; Jha, S; Klimentov, A

2016-01-01

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), MIRA supercomputer at Argonne Leadership Computing Facilities (ALCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava and others). Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full production for the ATLAS experiment since September 2015. We will present our current accomplishments with running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less
Color graphics, interactive processing, and the supercomputer

NASA Technical Reports Server (NTRS)

Smith-Taylor, Rudeen

1987-01-01

The development of a common graphics environment for the NASA Langley Research Center user community and the integration of a supercomputer into this environment is examined. The initial computer hardware, the software graphics packages, and their configurations are described. The addition of improved computer graphics capability to the supercomputer, and the utilization of the graphic software and hardware are discussed. Consideration is given to the interactive processing system which supports the computer in an interactive debugging, processing, and graphics environment.
Computational chemistry

NASA Technical Reports Server (NTRS)

Arnold, J. O.

1987-01-01

With the advent of supercomputers, modern computational chemistry algorithms and codes, a powerful tool was created to help fill NASA's continuing need for information on the properties of matter in hostile or unusual environments. Computational resources provided under the National Aerodynamics Simulator (NAS) program were a cornerstone for recent advancements in this field. Properties of gases, materials, and their interactions can be determined from solutions of the governing equations. In the case of gases, for example, radiative transition probabilites per particle, bond-dissociation energies, and rates of simple chemical reactions can be determined computationally as reliably as from experiment. The data are proving to be quite valuable in providing inputs to real-gas flow simulation codes used to compute aerothermodynamic loads on NASA's aeroassist orbital transfer vehicles and a host of problems related to the National Aerospace Plane Program. Although more approximate, similar solutions can be obtained for ensembles of atoms simulating small particles of materials with and without the presence of gases. Computational chemistry has application in studying catalysis, properties of polymers, all of interest to various NASA missions, including those previously mentioned. In addition to discussing these applications of computational chemistry within NASA, the governing equations and the need for supercomputers for their solution is outlined.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Boris, J.P.; Picone, J.M.; Lambrakos, S.G.

The Surveillance, Correlation, and Tracking (SCAT) problem is the computation-limited kernel of future battle-management systems currently being developed, for example, under the Strategic Defense Initiative (SDI). This report shows how high-performance SCAT can be performed in this decade. Estimates suggest that an increase by a factor of at least one thousand in computational capacity will be necessary to track 10/sup 5/ SDI objects in real time. This large improvement is needed because standard algorithms for data organization in important segments of the SCAT problem scale as N/sup 2/ and N/sup 3/, where N is the number of perceived objects. Itmore » is shown that the required speed-up factor can now be achieved because of two new developments: 1) a heterogeneous element supercomputer system based on available parallel-processing technology can account for over one order of magnitude performance improvement today over existing supercomputers; and 2) algorithmic innovations development recently by the NRL Laboratory for Computational Physics will account for another two orders of magnitude improvement. Based on these advances, a comprehensive, high-performance kernel for a simulator/system to perform the SCAT portion of SDI battle management is described.« less
Computing at the speed limit (supercomputers)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bernhard, R.

1982-07-01

The author discusses how unheralded efforts in the United States, mainly in universities, have removed major stumbling blocks to building cost-effective superfast computers for scientific and engineering applications within five years. These computers would have sustained speeds of billions of floating-point operations per second (flops), whereas with the fastest machines today the top sustained speed is only 25 million flops, with bursts to 160 megaflops. Cost-effective superfast machines can be built because of advances in very large-scale integration and the special software needed to program the new machines. VLSI greatly reduces the cost per unit of computing power. The developmentmore » of such computers would come at an opportune time. Although the US leads the world in large-scale computer technology, its supremacy is now threatened, not surprisingly, by the Japanese. Publicized reports indicate that the Japanese government is funding a cooperative effort by commercial computer manufacturers to develop superfast computers-about 1000 times faster than modern supercomputers. The US computer industry, by contrast, has balked at attempting to boost computer power so sharply because of the uncertain market for the machines and the failure of similar projects in the past to show significant results.« less
NSF Commits to Supercomputers.

ERIC Educational Resources Information Center

Waldrop, M. Mitchell

1985-01-01

The National Science Foundation (NSF) has allocated at least $200 million over the next five years to support four new supercomputer centers. Issues and trends related to this NSF initiative are examined. (JN)
Mira: Argonne's 10-petaflops supercomputer

ScienceCinema

Papka, Michael; Coghlan, Susan; Isaacs, Eric; Peters, Mark; Messina, Paul

2018-02-13

Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.
Adventures in Computational Grids

NASA Technical Reports Server (NTRS)

Walatka, Pamela P.; Biegel, Bryan A. (Technical Monitor)

2002-01-01

Sometimes one supercomputer is not enough. Or your local supercomputers are busy, or not configured for your job. Or you don't have any supercomputers. You might be trying to simulate worldwide weather changes in real time, requiring more compute power than you could get from any one machine. Or you might be collecting microbiological samples on an island, and need to examine them with a special microscope located on the other side of the continent. These are the times when you need a computational grid.
Mira: Argonne's 10-petaflops supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Papka, Michael; Coghlan, Susan; Isaacs, Eric

2013-07-03

Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.
Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)

ScienceCinema

Guenther, Chris

2018-05-23

The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.
A high level language for a high performance computer

NASA Technical Reports Server (NTRS)

Perrott, R. H.

1978-01-01

The proposed computational aerodynamic facility will join the ranks of the supercomputers due to its architecture and increased execution speed. At present, the languages used to program these supercomputers have been modifications of programming languages which were designed many years ago for sequential machines. A new programming language should be developed based on the techniques which have proved valuable for sequential programming languages and incorporating the algorithmic techniques required for these supercomputers. The design objectives for such a language are outlined.
Floating point arithmetic in future supercomputers

NASA Technical Reports Server (NTRS)

Bailey, David H.; Barton, John T.; Simon, Horst D.; Fouts, Martin J.

1989-01-01

Considerations in the floating-point design of a supercomputer are discussed. Particular attention is given to word size, hardware support for extended precision, format, and accuracy characteristics. These issues are discussed from the perspective of the Numerical Aerodynamic Simulation Systems Division at NASA Ames. The features believed to be most important for a future supercomputer floating-point design include: (1) a 64-bit IEEE floating-point format with 11 exponent bits, 52 mantissa bits, and one sign bit and (2) hardware support for reasonably fast double-precision arithmetic.
Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guenther, Chris

The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.
Constructing Neuronal Network Models in Massively Parallel Environments.

PubMed

Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.
Constructing Neuronal Network Models in Massively Parallel Environments

PubMed Central

Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808
Integration of Panda Workload Management System with supercomputers

NASA Astrophysics Data System (ADS)

De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Novikov, A.; Oleynik, D.; Panitkin, S.; Poyda, A.; Read, K. F.; Ryabinkin, E.; Teslyuk, A.; Velikhov, V.; Wells, J. C.; Wenaus, T.

2016-09-01

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center "Kurchatov Institute", IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run singlethreaded workloads in parallel on Titan's multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility's infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.
Freud: a software suite for high-throughput simulation analysis

NASA Astrophysics Data System (ADS)

Harper, Eric; Spellings, Matthew; Anderson, Joshua; Glotzer, Sharon

Computer simulation is an indispensable tool for the study of a wide variety of systems. As simulations scale to fill petascale and exascale supercomputing clusters, so too does the size of the data produced, as well as the difficulty in analyzing these data. We present Freud, an analysis software suite for efficient analysis of simulation data. Freud makes no assumptions about the system being analyzed, allowing for general analysis methods to be applied to nearly any type of simulation. Freud includes standard analysis methods such as the radial distribution function, as well as new methods including the potential of mean force and torque and local crystal environment analysis. Freud combines a Python interface with fast, parallel C + + analysis routines to run efficiently on laptops, workstations, and supercomputing clusters. Data analysis on clusters reduces data transfer requirements, a prohibitive cost for petascale computing. Used in conjunction with simulation software, Freud allows for smart simulations that adapt to the current state of the system, enabling the study of phenomena such as nucleation and growth, intelligent investigation of phases and phase transitions, and determination of effective pair potentials.
Climate@Home: Crowdsourcing Climate Change Research

NASA Astrophysics Data System (ADS)

Xu, C.; Yang, C.; Li, J.; Sun, M.; Bambacus, M.

2011-12-01

Climate change deeply impacts human wellbeing. Significant amounts of resources have been invested in building super-computers that are capable of running advanced climate models, which help scientists understand climate change mechanisms, and predict its trend. Although climate change influences all human beings, the general public is largely excluded from the research. On the other hand, scientists are eagerly seeking communication mediums for effectively enlightening the public on climate change and its consequences. The Climate@Home project is devoted to connect the two ends with an innovative solution: crowdsourcing climate computing to the general public by harvesting volunteered computing resources from the participants. A distributed web-based computing platform will be built to support climate computing, and the general public can 'plug-in' their personal computers to participate in the research. People contribute the spare computing power of their computers to run a computer model, which is used by scientists to predict climate change. Traditionally, only super-computers could handle such a large computing processing load. By orchestrating massive amounts of personal computers to perform atomized data processing tasks, investments on new super-computers, energy consumed by super-computers, and carbon release from super-computers are reduced. Meanwhile, the platform forms a social network of climate researchers and the general public, which may be leveraged to raise climate awareness among the participants. A portal is to be built as the gateway to the climate@home project. Three types of roles and the corresponding functionalities are designed and supported. The end users include the citizen participants, climate scientists, and project managers. Citizen participants connect their computing resources to the platform by downloading and installing a computing engine on their personal computers. Computer climate models are defined at the server side. Climate scientists configure computer model parameters through the portal user interface. After model configuration, scientists then launch the computing task. Next, data is atomized and distributed to computing engines that are running on citizen participants' computers. Scientists will receive notifications on the completion of computing tasks, and examine modeling results via visualization modules of the portal. Computing tasks, computing resources, and participants are managed by project managers via portal tools. A portal prototype has been built for proof of concept. Three forums have been setup for different groups of users to share information on science aspect, technology aspect, and educational outreach aspect. A facebook account has been setup to distribute messages via the most popular social networking platform. New treads are synchronized from the forums to facebook. A mapping tool displays geographic locations of the participants and the status of tasks on each client node. A group of users have been invited to test functions such as forums, blogs, and computing resource monitoring.

Survey of new vector computers: The CRAY 1S from CRAY research; the CYBER 205 from CDC and the parallel computer from ICL - architecture and programming

NASA Technical Reports Server (NTRS)

Gentzsch, W.

1982-01-01

Problems which can arise with vector and parallel computers are discussed in a user oriented context. Emphasis is placed on the algorithms used and the programming techniques adopted. Three recently developed supercomputers are examined and typical application examples are given in CRAY FORTRAN, CYBER 205 FORTRAN and DAP (distributed array processor) FORTRAN. The systems performance is compared. The addition of parts of two N x N arrays is considered. The influence of the architecture on the algorithms and programming language is demonstrated. Numerical analysis of magnetohydrodynamic differential equations by an explicit difference method is illustrated, showing very good results for all three systems. The prognosis for supercomputer development is assessed.
FAST: A multi-processed environment for visualization of computational fluid dynamics

NASA Technical Reports Server (NTRS)

Bancroft, Gordon V.; Merritt, Fergus J.; Plessel, Todd C.; Kelaita, Paul G.; Mccabe, R. Kevin

1991-01-01

Three-dimensional, unsteady, multi-zoned fluid dynamics simulations over full scale aircraft are typical of the problems being investigated at NASA Ames' Numerical Aerodynamic Simulation (NAS) facility on CRAY2 and CRAY-YMP supercomputers. With multiple processor workstations available in the 10-30 Mflop range, we feel that these new developments in scientific computing warrant a new approach to the design and implementation of analysis tools. These larger, more complex problems create a need for new visualization techniques not possible with the existing software or systems available as of this writing. The visualization techniques will change as the supercomputing environment, and hence the scientific methods employed, evolves even further. The Flow Analysis Software Toolkit (FAST), an implementation of a software system for fluid mechanics analysis, is discussed.
CFD Research, Parallel Computation and Aerodynamic Optimization

NASA Technical Reports Server (NTRS)

Ryan, James S.

1995-01-01

During the last five years, CFD has matured substantially. Pure CFD research remains to be done, but much of the focus has shifted to integration of CFD into the design process. The work under these cooperative agreements reflects this trend. The recent work, and work which is planned, is designed to enhance the competitiveness of the US aerospace industry. CFD and optimization approaches are being developed and tested, so that the industry can better choose which methods to adopt in their design processes. The range of computer architectures has been dramatically broadened, as the assumption that only huge vector supercomputers could be useful has faded. Today, researchers and industry can trade off time, cost, and availability, choosing vector supercomputers, scalable parallel architectures, networked workstations, or heterogenous combinations of these to complete required computations efficiently.
The Computer Simulation of Liquids by Molecular Dynamics.

ERIC Educational Resources Information Center

Smith, W.

1987-01-01

Proposes a mathematical computer model for the behavior of liquids using the classical dynamic principles of Sir Isaac Newton and the molecular dynamics method invented by other scientists. Concludes that other applications will be successful using supercomputers to go beyond simple Newtonian physics. (CW)
Energy Efficient Supercomputing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anypas, Katie

2014-10-17

Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.
Energy Efficient Supercomputing

ScienceCinema

Anypas, Katie

2018-05-07

Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.
Job Management Requirements for NAS Parallel Systems and Clusters

NASA Technical Reports Server (NTRS)

Saphir, William; Tanner, Leigh Ann; Traversat, Bernard

1995-01-01

A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance computing industry. Newer job management systems offer new functionality but do not solve fundamental problems. We address some of the main issues in resource allocation and job scheduling we have encountered on two parallel computers - a 160-node IBM SP2 and a cluster of 20 high performance workstations located at the Numerical Aerodynamic Simulation facility. We describe the requirements for resource allocation and job management that are necessary to provide a production supercomputing environment on these machines, prioritizing according to difficulty and importance, and advocating a return to fundamental issues.
Approaching the exa-scale: a real-world evaluation of rendering extremely large data sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patchett, John M; Ahrens, James P; Lo, Li - Ta

2010-10-15

Extremely large scale analysis is becoming increasingly important as supercomputers and their simulations move from petascale to exascale. The lack of dedicated hardware acceleration for rendering on today's supercomputing platforms motivates our detailed evaluation of the possibility of interactive rendering on the supercomputer. In order to facilitate our understanding of rendering on the supercomputing platform, we focus on scalability of rendering algorithms and architecture envisioned for exascale datasets. To understand tradeoffs for dealing with extremely large datasets, we compare three different rendering algorithms for large polygonal data: software based ray tracing, software based rasterization and hardware accelerated rasterization. We presentmore » a case study of strong and weak scaling of rendering extremely large data on both GPU and CPU based parallel supercomputers using Para View, a parallel visualization tool. Wc use three different data sets: two synthetic and one from a scientific application. At an extreme scale, algorithmic rendering choices make a difference and should be considered while approaching exascale computing, visualization, and analysis. We find software based ray-tracing offers a viable approach for scalable rendering of the projected future massive data sizes.« less
Supercomputing Drives Innovation - Continuum Magazine | NREL

Science.gov Websites

years, NREL scientists have used supercomputers to simulate 3D models of the primary enzymes and Scientist, discuss a 3D model of wind plant aerodynamics, showing low velocity wakes and impact on
Parallel iterative methods for sparse linear and nonlinear equations

NASA Technical Reports Server (NTRS)

Saad, Youcef

1989-01-01

As three-dimensional models are gaining importance, iterative methods will become almost mandatory. Among these, preconditioned Krylov subspace methods have been viewed as the most efficient and reliable, when solving linear as well as nonlinear systems of equations. There has been several different approaches taken to adapt iterative methods for supercomputers. Some of these approaches are discussed and the methods that deal more specifically with general unstructured sparse matrices, such as those arising from finite element methods, are emphasized.
Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sreepathi, Sarat; D'Azevedo, Eduardo; Philip, Bobby

On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phasemore » of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership class supercomputer at Oak Ridge National Laboratory.« less
Wavefunction Properties and Electronic Band Structures of High-Mobility Semiconductor Nanosheet MoS2

NASA Astrophysics Data System (ADS)

Baik, Seung Su; Lee, Hee Sung; Im, Seongil; Choi, Hyoung Joon; Ccsaemp Team; Edl Team

2014-03-01

Molybdenum disulfide (MoS2) nanosheet is regarded as one of the most promising alternatives to the current semiconductors due to its significant band-gap and electron-mobility enhancement upon exfoliating. To elucidate such thickness-dependent properties, we have studied the electronic band structures of bulk and monolayer MoS2 by using the first-principles density-functional method as implemented in the SIESTA code. Based on the wavefunction analyses at the conduction band minimum (CBM) points, we have investigated possible origins of mobility difference between bulk and monolayer MoS2. We provide formation energies of substitutional impurities at the Mo and S sites, and discuss feasible electron sources which may induce a significant difference in the carrier lifetime. This work was supported by NRF of Korea (Grant Nos. 2009-0079462 and 2011-0018306), Nano-Material Technology Development Program (2012M3a7B4034985), and KISTI supercomputing center (Project No. KSC-2013-C3-008). Center for Computational Studies of Advanced Electronic Material Properties.
Scalable real space pseudopotential density functional codes for materials in the exascale regime

NASA Astrophysics Data System (ADS)

Lena, Charles; Chelikowsky, James; Schofield, Grady; Biller, Ariel; Kronik, Leeor; Saad, Yousef; Deslippe, Jack

Real-space pseudopotential density functional theory has proven to be an efficient method for computing the properties of matter in many different states and geometries, including liquids, wires, slabs, and clusters with and without spin polarization. Fully self-consistent solutions using this approach have been routinely obtained for systems with thousands of atoms. Yet, there are many systems of notable larger sizes where quantum mechanical accuracy is desired, but scalability proves to be a hindrance. Such systems include large biological molecules, complex nanostructures, or mismatched interfaces. We will present an overview of our new massively parallel algorithms, which offer improved scalability in preparation for exascale supercomputing. We will illustrate these algorithms by considering the electronic structure of a Si nanocrystal exceeding 104 atoms. Support provided by the SciDAC program, Department of Energy, Office of Science, Advanced Scientific Computing Research and Basic Energy Sciences. Grant Numbers DE-SC0008877 (Austin) and DE-FG02-12ER4 (Berkeley).
Parallelized reliability estimation of reconfigurable computer networks

NASA Technical Reports Server (NTRS)

Nicol, David M.; Das, Subhendu; Palumbo, Dan

1990-01-01

A parallelized system, ASSURE, for computing the reliability of embedded avionics flight control systems which are able to reconfigure themselves in the event of failure is described. ASSURE accepts a grammar that describes a reliability semi-Markov state-space. From this it creates a parallel program that simultaneously generates and analyzes the state-space, placing upper and lower bounds on the probability of system failure. ASSURE is implemented on a 32-node Intel iPSC/860, and has achieved high processor efficiencies on real problems. Through a combination of improved algorithms, exploitation of parallelism, and use of an advanced microprocessor architecture, ASSURE has reduced the execution time on substantial problems by a factor of one thousand over previous workstation implementations. Furthermore, ASSURE's parallel execution rate on the iPSC/860 is an order of magnitude faster than its serial execution rate on a Cray-2 supercomputer. While dynamic load balancing is necessary for ASSURE's good performance, it is needed only infrequently; the particular method of load balancing used does not substantially affect performance.
Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sarje, Abhinav; Jacobsen, Douglas W.; Williams, Samuel W.

The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.
A mass storage system for supercomputers based on Unix

NASA Technical Reports Server (NTRS)

Richards, J.; Kummell, T.; Zarlengo, D. G.

1988-01-01

The authors present the design, implementation, and utilization of a large mass storage subsystem (MSS) for the numerical aerodynamics simulation. The MSS supports a large networked, multivendor Unix-based supercomputing facility. The MSS at Ames Research Center provides all processors on the numerical aerodynamics system processing network, from workstations to supercomputers, the ability to store large amounts of data in a highly accessible, long-term repository. The MSS uses Unix System V and is capable of storing hundreds of thousands of files ranging from a few bytes to 2 Gb in size.
Supercomputer algorithms for efficient linear octree encoding of three-dimensional brain images.

PubMed

Berger, S B; Reis, D J

1995-02-01

We designed and implemented algorithms for three-dimensional (3-D) reconstruction of brain images from serial sections using two important supercomputer architectures, vector and parallel. These architectures were represented by the Cray YMP and Connection Machine CM-2, respectively. The programs operated on linear octree representations of the brain data sets, and achieved 500-800 times acceleration when compared with a conventional laboratory workstation. As the need for higher resolution data sets increases, supercomputer algorithms may offer a means of performing 3-D reconstruction well above current experimental limits.
Intelligent supercomputers: the Japanese computer sputnik

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walter, G.

1983-11-01

Japan's government-supported fifth-generation computer project has had a pronounced effect on the American computer and information systems industry. The US firms are intensifying their research on and production of intelligent supercomputers, a combination of computer architecture and artificial intelligence software programs. While the present generation of computers is built for the processing of numbers, the new supercomputers will be designed specifically for the solution of symbolic problems and the use of artificial intelligence software. This article discusses new and exciting developments that will increase computer capabilities in the 1990s. 4 references.
Los Alamos on Radio Café: Ludmil Alexandrov

DOE Office of Scientific and Technical Information (OSTI.GOV)

Domandi, Mary-Charlotte; Alexandrov, Ludmil

In a creative breakthrough in cancer research, Ludmil Alexandrov, the J. Robert Oppenheimer Distinguished Postdoctoral Fellow at Los Alamos National Laboratory, combines Big Data, supercomputing and machine-learning to identify the telltale mutations of cancer. Knowing these mutational signatures can help researchers develop new methods of prevention.
A numerical code for the simulation of non-equilibrium chemically reacting flows on hybrid CPU-GPU clusters

NASA Astrophysics Data System (ADS)

Kudryavtsev, Alexey N.; Kashkovsky, Alexander V.; Borisov, Semyon P.; Shershnev, Anton A.

2017-10-01

In the present work a computer code RCFS for numerical simulation of chemically reacting compressible flows on hybrid CPU/GPU supercomputers is developed. It solves 3D unsteady Euler equations for multispecies chemically reacting flows in general curvilinear coordinates using shock-capturing TVD schemes. Time advancement is carried out using the explicit Runge-Kutta TVD schemes. Program implementation uses CUDA application programming interface to perform GPU computations. Data between GPUs is distributed via domain decomposition technique. The developed code is verified on the number of test cases including supersonic flow over a cylinder.

ATLAS computing on CSCS HPC

NASA Astrophysics Data System (ADS)

Filipcic, A.; Haug, S.; Hostettler, M.; Walker, R.; Weber, M.

2015-12-01

The Piz Daint Cray XC30 HPC system at CSCS, the Swiss National Supercomputing centre, was the highest ranked European system on TOP500 in 2014, also featuring GPU accelerators. Event generation and detector simulation for the ATLAS experiment have been enabled for this machine. We report on the technical solutions, performance, HPC policy challenges and possible future opportunities for HEP on extreme HPC systems. In particular a custom made integration to the ATLAS job submission system has been developed via the Advanced Resource Connector (ARC) middleware. Furthermore, a partial GPU acceleration of the Geant4 detector simulations has been implemented.
Advanced flight computers for planetary exploration

NASA Technical Reports Server (NTRS)

Stephenson, R. Rhoads

1988-01-01

Research concerning flight computers for use on interplanetary probes is reviewed. The history of these computers from the Viking mission to the present is outlined. The differences between ground commercial computers and computers for planetary exploration are listed. The development of a computer for the Mariner Mark II comet rendezvous asteroid flyby mission is described. Various aspects of recently developed computer systems are examined, including the Max real time, embedded computer, a hypercube distributed supercomputer, a SAR data processor, a processor for the High Resolution IR Imaging Spectrometer, and a robotic vision multiresolution pyramid machine for processsing images obtained by a Mars Rover.
Introducing Mira, Argonne's Next-Generation Supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

2013-03-19

Mira, the new petascale IBM Blue Gene/Q system installed at the ALCF, will usher in a new era of scientific supercomputing. An engineering marvel, the 10-petaflops machine is capable of carrying out 10 quadrillion calculations per second.
Green Supercomputing at Argonne

ScienceCinema

Pete Beckman

2017-12-09

Pete Beckman, head of Argonne's Leadership Computing Facility (ALCF) talks about Argonne National Laboratory's green supercomputingâeverything from designing algorithms to use fewer kilowatts per operation to using cold Chicago winter air to cool the machine more efficiently.
TOP500 Supercomputers for June 2003

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

2003-06-23

21st Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 21st edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2003). The Earth Simulator supercomputer built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan, with its Linpack benchmark performance of 35.86 Tflop/s (teraflops or trillions of calculations per second), retains the number one position. The number 2 position is held by the re-measured ASCI Q system at Los Alamosmore » National Laboratory. With 13.88 Tflop/s, it is the second system ever to exceed the 10 Tflop/smark. ASCIQ was built by Hewlett-Packard and is based on the AlphaServerSC computer system.« less
Characterizing output bottlenecks in a supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Bing; Chase, Jeffrey; Dillow, David A

2012-01-01

Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more » contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.« less
Dynamic rupture simulations on complex fault zone structures with off-fault plasticity using the ADER-DG method

NASA Astrophysics Data System (ADS)

Wollherr, Stephanie; Gabriel, Alice-Agnes; Igel, Heiner

2015-04-01

In dynamic rupture models, high stress concentrations at rupture fronts have to to be accommodated by off-fault inelastic processes such as plastic deformation. As presented in (Roten et al., 2014), incorporating plastic yielding can significantly reduce earlier predictions of ground motions in the Los Angeles Basin. Further, an inelastic response of materials surrounding a fault potentially has a strong impact on surface displacement and is therefore a key aspect in understanding the triggering of tsunamis through floor uplifting. We present an implementation of off-fault-plasticity and its verification for the software package SeisSol, an arbitrary high-order derivative discontinuous Galerkin (ADER-DG) method. The software recently reached multi-petaflop/s performance on some of the largest supercomputers worldwide and was a Gordon Bell prize finalist application in 2014 (Heinecke et al., 2014). For the nonelastic calculations we impose a Drucker-Prager yield criterion in shear stress with a viscous regularization following (Andrews, 2005). It permits the smooth relaxation of high stress concentrations induced in the dynamic rupture process. We verify the implementation by comparison to the SCEC/USGS Spontaneous Rupture Code Verification Benchmarks. The results of test problem TPV13 with a 60-degree dipping normal fault show that SeisSol is in good accordance with other codes. Additionally we aim to explore the numerical characteristics of the off-fault plasticity implementation by performing convergence tests for the 2D code. The ADER-DG method is especially suited for complex geometries by using unstructured tetrahedral meshes. Local adaptation of the mesh resolution enables a fine sampling of the cohesive zone on the fault while simultaneously satisfying the dispersion requirements of wave propagation away from the fault. In this context we will investigate the influence of off-fault-plasticity on geometrically complex fault zone structures like subduction zones or branched faults. Studying the interplay of stress conditions and angle dependence of neighbouring branches including inelastic material behaviour and its effects on rupture jumps and seismic activation helps to advance our understanding of earthquake source processes. An application is the simulation of a real large-scale subduction zone scenario including plasticity to validate the coupling of our dynamic rupture calculations to a tsunami model in the framework of the ASCETE project (http://www.ascete.de/). Andrews, D. J. (2005): Rupture dynamics with energy loss outside the slip zone, J. Geophys. Res., 110, B01307. Heinecke, A. (2014), A. Breuer, S. Rettenberger, M. Bader, A.-A. Gabriel, C. Pelties, A. Bode, W. Barth, K. Vaidyanathan, M. Smelyanskiy and P. Dubey: Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous Supercomputers. In Supercomputing 2014, The International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, New Orleans, LA, USA, November 2014. Roten, D. (2014), K. B. Olsen, S.M. Day, Y. Cui, and D. Fäh: Expected seismic shaking in Los Angeles reduced by San Andreas fault zone plasticity, Geophys. Res. Lett., 41, 2769-2777.
Vorticity dynamics in an intracranial aneurysm

NASA Astrophysics Data System (ADS)

Le, Trung; Borazjani, Iman; Sotiropoulos, Fotis

2008-11-01

Direct Numerical Simulation is carried out to investigate the vortex dynamics of physiologic pulsatile flow in an intracranial aneurysm. The numerical solver is based on the CURVIB (curvilinear grid/immersed boundary method) approach developed by Ge and Sotiropoulos, J. Comp. Physics, 225 (2007) and is applied to simulate the blood flow in a grid with 8 million grid nodes. The aneurysm geometry is extracted from MRI images from common carotid artery (CCA) of a rabbit (courtesy Dr.Kallmes, Mayo Clinic). The simulation reveals the formation of a strong vortex ring at the proximal end during accelerated flow phase. The vortical structure advances toward the aneurysm dome forming a distinct inclined circular ring that connects with the proximal wall via two long streamwise vortical structures. During the reverse flow phase, the back flow results to the formation of another ring at the distal end that advances in the opposite direction toward the proximal end and interacts with the vortical structures that were created during the accelerated phase. The basic vortex formation mechanism is similar to that observed by Webster and Longmire (1998) for pulsed flow through inclined nozzles. The similarities between the two flows will be discussed and the vorticity dynamics of an aneurysm and inclined nozzle flows will be analyzed.This work was supported in part by the University of Minnesota Supercomputing Institute.
INTEGRATION OF PANDA WORKLOAD MANAGEMENT SYSTEM WITH SUPERCOMPUTERS

DOE Office of Scientific and Technical Information (OSTI.GOV)

De, K; Jha, S; Maeno, T

Abstract The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the funda- mental nature of matter and the basic forces that shape our universe, and were recently credited for the dis- covery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Datamore » Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data cen- ters are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Com- puting Facility (OLCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single- threaded workloads in parallel on Titan s multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accom- plishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility s infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less
Tools for 3D scientific visualization in computational aerodynamics at NASA Ames Research Center

NASA Technical Reports Server (NTRS)

Bancroft, Gordon; Plessel, Todd; Merritt, Fergus; Watson, Val

1989-01-01

Hardware, software, and techniques used by the Fluid Dynamics Division (NASA) for performing visualization of computational aerodynamics, which can be applied to the visualization of flow fields from computer simulations of fluid dynamics about the Space Shuttle, are discussed. Three visualization techniques applied, post-processing, tracking, and steering, are described, as well as the post-processing software packages used, PLOT3D, SURF (Surface Modeller), GAS (Graphical Animation System), and FAST (Flow Analysis software Toolkit). Using post-processing methods a flow simulation was executed on a supercomputer and, after the simulation was complete, the results were processed for viewing. It is shown that the high-resolution, high-performance three-dimensional workstation combined with specially developed display and animation software provides a good tool for analyzing flow field solutions obtained from supercomputers.
A multidimensional finite element method for CFD

NASA Technical Reports Server (NTRS)

Pepper, Darrell W.; Humphrey, Joseph W.

1991-01-01

A finite element method is used to solve the equations of motion for 2- and 3-D fluid flow. The time-dependent equations are solved explicitly using quadrilateral (2-D) and hexahedral (3-D) elements, mass lumping, and reduced integration. A Petrov-Galerkin technique is applied to the advection terms. The method requires a minimum of computational storage, executes quickly, and is scalable for execution on computer systems ranging from PCs to supercomputers.
Current state and future direction of computer systems at NASA Langley Research Center

NASA Technical Reports Server (NTRS)

Rogers, James L. (Editor); Tucker, Jerry H. (Editor)

1992-01-01

Computer systems have advanced at a rate unmatched by any other area of technology. As performance has dramatically increased there has been an equally dramatic reduction in cost. This constant cost performance improvement has precipitated the pervasiveness of computer systems into virtually all areas of technology. This improvement is due primarily to advances in microelectronics. Most people are now convinced that the new generation of supercomputers will be built using a large number (possibly thousands) of high performance microprocessors. Although the spectacular improvements in computer systems have come about because of these hardware advances, there has also been a steady improvement in software techniques. In an effort to understand how these hardware and software advances will effect research at NASA LaRC, the Computer Systems Technical Committee drafted this white paper to examine the current state and possible future directions of computer systems at the Center. This paper discusses selected important areas of computer systems including real-time systems, embedded systems, high performance computing, distributed computing networks, data acquisition systems, artificial intelligence, and visualization.
The new landscape of parallel computer architecture

NASA Astrophysics Data System (ADS)

Shalf, John

2007-07-01

The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
Supercomputers Join the Fight against Cancer – U.S. Department of Energy

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

The Department of Energy has some of the best supercomputers in the world. Now, they’re joining the fight against cancer. Learn about our new partnership with the National Cancer Institute and GlaxoSmithKline Pharmaceuticals.
NAS-current status and future plans

NASA Technical Reports Server (NTRS)

Bailey, F. R.

1987-01-01

The Numerical Aerodynamic Simulation (NAS) has met its first major milestone, the NAS Processing System Network (NPSN) Initial Operating Configuration (IOC). The program has met its goal of providing a national supercomputer facility capable of greatly enhancing the Nation's research and development efforts. Furthermore, the program is fulfilling its pathfinder role by defining and implementing a paradigm for supercomputing system environments. The IOC is only the begining and the NAS Program will aggressively continue to develop and implement emerging supercomputer, communications, storage, and software technologies to strengthen computations as a critical element in supporting the Nation's leadership role in aeronautics.
CRAY mini manual. Revision D

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Howser, Lona M.

1993-01-01

This document briefly describes the use of the CRAY supercomputers that are an integral part of the Supercomputing Network Subsystem of the Central Scientific Computing Complex at LaRC. Features of the CRAY supercomputers are covered, including: FORTRAN, C, PASCAL, architectures of the CRAY-2 and CRAY Y-MP, the CRAY UNICOS environment, batch job submittal, debugging, performance analysis, parallel processing, utilities unique to CRAY, and documentation. The document is intended for all CRAY users as a ready reference to frequently asked questions and to more detailed information contained in the vendor manuals. It is appropriate for both the novice and the experienced user.
Data distribution satellite

NASA Technical Reports Server (NTRS)

Stevens, Grady H.

1992-01-01

The Data Distribution Satellite (DDS), operating in conjunction with the planned space network, the National Research and Education Network and its commercial derivatives, would play a key role in networking the emerging supercomputing facilities, national archives, academic, industrial, and government institutions. Centrally located over the United States in geostationary orbit, DDS would carry sophisticated on-board switching and make use of advanced antennas to provide an array of special services. Institutions needing continuous high data rate service would be networked together by use of a microwave switching matrix and electronically steered hopping beams. Simultaneously, DDS would use other beams and on board processing to interconnect other institutions with lesser, low rate, intermittent needs. Dedicated links to White Sands and other facilities would enable direct access to space payloads and sensor data. Intersatellite links to a second generation ATDRS, called Advanced Space Data Acquisition and Communications System (ASDACS), would eliminate one satellite hop and enhance controllability of experimental payloads by reducing path delay. Similarly, direct access would be available to the supercomputing facilities and national data archives. Economies with DDS would be derived from its ability to switch high rate facilities amongst users needed. At the same time, having a CONUS view, DDS would interconnect with any institution regardless of how remote. Whether one needed high rate service or low rate service would be immaterial. With the capability to assign resources on demand, DDS will need only carry a portion of the resources needed if dedicated facilities were used. Efficiently switching resources to users as needed, DDS would become a very feasible spacecraft, even though it would tie together the space network, the terrestrial network, remote sites, 1000's of small users, and those few who need very large data links intermittently.
A parallel orbital-updating based plane-wave basis method for electronic structure calculations

NASA Astrophysics Data System (ADS)

Pan, Yan; Dai, Xiaoying; de Gironcoli, Stefano; Gong, Xin-Gao; Rignanese, Gian-Marco; Zhou, Aihui

2017-11-01

Motivated by the recently proposed parallel orbital-updating approach in real space method [1], we propose a parallel orbital-updating based plane-wave basis method for electronic structure calculations, for solving the corresponding eigenvalue problems. In addition, we propose two new modified parallel orbital-updating methods. Compared to the traditional plane-wave methods, our methods allow for two-level parallelization, which is particularly interesting for large scale parallelization. Numerical experiments show that these new methods are more reliable and efficient for large scale calculations on modern supercomputers.
Roadrunner Supercomputer Breaks the Petaflop Barrier

ScienceCinema

Los Alamos National Lab - Brian Albright, Charlie McMillan, Lin Yin

2017-12-09

At 3:30 a.m. on May 26, 2008, Memorial Day, the "Roadrunner" supercomputer exceeded a sustained speed of 1 petaflop/s, or 1 million billion calculations per second. The sustained performance makes Roadrunner more than twice as fast as the current number 1
QCD on the BlueGene/L Supercomputer

NASA Astrophysics Data System (ADS)

Bhanot, G.; Chen, D.; Gara, A.; Sexton, J.; Vranas, P.

2005-03-01

In June 2004 QCD was simulated for the first time at sustained speed exceeding 1 TeraFlops in the BlueGene/L supercomputer at the IBM T.J. Watson Research Lab. The implementation and performance of QCD in the BlueGene/L is presented.

Supercomputer Issues from a University Perspective.

ERIC Educational Resources Information Center

Beering, Steven C.

1984-01-01

Discusses issues related to the access of and training of university researchers in using supercomputers, considering National Science Foundation's (NSF) role in this area, microcomputers on campuses, and the limited use of existing telecommunication networks. Includes examples of potential scientific projects (by subject area) utilizing…
Real science at the petascale.

PubMed

Saksena, Radhika S; Boghosian, Bruce; Fazendeiro, Luis; Kenway, Owain A; Manos, Steven; Mazzeo, Marco D; Sadiq, S Kashif; Suter, James L; Wright, David; Coveney, Peter V

2009-06-28

We describe computational science research that uses petascale resources to achieve scientific results at unprecedented scales and resolution. The applications span a wide range of domains, from investigation of fundamental problems in turbulence through computational materials science research to biomedical applications at the forefront of HIV/AIDS research and cerebrovascular haemodynamics. This work was mainly performed on the US TeraGrid 'petascale' resource, Ranger, at Texas Advanced Computing Center, in the first half of 2008 when it was the largest computing system in the world available for open scientific research. We have sought to use this petascale supercomputer optimally across application domains and scales, exploiting the excellent parallel scaling performance found on up to at least 32 768 cores for certain of our codes in the so-called 'capability computing' category as well as high-throughput intermediate-scale jobs for ensemble simulations in the 32-512 core range. Furthermore, this activity provides evidence that conventional parallel programming with MPI should be successful at the petascale in the short to medium term. We also report on the parallel performance of some of our codes on up to 65 636 cores on the IBM Blue Gene/P system at the Argonne Leadership Computing Facility, which has recently been named the fastest supercomputer in the world for open science.
The impact of CFD on development test facilities - A National Research Council projection. [computational fluid dynamics

NASA Technical Reports Server (NTRS)

Korkegi, R. H.

1983-01-01

The results of a National Research Council study on the effect that advances in computational fluid dynamics (CFD) will have on conventional aeronautical ground testing are reported. Current CFD capabilities include the depiction of linearized inviscid flows and a boundary layer, initial use of Euler coordinates using supercomputers to automatically generate a grid, research and development on Reynolds-averaged Navier-Stokes (N-S) equations, and preliminary research on solutions to the full N-S equations. Improvements in the range of CFD usage is dependent on the development of more powerful supercomputers, exceeding even the projected abilities of the NASA Numerical Aerodynamic Simulator (1 BFLOP/sec). Full representation of the Re-averaged N-S equations will require over one million grid points, a computing level predicted to be available in 15 yr. Present capabilities allow identification of data anomalies, confirmation of data accuracy, and adequateness of model design in wind tunnel trials. Account can be taken of the wall effects and the Re in any flight regime during simulation. CFD can actually be more accurate than instrumented tests, since all points in a flow can be modeled with CFD, while they cannot all be monitored with instrumentation in a wind tunnel.
HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies

NASA Astrophysics Data System (ADS)

De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.

2017-10-01

PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.
New Tools for "New" History: Computers and the Teaching of Quantitative Historical Methods.

ERIC Educational Resources Information Center

Burton, Orville Vernon; Finnegan, Terence

1989-01-01

Explains the development of an instructional software package and accompanying workbook which teaches students to apply computerized statistical analysis to historical data, improving the study of social history. Concludes that the use of microcomputers and supercomputers to manipulate historical data enhances critical thinking skills and the use…
Waveform inversion for 3-D earth structure using the Direct Solution Method implemented on vector-parallel supercomputer

NASA Astrophysics Data System (ADS)

Hara, Tatsuhiko

2004-08-01

We implement the Direct Solution Method (DSM) on a vector-parallel supercomputer and show that it is possible to significantly improve its computational efficiency through parallel computing. We apply the parallel DSM calculation to waveform inversion of long period (250-500 s) surface wave data for three-dimensional (3-D) S-wave velocity structure in the upper and uppermost lower mantle. We use a spherical harmonic expansion to represent lateral variation with the maximum angular degree 16. We find significant low velocities under south Pacific hot spots in the transition zone. This is consistent with other seismological studies conducted in the Superplume project, which suggests deep roots of these hot spots. We also perform simultaneous waveform inversion for 3-D S-wave velocity and Q structure. Since resolution for Q is not good, we develop a new technique in which power spectra are used as data for inversion. We find good correlation between long wavelength patterns of Vs and Q in the transition zone such as high Vs and high Q under the western Pacific.
Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

NASA Astrophysics Data System (ADS)

Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.

2016-10-01

The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.
The use of supercomputer modelling of high-temperature failure in pipe weldments to optimize weld and heat affected zone materials property selection

NASA Astrophysics Data System (ADS)

Wang, Z. P.; Hayhurst, D. R.

1994-07-01

The creep deformation and damage evolution in a pipe weldment has been modeled by using the finite-element continuum damage mechanics (CDM) method. The finite-element CDM computer program DAMAGE XX has been adapted to run with increased speed on a Cray XMP/416 supercomputer. Run times are sufficiently short (20 min) to permit many parametric studies to be carried out on vessel lifetimes for different weld and heat affected zone (HAZ) materials. Finite-element mesh sensitivity was studied first in order to select a mesh capable of correctly predicting experimentally observed results using at least possible computer time. A study was then made of the effect on the lifetime of a butt welded vessel of each of the commomly measured material parameters for the weld and HAZ materials. Forty different ferritic steel welded vessels were analyzed for a constant internal pressure of 45.5 MPa at a temperature of 565 C; each vessel having the same parent pipe material but different weld and HAZ materials. A lifetime improvement has been demonstrated of 30% over that obtained for the initial materials property data. A methodology for weldment design has been established which uses supercomputer-based CDM analysis techniques; it is quick to use, provides accurate results, and is a viable design tool.
Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

PubMed

Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

2016-04-01

Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

DOE PAGES

Wang, Bei; Ethier, Stephane; Tang, William; ...

2017-06-29

The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Bei; Ethier, Stephane; Tang, William

The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Science & Technology Review June 2012

DOE Office of Scientific and Technical Information (OSTI.GOV)

Poyneer, L A

2012-04-20

This month's issue has the following articles: (1) A New Era in Climate System Analysis - Commentary by William H. Goldstein; (2) Seeking Clues to Climate Change - By comparing past climate records with results from computer simulations, Livermore scientists can better understand why Earth's climate has changed and how it might change in the future; (3) Finding and Fixing a Supercomputer's Faults - Livermore experts have developed innovative methods to detect hardware faults in supercomputers and help applications recover from errors that do occur; (4) Targeting Ignition - Enhancements to the cryogenic targets for National Ignition Facility experiments aremore » furthering work to achieve fusion ignition with energy gain; (5) Neural Implants Come of Age - A new generation of fully implantable, biocompatible neural prosthetics offers hope to patients with neurological impairment; and (6) Incubator Busy Growing Energy Technologies - Six collaborations with industrial partners are using the Laboratory's high-performance computing resources to find solutions to urgent energy-related problems.« less
Hybrid petacomputing meets cosmology: The Roadrunner Universe project

NASA Astrophysics Data System (ADS)

Habib, Salman; Pope, Adrian; Lukić, Zarija; Daniel, David; Fasel, Patricia; Desai, Nehal; Heitmann, Katrin; Hsu, Chung-Hsing; Ankeny, Lee; Mark, Graham; Bhattacharya, Suman; Ahrens, James

2009-07-01

The target of the Roadrunner Universe project at Los Alamos National Laboratory is a set of very large cosmological N-body simulation runs on the hybrid supercomputer Roadrunner, the world's first petaflop platform. Roadrunner's architecture presents opportunities and difficulties characteristic of next-generation supercomputing. We describe a new code designed to optimize performance and scalability by explicitly matching the underlying algorithms to the machine architecture, and by using the physics of the problem as an essential aid in this process. While applications will differ in specific exploits, we believe that such a design process will become increasingly important in the future. The Roadrunner Universe project code, MC3 (Mesh-based Cosmology Code on the Cell), uses grid and direct particle methods to balance the capabilities of Roadrunner's conventional (Opteron) and accelerator (Cell BE) layers. Mirrored particle caches and spectral techniques are used to overcome communication bandwidth limitations and possible difficulties with complicated particle-grid interaction templates.
NSF Establishes First Four National Supercomputer Centers.

ERIC Educational Resources Information Center

Lepkowski, Wil

1985-01-01

The National Science Foundation (NSF) has awarded support for supercomputer centers at Cornell University, Princeton University, University of California (San Diego), and University of Illinois. These centers are to be the nucleus of a national academic network for use by scientists and engineers throughout the United States. (DH)
Library Services in a Supercomputer Center.

ERIC Educational Resources Information Center

Layman, Mary

1991-01-01

Describes library services that are offered at the San Diego Supercomputer Center (SDSC), which is located at the University of California at San Diego. Topics discussed include the user population; online searching; microcomputer use; electronic networks; current awareness programs; library catalogs; and the slide collection. A sidebar outlines…
Probing the cosmic causes of errors in supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

Cosmic rays from outer space are causing errors in supercomputers. The neutrons that pass through the CPU may be causing binary data to flip leading to incorrect calculations. Los Alamos National Laboratory has developed detectors to determine how much data is being corrupted by these cosmic particles.
Networking Technologies Enable Advances in Earth Science

NASA Technical Reports Server (NTRS)

Johnson, Marjory; Freeman, Kenneth; Gilstrap, Raymond; Beck, Richard

2004-01-01

This paper describes an experiment to prototype a new way of conducting science by applying networking and distributed computing technologies to an Earth Science application. A combination of satellite, wireless, and terrestrial networking provided geologists at a remote field site with interactive access to supercomputer facilities at two NASA centers, thus enabling them to validate and calibrate remotely sensed geological data in near-real time. This represents a fundamental shift in the way that Earth scientists analyze remotely sensed data. In this paper we describe the experiment and the network infrastructure that enabled it, analyze the data flow during the experiment, and discuss the scientific impact of the results.
Computational fluid dynamics in a marine environment

NASA Technical Reports Server (NTRS)

Carlson, Arthur D.

1987-01-01

The introduction of the supercomputer and recent advances in both Reynolds averaged, and large eddy simulation fluid flow approximation techniques to the Navier-Stokes equations, have created a robust environment for the exploration of problems of interest to the Navy in general, and the Naval Underwater Systems Center in particular. The nature of problems that are of interest, and the type of resources needed for their solution are addressed. The goal is to achieve a good engineering solution to the fluid-structure interaction problem. It is appropriate to indicate that a paper by D. Champman played a major role in developing the interest in the approach discussed.
Computational chemistry research

NASA Technical Reports Server (NTRS)

Levin, Eugene

1987-01-01

Task 41 is composed of two parts: (1) analysis and design studies related to the Numerical Aerodynamic Simulation (NAS) Extended Operating Configuration (EOC) and (2) computational chemistry. During the first half of 1987, Dr. Levin served as a member of an advanced system planning team to establish the requirements, goals, and principal technical characteristics of the NAS EOC. A paper entitled 'Scaling of Data Communications for an Advanced Supercomputer Network' is included. The high temperature transport properties (such as viscosity, thermal conductivity, etc.) of the major constituents of air (oxygen and nitrogen) were correctly determined. The results of prior ab initio computer solutions of the Schroedinger equation were combined with the best available experimental data to obtain complete interaction potentials for both neutral and ion-atom collision partners. These potentials were then used in a computer program to evaluate the collision cross-sections from which the transport properties could be determined. A paper entitled 'High Temperature Transport Properties of Air' is included.
Advanced Architectures for Astrophysical Supercomputing

NASA Astrophysics Data System (ADS)

Barsdell, B. R.; Barnes, D. G.; Fluke, C. J.

2010-12-01

Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×) in general-purpose computation - performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.

The Sky's the Limit When Super Students Meet Supercomputers.

ERIC Educational Resources Information Center

Trotter, Andrew

1991-01-01

In a few select high schools in the U.S., supercomputers are allowing talented students to attempt sophisticated research projects using simultaneous simulations of nature, culture, and technology not achievable by ordinary microcomputers. Schools can get their students online by entering contests and seeking grants and partnerships with…
NSF Says It Will Support Supercomputer Centers in California and Illinois.

ERIC Educational Resources Information Center

Strosnider, Kim; Young, Jeffrey R.

1997-01-01

The National Science Foundation will increase support for supercomputer centers at the University of California, San Diego and the University of Illinois, Urbana-Champaign, while leaving unclear the status of the program at Cornell University (New York) and a cooperative Carnegie-Mellon University (Pennsylvania) and University of Pittsburgh…
Access to Supercomputers. Higher Education Panel Report 69.

ERIC Educational Resources Information Center

Holmstrom, Engin Inel

This survey was conducted to provide the National Science Foundation with baseline information on current computer use in the nation's major research universities, including the actual and potential use of supercomputers. Questionnaires were sent to 207 doctorate-granting institutions; after follow-ups, 167 institutions (91% of the institutions…
NOAA announces significant investment in next generation of supercomputers

Science.gov Websites

provide more timely, accurate weather forecasts. (Credit: istockphoto.com) Today, NOAA announced the next phase in the agency's efforts to increase supercomputing capacity to provide more timely, accurate turn will lead to more timely, accurate, and reliable forecasts." Ahead of this upgrade, each of
Computing and data processing

NASA Technical Reports Server (NTRS)

Smarr, Larry; Press, William; Arnett, David W.; Cameron, Alastair G. W.; Crutcher, Richard M.; Helfand, David J.; Horowitz, Paul; Kleinmann, Susan G.; Linsky, Jeffrey L.; Madore, Barry F.

1991-01-01

The applications of computers and data processing to astronomy are discussed. Among the topics covered are the emerging national information infrastructure, workstations and supercomputers, supertelescopes, digital astronomy, astrophysics in a numerical laboratory, community software, archiving of ground-based observations, dynamical simulations of complex systems, plasma astrophysics, and the remote control of fourth dimension supercomputers.
NREL's Building-Integrated Supercomputer Provides Heating and Efficient Computing (Fact Sheet)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

2014-09-01

NREL's Energy Systems Integration Facility (ESIF) is meant to investigate new ways to integrate energy sources so they work together efficiently, and one of the key tools to that investigation, a new supercomputer, is itself a prime example of energy systems integration. NREL teamed with Hewlett-Packard (HP) and Intel to develop the innovative warm-water, liquid-cooled Peregrine supercomputer, which not only operates efficiently but also serves as the primary source of building heat for ESIF offices and laboratories. This innovative high-performance computer (HPC) can perform more than a quadrillion calculations per second as part of the world's most energy-efficient HPC datamore » center.« less
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer

NASA Technical Reports Server (NTRS)

Hornfeck, William A.

1988-01-01

A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
Congressional Panel Seeks To Curb Access of Foreign Students to U.S. Supercomputers.

ERIC Educational Resources Information Center

Kiernan, Vincent

1999-01-01

Fearing security problems, a congressional committee on Chinese espionage recommends that foreign students and other foreign nationals be barred from using supercomputers at national laboratories unless they first obtain export licenses from the federal government. University officials dispute the data on which the report is based and find the…
The Age of the Supercomputer Gives Way to the Age of the Super Infrastructure.

ERIC Educational Resources Information Center

Young, Jeffrey R.

1997-01-01

In October 1997, the National Science Foundation will discontinue financial support for two university-based supercomputer facilities to concentrate resources on partnerships led by facilities at the University of California, San Diego and the University of Illinois, Urbana-Champaign. The reconfigured program will develop more user-friendly and…
The ChemViz Project: Using a Supercomputer To Illustrate Abstract Concepts in Chemistry.

ERIC Educational Resources Information Center

Beckwith, E. Kenneth; Nelson, Christopher

1998-01-01

Describes the Chemistry Visualization (ChemViz) Project, a Web venture maintained by the University of Illinois National Center for Supercomputing Applications (NCSA) that enables high school students to use computational chemistry as a technique for understanding abstract concepts. Discusses the evolution of computational chemistry and provides a…
Supercomputations and big-data analysis in strong-field ultrafast optical physics: filamentation of high-peak-power ultrashort laser pulses

NASA Astrophysics Data System (ADS)

Voronin, A. A.; Panchenko, V. Ya; Zheltikov, A. M.

2016-06-01

High-intensity ultrashort laser pulses propagating in gas media or in condensed matter undergo complex nonlinear spatiotemporal evolution where temporal transformations of optical field waveforms are strongly coupled to an intricate beam dynamics and ultrafast field-induced ionization processes. At the level of laser peak powers orders of magnitude above the critical power of self-focusing, the beam exhibits modulation instabilities, producing random field hot spots and breaking up into multiple noise-seeded filaments. This problem is described by a (3 + 1)-dimensional nonlinear field evolution equation, which needs to be solved jointly with the equation for ultrafast ionization of a medium. Analysis of this problem, which is equivalent to solving a billion-dimensional evolution problem, is only possible by means of supercomputer simulations augmented with coordinated big-data processing of large volumes of information acquired through theory-guiding experiments and supercomputations. Here, we review the main challenges of supercomputations and big-data processing encountered in strong-field ultrafast optical physics and discuss strategies to confront these challenges.
Addressing model uncertainty through stochastic parameter perturbations within the High Resolution Rapid Refresh (HRRR) ensemble

NASA Astrophysics Data System (ADS)

Wolff, J.; Jankov, I.; Beck, J.; Carson, L.; Frimel, J.; Harrold, M.; Jiang, H.

2016-12-01

It is well known that global and regional numerical weather prediction ensemble systems are under-dispersive, producing unreliable and overconfident ensemble forecasts. Typical approaches to alleviate this problem include the use of multiple dynamic cores, multiple physics suite configurations, or a combination of the two. While these approaches may produce desirable results, they have practical and theoretical deficiencies and are more difficult and costly to maintain. An active area of research that promotes a more unified and sustainable system for addressing the deficiencies in ensemble modeling is the use of stochastic physics to represent model-related uncertainty. Stochastic approaches include Stochastic Parameter Perturbations (SPP), Stochastic Kinetic Energy Backscatter (SKEB), Stochastic Perturbation of Physics Tendencies (SPPT), or some combination of all three. The focus of this study is to assess the model performance within a convection-permitting ensemble at 3-km grid spacing across the Contiguous United States (CONUS) when using stochastic approaches. For this purpose, the test utilized a single physics suite configuration based on the operational High-Resolution Rapid Refresh (HRRR) model, with ensemble members produced by employing stochastic methods. Parameter perturbations were employed in the Rapid Update Cycle (RUC) land surface model and Mellor-Yamada-Nakanishi-Niino (MYNN) planetary boundary layer scheme. Results will be presented in terms of bias, error, spread, skill, accuracy, reliability, and sharpness using the Model Evaluation Tools (MET) verification package. Due to the high level of complexity of running a frequently updating (hourly), high spatial resolution (3 km), large domain (CONUS) ensemble system, extensive high performance computing (HPC) resources were needed to meet this objective. Supercomputing resources were provided through the National Center for Atmospheric Research (NCAR) Strategic Capability (NSC) project support, allowing for a more extensive set of tests over multiple seasons, consequently leading to more robust results. Through the use of these stochastic innovations and powerful supercomputing at NCAR, further insights and advancements in ensemble forecasting at convection-permitting scales will be possible.
Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers

NASA Astrophysics Data System (ADS)

Dreher, Patrick; Scullin, William; Vouk, Mladen

2015-09-01

Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers.
NBodyLab: A Testbed for Undergraduates Utilizing a Web Interface to NEMO and MD-GRAPE2 Hardware

NASA Astrophysics Data System (ADS)

Johnson, V. L.; Teuben, P. J.; Penprase, B. E.

An N-body simulation testbed called NBodyLab was developed at Pomona College as a teaching tool for undergraduates. The testbed runs under Linux and provides a web interface to selected back-end NEMO modeling and analysis tools, and several integration methods which can optionally use an MD-GRAPE2 supercomputer card in the server to accelerate calculation of particle-particle forces. The testbed provides a framework for using and experimenting with the main components of N-body simulations: data models and transformations, numerical integration of the equations of motion, analysis and visualization products, and acceleration techniques (in this case, special purpose hardware). The testbed can be used by students with no knowledge of programming or Unix, freeing such students and their instructor to spend more time on scientific experimentation. The advanced student can extend the testbed software and/or more quickly transition to the use of more advanced Unix-based toolsets such as NEMO, Starlab and model builders such as GalactICS. Cosmology students at Pomona College used the testbed to study collisions of galaxies with different speeds, masses, densities, collision angles, angular momentum, etc., attempting to simulate, for example, the Tadpole Galaxy and the Antenna Galaxies. The testbed framework is available as open-source to assist other researchers and educators. Recommendations are made for testbed enhancements.
A mathematical model for simulating noise suppression of lined ejectors

NASA Technical Reports Server (NTRS)

Watson, Willie R.

1994-01-01

A mathematical model containing the essential features embodied in the noise suppression of lined ejectors is presented. Although some simplification of the physics is necessary to render the model mathematically tractable, the current model is the most versatile and technologically advanced at the current time. A system of linearized equations and the boundary conditions governing the sound field are derived starting from the equations of fluid dynamics. A nonreflecting boundary condition is developed. In view of the complex nature of the equations, a parametric study requires the use of numerical techniques and modern computers. A finite element algorithm that solves the differential equations coupled with the boundary condition is then introduced. The numerical method results in a matrix equation with several hundred thousand degrees of freedom that is solved efficiently on a supercomputer. The model is validated by comparing results either with exact solutions or with approximate solutions from other works. In each case, excellent correlations are obtained. The usefulness of the model as an optimization tool and the importance of variable impedance liners as a mechanism for achieving broadband suppression within a lined ejector are demonstrated.
A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajbhandari, Samyam; Kim, Jinsung; Krishnamoorthy, Sriram

This paper describes the design and implementation of a layered domain-specific compiler to support MADNESS---Multiresolution ADaptive Numerical Environment for Scientific Simulation. MADNESS is a high-level software environment for the solution of integral and differential equations in many dimensions, using adaptive and fast harmonic analysis methods with guaranteed precision. MADNESS uses k-d trees to represent spatial functions and implements operators like addition, multiplication, differentiation, and integration on the numerical representation of functions. The MADNESS runtime system provides global namespace support and a task-based execution model including futures. MADNESS is currently deployed on massively parallel supercomputers and has enabled many science advances.more » Due to the highly irregular and statically unpredictable structure of the k-d trees representing the spatial functions encountered in MADNESS applications, only purely runtime approaches to optimization have previously been implemented in the MADNESS framework. This paper describes a layered domain-specific compiler developed to address some performance bottlenecks in MADNESS. The newly developed static compile-time optimizations, in conjunction with the MADNESS runtime support, enable significant performance improvement for the MADNESS framework.« less
Big Data Processing for a Central Texas Groundwater Case Study

NASA Astrophysics Data System (ADS)

Cantu, A.; Rivera, O.; Martínez, A.; Lewis, D. H.; Gentle, J. N., Jr.; Fuentes, G.; Pierce, S. A.

2016-12-01

As computational methods improve, scientists are able to expand the level and scale of experimental simulation and testing that is completed for case studies. This study presents a comparative analysis of multiple models for the Barton Springs segment of the Edwards aquifer. Several numerical simulations using state-mandated MODFLOW models ran on Stampede, a High Performance Computing system housed at the Texas Advanced Computing Center, were performed for multiple scenario testing. One goal of this multidisciplinary project aims to visualize and compare the output data of the groundwater model using the statistical programming language R to find revealing data patterns produced by different pumping scenarios. Presenting data in a friendly post-processing format is covered in this paper. Visualization of the data and creating workflows applicable to the management of the data are tasks performed after data extraction. Resulting analyses provide an example of how supercomputing can be used to accelerate evaluation of scientific uncertainty and geological knowledge in relation to policy and management decisions. Understanding the aquifer behavior helps policy makers avoid negative impact on the endangered species, environmental services and aids in maximizing the aquifer yield.
Virtualizing Super-Computation On-Board Uas

NASA Astrophysics Data System (ADS)

Salami, E.; Soler, J. A.; Cuadrado, R.; Barrado, C.; Pastor, E.

2015-04-01

Unmanned aerial systems (UAS, also known as UAV, RPAS or drones) have a great potential to support a wide variety of aerial remote sensing applications. Most UAS work by acquiring data using on-board sensors for later post-processing. Some require the data gathered to be downlinked to the ground in real-time. However, depending on the volume of data and the cost of the communications, this later option is not sustainable in the long term. This paper develops the concept of virtualizing super-computation on-board UAS, as a method to ease the operation by facilitating the downlink of high-level information products instead of raw data. Exploiting recent developments in miniaturized multi-core devices is the way to speed-up on-board computation. This hardware shall satisfy size, power and weight constraints. Several technologies are appearing with promising results for high performance computing on unmanned platforms, such as the 36 cores of the TILE-Gx36 by Tilera (now EZchip) or the 64 cores of the Epiphany-IV by Adapteva. The strategy for virtualizing super-computation on-board includes the benchmarking for hardware selection, the software architecture and the communications aware design. A parallelization strategy is given for the 36-core TILE-Gx36 for a UAS in a fire mission or in similar target-detection applications. The results are obtained for payload image processing algorithms and determine in real-time the data snapshot to gather and transfer to ground according to the needs of the mission, the processing time, and consumed watts.
Air Force Maui Optical and Supercomputing Site (AMOS) Application Briefs 2004

DTIC Science & Technology

2004-01-01

Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection...17. LIMITATION OF ABSTRACT Same as Report (SAR) 18. NUMBER OF PAGES 60 19a. NAME OF RESPONSIBLE PERSON a. REPORT unclassified b. ABSTRACT...SOR methods, the work in parallel matrix iterative methods is very relevant to the parallelization of a CA. The Moore neighborhood used in this work
A unified framework for spiking and gap-junction interactions in distributed neuronal network simulations.

PubMed

Hahne, Jan; Helias, Moritz; Kunkel, Susanne; Igarashi, Jun; Bolten, Matthias; Frommer, Andreas; Diesmann, Markus

2015-01-01

Contemporary simulators for networks of point and few-compartment model neurons come with a plethora of ready-to-use neuron and synapse models and support complex network topologies. Recent technological advancements have broadened the spectrum of application further to the efficient simulation of brain-scale networks on supercomputers. In distributed network simulations the amount of spike data that accrues per millisecond and process is typically low, such that a common optimization strategy is to communicate spikes at relatively long intervals, where the upper limit is given by the shortest synaptic transmission delay in the network. This approach is well-suited for simulations that employ only chemical synapses but it has so far impeded the incorporation of gap-junction models, which require instantaneous neuronal interactions. Here, we present a numerical algorithm based on a waveform-relaxation technique which allows for network simulations with gap junctions in a way that is compatible with the delayed communication strategy. Using a reference implementation in the NEST simulator, we demonstrate that the algorithm and the required data structures can be smoothly integrated with existing code such that they complement the infrastructure for spiking connections. To show that the unified framework for gap-junction and spiking interactions achieves high performance and delivers high accuracy in the presence of gap junctions, we present benchmarks for workstations, clusters, and supercomputers. Finally, we discuss limitations of the novel technology.

A unified framework for spiking and gap-junction interactions in distributed neuronal network simulations

PubMed Central

Hahne, Jan; Helias, Moritz; Kunkel, Susanne; Igarashi, Jun; Bolten, Matthias; Frommer, Andreas; Diesmann, Markus

2015-01-01

Contemporary simulators for networks of point and few-compartment model neurons come with a plethora of ready-to-use neuron and synapse models and support complex network topologies. Recent technological advancements have broadened the spectrum of application further to the efficient simulation of brain-scale networks on supercomputers. In distributed network simulations the amount of spike data that accrues per millisecond and process is typically low, such that a common optimization strategy is to communicate spikes at relatively long intervals, where the upper limit is given by the shortest synaptic transmission delay in the network. This approach is well-suited for simulations that employ only chemical synapses but it has so far impeded the incorporation of gap-junction models, which require instantaneous neuronal interactions. Here, we present a numerical algorithm based on a waveform-relaxation technique which allows for network simulations with gap junctions in a way that is compatible with the delayed communication strategy. Using a reference implementation in the NEST simulator, we demonstrate that the algorithm and the required data structures can be smoothly integrated with existing code such that they complement the infrastructure for spiking connections. To show that the unified framework for gap-junction and spiking interactions achieves high performance and delivers high accuracy in the presence of gap junctions, we present benchmarks for workstations, clusters, and supercomputers. Finally, we discuss limitations of the novel technology. PMID:26441628
The impact of the U.S. supercomputing initiative will be global

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, Dona

2016-01-15

Last July, President Obama issued an executive order that created a coordinated federal strategy for HPC research, development, and deployment called the U.S. National Strategic Computing Initiative (NSCI). However, this bold, necessary step toward building the next generation of supercomputers has inaugurated a new era for U.S. high performance computing (HPC).
Predicting Hurricanes with Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

2010-01-01

Hurricane Emily, formed in the Atlantic Ocean on July 10, 2005, was the strongest hurricane ever to form before August. By checking computer models against the actual path of the storm, researchers can improve hurricane prediction. In 2010, NOAA researchers were awarded 25 million processor-hours on Argonne's BlueGene/P supercomputer for the project. Read more at http://go.usa.gov/OLh
Surprise

DOE Office of Scientific and Technical Information (OSTI.GOV)

Curran, L.

1988-03-03

Interest has been building in recent months over the imminent arrival of a new class of supercomputer, called the ''supercomputer on a desk'' or the single-user model. Most observers expected the first such product to come from either of two startups, Ardent Computer Corp. or Stellar Computer Inc. But a surprise entry has shown up. Apollo Computer Inc. is launching a new work station this week that racks up an impressive list of industry first as it puts supercomputer power at the disposal of a single user. The new series 10000 from the Chelmsford, Mass., a company is built aroundmore » a reduced-instruction-set architecture that the company calls Prism, for parallel reduced-instruction-set multiprocessor. This article describes the 10000 and Prism.« less
Computational Solutions for Today’s Navy: New Methods are Being Employed to Meet the Navy’s Changing Software-Development Environment

DTIC Science & Technology

2008-03-01

software- development environment. ▶ Frank W. Bentrem, Ph.D., John T. Sample, Ph.D., and Michael M. Harris he Naval Research Labor - atory (NRL) is the...sonars (Through-the-Sensor technology), supercomputer generated numer- ical models, and historical/ clima - tological databases. It uses a vari- ety of
Energy consumption optimization of the total-FETI solver by changing the CPU frequency

NASA Astrophysics Data System (ADS)

Horak, David; Riha, Lubomir; Sojka, Radim; Kruzik, Jakub; Beseda, Martin; Cermak, Martin; Schuchart, Joseph

2017-07-01

The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power and energy consumption is required on both software and hardware side. This paper deals with the energy consumption evaluation of the Finite Element Tearing and Interconnect (FETI) based solvers of linear systems, which is an established method for solving real-world engineering problems. We have evaluated the effect of the CPU frequency on the energy consumption of the FETI solver using a linear elasticity 3D cube synthetic benchmark. In this problem, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the FETI method. The paper provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The paper shows that static tuning brings up 12% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 3%.
Combining density functional theory calculations, supercomputing, and data-driven methods to design new materials (Conference Presentation)

NASA Astrophysics Data System (ADS)

Jain, Anubhav

2017-04-01

Density functional theory (DFT) simulations solve for the electronic structure of materials starting from the Schrödinger equation. Many case studies have now demonstrated that researchers can often use DFT to design new compounds in the computer (e.g., for batteries, catalysts, and hydrogen storage) before synthesis and characterization in the lab. In this talk, I will focus on how DFT calculations can be executed on large supercomputing resources in order to generate very large data sets on new materials for functional applications. First, I will briefly describe the Materials Project, an effort at LBNL that has virtually characterized over 60,000 materials using DFT and has shared the results with over 17,000 registered users. Next, I will talk about how such data can help discover new materials, describing how preliminary computational screening led to the identification and confirmation of a new family of bulk AMX2 thermoelectric compounds with measured zT reaching 0.8. I will outline future plans for how such data-driven methods can be used to better understand the factors that control thermoelectric behavior, e.g., for the rational design of electronic band structures, in ways that are different from conventional approaches.
IBM PC enhances the world's future

NASA Technical Reports Server (NTRS)

Cox, Jozelle

1988-01-01

Although the purpose of this research is to illustrate the importance of computers to the public, particularly the IBM PC, present examinations will include computers developed before the IBM PC was brought into use. IBM, as well as other computing facilities, began serving the public years ago, and is continuing to find ways to enhance the existence of man. With new developments in supercomputers like the Cray-2, and the recent advances in artificial intelligence programming, the human race is gaining knowledge at a rapid pace. All have benefited from the development of computers in the world; not only have they brought new assets to life, but have made life more and more of a challenge everyday.
GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

DOE PAGES

Abraham, Mark James; Murtola, Teemu; Schulz, Roland; ...

2015-07-15

GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abraham, Mark James; Murtola, Teemu; Schulz, Roland

GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
Ice Storm Supercomputer

ScienceCinema

None

2018-05-01

A new Idaho National Laboratory supercomputer is helping scientists create more realistic simulations of nuclear fuel. Dubbed "Ice Storm" this 2048-processor machine allows researchers to model and predict the complex physics behind nuclear reactor behavior. And with a new visualization lab, the team can see the results of its simulations on the big screen. For more information about INL research, visit http://www.facebook.com/idahonationallaboratory.
Open Skies Project Computational Fluid Dynamic Analysis

DTIC Science & Technology

1994-03-01

109 -. -_ _ 9 . CONCLUSIONSI1 f 10. LIST OF REFERENCES _________ ___________112 APPENDIX A: Transition Prediction __________________116 B...Behind the Open Skies Plate 20 8. VSAERO Results on the Alternate Fairing 21 9 . Centerline Cp Comparisons 22 10. VSAERO Wing Effects Study Centerline C...problems. The assistance Mrs. Mary Ann Mages, at Kirtland Supercomputer Center ( PL /SCPR) gave by setting a precedent for supercomputer account
HACC: Extreme Scaling and Performance Across Diverse Architectures

NASA Astrophysics Data System (ADS)

Habib, Salman; Morozov, Vitali; Frontiere, Nicholas; Finkel, Hal; Pope, Adrian; Heitmann, Katrin

2013-11-01

Supercomputing is evolving towards hybrid and accelerator-based architectures with millions of cores. The HACC (Hardware/Hybrid Accelerated Cosmology Code) framework exploits this diverse landscape at the largest scales of problem size, obtaining high scalability and sustained performance. Developed to satisfy the science requirements of cosmological surveys, HACC melds particle and grid methods using a novel algorithmic structure that flexibly maps across architectures, including CPU/GPU, multi/many-core, and Blue Gene systems. We demonstrate the success of HACC on two very different machines, the CPU/GPU system Titan and the BG/Q systems Sequoia and Mira, attaining unprecedented levels of scalable performance. We demonstrate strong and weak scaling on Titan, obtaining up to 99.2% parallel efficiency, evolving 1.1 trillion particles. On Sequoia, we reach 13.94 PFlops (69.2% of peak) and 90% parallel efficiency on 1,572,864 cores, with 3.6 trillion particles, the largest cosmological benchmark yet performed. HACC design concepts are applicable to several other supercomputer applications.
Web-based system for surgical planning and simulation

NASA Astrophysics Data System (ADS)

Eldeib, Ayman M.; Ahmed, Mohamed N.; Farag, Aly A.; Sites, C. B.

1998-10-01

The growing scientific knowledge and rapid progress in medical imaging techniques has led to an increasing demand for better and more efficient methods of remote access to high-performance computer facilities. This paper introduces a web-based telemedicine project that provides interactive tools for surgical simulation and planning. The presented approach makes use of client-server architecture based on new internet technology where clients use an ordinary web browser to view, send, receive and manipulate patients' medical records while the server uses the supercomputer facility to generate online semi-automatic segmentation, 3D visualization, surgical simulation/planning and neuroendoscopic procedures navigation. The supercomputer (SGI ONYX 1000) is located at the Computer Vision and Image Processing Lab, University of Louisville, Kentucky. This system is under development in cooperation with the Department of Neurological Surgery, Alliant Health Systems, Louisville, Kentucky. The server is connected via a network to the Picture Archiving and Communication System at Alliant Health Systems through a DICOM standard interface that enables authorized clients to access patients' images from different medical modalities.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

20th Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 20th edition of the TOP500 list of the world's fastest supercomputers was released today (November 15, 2002). The Earth Simulator supercomputer installed earlier this year at the Earth Simulator Center in Yokohama, Japan, is with its Linpack benchmark performance of 35.86 Tflop/s (trillions of calculations per second) retains the number one position. The No.2 and No.3 positions are held by two new, identical ASCI Q systems at Los Alamos National Laboratorymore » (7.73Tflop/s each). These systems are built by Hewlett-Packard and based on the Alpha Server SC computer system.« less
STAMPS: Software Tool for Automated MRI Post-processing on a supercomputer.

PubMed

Bigler, Don C; Aksu, Yaman; Miller, David J; Yang, Qing X

2009-08-01

This paper describes a Software Tool for Automated MRI Post-processing (STAMP) of multiple types of brain MRIs on a workstation and for parallel processing on a supercomputer (STAMPS). This software tool enables the automation of nonlinear registration for a large image set and for multiple MR image types. The tool uses standard brain MRI post-processing tools (such as SPM, FSL, and HAMMER) for multiple MR image types in a pipeline fashion. It also contains novel MRI post-processing features. The STAMP image outputs can be used to perform brain analysis using Statistical Parametric Mapping (SPM) or single-/multi-image modality brain analysis using Support Vector Machines (SVMs). Since STAMPS is PBS-based, the supercomputer may be a multi-node computer cluster or one of the latest multi-core computers.
Japanese project aims at supercomputer that executes 10 gflops

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burskey, D.

1984-05-03

Dubbed supercom by its multicompany design team, the decade-long project's goal is an engineering supercomputer that can execute 10 billion floating-point operations/s-about 20 times faster than today's supercomputers. The project, guided by Japan's Ministry of International Trade and Industry (MITI) and the Agency of Industrial Science and Technology encompasses three parallel research programs, all aimed at some angle of the superconductor. One program should lead to superfast logic and memory circuits, another to a system architecture that will afford the best performance, and the last to the software that will ultimately control the computer. The work on logic and memorymore » chips is based on: GAAS circuit; Josephson junction devices; and high electron mobility transistor structures. The architecture will involve parallel processing.« less
OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics

NASA Astrophysics Data System (ADS)

Huffer, E.; Gleason, J. L.

2015-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge, metadata provisioning becomes increasingly burdensome to data producers. The OlyMPUS system helps data providers produce semantically rich metadata, making their data more accessible to data consumers, and helps data consumers quickly discover and download the right data for their research.
Determination of the vapor-liquid transition of square-well particles using a novel generalized-canonical-ensemble-based method

NASA Astrophysics Data System (ADS)

Zhao, Liang; Xu, Shun; Tu, Yu-Song; Zhou, Xin

2017-06-01

Not Available Project supported by the National Natural Science Foundation for Outstanding Young Scholars, China (Grant No. 11422542), the National Natural Science Foundation of China (Grant Nos. 11605151 and 11675138), and the Shanghai Supercomputer Center of China and Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund (the second phase).
Supercomputer Simulations Help Develop New Approach to Fight Antibiotic Resistance

ScienceCinema

Zgurskaya, Helen; Smith, Jeremy

2018-06-13

ORNL leveraged powerful supercomputing to support research led by University of Oklahoma scientists to identify chemicals that seek out and disrupt bacterial proteins called efflux pumps, known to be a major cause of antibiotic resistance. By running simulations on Titan, the team selected molecules most likely to target and potentially disable the assembly of efflux pumps found in E. coli bacteria cells.

Infinite possibilities: Computational structures technology

NASA Astrophysics Data System (ADS)

Beam, Sherilee F.

1994-12-01

Computational Fluid Dynamics (or CFD) methods are very familiar to the research community. Even the general public has had some exposure to CFD images, primarily through the news media. However, very little attention has been paid to CST--Computational Structures Technology. Yet, no important design can be completed without it. During the first half of this century, researchers only dreamed of designing and building structures on a computer. Today their dreams have become practical realities as computational methods are used in all phases of design, fabrication and testing of engineering systems. Increasingly complex structures can now be built in even shorter periods of time. Over the past four decades, computer technology has been developing, and early finite element methods have grown from small in-house programs to numerous commercial software programs. When coupled with advanced computing systems, they help engineers make dramatic leaps in designing and testing concepts. The goals of CST include: predicting how a structure will behave under actual operating conditions; designing and complementing other experiments conducted on a structure; investigating microstructural damage or chaotic, unpredictable behavior; helping material developers in improving material systems; and being a useful tool in design systems optimization and sensitivity techniques. Applying CST to a structure problem requires five steps: (1) observe the specific problem; (2) develop a computational model for numerical simulation; (3) develop and assemble software and hardware for running the codes; (4) post-process and interpret the results; and (5) use the model to analyze and design the actual structure. Researchers in both industry and academia continue to make significant contributions to advance this technology with improvements in software, collaborative computing environments and supercomputing systems. As these environments and systems evolve, computational structures technology will evolve. By using CST in the design and operation of future structures systems, engineers will have a better understanding of how a system responds and lasts, more cost-effective methods of designing and testing models, and improved productivity. For informational and educational purposes, a videotape is being produced using both static and dynamic images from research institutions, software and hardware companies, private individuals, and historical photographs and drawings. The extensive number of CST resources indicates its widespread use. Applications run the gamut from simpler university-simulated problems to those requiring solutions on supercomputers. In some cases, an image or an animation will be mapped onto the actual structure to show the relevance of the computer model to the structure.
Infinite possibilities: Computational structures technology

NASA Technical Reports Server (NTRS)

Beam, Sherilee F.

1994-01-01

Computational Fluid Dynamics (or CFD) methods are very familiar to the research community. Even the general public has had some exposure to CFD images, primarily through the news media. However, very little attention has been paid to CST--Computational Structures Technology. Yet, no important design can be completed without it. During the first half of this century, researchers only dreamed of designing and building structures on a computer. Today their dreams have become practical realities as computational methods are used in all phases of design, fabrication and testing of engineering systems. Increasingly complex structures can now be built in even shorter periods of time. Over the past four decades, computer technology has been developing, and early finite element methods have grown from small in-house programs to numerous commercial software programs. When coupled with advanced computing systems, they help engineers make dramatic leaps in designing and testing concepts. The goals of CST include: predicting how a structure will behave under actual operating conditions; designing and complementing other experiments conducted on a structure; investigating microstructural damage or chaotic, unpredictable behavior; helping material developers in improving material systems; and being a useful tool in design systems optimization and sensitivity techniques. Applying CST to a structure problem requires five steps: (1) observe the specific problem; (2) develop a computational model for numerical simulation; (3) develop and assemble software and hardware for running the codes; (4) post-process and interpret the results; and (5) use the model to analyze and design the actual structure. Researchers in both industry and academia continue to make significant contributions to advance this technology with improvements in software, collaborative computing environments and supercomputing systems. As these environments and systems evolve, computational structures technology will evolve. By using CST in the design and operation of future structures systems, engineers will have a better understanding of how a system responds and lasts, more cost-effective methods of designing and testing models, and improved productivity. For informational and educational purposes, a videotape is being produced using both static and dynamic images from research institutions, software and hardware companies, private individuals, and historical photographs and drawings. The extensive number of CST resources indicates its widespread use. Applications run the gamut from simpler university-simulated problems to those requiring solutions on supercomputers. In some cases, an image or an animation will be mapped onto the actual structure to show the relevance of the computer model to the structure. Transferring the digital files to videotape presents a number of problems related to maintaining the quality of the original image, while still producing a broadcast quality videotape. Since researchers normally do not create a computer image using traditional composition theories or video production requirements, often the image loses some of its original digital quality and impact when transferred to videotape. Although many CST images are currently available, those that are edited into the final project must meet two important criteria: they must complement the narration, and they must be broadcast quality when recorded on videotape.
The Pawsey Supercomputer geothermal cooling project

NASA Astrophysics Data System (ADS)

Regenauer-Lieb, K.; Horowitz, F.; Western Australian Geothermal Centre Of Excellence, T.

2010-12-01

The Australian Government has funded the Pawsey supercomputer in Perth, Western Australia, providing computational infrastructure intended to support the future operations of the Australian Square Kilometre Array radiotelescope and to boost next-generation computational geosciences in Australia. Supplementary funds have been directed to the development of a geothermal exploration well to research the potential for direct heat use applications at the Pawsey Centre site. Cooling the Pawsey supercomputer may be achieved by geothermal heat exchange rather than by conventional electrical power cooling, thus reducing the carbon footprint of the Pawsey Centre and demonstrating an innovative green technology that is widely applicable in industry and urban centres across the world. The exploration well is scheduled to be completed in 2013, with drilling due to commence in the third quarter of 2011. One year is allocated to finalizing the design of the exploration, monitoring and research well. Success in the geothermal exploration and research program will result in an industrial-scale geothermal cooling facility at the Pawsey Centre, and will provide a world-class student training environment in geothermal energy systems. A similar system is partially funded and in advanced planning to provide base-load air-conditioning for the main campus of the University of Western Australia. Both systems are expected to draw ~80-95 degrees C water from aquifers lying between 2000 and 3000 meters depth from naturally permeable rocks of the Perth sedimentary basin. The geothermal water will be run through absorption chilling devices, which only require heat (as opposed to mechanical work) to power a chilled water stream adequate to meet the cooling requirements. Once the heat has been removed from the geothermal water, licensing issues require the water to be re-injected back into the aquifer system. These systems are intended to demonstrate the feasibility of powering large-scale air-conditioning systems from the direct use of geothermal power from Hot Sedimentary Aquifer (HSA) systems. HSA systems underlie many of the world's population centers, and thus have the potential to offset a significant fraction of the world's consumption of electrical power for air-conditioning.
The feasibility of an efficient drug design method with high-performance computers.

PubMed

Yamashita, Takefumi; Ueda, Akihiko; Mitsui, Takashi; Tomonaga, Atsushi; Matsumoto, Shunji; Kodama, Tatsuhiko; Fujitani, Hideaki

2015-01-01

In this study, we propose a supercomputer-assisted drug design approach involving all-atom molecular dynamics (MD)-based binding free energy prediction after the traditional design/selection step. Because this prediction is more accurate than the empirical binding affinity scoring of the traditional approach, the compounds selected by the MD-based prediction should be better drug candidates. In this study, we discuss the applicability of the new approach using two examples. Although the MD-based binding free energy prediction has a huge computational cost, it is feasible with the latest 10 petaflop-scale computer. The supercomputer-assisted drug design approach also involves two important feedback procedures: The first feedback is generated from the MD-based binding free energy prediction step to the drug design step. While the experimental feedback usually provides binding affinities of tens of compounds at one time, the supercomputer allows us to simultaneously obtain the binding free energies of hundreds of compounds. Because the number of calculated binding free energies is sufficiently large, the compounds can be classified into different categories whose properties will aid in the design of the next generation of drug candidates. The second feedback, which occurs from the experiments to the MD simulations, is important to validate the simulation parameters. To demonstrate this, we compare the binding free energies calculated with various force fields to the experimental ones. The results indicate that the prediction will not be very successful, if we use an inaccurate force field. By improving/validating such simulation parameters, the next prediction can be made more accurate.
CFD applications: The Lockheed perspective

NASA Technical Reports Server (NTRS)

Miranda, Luis R.

1987-01-01

The Numerical Aerodynamic Simulator (NAS) epitomizes the coming of age of supercomputing and opens exciting horizons in the world of numerical simulation. An overview of supercomputing at Lockheed Corporation in the area of Computational Fluid Dynamics (CFD) is presented. This overview will focus on developments and applications of CFD as an aircraft design tool and will attempt to present an assessment, withing this context, of the state-of-the-art in CFD methodology.
Computational mechanics analysis tools for parallel-vector supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

1993-01-01

Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
A Layered Solution for Supercomputing Storage

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grider, Gary

To solve the supercomputing challenge of memory keeping up with processing speed, a team at Los Alamos National Laboratory developed two innovative memory management and storage technologies. Burst buffers peel off data onto flash memory to support the checkpoint/restart paradigm of large simulations. MarFS adds a thin software layer enabling a new tier for campaign storage—based on inexpensive, failure-prone disk drives—between disk drives and tape archives.
Achieving supercomputer performance for neural net simulation with an array of digital signal processors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Muller, U.A.; Baumle, B.; Kohler, P.

1992-10-01

Music, a DSP-based system with a parallel distributed-memory architecture, provides enormous computing power yet retains the flexibility of a general-purpose computer. Reaching a peak performance of 2.7 Gflops at a significantly lower cost, power consumption, and space requirement than conventional supercomputers, Music is well suited to computationally intensive applications such as neural network simulation. 12 refs., 9 figs., 2 tabs.
A Heterogeneous High-Performance System for Computational and Computer Science

DTIC Science & Technology

2016-11-15

Patents Submitted Patents Awarded Awards Graduate Students Names of Post Doctorates Names of Faculty Supported Names of Under Graduate students supported...team of research faculty from the departments of computer science and natural science at Bowie State University. The supercomputer is not only to...accelerated HPC systems. The supercomputer is also ideal for the research conducted in the Department of Natural Science, as research faculty work on
LLMapReduce: Multi-Lingual Map-Reduce for Supercomputing Environments

DTIC Science & Technology

2015-11-20

1990s. Popularized by Google [36] and Apache Hadoop [37], map-reduce has become a staple technology of the ever- growing big data community...Lexington, MA, U.S.A Abstract— The map-reduce parallel programming model has become extremely popular in the big data community. Many big data ...to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing simple parallel programming
2011 Computation Directorate Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, D L

2012-04-11

From its founding in 1952 until today, Lawrence Livermore National Laboratory (LLNL) has made significant strategic investments to develop high performance computing (HPC) and its application to national security and basic science. Now, 60 years later, the Computation Directorate and its myriad resources and capabilities have become a key enabler for LLNL programs and an integral part of the effort to support our nation's nuclear deterrent and, more broadly, national security. In addition, the technological innovation HPC makes possible is seen as vital to the nation's economic vitality. LLNL, along with other national laboratories, is working to make supercomputing capabilitiesmore » and expertise available to industry to boost the nation's global competitiveness. LLNL is on the brink of an exciting milestone with the 2012 deployment of Sequoia, the National Nuclear Security Administration's (NNSA's) 20-petaFLOP/s resource that will apply uncertainty quantification to weapons science. Sequoia will bring LLNL's total computing power to more than 23 petaFLOP/s-all brought to bear on basic science and national security needs. The computing systems at LLNL provide game-changing capabilities. Sequoia and other next-generation platforms will enable predictive simulation in the coming decade and leverage industry trends, such as massively parallel and multicore processors, to run petascale applications. Efficient petascale computing necessitates refining accuracy in materials property data, improving models for known physical processes, identifying and then modeling for missing physics, quantifying uncertainty, and enhancing the performance of complex models and algorithms in macroscale simulation codes. Nearly 15 years ago, NNSA's Accelerated Strategic Computing Initiative (ASCI), now called the Advanced Simulation and Computing (ASC) Program, was the critical element needed to shift from test-based confidence to science-based confidence. Specifically, ASCI/ASC accelerated the development of simulation capabilities necessary to ensure confidence in the nuclear stockpile-far exceeding what might have been achieved in the absence of a focused initiative. While stockpile stewardship research pushed LLNL scientists to develop new computer codes, better simulation methods, and improved visualization technologies, this work also stimulated the exploration of HPC applications beyond the standard sponsor base. As LLNL advances to a petascale platform and pursues exascale computing (1,000 times faster than Sequoia), ASC will be paramount to achieving predictive simulation and uncertainty quantification. Predictive simulation and quantifying the uncertainty of numerical predictions where little-to-no data exists demands exascale computing and represents an expanding area of scientific research important not only to nuclear weapons, but to nuclear attribution, nuclear reactor design, and understanding global climate issues, among other fields. Aside from these lofty goals and challenges, computing at LLNL is anything but 'business as usual.' International competition in supercomputing is nothing new, but the HPC community is now operating in an expanded, more aggressive climate of global competitiveness. More countries understand how science and technology research and development are inextricably linked to economic prosperity, and they are aggressively pursuing ways to integrate HPC technologies into their native industrial and consumer products. In the interest of the nation's economic security and the science and technology that underpins it, LLNL is expanding its portfolio and forging new collaborations. We must ensure that HPC remains an asymmetric engine of innovation for the Laboratory and for the U.S. and, in doing so, protect our research and development dynamism and the prosperity it makes possible. One untapped area of opportunity LLNL is pursuing is to help U.S. industry understand how supercomputing can benefit their business. Industrial investment in HPC applications has historically been limited by the prohibitive cost of entry, the inaccessibility of software to run the powerful systems, and the years it takes to grow the expertise to develop codes and run them in an optimal way. LLNL is helping industry better compete in the global market place by providing access to some of the world's most powerful computing systems, the tools to run them, and the experts who are adept at using them. Our scientists are collaborating side by side with industrial partners to develop solutions to some of industry's toughest problems. The goal of the Livermore Valley Open Campus High Performance Computing Innovation Center is to allow American industry the opportunity to harness the power of supercomputing by leveraging the scientific and computational expertise at LLNL in order to gain a competitive advantage in the global economy.« less
Accelerating cardiac bidomain simulations using graphics processing units.

PubMed

Neic, A; Liebmann, M; Hoetzl, E; Mitchell, L; Vigmond, E J; Haase, G; Plank, G

2012-08-01

Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6-20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20 GPUs, 476 CPU cores were required on a national supercomputing facility.
Accelerating Cardiac Bidomain Simulations Using Graphics Processing Units

PubMed Central

Neic, Aurel; Liebmann, Manfred; Hoetzl, Elena; Mitchell, Lawrence; Vigmond, Edward J.; Haase, Gundolf

2013-01-01

Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6–20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20GPUs, 476 CPU cores were required on a national supercomputing facility. PMID:22692867
The Revolutionary Vertical Lift Technology (RVLT) Project

NASA Technical Reports Server (NTRS)

Yamauchi, Gloria K.

2018-01-01

The Revolutionary Vertical Lift Technology (RVLT) Project is one of six projects in the Advanced Air Vehicles Program (AAVP) of the NASA Aeronautics Research Mission Directorate. The overarching goal of the RVLT Project is to develop and validate tools, technologies, and concepts to overcome key barriers for vertical lift vehicles. The project vision is to enable the next generation of vertical lift vehicles with aggressive goals for efficiency, noise, and emissions, to expand current capabilities and develop new commercial markets. The RVLT Project invests in technologies that support conventional, non-conventional, and emerging vertical-lift aircraft in the very light to heavy vehicle classes. Research areas include acoustic, aeromechanics, drive systems, engines, icing, hybrid-electric systems, impact dynamics, experimental techniques, computational methods, and conceptual design. The project research is executed at NASA Ames, Glenn, and Langley Research Centers; the research extensively leverages partnerships with the US Army, the Federal Aviation Administration, industry, and academia. The primary facilities used by the project for testing of vertical-lift technologies include the 14- by 22-Ft Wind Tunnel, Icing Research Tunnel, National Full-Scale Aerodynamics Complex, 7- by 10-Ft Wind Tunnel, Rotor Test Cell, Landing and Impact Research facility, Compressor Test Facility, Drive System Test Facilities, Transonic Turbine Blade Cascade Facility, Vertical Motion Simulator, Mobile Acoustic Facility, Exterior Effects Synthesis and Simulation Lab, and the NASA Advanced Supercomputing Complex. To learn more about the RVLT Project, please stop by booth #1004 or visit their website at https://www.nasa.gov/aeroresearch/programs/aavp/rvlt.
Operational uses of ACTS technology

NASA Astrophysics Data System (ADS)

Gedney, Richard T.; Wright, David L.; Balombin, Joseph L.; Sohn, Philip Y.; Cashman, William F.; Stern, Alan L.; Golding, Len; Palmer, Larry

1992-03-01

The NASA Advanced Communications Technology Satellite (ACTS) provides the technologies for very high gain hopping spot beam antennas, on-board baseband routing and processing, and wideband (1 GHz) Ka-band transponders. A number of studies have recently been completed using the experience gained in developing the actual ACTS system hardware to quantify how well the ACTS technology can be used in future operational systems. This paper provides a summary of these study results including the spacecraft (S/C) weight per unit circuit for providing services by ACTS technologies as compared to present-day satellites. The uses of the ACTS technology discussed are for providing T1 VSAT mesh networks, aeronautical mobile communications, supervisory control and data acquisition (SCADA) services, and high data rate networks for supercomputer and other applications.
NAS Grid Benchmarks. 1.0

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob; Frumkin, Michael; Biegel, Bryan A. (Technical Monitor)

2002-01-01

We provide a paper-and-pencil specification of a benchmark suite for computational grids. It is based on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks (NPB) and is called the NAS Grid Benchmarks (NGB). NGB problems are presented as data flow graphs encapsulating an instance of a slightly modified NPB task in each graph node, which communicates with other nodes by sending/receiving initialization data. Like NPB, NGB specifies several different classes (problem sizes). In this report we describe classes S, W, and A, and provide verification values for each. The implementor has the freedom to choose any language, grid environment, security model, fault tolerance/error correction mechanism, etc., as long as the resulting implementation passes the verification test and reports the turnaround time of the benchmark.
A site oriented supercomputer for theoretical physics: The Fermilab Advanced Computer Program Multi Array Processor System (ACMAPS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nash, T.; Atac, R.; Cook, A.

1989-03-06

The ACPMAPS multipocessor is a highly cost effective, local memory parallel computer with a hypercube or compound hypercube architecture. Communication requires the attention of only the two communicating nodes. The design is aimed at floating point intensive, grid like problems, particularly those with extreme computing requirements. The processing nodes of the system are single board array processors, each with a peak power of 20 Mflops, supported by 8 Mbytes of data and 2 Mbytes of instruction memory. The system currently being assembled has a peak power of 5 Gflops. The nodes are based on the Weitek XL Chip set. Themore » system delivers performance at approximately $300/Mflop. 8 refs., 4 figs.« less
New NAS Parallel Benchmarks Results

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)

1997-01-01

NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.
Chasing Exoplanets

NASA Technical Reports Server (NTRS)

Jenkins, Jon M.

2017-01-01

NASA's Kepler Mission was launched in March 2009 as NASA's first mission capable of finding Earth-size planets orbiting in the habitable zone of Sun-like stars, that range of distances for which liquid water would pool on the surface of a rocky planet. Kepler has discovered over 1000 planets and over 4600 candidates, many of them as small as the Earth. Today, Kepler's amazing success seems to be a fait accompli to those unfamiliar with her history. But twenty years ago, there were no planets known outside our solar system, and few people believed it was possible to detect tiny Earth-size planets orbiting other stars. Motivating NASA to select Kepler for launch required a confluence of the right detector technology, advances in signal processing and algorithms, and the power of supercomputing.
Deterministic and stochastic methods of calculation of polarization characteristics of radiation in natural environment

NASA Astrophysics Data System (ADS)

Strelkov, S. A.; Sushkevich, T. A.; Maksakova, S. V.

2017-11-01

We are talking about russian achievements of the world level in the theory of radiation transfer, taking into account its polarization in natural media and the current scientific potential developing in Russia, which adequately provides the methodological basis for theoretically-calculated research of radiation processes and radiation fields in natural media using supercomputers and mass parallelism. A new version of the matrix transfer operator is proposed for solving problems of polarized radiation transfer in heterogeneous media by the method of influence functions, when deterministic and stochastic methods can be combined.

NASA high performance computing and communications program

NASA Technical Reports Server (NTRS)

Holcomb, Lee; Smith, Paul; Hunter, Paul

1993-01-01

The National Aeronautics and Space Administration's HPCC program is part of a new Presidential initiative aimed at producing a 1000-fold increase in supercomputing speed and a 100-fold improvement in available communications capability by 1997. As more advanced technologies are developed under the HPCC program, they will be used to solve NASA's 'Grand Challenge' problems, which include improving the design and simulation of advanced aerospace vehicles, allowing people at remote locations to communicate more effectively and share information, increasing scientist's abilities to model the Earth's climate and forecast global environmental trends, and improving the development of advanced spacecraft. NASA's HPCC program is organized into three projects which are unique to the agency's mission: the Computational Aerosciences (CAS) project, the Earth and Space Sciences (ESS) project, and the Remote Exploration and Experimentation (REE) project. An additional project, the Basic Research and Human Resources (BRHR) project exists to promote long term research in computer science and engineering and to increase the pool of trained personnel in a variety of scientific disciplines. This document presents an overview of the objectives and organization of these projects as well as summaries of individual research and development programs within each project.
Applications of Massive Mathematical Computations

DTIC Science & Technology

1990-04-01

particles from the first principles of QCD . This problem is under intensive numerical study 11-6 using special purpose parallel supercomputers in...several places around the world. The method used here is the Monte Carlo integration for a fixed 3-D plus time lattices . Reliable results are still years...mathematical and theoretical physics, but its most promising applications are in the numerical realization of QCD computations. Our programs for the solution
Quality use of the computer: Computational mechanics, artificial intelligence, robotics, and acoustic sensing; Proceedings of the ASME/JSME Pressure Vessels and Piping Conference, Honolulu, HI, July 23-27, 1989

NASA Astrophysics Data System (ADS)

Cory, J. F., Jr.; Gordon, J. L.; Miyoshi, T.; Suzuki, K.

1989-06-01

Papers are presented on the use of microcomputers, supercomputers, and workstations in solid and structural mechanics. Artificial intelligence technology, the development and use of expert systems, and research in the area of robotics are discussed. Attention is also given to probabilistic finite element and boundary element methods and acoustic sensing.
Adapting iterative algorithms for solving large sparse linear systems for efficient use on the CDC CYBER 205

NASA Technical Reports Server (NTRS)

Kincaid, D. R.; Young, D. M.

1984-01-01

Adapting and designing mathematical software to achieve optimum performance on the CYBER 205 is discussed. Comments and observations are made in light of recent work done on modifying the ITPACK software package and on writing new software for vector supercomputers. The goal was to develop very efficient vector algorithms and software for solving large sparse linear systems using iterative methods.
Computational mechanics analysis tools for parallel-vector supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.

1993-01-01

Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
A Layered Solution for Supercomputing Storage

ScienceCinema

Grider, Gary

2018-06-13

To solve the supercomputing challenge of memory keeping up with processing speed, a team at Los Alamos National Laboratory developed two innovative memory management and storage technologies. Burst buffers peel off data onto flash memory to support the checkpoint/restart paradigm of large simulations. MarFS adds a thin software layer enabling a new tier for campaign storageâbased on inexpensive, failure-prone disk drivesâbetween disk drives and tape archives.
A Long History of Supercomputing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grider, Gary

As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nation’s first computer to building the first machine to break the petaflop barrier, Los Alamos holds many “firsts” in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.
Introducing Argonne’s Theta Supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

Theta, the Argonne Leadership Computing Facility’s (ALCF) new Intel-Cray supercomputer, is officially open to the research community. Theta’s massively parallel, many-core architecture puts the ALCF on the path to Aurora, the facility’s future Intel-Cray system. Capable of nearly 10 quadrillion calculations per second, Theta enables researchers to break new ground in scientific investigations that range from modeling the inner workings of the brain to developing new materials for renewable energy applications.
Graphics supercomputer for computational fluid dynamics research

NASA Astrophysics Data System (ADS)

Liaw, Goang S.

1994-11-01

The objective of this project is to purchase a state-of-the-art graphics supercomputer to improve the Computational Fluid Dynamics (CFD) research capability at Alabama A & M University (AAMU) and to support the Air Force research projects. A cutting-edge graphics supercomputer system, Onyx VTX, from Silicon Graphics Computer Systems (SGI), was purchased and installed. Other equipment including a desktop personal computer, PC-486 DX2 with a built-in 10-BaseT Ethernet card, a 10-BaseT hub, an Apple Laser Printer Select 360, and a notebook computer from Zenith were also purchased. A reading room has been converted to a research computer lab by adding some furniture and an air conditioning unit in order to provide an appropriate working environments for researchers and the purchase equipment. All the purchased equipment were successfully installed and are fully functional. Several research projects, including two existing Air Force projects, are being performed using these facilities.
Modelling sodium cobaltate by mapping onto magnetic Ising model

NASA Astrophysics Data System (ADS)

Gemperline, Patrick; Morris, David Jonathan Pryce

Fast Ion conductors are a class of crystals that are frequently used as battery materials, especially in smart phones, laptops, and other portable devices. Sodium Cobalt Oxide, NaxCoO2, falls into this class of crystals, but is unique because it possesses the ability to act as a thermoelectric material and a superconductor at different concentrations of Na+. The crystal lattice is mapped onto an Ising Magnetic Spin model and a Monte-Carol Simulation is used to find the most energetically favorable configuration of spins. This spin configuration is mapped back to the crystal lattice resulting in the most stable crystal structure of Sodium Cobalt Oxide at various concentrations. Knowing the atomic structures of the crystals will aid in the research of the materials capabilities and the possible uses of the material commercially. Ohio Supercomputer Center. 1987. Ohio Supercomputer Center. Columbus OH: Ohio Supercomputer Center. and the John Hauck Foundation.
Final Scientific Report: A Scalable Development Environment for Peta-Scale Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karbach, Carsten; Frings, Wolfgang

2013-02-22

This document is the final scientific report of the project DE-SC000120 (A scalable Development Environment for Peta-Scale Computing). The objective of this project is the extension of the Parallel Tools Platform (PTP) for applying it to peta-scale systems. PTP is an integrated development environment for parallel applications. It comprises code analysis, performance tuning, parallel debugging and system monitoring. The contribution of the Juelich Supercomputing Centre (JSC) aims to provide a scalable solution for system monitoring of supercomputers. This includes the development of a new communication protocol for exchanging status data between the target remote system and the client running PTP.more » The communication has to work for high latency. PTP needs to be implemented robustly and should hide the complexity of the supercomputer's architecture in order to provide a transparent access to various remote systems via a uniform user interface. This simplifies the porting of applications to different systems, because PTP functions as abstraction layer between parallel application developer and compute resources. The common requirement for all PTP components is that they have to interact with the remote supercomputer. E.g. applications are built remotely and performance tools are attached to job submissions and their output data resides on the remote system. Status data has to be collected by evaluating outputs of the remote job scheduler and the parallel debugger needs to control an application executed on the supercomputer. The challenge is to provide this functionality for peta-scale systems in real-time. The client server architecture of the established monitoring application LLview, developed by the JSC, can be applied to PTP's system monitoring. LLview provides a well-arranged overview of the supercomputer's current status. A set of statistics, a list of running and queued jobs as well as a node display mapping running jobs to their compute resources form the user display of LLview. These monitoring features have to be integrated into the development environment. Besides showing the current status PTP's monitoring also needs to allow for submitting and canceling user jobs. Monitoring peta-scale systems especially deals with presenting the large amount of status data in a useful manner. Users require to select arbitrary levels of detail. The monitoring views have to provide a quick overview of the system state, but also need to allow for zooming into specific parts of the system, into which the user is interested in. At present, the major batch systems running on supercomputers are PBS, TORQUE, ALPS and LoadLeveler, which have to be supported by both the monitoring and the job controlling component. Finally, PTP needs to be designed as generic as possible, so that it can be extended for future batch systems.« less
Software Engineering Support of the Third Round of Scientific Grand Challenge Investigations: Earth System Modeling Software Framework Survey

NASA Technical Reports Server (NTRS)

Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn; Zukor, Dorothy (Technical Monitor)

2002-01-01

One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task, both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation. while maintaining high performance across numerous supercomputer and workstation architectures. This document surveys numerous software frameworks for potential use in Earth science modeling. Several frameworks are evaluated in depth, including Parallel Object-Oriented Methods and Applications (POOMA), Cactus (from (he relativistic physics community), Overture, Goddard Earth Modeling System (GEMS), the National Center for Atmospheric Research Flux Coupler, and UCLA/UCB Distributed Data Broker (DDB). Frameworks evaluated in less detail include ROOT, Parallel Application Workspace (PAWS), and Advanced Large-Scale Integrated Computational Environment (ALICE). A host of other frameworks and related tools are referenced in this context. The frameworks are evaluated individually and also compared with each other.
Multibillion-atom Molecular Dynamics Simulations of Plasticity, Spall, and Ejecta

NASA Astrophysics Data System (ADS)

Germann, Timothy C.

2007-06-01

Modern supercomputing platforms, such as the IBM BlueGene/L at Lawrence Livermore National Laboratory and the Roadrunner hybrid supercomputer being built at Los Alamos National Laboratory, are enabling large-scale classical molecular dynamics simulations of phenomena that were unthinkable just a few years ago. Using either the embedded atom method (EAM) description of simple (close-packed) metals, or modified EAM (MEAM) models of more complex solids and alloys with mixed covalent and metallic character, simulations containing billions to trillions of atoms are now practical, reaching volumes in excess of a cubic micron. In order to obtain any new physical insights, however, it is equally important that the analysis of such systems be tractable. This is in fact possible, in large part due to our highly efficient parallel visualization code, which enables the rendering of atomic spheres, Eulerian cells, and other geometric objects in a matter of minutes, even for tens of thousands of processors and billions of atoms. After briefly describing the BlueGene/L and Roadrunner architectures, and the code optimization strategies that were employed, results obtained thus far on BlueGene/L will be reviewed, including: (1) shock compression and release of a defective EAM Cu sample, illustrating the plastic deformation accompanying void collapse as well as the subsequent void growth and linkup upon release; (2) solid-solid martensitic phase transition in shock-compressed MEAM Ga; and (3) Rayleigh-Taylor fluid instability modeled using large-scale direct simulation Monte Carlo (DSMC) simulations. I will also describe our initial experiences utilizing Cell Broadband Engine processors (developed for the Sony PlayStation 3), and planned simulation studies of ejecta and spall failure in polycrystalline metals that will be carried out when the full Petaflop Opteron/Cell Roadrunner supercomputer is assembled in mid-2008.
Green Supercomputing at Argonne

ScienceCinema

Beckman, Pete

2018-02-07

Pete Beckman, head of Argonne's Leadership Computing Facility (ALCF) talks about Argonne National Laboratory's green supercomputingâeverything from designing algorithms to use fewer kilowatts per operation to using cold Chicago winter air to cool the machine more efficiently. Argonne was recognized for green computing in the 2009 HPCwire Readers Choice Awards. More at http://www.anl.gov/Media_Center/News/2009/news091117.html Read more about the Argonne Leadership Computing Facility at http://www.alcf.anl.gov/
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kozacik, Stephen

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
A performance comparison of scalar, vector, and concurrent vector computers including supercomputers for modeling transport of reactive contaminants in groundwater

NASA Astrophysics Data System (ADS)

Tripathi, Vijay S.; Yeh, G. T.

1993-06-01

Sophisticated and highly computation-intensive models of transport of reactive contaminants in groundwater have been developed in recent years. Application of such models to real-world contaminant transport problems, e.g., simulation of groundwater transport of 10-15 chemically reactive elements (e.g., toxic metals) and relevant complexes and minerals in two and three dimensions over a distance of several hundred meters, requires high-performance computers including supercomputers. Although not widely recognized as such, the computational complexity and demand of these models compare with well-known computation-intensive applications including weather forecasting and quantum chemical calculations. A survey of the performance of a variety of available hardware, as measured by the run times for a reactive transport model HYDROGEOCHEM, showed that while supercomputers provide the fastest execution times for such problems, relatively low-cost reduced instruction set computer (RISC) based scalar computers provide the best performance-to-price ratio. Because supercomputers like the Cray X-MP are inherently multiuser resources, often the RISC computers also provide much better turnaround times. Furthermore, RISC-based workstations provide the best platforms for "visualization" of groundwater flow and contaminant plumes. The most notable result, however, is that current workstations costing less than $10,000 provide performance within a factor of 5 of a Cray X-MP.
Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer

NASA Astrophysics Data System (ADS)

Xu, Chuanfu; Deng, Xiaogang; Zhang, Lilun; Fang, Jianbin; Wang, Guangxue; Jiang, Yi; Cao, Wei; Che, Yonggang; Wang, Yongxian; Wang, Zhenghua; Liu, Wei; Cheng, Xinghua

2014-12-01

Programming and optimizing complex, real-world CFD codes on current many-core accelerated HPC systems is very challenging, especially when collaborating CPUs and accelerators to fully tap the potential of heterogeneous systems. In this paper, with a tri-level hybrid and heterogeneous programming model using MPI + OpenMP + CUDA, we port and optimize our high-order multi-block structured CFD software HOSTA on the GPU-accelerated TianHe-1A supercomputer. HOSTA adopts two self-developed high-order compact definite difference schemes WCNS and HDCS that can simulate flows with complex geometries. We present a dual-level parallelization scheme for efficient multi-block computation on GPUs and perform particular kernel optimizations for high-order CFD schemes. The GPU-only approach achieves a speedup of about 1.3 when comparing one Tesla M2050 GPU with two Xeon X5670 CPUs. To achieve a greater speedup, we collaborate CPU and GPU for HOSTA instead of using a naive GPU-only approach. We present a novel scheme to balance the loads between the store-poor GPU and the store-rich CPU. Taking CPU and GPU load balance into account, we improve the maximum simulation problem size per TianHe-1A node for HOSTA by 2.3×, meanwhile the collaborative approach can improve the performance by around 45% compared to the GPU-only approach. Further, to scale HOSTA on TianHe-1A, we propose a gather/scatter optimization to minimize PCI-e data transfer times for ghost and singularity data of 3D grid blocks, and overlap the collaborative computation and communication as far as possible using some advanced CUDA and MPI features. Scalability tests show that HOSTA can achieve a parallel efficiency of above 60% on 1024 TianHe-1A nodes. With our method, we have successfully simulated an EET high-lift airfoil configuration containing 800M cells and China's large civil airplane configuration containing 150M cells. To our best knowledge, those are the largest-scale CPU-GPU collaborative simulations that solve realistic CFD problems with both complex configurations and high-order schemes.
Real-time tsunami inundation forecasting and damage mapping towards enhancing tsunami disaster resiliency

NASA Astrophysics Data System (ADS)

Koshimura, S.; Hino, R.; Ohta, Y.; Kobayashi, H.; Musa, A.; Murashima, Y.

2014-12-01

With use of modern computing power and advanced sensor networks, a project is underway to establish a new system of real-time tsunami inundation forecasting, damage estimation and mapping to enhance society's resilience in the aftermath of major tsunami disaster. The system consists of fusion of real-time crustal deformation monitoring/fault model estimation by Ohta et al. (2012), high-performance real-time tsunami propagation/inundation modeling with NEC's vector supercomputer SX-ACE, damage/loss estimation models (Koshimura et al., 2013), and geo-informatics. After a major (near field) earthquake is triggered, the first response of the system is to identify the tsunami source model by applying RAPiD Algorithm (Ohta et al., 2012) to observed RTK-GPS time series at GEONET sites in Japan. As performed in the data obtained during the 2011 Tohoku event, we assume less than 10 minutes as the acquisition time of the source model. Given the tsunami source, the system moves on to running tsunami propagation and inundation model which was optimized on the vector supercomputer SX-ACE to acquire the estimation of time series of tsunami at offshore/coastal tide gauges to determine tsunami travel and arrival time, extent of inundation zone, maximum flow depth distribution. The implemented tsunami numerical model is based on the non-linear shallow-water equations discretized by finite difference method. The merged bathymetry and topography grids are prepared with 10 m resolution to better estimate the tsunami inland penetration. Given the maximum flow depth distribution, the system performs GIS analysis to determine the numbers of exposed population and structures using census data, then estimates the numbers of potential death and damaged structures by applying tsunami fragility curve (Koshimura et al., 2013). Since the tsunami source model is determined, the model is supposed to complete the estimation within 10 minutes. The results are disseminated as mapping products to responders and stakeholders, e.g. national and regional municipalities, to be utilized for their emergency/response activities. In 2014, the system is verified through the case studies of 2011 Tohoku event and potential earthquake scenarios along Nankai Trough with regard to its capability and robustness.
Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Chuanfu, E-mail: xuchuanfu@nudt.edu.cn; Deng, Xiaogang; Zhang, Lilun

Programming and optimizing complex, real-world CFD codes on current many-core accelerated HPC systems is very challenging, especially when collaborating CPUs and accelerators to fully tap the potential of heterogeneous systems. In this paper, with a tri-level hybrid and heterogeneous programming model using MPI + OpenMP + CUDA, we port and optimize our high-order multi-block structured CFD software HOSTA on the GPU-accelerated TianHe-1A supercomputer. HOSTA adopts two self-developed high-order compact definite difference schemes WCNS and HDCS that can simulate flows with complex geometries. We present a dual-level parallelization scheme for efficient multi-block computation on GPUs and perform particular kernel optimizations formore » high-order CFD schemes. The GPU-only approach achieves a speedup of about 1.3 when comparing one Tesla M2050 GPU with two Xeon X5670 CPUs. To achieve a greater speedup, we collaborate CPU and GPU for HOSTA instead of using a naive GPU-only approach. We present a novel scheme to balance the loads between the store-poor GPU and the store-rich CPU. Taking CPU and GPU load balance into account, we improve the maximum simulation problem size per TianHe-1A node for HOSTA by 2.3×, meanwhile the collaborative approach can improve the performance by around 45% compared to the GPU-only approach. Further, to scale HOSTA on TianHe-1A, we propose a gather/scatter optimization to minimize PCI-e data transfer times for ghost and singularity data of 3D grid blocks, and overlap the collaborative computation and communication as far as possible using some advanced CUDA and MPI features. Scalability tests show that HOSTA can achieve a parallel efficiency of above 60% on 1024 TianHe-1A nodes. With our method, we have successfully simulated an EET high-lift airfoil configuration containing 800M cells and China's large civil airplane configuration containing 150M cells. To our best knowledge, those are the largest-scale CPU–GPU collaborative simulations that solve realistic CFD problems with both complex configurations and high-order schemes.« less
ExM:System Support for Extreme-Scale, Many-Task Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Katz, Daniel S

The ever-increasing power of supercomputer systems is both driving and enabling the emergence of new problem-solving methods that require the effi cient execution of many concurrent and interacting tasks. Methodologies such as rational design (e.g., in materials science), uncertainty quanti fication (e.g., in engineering), parameter estimation (e.g., for chemical and nuclear potential functions, and in economic energy systems modeling), massive dynamic graph pruning (e.g., in phylogenetic searches), Monte-Carlo- based iterative fi xing (e.g., in protein structure prediction), and inverse modeling (e.g., in reservoir simulation) all have these requirements. These many-task applications frequently have aggregate computing needs that demand the fastestmore » computers. For example, proposed next-generation climate model ensemble studies will involve 1,000 or more runs, each requiring 10,000 cores for a week, to characterize model sensitivity to initial condition and parameter uncertainty. The goal of the ExM project is to achieve the technical advances required to execute such many-task applications efficiently, reliably, and easily on petascale and exascale computers. In this way, we will open up extreme-scale computing to new problem solving methods and application classes. In this document, we report on combined technical progress of the collaborative ExM project, and the institutional financial status of the portion of the project at University of Chicago, over the rst 8 months (through April 30, 2011)« less

Development of seismic tomography software for hybrid supercomputers

NASA Astrophysics Data System (ADS)

Nikitin, Alexandr; Serdyukov, Alexandr; Duchkov, Anton

2015-04-01

Seismic tomography is a technique used for computing velocity model of geologic structure from first arrival travel times of seismic waves. The technique is used in processing of regional and global seismic data, in seismic exploration for prospecting and exploration of mineral and hydrocarbon deposits, and in seismic engineering for monitoring the condition of engineering structures and the surrounding host medium. As a consequence of development of seismic monitoring systems and increasing volume of seismic data, there is a growing need for new, more effective computational algorithms for use in seismic tomography applications with improved performance, accuracy and resolution. To achieve this goal, it is necessary to use modern high performance computing systems, such as supercomputers with hybrid architecture that use not only CPUs, but also accelerators and co-processors for computation. The goal of this research is the development of parallel seismic tomography algorithms and software package for such systems, to be used in processing of large volumes of seismic data (hundreds of gigabytes and more). These algorithms and software package will be optimized for the most common computing devices used in modern hybrid supercomputers, such as Intel Xeon CPUs, NVIDIA Tesla accelerators and Intel Xeon Phi co-processors. In this work, the following general scheme of seismic tomography is utilized. Using the eikonal equation solver, arrival times of seismic waves are computed based on assumed velocity model of geologic structure being analyzed. In order to solve the linearized inverse problem, tomographic matrix is computed that connects model adjustments with travel time residuals, and the resulting system of linear equations is regularized and solved to adjust the model. The effectiveness of parallel implementations of existing algorithms on target architectures is considered. During the first stage of this work, algorithms were developed for execution on supercomputers using multicore CPUs only, with preliminary performance tests showing good parallel efficiency on large numerical grids. Porting of the algorithms to hybrid supercomputers is currently ongoing.
Supercomputing with TOUGH2 family codes for coupled multi-physics simulations of geologic carbon sequestration

NASA Astrophysics Data System (ADS)

Yamamoto, H.; Nakajima, K.; Zhang, K.; Nanai, S.

2015-12-01

Powerful numerical codes that are capable of modeling complex coupled processes of physics and chemistry have been developed for predicting the fate of CO2 in reservoirs as well as its potential impacts on groundwater and subsurface environments. However, they are often computationally demanding for solving highly non-linear models in sufficient spatial and temporal resolutions. Geological heterogeneity and uncertainties further increase the challenges in modeling works. Two-phase flow simulations in heterogeneous media usually require much longer computational time than that in homogeneous media. Uncertainties in reservoir properties may necessitate stochastic simulations with multiple realizations. Recently, massively parallel supercomputers with more than thousands of processors become available in scientific and engineering communities. Such supercomputers may attract attentions from geoscientist and reservoir engineers for solving the large and non-linear models in higher resolutions within a reasonable time. However, for making it a useful tool, it is essential to tackle several practical obstacles to utilize large number of processors effectively for general-purpose reservoir simulators. We have implemented massively-parallel versions of two TOUGH2 family codes (a multi-phase flow simulator TOUGH2 and a chemically reactive transport simulator TOUGHREACT) on two different types (vector- and scalar-type) of supercomputers with a thousand to tens of thousands of processors. After completing implementation and extensive tune-up on the supercomputers, the computational performance was measured for three simulations with multi-million grid models, including a simulation of the dissolution-diffusion-convection process that requires high spatial and temporal resolutions to simulate the growth of small convective fingers of CO2-dissolved water to larger ones in a reservoir scale. The performance measurement confirmed that the both simulators exhibit excellent scalabilities showing almost linear speedup against number of processors up to over ten thousand cores. Generally this allows us to perform coupled multi-physics (THC) simulations on high resolution geologic models with multi-million grid in a practical time (e.g., less than a second per time step).
Fusion of real-time simulation, sensing, and geo-informatics in assessing tsunami impact

NASA Astrophysics Data System (ADS)

Koshimura, S.; Inoue, T.; Hino, R.; Ohta, Y.; Kobayashi, H.; Musa, A.; Murashima, Y.; Gokon, H.

2015-12-01

Bringing together state-of-the-art high-performance computing, remote sensing and spatial information sciences, we establish a method of real-time tsunami inundation forecasting, damage estimation and mapping to enhance disaster response. Right after a major (near field) earthquake is triggered, we perform a real-time tsunami inundation forecasting with use of high-performance computing platform (Koshimura et al., 2014). Using Tohoku University's vector supercomputer, we accomplished "10-10-10 challenge", to complete tsunami source determination in 10 minutes, tsunami inundation modeling in 10 minutes with 10 m grid resolution. Given the maximum flow depth distribution, we perform quantitative estimation of exposed population using census data and mobile phone data, and the numbers of potential death and damaged structures by applying tsunami fragility curve. After the potential tsunami-affected areas are estimated, the analysis gets focused and moves on to the "detection" phase using remote sensing. Recent advances of remote sensing technologies expand capabilities of detecting spatial extent of tsunami affected area and structural damage. Especially, a semi-automated method to estimate building damage in tsunami affected areas is developed using pre- and post-event high-resolution SAR (Synthetic Aperture Radar) data. The method is verified through the case studies in the 2011 Tohoku and other potential tsunami scenarios, and the prototype system development is now underway in Kochi prefecture, one of at-risk coastal city against Nankai trough earthquake. In the trial operation, we verify the capability of the method as a new tsunami early warning and response system for stakeholders and responders.
Accurate quantum chemical calculations

NASA Technical Reports Server (NTRS)

Bauschlicher, Charles W., Jr.; Langhoff, Stephen R.; Taylor, Peter R.

1989-01-01

An important goal of quantum chemical calculations is to provide an understanding of chemical bonding and molecular electronic structure. A second goal, the prediction of energy differences to chemical accuracy, has been much harder to attain. First, the computational resources required to achieve such accuracy are very large, and second, it is not straightforward to demonstrate that an apparently accurate result, in terms of agreement with experiment, does not result from a cancellation of errors. Recent advances in electronic structure methodology, coupled with the power of vector supercomputers, have made it possible to solve a number of electronic structure problems exactly using the full configuration interaction (FCI) method within a subspace of the complete Hilbert space. These exact results can be used to benchmark approximate techniques that are applicable to a wider range of chemical and physical problems. The methodology of many-electron quantum chemistry is reviewed. Methods are considered in detail for performing FCI calculations. The application of FCI methods to several three-electron problems in molecular physics are discussed. A number of benchmark applications of FCI wave functions are described. Atomic basis sets and the development of improved methods for handling very large basis sets are discussed: these are then applied to a number of chemical and spectroscopic problems; to transition metals; and to problems involving potential energy surfaces. Although the experiences described give considerable grounds for optimism about the general ability to perform accurate calculations, there are several problems that have proved less tractable, at least with current computer resources, and these and possible solutions are discussed.
A workflow for the automatic segmentation of organelles in electron microscopy image stacks

PubMed Central

Perez, Alex J.; Seyedhosseini, Mojtaba; Deerinck, Thomas J.; Bushong, Eric A.; Panda, Satchidananda; Tasdizen, Tolga; Ellisman, Mark H.

2014-01-01

Electron microscopy (EM) facilitates analysis of the form, distribution, and functional status of key organelle systems in various pathological processes, including those associated with neurodegenerative disease. Such EM data often provide important new insights into the underlying disease mechanisms. The development of more accurate and efficient methods to quantify changes in subcellular microanatomy has already proven key to understanding the pathogenesis of Parkinson's and Alzheimer's diseases, as well as glaucoma. While our ability to acquire large volumes of 3D EM data is progressing rapidly, more advanced analysis tools are needed to assist in measuring precise three-dimensional morphologies of organelles within data sets that can include hundreds to thousands of whole cells. Although new imaging instrument throughputs can exceed teravoxels of data per day, image segmentation and analysis remain significant bottlenecks to achieving quantitative descriptions of whole cell structural organellomes. Here, we present a novel method for the automatic segmentation of organelles in 3D EM image stacks. Segmentations are generated using only 2D image information, making the method suitable for anisotropic imaging techniques such as serial block-face scanning electron microscopy (SBEM). Additionally, no assumptions about 3D organelle morphology are made, ensuring the method can be easily expanded to any number of structurally and functionally diverse organelles. Following the presentation of our algorithm, we validate its performance by assessing the segmentation accuracy of different organelle targets in an example SBEM dataset and demonstrate that it can be efficiently parallelized on supercomputing resources, resulting in a dramatic reduction in runtime. PMID:25426032
High Resolution DNS of Turbulent Flows using an Adaptive, Finite Volume Method

NASA Astrophysics Data System (ADS)

Trebotich, David

2014-11-01

We present a new computational capability for high resolution simulation of incompressible viscous flows. Our approach is based on cut cell methods where an irregular geometry such as a bluff body is intersected with a rectangular Cartesian grid resulting in cut cells near the boundary. In the cut cells we use a conservative discretization based on a discrete form of the divergence theorem to approximate fluxes for elliptic and hyperbolic terms in the Navier-Stokes equations. Away from the boundary the method reduces to a finite difference method. The algorithm is implemented in the Chombo software framework which supports adaptive mesh refinement and massively parallel computations. The code is scalable to 200,000 + processor cores on DOE supercomputers, resulting in DNS studies at unprecedented scale and resolution. For flow past a cylinder in transition (Re = 300) we observe a number of secondary structures in the far wake in 2D where the wake is over 120 cylinder diameters in length. These are compared with the more regularized wake structures in 3D at the same scale. For flow past a sphere (Re = 600) we resolve an arrowhead structure in the velocity in the near wake. The effectiveness of AMR is further highlighted in a simulation of turbulent flow (Re = 6000) in the contraction of an oil well blowout preventer. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Contract Number DE-AC02-05-CH11231.
Earth and environmental science in the 1980's: Part 1: Environmental data systems, supercomputer facilities and networks

NASA Technical Reports Server (NTRS)

1986-01-01

Overview descriptions of on-line environmental data systems, supercomputer facilities, and networks are presented. Each description addresses the concepts of content, capability, and user access relevant to the point of view of potential utilization by the Earth and environmental science community. The information on similar systems or facilities is presented in parallel fashion to encourage and facilitate intercomparison. In addition, summary sheets are given for each description, and a summary table precedes each section.
A Long History of Supercomputing

ScienceCinema

Grider, Gary

2018-06-13

As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nationâs first computer to building the first machine to break the petaflop barrier, Los Alamos holds many âfirstsâ in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.
LightForce Photon-Pressure Collision Avoidance: Updated Efficiency Analysis Utilizing a Highly Parallel Simulation Approach

DTIC Science & Technology

2014-09-01

simulation time frame from 30 days to one year. This was enabled by porting the simulation to the Pleiades supercomputer at NASA Ames Research Center, a...including the motivation for changes to our past approach. We then present the software implementation (3) on the NASA Ames Pleiades supercomputer...significantly updated since last year’s paper [25]. The main incentive for that was the shift to a highly parallel approach in order to utilize the Pleiades
Parallel-Vector Algorithm For Rapid Structural Anlysis

NASA Technical Reports Server (NTRS)

Agarwal, Tarun R.; Nguyen, Duc T.; Storaasli, Olaf O.

1993-01-01

New algorithm developed to overcome deficiency of skyline storage scheme by use of variable-band storage scheme. Exploits both parallel and vector capabilities of modern high-performance computers. Gives engineers and designers opportunity to include more design variables and constraints during optimization of structures. Enables use of more refined finite-element meshes to obtain improved understanding of complex behaviors of aerospace structures leading to better, safer designs. Not only attractive for current supercomputers but also for next generation of shared-memory supercomputers.
Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.; Kwak, Dochan (Technical Monitor)

2002-01-01

A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel supercomputers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.
PerSEUS: Ultra-Low-Power High Performance Computing for Plasma Simulations

NASA Astrophysics Data System (ADS)

Doxas, I.; Andreou, A.; Lyon, J.; Angelopoulos, V.; Lu, S.; Pritchett, P. L.

2017-12-01

Peta-op SupErcomputing Unconventional System (PerSEUS) aims to explore the use for High Performance Scientific Computing (HPC) of ultra-low-power mixed signal unconventional computational elements developed by Johns Hopkins University (JHU), and demonstrate that capability on both fluid and particle Plasma codes. We will describe the JHU Mixed-signal Unconventional Supercomputing Elements (MUSE), and report initial results for the Lyon-Fedder-Mobarry (LFM) global magnetospheric MHD code, and a UCLA general purpose relativistic Particle-In-Cell (PIC) code.
AIC Computations Using Navier-Stokes Equations on Single Image Supercomputers For Design Optimization

NASA Technical Reports Server (NTRS)

Guruswamy, Guru

2004-01-01

A procedure to accurately generate AIC using the Navier-Stokes solver including grid deformation is presented. Preliminary results show good comparisons between experiment and computed flutter boundaries for a rectangular wing. A full wing body configuration of an orbital space plane is selected for demonstration on a large number of processors. In the final paper the AIC of full wing body configuration will be computed. The scalability of the procedure on supercomputer will be demonstrated.
Discover Supercomputer 5

NASA Image and Video Library

2017-12-08

Two rows of the “Discover” supercomputer at the NASA Center for Climate Simulation (NCCS) contain more than 4,000 computer processors. Discover has a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
Discover Supercomputer 4

NASA Image and Video Library

2017-12-08

This close-up view highlights one row—approximately 2,000 computer processors—of the “Discover” supercomputer at the NASA Center for Climate Simulation (NCCS). Discover has a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
Predicting Cost/Performance Trade-Offs for Whitney: A Commodity Computing Cluster

NASA Technical Reports Server (NTRS)

Becker, Jeffrey C.; Nitzberg, Bill; VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)

1997-01-01

Recent advances in low-end processor and network technology have made it possible to build a "supercomputer" out of commodity components. We develop simple models of the NAS Parallel Benchmarks version 2 (NPB 2) to explore the cost/performance trade-offs involved in building a balanced parallel computer supporting a scientific workload. We develop closed form expressions detailing the number and size of messages sent by each benchmark. Coupling these with measured single processor performance, network latency, and network bandwidth, our models predict benchmark performance to within 30%. A comparison based on total system cost reveals that current commodity technology (200 MHz Pentium Pros with 100baseT Ethernet) is well balanced for the NPBs up to a total system cost of around $1,000,000.
An Automated, High-Throughput System for GISAXS and GIWAXS Measurements of Thin Films

NASA Astrophysics Data System (ADS)

Schaible, Eric; Jimenez, Jessica; Church, Matthew; Lim, Eunhee; Stewart, Polite; Hexemer, Alexander

Grazing incidence small-angle X-ray scattering (GISAXS) and grazing incidence wide-angle X-ray scattering (GIWAXS) are important techniques for characterizing thin films. In order to meet rapidly increasing demand, the SAXSWAXS beamline at the Advanced Light Source (beamline 7.3.3) has implemented a fully automated, high-throughput system to conduct SAXS, GISAXS and GIWAXS measurements. An automated robot arm transfers samples from a holding tray to a measurement stage. Intelligent software aligns each sample in turn, and measures each according to user-defined specifications. Users mail in trays of samples on individually barcoded pucks, and can download and view their data remotely. Data will be pipelined to the NERSC supercomputing facility, and will be available to users via a web portal that facilitates highly parallelized analysis.
FFTs in external or hierarchical memory

NASA Technical Reports Server (NTRS)

Bailey, David H.

1989-01-01

A description is given of advanced techniques for computing an ordered FFT on a computer with external or hierarchical memory. These algorithms (1) require as few as two passes through the external data set, (2) use strictly unit stride, long vector transfers between main memory and external storage, (3) require only a modest amount of scratch space in main memory, and (4) are well suited for vector and parallel computation. Performance figures are included for implementations of some of these algorithms on Cray supercomputers. Of interest is the fact that a main memory version outperforms the current Cray library FFT routines on the Cray-2, the Cray X-MP, and the Cray Y-MP systems. Using all eight processors on the Cray Y-MP, this main memory routine runs at nearly 2 Gflops.
Volumetric visualization of 3D data

NASA Technical Reports Server (NTRS)

Russell, Gregory; Miles, Richard

1989-01-01

In recent years, there has been a rapid growth in the ability to obtain detailed data on large complex structures in three dimensions. This development occurred first in the medical field, with CAT (computer aided tomography) scans and now magnetic resonance imaging, and in seismological exploration. With the advances in supercomputing and computational fluid dynamics, and in experimental techniques in fluid dynamics, there is now the ability to produce similar large data fields representing 3D structures and phenomena in these disciplines. These developments have produced a situation in which currently there is access to data which is too complex to be understood using the tools available for data reduction and presentation. Researchers in these areas are becoming limited by their ability to visualize and comprehend the 3D systems they are measuring and simulating.
Science & Technology Review November 2007

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chinn, D J

2007-10-16

This month's issue has the following articles: (1) Simulating the Electromagnetic World--Commentary by Steven R. Patterson; (2) A Code to Model Electromagnetic Phenomena--EMSolve, a Livermore supercomputer code that simulates electromagnetic fields, is helping advance a wide range of research efforts; (3) Characterizing Virulent Pathogens--Livermore researchers are developing multiplexed assays for rapid detection of pathogens; (4) Imaging at the Atomic Level--A powerful new electron microscope at the Laboratory is resolving materials at the atomic level for the first time; (5) Scientists without Borders--Livermore scientists lend their expertise on peaceful nuclear applications to their counterparts in other countries; and (6) Probing Deepmore » into the Nucleus--Edward Teller's contributions to the fast-growing fields of nuclear and particle physics were part of a physics golden age.« less

Programmable lithography engine (ProLE) grid-type supercomputer and its applications

NASA Astrophysics Data System (ADS)

Petersen, John S.; Maslow, Mark J.; Gerold, David J.; Greenway, Robert T.

2003-06-01

There are many variables that can affect lithographic dependent device yield. Because of this, it is not enough to make optical proximity corrections (OPC) based on the mask type, wavelength, lens, illumination-type and coherence. Resist chemistry and physics along with substrate, exposure, and all post-exposure processing must be considered too. Only a holistic approach to finding imaging solutions will accelerate yield and maximize performance. Since experiments are too costly in both time and money, accomplishing this takes massive amounts of accurate simulation capability. Our solution is to create a workbench that has a set of advanced user applications that utilize best-in-class simulator engines for solving litho-related DFM problems using distributive computing. Our product, ProLE (Programmable Lithography Engine), is an integrated system that combines Petersen Advanced Lithography Inc."s (PAL"s) proprietary applications and cluster management software wrapped around commercial software engines, along with optional commercial hardware and software. It uses the most rigorous lithography simulation engines to solve deep sub-wavelength imaging problems accurately and at speeds that are several orders of magnitude faster than current methods. Specifically, ProLE uses full vector thin-mask aerial image models or when needed, full across source 3D electromagnetic field simulation to make accurate aerial image predictions along with calibrated resist models;. The ProLE workstation from Petersen Advanced Lithography, Inc., is the first commercial product that makes it possible to do these intensive calculations at a fraction of a time previously available thus significantly reducing time to market for advance technology devices. In this work, ProLE is introduced, through model comparison to show why vector imaging and rigorous resist models work better than other less rigorous models, then some applications of that use our distributive computing solution are shown. Topics covered describe why ProLE solutions are needed from an economic and technical aspect, a high level discussion of how the distributive system works, speed benchmarking, and finally, a brief survey of applications including advanced aberrations for lens sensitivity and flare studies, optical-proximity-correction for a bitcell and an application that will allow evaluation of the potential of a design to have systematic failures during fabrication.
DART: A Community Facility Providing State-of-the-Art, Efficient Ensemble Data Assimilation for Large (Coupled) Geophysical Models

NASA Astrophysics Data System (ADS)

Hoar, T. J.; Anderson, J. L.; Collins, N.; Kershaw, H.; Hendricks, J.; Raeder, K.; Mizzi, A. P.; Barré, J.; Gaubert, B.; Madaus, L. E.; Aydogdu, A.; Raeder, J.; Arango, H.; Moore, A. M.; Edwards, C. A.; Curchitser, E. N.; Escudier, R.; Dussin, R.; Bitz, C. M.; Zhang, Y. F.; Shrestha, P.; Rosolem, R.; Rahman, M.

2016-12-01

Strongly-coupled ensemble data assimilation with multiple high-resolution model components requires massive state vectors which need to be efficiently stored and accessed throughout the assimilation process. Supercomputer architectures are tending towards increasing the number of cores per node but have the same or less memory per node. Recent advances in the Data Assimilation Research Testbed (DART), a freely-available community ensemble data assimilation facility that works with dozens of large geophysical models, have addressed the need to run with a smaller memory footprint on a higher node count by utilizing MPI-2 one-sided communication to do non-blocking asynchronous access of distributed data. DART runs efficiently on many computational platforms ranging from laptops through thousands of cores on the newest supercomputers. Benefits of the new DART implementation will be shown. In addition, overviews of the most recently supported models will be presented: CAM-CHEM, WRF-CHEM, CM1, OpenGGCM, FESOM, ROMS, CICE5, TerrSysMP (COSMO, CLM, ParFlow), JULES, and CABLE. DART provides a comprehensive suite of software, documentation, and tutorials that can be used for ensemble data assimilation research, operations, and education. Scientists and software engineers at NCAR are available to support DART users who want to use existing DART products or develop their own applications. Current DART users range from university professors teaching data assimilation, to individual graduate students working with simple models, through national laboratories and state agencies doing operational prediction with large state-of-the-art models.
Computational fluid dynamics research at the United Technologies Research Center requiring supercomputers

NASA Astrophysics Data System (ADS)

Landgrebe, Anton J.

1987-03-01

An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers.
Computational fluid dynamics research at the United Technologies Research Center requiring supercomputers

NASA Technical Reports Server (NTRS)

Landgrebe, Anton J.

1987-01-01

An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers.
Antenna pattern control using impedance surfaces

NASA Technical Reports Server (NTRS)

Balanis, Constantine A.; Liu, Kefeng

1992-01-01

During this research period, we have effectively transferred existing computer codes from CRAY supercomputer to work station based systems. The work station based version of our code preserved the accuracy of the numerical computations while giving a much better turn-around time than the CRAY supercomputer. Such a task relieved us of the heavy dependence of the supercomputer account budget and made codes developed in this research project more feasible for applications. The analysis of pyramidal horns with impedance surfaces was our major focus during this research period. Three different modeling algorithms in analyzing lossy impedance surfaces were investigated and compared with measured data. Through this investigation, we discovered that a hybrid Fourier transform technique, which uses the eigen mode in the stepped waveguide section and the Fourier transformed field distributions across the stepped discontinuities for lossy impedances coating, gives a better accuracy in analyzing lossy coatings. After a further refinement of the present technique, we will perform an accurate radiation pattern synthesis in the coming reporting period.
Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization

NASA Technical Reports Server (NTRS)

Jones, James Patton; Nitzberg, Bill

1999-01-01

The NAS facility has operated parallel supercomputers for the past 11 years, including the Intel iPSC/860, Intel Paragon, Thinking Machines CM-5, IBM SP-2, and Cray Origin 2000. Across this wide variety of machine architectures, across a span of 10 years, across a large number of different users, and through thousands of minor configuration and policy changes, the utilization of these machines shows three general trends: (1) scheduling using a naive FIFO first-fit policy results in 40-60% utilization, (2) switching to the more sophisticated dynamic backfilling scheduling algorithm improves utilization by about 15 percentage points (yielding about 70% utilization), and (3) reducing the maximum allowable job size further increases utilization. Most surprising is the consistency of these trends. Over the lifetime of the NAS parallel systems, we made hundreds, perhaps thousands, of small changes to hardware, software, and policy, yet, utilization was affected little. In particular these results show that the goal of achieving near 100% utilization while supporting a real parallel supercomputing workload is unrealistic.
Generating unstructured nuclear reactor core meshes in parallel

DOE PAGES

Jain, Rajeev; Tautges, Timothy J.

2014-10-24

Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor coremore » examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.« less
Design of a neural network simulator on a transputer array

NASA Technical Reports Server (NTRS)

Mcintire, Gary; Villarreal, James; Baffes, Paul; Rua, Monica

1987-01-01

A brief summary of neural networks is presented which concentrates on the design constraints imposed. Major design issues are discussed together with analysis methods and the chosen solutions. Although the system will be capable of running on most transputer architectures, it currently is being implemented on a 40-transputer system connected to a toroidal architecture. Predictions show a performance level equivalent to that of a highly optimized simulator running on the SX-2 supercomputer.
Algorithms for the Euler and Navier-Stokes equations for supercomputers

NASA Technical Reports Server (NTRS)

Turkel, E.

1985-01-01

The steady state Euler and Navier-Stokes equations are considered for both compressible and incompressible flow. Methods are found for accelerating the convergence to a steady state. This acceleration is based on preconditioning the system so that it is no longer time consistent. In order that the acceleration technique be scheme-independent, this preconditioning is done at the differential equation level. Applications are presented for very slow flows and also for the incompressible equations.
Chemical calculations on Cray computers

NASA Technical Reports Server (NTRS)

Taylor, Peter R.; Bauschlicher, Charles W., Jr.; Schwenke, David W.

1989-01-01

The influence of recent developments in supercomputing on computational chemistry is discussed with particular reference to Cray computers and their pipelined vector/limited parallel architectures. After reviewing Cray hardware and software the performance of different elementary program structures are examined, and effective methods for improving program performance are outlined. The computational strategies appropriate for obtaining optimum performance in applications to quantum chemistry and dynamics are discussed. Finally, some discussion is given of new developments and future hardware and software improvements.
High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies.

PubMed

Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias

2015-01-01

Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Parallel computing of a climate model on the dawn 1000 by domain decomposition method

NASA Astrophysics Data System (ADS)

Bi, Xunqiang

1997-12-01

In this paper the parallel computing of a grid-point nine-level atmospheric general circulation model on the Dawn 1000 is introduced. The model was developed by the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS). The Dawn 1000 is a MIMD massive parallel computer made by National Research Center for Intelligent Computer (NCIC), CAS. A two-dimensional domain decomposition method is adopted to perform the parallel computing. The potential ways to increase the speed-up ratio and exploit more resources of future massively parallel supercomputation are also discussed.
The CSM testbed software system: A development environment for structural analysis methods on the NAS CRAY-2

NASA Technical Reports Server (NTRS)

Gillian, Ronnie E.; Lotts, Christine G.

1988-01-01

The Computational Structural Mechanics (CSM) Activity at Langley Research Center is developing methods for structural analysis on modern computers. To facilitate that research effort, an applications development environment has been constructed to insulate the researcher from the many computer operating systems of a widely distributed computer network. The CSM Testbed development system was ported to the Numerical Aerodynamic Simulator (NAS) Cray-2, at the Ames Research Center, to provide a high end computational capability. This paper describes the implementation experiences, the resulting capability, and the future directions for the Testbed on supercomputers.
Building more powerful less expensive supercomputers using Processing-In-Memory (PIM) LDRD final report.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Murphy, Richard C.

2009-09-01

This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential ofmore » PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.« less
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

PubMed

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Asynchronous communication in spectral-element and discontinuous Galerkin methods for atmospheric dynamics - a case study using the High-Order Methods Modeling Environment (HOMME-homme_dg_branch)

NASA Astrophysics Data System (ADS)

Jamroz, Benjamin F.; Klöfkorn, Robert

2016-08-01

The scalability of computational applications on current and next-generation supercomputers is increasingly limited by the cost of inter-process communication. We implement non-blocking asynchronous communication in the High-Order Methods Modeling Environment for the time integration of the hydrostatic fluid equations using both the spectral-element and discontinuous Galerkin methods. This allows the overlap of computation with communication, effectively hiding some of the costs of communication. A novel detail about our approach is that it provides some data movement to be performed during the asynchronous communication even in the absence of other computations. This method produces significant performance and scalability gains in large-scale simulations.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Keefer, Donald A.; Shaffer, Eric G.; Storsved, Brynne

A free software application, RVA, has been developed as a plugin to the US DOE-funded ParaView visualization package, to provide support in the visualization and analysis of complex reservoirs being managed using multi-fluid EOR techniques. RVA, for Reservoir Visualization and Analysis, was developed as an open-source plugin to the 64 bit Windows version of ParaView 3.14. RVA was developed at the University of Illinois at Urbana-Champaign, with contributions from the Illinois State Geological Survey, Department of Computer Science and National Center for Supercomputing Applications. RVA was designed to utilize and enhance the state-of-the-art visualization capabilities within ParaView, readily allowing jointmore » visualization of geologic framework and reservoir fluid simulation model results. Particular emphasis was placed on enabling visualization and analysis of simulation results highlighting multiple fluid phases, multiple properties for each fluid phase (including flow lines), multiple geologic models and multiple time steps. Additional advanced functionality was provided through the development of custom code to implement data mining capabilities. The built-in functionality of ParaView provides the capacity to process and visualize data sets ranging from small models on local desktop systems to extremely large models created and stored on remote supercomputers. The RVA plugin that we developed and the associated User Manual provide improved functionality through new software tools, and instruction in the use of ParaView-RVA, targeted to petroleum engineers and geologists in industry and research. The RVA web site (http://rva.cs.illinois.edu) provides an overview of functions, and the development web site (https://github.com/shaffer1/RVA) provides ready access to the source code, compiled binaries, user manual, and a suite of demonstration data sets. Key functionality has been included to support a range of reservoirs visualization and analysis needs, including: sophisticated connectivity analysis, cross sections through simulation results between selected wells, simplified volumetric calculations, global vertical exaggeration adjustments, ingestion of UTChem simulation results, ingestion of Isatis geostatistical framework models, interrogation of joint geologic and reservoir modeling results, joint visualization and analysis of well history files, location-targeted visualization, advanced correlation analysis, visualization of flow paths, and creation of static images and animations highlighting targeted reservoir features.« less
RVA: A Plugin for ParaView 3.14

DOE Office of Scientific and Technical Information (OSTI.GOV)

2015-09-04

RVA is a plugin developed for the 64-bit Windows version of the ParaView 3.14 visualization package. RVA is designed to provide support in the visualization and analysis of complex reservoirs being managed using multi-fluid EOR techniques. RVA, for Reservoir Visualization and Analysis, was developed at the University of Illinois at Urbana-Champaign, with contributions from the Illinois State Geological Survey, Department of Computer Science and National Center for Supercomputing Applications. RVA was designed to utilize and enhance the state-of-the-art visualization capabilities within ParaView, readily allowing joint visualization of geologic framework and reservoir fluid simulation model results. Particular emphasis was placed onmore » enabling visualization and analysis of simulation results highlighting multiple fluid phases, multiple properties for each fluid phase (including flow lines), multiple geologic models and multiple time steps. Additional advanced functionality was provided through the development of custom code to implement data mining capabilities. The built-in functionality of ParaView provides the capacity to process and visualize data sets ranging from small models on local desktop systems to extremely large models created and stored on remote supercomputers. The RVA plugin that we developed and the associated User Manual provide improved functionality through new software tools, and instruction in the use of ParaView-RVA, targeted to petroleum engineers and geologists in industry and research. The RVA web site (http://rva.cs.illinois.edu) provides an overview of functions, and the development web site (https://github.com/shaffer1/RVA) provides ready access to the source code, compiled binaries, user manual, and a suite of demonstration data sets. Key functionality has been included to support a range of reservoirs visualization and analysis needs, including: sophisticated connectivity analysis, cross sections through simulation results between selected wells, simplified volumetric calculations, global vertical exaggeration adjustments, ingestion of UTChem simulation results, ingestion of Isatis geostatistical framework models, interrogation of joint geologic and reservoir modeling results, joint visualization and analysis of well history files, location-targeted visualization, advanced correlation analysis, visualization of flow paths, and creation of static images and animations highlighting targeted reservoir features.« less
Multiscale Processes of Hurricane Sandy (2012) as Revealed by the CAMVis-MAP

NASA Astrophysics Data System (ADS)

Shen, B.; Li, J. F.; Cheung, S.

2013-12-01

In late October 2012, Storm Sandy made landfall near Brigantine, New Jersey, devastating surrounding areas and causing tremendous economic loss and hundreds of fatalities (Blake et al., 2013). An estimated damage of $50 billion made Sandy become the second costliest tropical cyclone (TC) in US history, surpassed only by Hurricane Katrina (2005). Central questions to be addressed include (1) to what extent the lead time of severe storm prediction such as Sandy can be extended (e.g., Emanuel 2012); and (2) whether and how advanced global model, supercomputing technology and numerical algorithm can help effectively illustrate the complicated physical processes that are associated with the evolution of the storms. In this study, the predictability of Sandy is addressed with a focus on short-term (or extended-range) genesis prediction as the first step toward the goal of understanding the relationship between extreme events, such as Sandy, and the current climate. The newly deployed Coupled Advanced global mesoscale Modeling (GMM) and concurrent Visualization (CAMVis) system is used for this study. We will show remarkable simulations of Hurricane Sandy with the GMM, including realistic 7-day track and intensity forecast and genesis predictions with a lead time of up to 6 days (e.g., Shen et al., 2013, GRL, submitted). We then discuss the enabling role of the high-resolution 4-D (time-X-Y-Z) visualizations in illustrating TC's transient dynamics and its interaction with tropical waves. In addition, we have finished the parallel implementation of the ensemble empirical mode decomposition (PEEMD, Cheung et al., 2013, AGU13, submitted) method that will be soon integrated into the multiscale analysis package (MAP) for the analysis of tropical weather systems such as TCs and tropical waves. While the original EEMD has previously shown superior performance in decomposition of nonlinear (local) and non-stationary data into different intrinsic modes which stay within the natural filter period windows, the PEEMD achieves a speedup of over 100 times as compared to the original EEMD. The advanced GMM, 4D visualizations and PEEMD method are being used to examine the multiscale processes of Sandy and its environmental flows that may contribute to the extended lead-time predictability of Hurricane Sandy. Figure 1: Evolution of Hurricane Sandy (2012) as revealed by the advanced visualization.
Delivering Science on Day One

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, Timothy J.

2016-03-01

While benchmarking software is useful for testing the performance limits and stability of Argonne National Laboratory’s new Theta supercomputer, there is no substitute for running real applications to explore the system’s potential. The Argonne Leadership Computing Facility’s Theta Early Science Program, modeled after its highly successful code migration program for the Mira supercomputer, has one primary aim: to deliver science on day one. Here is a closer look at the type of science problems that will be getting early access to Theta, a next-generation machine being rolled out this year.

Supercomputer analysis of sedimentary basins.

PubMed

Bethke, C M; Altaner, S P; Harrison, W J; Upson, C

1988-01-15

Geological processes of fluid transport and chemical reaction in sedimentary basins have formed many of the earth's energy and mineral resources. These processes can be analyzed on natural time and distance scales with the use of supercomputers. Numerical experiments are presented that give insights to the factors controlling subsurface pressures, temperatures, and reactions; the origin of ores; and the distribution and quality of hydrocarbon reservoirs. The results show that numerical analysis combined with stratigraphic, sea level, and plate tectonic histories provides a powerful tool for studying the evolution of sedimentary basins over geologic time.
Discover Supercomputer 3

NASA Image and Video Library

2017-12-08

The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
Discover Supercomputer 2

NASA Image and Video Library

2017-12-08

The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
Discover Supercomputer 1

NASA Image and Video Library

2017-12-08

The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
The Navier-Stokes computer

NASA Technical Reports Server (NTRS)

Nosenchuck, D. M.; Littman, M. G.

1986-01-01

The Navier-Stokes computer (NSC) has been developed for solving problems in fluid mechanics involving complex flow simulations that require more speed and capacity than provided by current and proposed Class VI supercomputers. The machine is a parallel processing supercomputer with several new architectural elements which can be programmed to address a wide range of problems meeting the following criteria: (1) the problem is numerically intensive, and (2) the code makes use of long vectors. A simulation of two-dimensional nonsteady viscous flows is presented to illustrate the architecture, programming, and some of the capabilities of the NSC.
Merging the Machines of Modern Science

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Laura; Collins, Jim

Two recent projects have harnessed supercomputing resources at the US Department of Energy’s Argonne National Laboratory in a novel way to support major fusion science and particle collider experiments. Using leadership computing resources, one team ran fine-grid analysis of real-time data to make near-real-time adjustments to an ongoing experiment, while a second team is working to integrate Argonne’s supercomputers into the Large Hadron Collider/ATLAS workflow. Together these efforts represent a new paradigm of the high-performance computing center as a partner in experimental science.
Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

NASA Astrophysics Data System (ADS)

Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.

2012-06-01

Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.
Tools for 3D scientific visualization in computational aerodynamics

NASA Technical Reports Server (NTRS)

Bancroft, Gordon; Plessel, Todd; Merritt, Fergus; Watson, Val

1989-01-01

The purpose is to describe the tools and techniques in use at the NASA Ames Research Center for performing visualization of computational aerodynamics, for example visualization of flow fields from computer simulations of fluid dynamics about vehicles such as the Space Shuttle. The hardware used for visualization is a high-performance graphics workstation connected to a super computer with a high speed channel. At present, the workstation is a Silicon Graphics IRIS 3130, the supercomputer is a CRAY2, and the high speed channel is a hyperchannel. The three techniques used for visualization are post-processing, tracking, and steering. Post-processing analysis is done after the simulation. Tracking analysis is done during a simulation but is not interactive, whereas steering analysis involves modifying the simulation interactively during the simulation. Using post-processing methods, a flow simulation is executed on a supercomputer and, after the simulation is complete, the results of the simulation are processed for viewing. The software in use and under development at NASA Ames Research Center for performing these types of tasks in computational aerodynamics is described. Workstation performance issues, benchmarking, and high-performance networks for this purpose are also discussed as well as descriptions of other hardware for digital video and film recording.
Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project.

PubMed

Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L

2012-06-13

Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.
Sharing lattice QCD data over a widely distributed file system

NASA Astrophysics Data System (ADS)

Amagasa, T.; Aoki, S.; Aoki, Y.; Aoyama, T.; Doi, T.; Fukumura, K.; Ishii, N.; Ishikawa, K.-I.; Jitsumoto, H.; Kamano, H.; Konno, Y.; Matsufuru, H.; Mikami, Y.; Miura, K.; Sato, M.; Takeda, S.; Tatebe, O.; Togawa, H.; Ukawa, A.; Ukita, N.; Watanabe, Y.; Yamazaki, T.; Yoshie, T.

2015-12-01

JLDG is a data-grid for the lattice QCD (LQCD) community in Japan. Several large research groups in Japan have been working on lattice QCD simulations using supercomputers distributed over distant sites. The JLDG provides such collaborations with an efficient method of data management and sharing. File servers installed on 9 sites are connected to the NII SINET VPN and are bound into a single file system with the GFarm. The file system looks the same from any sites, so that users can do analyses on a supercomputer on a site, using data generated and stored in the JLDG at a different site. We present a brief description of hardware and software of the JLDG, including a recently developed subsystem for cooperating with the HPCI shared storage, and report performance and statistics of the JLDG. As of April 2015, 15 research groups (61 users) store their daily research data of 4.7PB including replica and 68 million files in total. Number of publications for works which used the JLDG is 98. The large number of publications and recent rapid increase of disk usage convince us that the JLDG has grown up into a useful infrastructure for LQCD community in Japan.
A Spectral Algorithm for Solving the Relativistic Vlasov-Maxwell Equations

NASA Technical Reports Server (NTRS)

Shebalin, John V.

2001-01-01

A spectral method algorithm is developed for the numerical solution of the full six-dimensional Vlasov-Maxwell system of equations. Here, the focus is on the electron distribution function, with positive ions providing a constant background. The algorithm consists of a Jacobi polynomial-spherical harmonic formulation in velocity space and a trigonometric formulation in position space. A transform procedure is used to evaluate nonlinear terms. The algorithm is suitable for performing moderate resolution simulations on currently available supercomputers for both scientific and engineering applications.
Parallel FEM Simulation of Electromechanics in the Heart

NASA Astrophysics Data System (ADS)

Xia, Henian; Wong, Kwai; Zhao, Xiaopeng

2011-11-01

Cardiovascular disease is the leading cause of death in America. Computer simulation of complicated dynamics of the heart could provide valuable quantitative guidance for diagnosis and treatment of heart problems. In this paper, we present an integrated numerical model which encompasses the interaction of cardiac electrophysiology, electromechanics, and mechanoelectrical feedback. The model is solved by finite element method on a Linux cluster and the Cray XT5 supercomputer, kraken. Dynamical influences between the effects of electromechanics coupling and mechanic-electric feedback are shown.
The p- and h-p Versions of the Finite Element Method: An Overview

DTIC Science & Technology

1989-05-01

and h-p versions was studied in detail for 1 dimension in [48] and for 2 dimensions in [491. Let us mention some one dimensional results. We consider...Supercomputing in the Automotive Industry, Oct. 25-28 1988, Seville Spain, Report of Noetic Tech., St. Louis, NO 63117. 70. H. Vogelius, An analysis of the p...collaboration with govern- ment agencies such as the National Bureau of Standards. " To be an international center of study and research for foreign
Iterative-method performance evaluation for multiple vectors associated with a large-scale sparse matrix

NASA Astrophysics Data System (ADS)

Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo

2016-07-01

Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
A vectorized Lanczos eigensolver for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1990-01-01

The computational strategies used to implement a Lanczos-based-method eigensolver on the latest generation of supercomputers are described. Several examples of structural vibration and buckling problems are presented that show the effects of using optimization techniques to increase the vectorization of the computational steps. The data storage and access schemes and the tools and strategies that best exploit the computer resources are presented. The method is implemented on the Convex C220, the Cray 2, and the Cray Y-MP computers. Results show that very good computation rates are achieved for the most computationally intensive steps of the Lanczos algorithm and that the Lanczos algorithm is many times faster than other methods extensively used in the past.
RIACS

NASA Technical Reports Server (NTRS)

Oliger, Joseph

1997-01-01

Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project.
On the Convenience of Using the Complete Linearization Method in Modelling the BLR of AGN

NASA Astrophysics Data System (ADS)

Patriarchi, P.; Perinotto, M.

The Complete Linearization Method (Mihalas, 1978) consists in the determination of the radiation field (at a set of frequency points), atomic level populations, temperature, electron density etc., by resolving the system of radiative transfer, thermal equilibrium, statistical equilibrium equations simultaneously and self-consistently. Since the system is not linear, it must be solved by iteration after linearization, using a perturbative method, starting from an initial guess solution. Of course the Complete Linearization Method is more time consuming than the previous one. But how great can this disadvantage be in the age of supercomputers? It is possible to approximately evaluate the CPU time needed to run a model by computing the number of multiplications necessary to solve the system.
Internal fluid mechanics research on supercomputers for aerospace propulsion systems

NASA Technical Reports Server (NTRS)

Miller, Brent A.; Anderson, Bernhard H.; Szuch, John R.

1988-01-01

The Internal Fluid Mechanics Division of the NASA Lewis Research Center is combining the key elements of computational fluid dynamics, aerothermodynamic experiments, and advanced computational technology to bring internal computational fluid mechanics (ICFM) to a state of practical application for aerospace propulsion systems. The strategies used to achieve this goal are to: (1) pursue an understanding of flow physics, surface heat transfer, and combustion via analysis and fundamental experiments, (2) incorporate improved understanding of these phenomena into verified 3-D CFD codes, and (3) utilize state-of-the-art computational technology to enhance experimental and CFD research. Presented is an overview of the ICFM program in high-speed propulsion, including work in inlets, turbomachinery, and chemical reacting flows. Ongoing efforts to integrate new computer technologies, such as parallel computing and artificial intelligence, into high-speed aeropropulsion research are described.
Deploying Darter A Cray XC30 System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fahey, Mark R; Budiardja, Reuben D; Crosby, Lonnie D

TheUniversityofTennessee,KnoxvilleacquiredaCrayXC30 supercomputer, called Darter, with a peak performance of 248.9 Ter- aflops. Darter was deployed in late March of 2013 with a very aggressive production timeline - the system was deployed, accepted, and placed into production in only 2 weeks. The Spring Experiment for the Center for Analysis and Prediction of Storms (CAPS) largely drove the accelerated timeline, as the experiment was scheduled to start in mid-April. The Consortium for Advanced Simulation of Light Water Reactors (CASL) project also needed access and was able to meet their tight deadlines on the newly acquired XC30. Darter s accelerated deployment and op-more » erations schedule resulted in substantial scientific impacts within the re- search community as well as immediate real-world impacts such as early severe tornado warnings« less
Plasma physics. Stochastic electron acceleration during spontaneous turbulent reconnection in a strong shock wave.

PubMed

Matsumoto, Y; Amano, T; Kato, T N; Hoshino, M

2015-02-27

Explosive phenomena such as supernova remnant shocks and solar flares have demonstrated evidence for the production of relativistic particles. Interest has therefore been renewed in collisionless shock waves and magnetic reconnection as a means to achieve such energies. Although ions can be energized during such phenomena, the relativistic energy of the electrons remains a puzzle for theory. We present supercomputer simulations showing that efficient electron energization can occur during turbulent magnetic reconnection arising from a strong collisionless shock. Upstream electrons undergo first-order Fermi acceleration by colliding with reconnection jets and magnetic islands, giving rise to a nonthermal relativistic population downstream. These results shed new light on magnetic reconnection as an agent of energy dissipation and particle acceleration in strong shock waves. Copyright © 2015, American Association for the Advancement of Science.

Elucidating reaction mechanisms on quantum computers.

PubMed

Reiher, Markus; Wiebe, Nathan; Svore, Krysta M; Wecker, Dave; Troyer, Matthias

2017-07-18

With rapid recent advances in quantum technology, we are close to the threshold of quantum devices whose computational powers can exceed those of classical supercomputers. Here, we show that a quantum computer can be used to elucidate reaction mechanisms in complex chemical systems, using the open problem of biological nitrogen fixation in nitrogenase as an example. We discuss how quantum computers can augment classical computer simulations used to probe these reaction mechanisms, to significantly increase their accuracy and enable hitherto intractable simulations. Our resource estimates show that, even when taking into account the substantial overhead of quantum error correction, and the need to compile into discrete gate sets, the necessary computations can be performed in reasonable time on small quantum computers. Our results demonstrate that quantum computers will be able to tackle important problems in chemistry without requiring exorbitant resources.
Elucidating reaction mechanisms on quantum computers

PubMed Central

Reiher, Markus; Wiebe, Nathan; Svore, Krysta M.; Wecker, Dave; Troyer, Matthias

2017-01-01

With rapid recent advances in quantum technology, we are close to the threshold of quantum devices whose computational powers can exceed those of classical supercomputers. Here, we show that a quantum computer can be used to elucidate reaction mechanisms in complex chemical systems, using the open problem of biological nitrogen fixation in nitrogenase as an example. We discuss how quantum computers can augment classical computer simulations used to probe these reaction mechanisms, to significantly increase their accuracy and enable hitherto intractable simulations. Our resource estimates show that, even when taking into account the substantial overhead of quantum error correction, and the need to compile into discrete gate sets, the necessary computations can be performed in reasonable time on small quantum computers. Our results demonstrate that quantum computers will be able to tackle important problems in chemistry without requiring exorbitant resources. PMID:28674011
Machine learning for micro-tomography

NASA Astrophysics Data System (ADS)

Parkinson, Dilworth Y.; Pelt, Daniël. M.; Perciano, Talita; Ushizima, Daniela; Krishnan, Harinarayan; Barnard, Harold S.; MacDowell, Alastair A.; Sethian, James

2017-09-01

Machine learning has revolutionized a number of fields, but many micro-tomography users have never used it for their work. The micro-tomography beamline at the Advanced Light Source (ALS), in collaboration with the Center for Applied Mathematics for Energy Research Applications (CAMERA) at Lawrence Berkeley National Laboratory, has now deployed a series of tools to automate data processing for ALS users using machine learning. This includes new reconstruction algorithms, feature extraction tools, and image classification and recommen- dation systems for scientific image. Some of these tools are either in automated pipelines that operate on data as it is collected or as stand-alone software. Others are deployed on computing resources at Berkeley Lab-from workstations to supercomputers-and made accessible to users through either scripting or easy-to-use graphical interfaces. This paper presents a progress report on this work.
Forecasting techno-social systems: how physics and computing help to fight off global pandemics

NASA Astrophysics Data System (ADS)

Vespignani, Alessandro

2010-03-01

The crucial issue when planning for adequate public health interventions to mitigate the spread and impact of epidemics is risk evaluation and forecast. This amount to the anticipation of where, when and how strong the epidemic will strike. In the last decade advances in performance in computer technology, data acquisition, statistical physics and complex networks theory allow the generation of sophisticated simulations on supercomputer infrastructures to anticipate the spreading pattern of a pandemic. For the first time we are in the position of generating real time forecast of epidemic spreading. I will review the history of the current H1N1 pandemic, the major road-blocks the community has faced in its containment and mitigation and how physics and computing provide predictive tools that help us to battle epidemics.
Elucidating reaction mechanisms on quantum computers

NASA Astrophysics Data System (ADS)

Reiher, Markus; Wiebe, Nathan; Svore, Krysta M.; Wecker, Dave; Troyer, Matthias

2017-07-01

With rapid recent advances in quantum technology, we are close to the threshold of quantum devices whose computational powers can exceed those of classical supercomputers. Here, we show that a quantum computer can be used to elucidate reaction mechanisms in complex chemical systems, using the open problem of biological nitrogen fixation in nitrogenase as an example. We discuss how quantum computers can augment classical computer simulations used to probe these reaction mechanisms, to significantly increase their accuracy and enable hitherto intractable simulations. Our resource estimates show that, even when taking into account the substantial overhead of quantum error correction, and the need to compile into discrete gate sets, the necessary computations can be performed in reasonable time on small quantum computers. Our results demonstrate that quantum computers will be able to tackle important problems in chemistry without requiring exorbitant resources.
Accessing Wind Tunnels From NASA's Information Power Grid

NASA Technical Reports Server (NTRS)

Becker, Jeff; Biegel, Bryan (Technical Monitor)

2002-01-01

The NASA Ames wind tunnel customers are one of the first users of the Information Power Grid (IPG) storage system at the NASA Advanced Supercomputing Division. We wanted to be able to store their data on the IPG so that it could be accessed remotely in a secure but timely fashion. In addition, incorporation into the IPG allows future use of grid computational resources, e.g., for post-processing of data, or to do side-by-side CFD validation. In this paper, we describe the integration of grid data access mechanisms with the existing DARWIN web-based system that is used to access wind tunnel test data. We also show that the combined system has reasonable performance: wind tunnel data may be retrieved at 50Mbits/s over a 100 base T network connected to the IPG storage server.
Towards energy-efficient nonoscillatory forward-in-time integrations on lat-lon grids

NASA Astrophysics Data System (ADS)

Polkowski, Marcin; Piotrowski, Zbigniew; Ryczkowski, Adam

2017-04-01

The design of the next-generation weather prediction models calls for new algorithmic approaches allowing for robust integrations of atmospheric flow over complex orography at sub-km resolutions. These need to be accompanied by efficient implementations exposing multi-level parallelism, capable to run on modern supercomputing architectures. Here we present the recent advances in the energy-efficient implementation of the consistent soundproof/implicit compressible EULAG dynamical core of the COSMO weather prediction framework. Based on the experiences of the atmospheric dwarfs developed within H2020 ESCAPE project, we develop efficient, architecture agnostic implementations of fully three-dimensional MPDATA advection schemes and generalized diffusion operator in curvilinear coordinates and spherical geometry. We compare optimized Fortran implementation with preliminary C++ implementation employing the Gridtools library, allowing for integrations on CPU and GPU while maintaining single source code.
Scientific Discovery through Advanced Computing in Plasma Science

NASA Astrophysics Data System (ADS)

Tang, William

2005-03-01

Advanced computing is generally recognized to be an increasingly vital tool for accelerating progress in scientific research during the 21st Century. For example, the Department of Energy's ``Scientific Discovery through Advanced Computing'' (SciDAC) Program was motivated in large measure by the fact that formidable scientific challenges in its research portfolio could best be addressed by utilizing the combination of the rapid advances in super-computing technology together with the emergence of effective new algorithms and computational methodologies. The imperative is to translate such progress into corresponding increases in the performance of the scientific codes used to model complex physical systems such as those encountered in high temperature plasma research. If properly validated against experimental measurements and analytic benchmarks, these codes can provide reliable predictive capability for the behavior of a broad range of complex natural and engineered systems. This talk reviews recent progress and future directions for advanced simulations with some illustrative examples taken from the plasma science applications area. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by the combination of access to powerful new computational resources together with innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning a huge range in time and space scales. In particular, the plasma science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of plasma turbulence in magnetically-confined high temperature plasmas. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to the computational science area.
Binary Black Hole Mergers, Gravitational Waves, and LISA

NASA Astrophysics Data System (ADS)

Centrella, Joan; Baker, J.; Boggs, W.; Kelly, B.; McWilliams, S.; van Meter, J.

2007-12-01

The final merger of comparable mass binary black holes is expected to be the strongest source of gravitational waves for LISA. Since these mergers take place in regions of extreme gravity, we need to solve Einstein's equations of general relativity on a computer in order to calculate these waveforms. For more than 30 years, scientists have tried to compute black hole mergers using the methods of numerical relativity. The resulting computer codes have been plagued by instabilities, causing them to crash well before the black holes in the binary could complete even a single orbit. Within the past few years, however, this situation has changed dramatically, with a series of remarkable breakthroughs. We will present the results of new simulations of black hole mergers with unequal masses and spins, focusing on the gravitational waves emitted and the accompanying astrophysical "kicks.” The magnitude of these kicks has bearing on the production and growth of supermassive blackholes during the epoch of structure formation, and on the retention of black holes in stellar clusters. This work was supported by NASA grant 06-BEFS06-19, and the simulations were carried out using Project Columbia at the NASA Advanced Supercomputing Division (Ames Research Center) and at the NASA Center for Computational Sciences (Goddard Space Flight Center).
High-speed classification of coherent X-ray diffraction patterns on the K computer for high-resolution single biomolecule imaging.

PubMed

Tokuhisa, Atsushi; Arai, Junya; Joti, Yasumasa; Ohno, Yoshiyuki; Kameyama, Toyohisa; Yamamoto, Keiji; Hatanaka, Masayuki; Gerofi, Balazs; Shimada, Akio; Kurokawa, Motoyoshi; Shoji, Fumiyoshi; Okada, Kensuke; Sugimoto, Takashi; Yamaga, Mitsuhiro; Tanaka, Ryotaro; Yokokawa, Mitsuo; Hori, Atsushi; Ishikawa, Yutaka; Hatsui, Takaki; Go, Nobuhiro

2013-11-01

Single-particle coherent X-ray diffraction imaging using an X-ray free-electron laser has the potential to reveal the three-dimensional structure of a biological supra-molecule at sub-nanometer resolution. In order to realise this method, it is necessary to analyze as many as 1 × 10(6) noisy X-ray diffraction patterns, each for an unknown random target orientation. To cope with the severe quantum noise, patterns need to be classified according to their similarities and average similar patterns to improve the signal-to-noise ratio. A high-speed scalable scheme has been developed to carry out classification on the K computer, a 10PFLOPS supercomputer at RIKEN Advanced Institute for Computational Science. It is designed to work on the real-time basis with the experimental diffraction pattern collection at the X-ray free-electron laser facility SACLA so that the result of classification can be feedback for optimizing experimental parameters during the experiment. The present status of our effort developing the system and also a result of application to a set of simulated diffraction patterns is reported. About 1 × 10(6) diffraction patterns were successfully classificatied by running 255 separate 1 h jobs in 385-node mode.
High-speed classification of coherent X-ray diffraction patterns on the K computer for high-resolution single biomolecule imaging

PubMed Central

Tokuhisa, Atsushi; Arai, Junya; Joti, Yasumasa; Ohno, Yoshiyuki; Kameyama, Toyohisa; Yamamoto, Keiji; Hatanaka, Masayuki; Gerofi, Balazs; Shimada, Akio; Kurokawa, Motoyoshi; Shoji, Fumiyoshi; Okada, Kensuke; Sugimoto, Takashi; Yamaga, Mitsuhiro; Tanaka, Ryotaro; Yokokawa, Mitsuo; Hori, Atsushi; Ishikawa, Yutaka; Hatsui, Takaki; Go, Nobuhiro

2013-01-01

Single-particle coherent X-ray diffraction imaging using an X-ray free-electron laser has the potential to reveal the three-dimensional structure of a biological supra-molecule at sub-nanometer resolution. In order to realise this method, it is necessary to analyze as many as 1 × 106 noisy X-ray diffraction patterns, each for an unknown random target orientation. To cope with the severe quantum noise, patterns need to be classified according to their similarities and average similar patterns to improve the signal-to-noise ratio. A high-speed scalable scheme has been developed to carry out classification on the K computer, a 10PFLOPS supercomputer at RIKEN Advanced Institute for Computational Science. It is designed to work on the real-time basis with the experimental diffraction pattern collection at the X-ray free-electron laser facility SACLA so that the result of classification can be feedback for optimizing experimental parameters during the experiment. The present status of our effort developing the system and also a result of application to a set of simulated diffraction patterns is reported. About 1 × 106 diffraction patterns were successfully classificatied by running 255 separate 1 h jobs in 385-node mode. PMID:24121336
HACC: Simulating sky surveys on state-of-the-art supercomputing architectures

NASA Astrophysics Data System (ADS)

Habib, Salman; Pope, Adrian; Finkel, Hal; Frontiere, Nicholas; Heitmann, Katrin; Daniel, David; Fasel, Patricia; Morozov, Vitali; Zagaris, George; Peterka, Tom; Vishwanath, Venkatram; Lukić, Zarija; Sehrish, Saba; Liao, Wei-keng

2016-01-01

Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the 'Dark Universe', dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers that enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models and algorithms. It has been demonstrated at scale on Cell- and GPU-accelerated systems, standard multi-core node clusters, and Blue Gene systems. HACC's design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available. We present a description of the design philosophy of HACC, the underlying algorithms and code structure, and outline implementation details for several specific architectures. We show selected accuracy and performance results from some of the largest high resolution cosmological simulations so far performed, including benchmarks evolving more than 3.6 trillion particles.
HACC: Simulating sky surveys on state-of-the-art supercomputing architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Habib, Salman; Pope, Adrian; Finkel, Hal

2016-01-01

Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the ‘Dark Universe’, dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers thatmore » enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models and algorithms. It has been demonstrated at scale on Cell- and GPU-accelerated systems, standard multi-core node clusters, and Blue Gene systems. HACC’s design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available. We present a description of the design philosophy of HACC, the underlying algorithms and code structure, and outline implementation details for several specific architectures. We show selected accuracy and performance results from some of the largest high resolution cosmological simulations so far performed, including benchmarks evolving more than 3.6 trillion particles.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Meneses, Esteban; Ni, Xiang; Jones, Terry R

The unprecedented computational power of cur- rent supercomputers now makes possible the exploration of complex problems in many scientific fields, from genomic analysis to computational fluid dynamics. Modern machines are powerful because they are massive: they assemble millions of cores and a huge quantity of disks, cards, routers, and other components. But it is precisely the size of these machines that glooms the future of supercomputing. A system that comprises many components has a high chance to fail, and fail often. In order to make the next generation of supercomputers usable, it is imperative to use some type of faultmore » tolerance platform to run applications on large machines. Most fault tolerance strategies can be optimized for the peculiarities of each system and boost efficacy by keeping the system productive. In this paper, we aim to understand how failure characterization can improve resilience in several layers of the software stack: applications, runtime systems, and job schedulers. We examine the Titan supercomputer, one of the fastest systems in the world. We analyze a full year of Titan in production and distill the failure patterns of the machine. By looking into Titan s log files and using the criteria of experts, we provide a detailed description of the types of failures. In addition, we inspect the job submission files and describe how the system is used. Using those two sources, we cross correlate failures in the machine to executing jobs and provide a picture of how failures affect the user experience. We believe such characterization is fundamental in developing appropriate fault tolerance solutions for Cray systems similar to Titan.« less
Energy Innovation Hubs: A Home for Scientific Collaboration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chu, Steven

Secretary Chu will host a live, streaming Q&A session with the directors of the Energy Innovation Hubs on Tuesday, March 6, at 2:15 p.m. EST. The directors will be available for questions regarding their teams' work and the future of American energy. Ask your questions in the comments below, or submit them on Facebook, Twitter (@energy), or send an e-mail to newmedia@hq.doe.gov, prior or during the live event. Dr. Hank Foley is the director of the Greater Philadelphia Innovation Cluster for Energy-Efficient Buildings, which is pioneering new data intensive techniques for designing and operating energy efficient buildings, including advanced computermore » modeling. Dr. Douglas Kothe is the director of the Consortium for Advanced Simulation of Light Water Reactors, which uses powerful supercomputers to create "virtual" reactors that will help improve the safety and performance of both existing and new nuclear reactors. Dr. Nathan Lewis is the director of the Joint Center for Artificial Photosynthesis, which focuses on how to produce fuels from sunlight, water, and carbon dioxide. The Energy Innovation Hubs are major integrated research centers, with researchers from many different institutions and technical backgrounds. Each hub is focused on a specific high priority goal, rapidly accelerating scientific discoveries and shortening the path from laboratory innovation to technological development and commercial deployment of critical energy technologies. Ask your questions in the comments below, or submit them on Facebook, Twitter (@energy), or send an e-mail to newmedia@energy.gov, prior or during the live event. The Energy Innovation Hubs are major integrated research centers, with researchers from many different institutions and technical backgrounds. Each Hub is focused on a specific high priority goal, rapidly accelerating scientific discoveries and shortening the path from laboratory innovation to technological development and commercial deployment of critical energy technologies. Dr. Hank Holey is the director of the Greater Philadelphia Innovation Cluster for Energy-Efficient Buildings, which is pioneering new data intensive techniques for designing and operating energy efficient buildings, including advanced computer modeling. Dr. Douglas Kothe is the director of the Modeling and Simulation for Nuclear Reactors Hub, which uses powerful supercomputers to create "virtual" reactors that will help improve the safety and performance of both existing and new nuclear reactors. Dr. Nathan Lewis is the director of the Joint Center for Artificial Photosynthesis Hub, which focuses on how to produce biofuels from sunlight, water, and carbon dioxide.« less
Computation Directorate Annual Report 2003

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, D L; McGraw, J R; Ashby, S F

Big computers are icons: symbols of the culture, and of the larger computing infrastructure that exists at Lawrence Livermore. Through the collective effort of Laboratory personnel, they enable scientific discovery and engineering development on an unprecedented scale. For more than three decades, the Computation Directorate has supplied the big computers that enable the science necessary for Laboratory missions and programs. Livermore supercomputing is uniquely mission driven. The high-fidelity weapon simulation capabilities essential to the Stockpile Stewardship Program compel major advances in weapons codes and science, compute power, and computational infrastructure. Computation's activities align with this vital mission of the Departmentmore » of Energy. Increasingly, non-weapons Laboratory programs also rely on computer simulation. World-class achievements have been accomplished by LLNL specialists working in multi-disciplinary research and development teams. In these teams, Computation personnel employ a wide array of skills, from desktop support expertise, to complex applications development, to advanced research. Computation's skilled professionals make the Directorate the success that it has become. These individuals know the importance of the work they do and the many ways it contributes to Laboratory missions. They make appropriate and timely decisions that move the entire organization forward. They make Computation a leader in helping LLNL achieve its programmatic milestones. I dedicate this inaugural Annual Report to the people of Computation in recognition of their continuing contributions. I am proud that we perform our work securely and safely. Despite increased cyber attacks on our computing infrastructure from the Internet, advanced cyber security practices ensure that our computing environment remains secure. Through Integrated Safety Management (ISM) and diligent oversight, we address safety issues promptly and aggressively. The safety of our employees, whether at work or at home, is a paramount concern. Even as the Directorate meets today's supercomputing requirements, we are preparing for the future. We are investigating open-source cluster technology, the basis of our highly successful Mulitprogrammatic Capability Resource (MCR). Several breakthrough discoveries have resulted from MCR calculations coupled with theory and experiment, prompting Laboratory scientists to demand ever-greater capacity and capability. This demand is being met by a new 23-TF system, Thunder, with architecture modeled on MCR. In preparation for the ''after-next'' computer, we are researching technology even farther out on the horizon--cell-based computers. Assuming that the funding and the technology hold, we will acquire the cell-based machine BlueGene/L within the next 12 months.« less
Energy Innovation Hubs: A Home for Scientific Collaboration

ScienceCinema

Chu, Steven

2017-12-11

Secretary Chu will host a live, streaming Q&A session with the directors of the Energy Innovation Hubs on Tuesday, March 6, at 2:15 p.m. EST. The directors will be available for questions regarding their teams' work and the future of American energy. Ask your questions in the comments below, or submit them on Facebook, Twitter (@energy), or send an e-mail to newmedia@hq.doe.gov, prior or during the live event. Dr. Hank Foley is the director of the Greater Philadelphia Innovation Cluster for Energy-Efficient Buildings, which is pioneering new data intensive techniques for designing and operating energy efficient buildings, including advanced computer modeling. Dr. Douglas Kothe is the director of the Consortium for Advanced Simulation of Light Water Reactors, which uses powerful supercomputers to create "virtual" reactors that will help improve the safety and performance of both existing and new nuclear reactors. Dr. Nathan Lewis is the director of the Joint Center for Artificial Photosynthesis, which focuses on how to produce fuels from sunlight, water, and carbon dioxide. The Energy Innovation Hubs are major integrated research centers, with researchers from many different institutions and technical backgrounds. Each hub is focused on a specific high priority goal, rapidly accelerating scientific discoveries and shortening the path from laboratory innovation to technological development and commercial deployment of critical energy technologies. Ask your questions in the comments below, or submit them on Facebook, Twitter (@energy), or send an e-mail to newmedia@energy.gov, prior or during the live event. The Energy Innovation Hubs are major integrated research centers, with researchers from many different institutions and technical backgrounds. Each Hub is focused on a specific high priority goal, rapidly accelerating scientific discoveries and shortening the path from laboratory innovation to technological development and commercial deployment of critical energy technologies. Dr. Hank Holey is the director of the Greater Philadelphia Innovation Cluster for Energy-Efficient Buildings, which is pioneering new data intensive techniques for designing and operating energy efficient buildings, including advanced computer modeling. Dr. Douglas Kothe is the director of the Modeling and Simulation for Nuclear Reactors Hub, which uses powerful supercomputers to create "virtual" reactors that will help improve the safety and performance of both existing and new nuclear reactors. Dr. Nathan Lewis is the director of the Joint Center for Artificial Photosynthesis Hub, which focuses on how to produce biofuels from sunlight, water, and carbon dioxide.
Calibrating Building Energy Models Using Supercomputer Trained Machine Learning Agents

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanyal, Jibonananda; New, Joshua Ryan; Edwards, Richard

2014-01-01

Building Energy Modeling (BEM) is an approach to model the energy usage in buildings for design and retrofit purposes. EnergyPlus is the flagship Department of Energy software that performs BEM for different types of buildings. The input to EnergyPlus can often extend in the order of a few thousand parameters which have to be calibrated manually by an expert for realistic energy modeling. This makes it challenging and expensive thereby making building energy modeling unfeasible for smaller projects. In this paper, we describe the Autotune research which employs machine learning algorithms to generate agents for the different kinds of standardmore » reference buildings in the U.S. building stock. The parametric space and the variety of building locations and types make this a challenging computational problem necessitating the use of supercomputers. Millions of EnergyPlus simulations are run on supercomputers which are subsequently used to train machine learning algorithms to generate agents. These agents, once created, can then run in a fraction of the time thereby allowing cost-effective calibration of building models.« less
Challenges in scaling NLO generators to leadership computers

NASA Astrophysics Data System (ADS)

Benjamin, D.; Childers, JT; Hoeche, S.; LeCompte, T.; Uram, T.

2017-10-01

Exascale computing resources are roughly a decade away and will be capable of 100 times more computing than current supercomputers. In the last year, Energy Frontier experiments crossed a milestone of 100 million core-hours used at the Argonne Leadership Computing Facility, Oak Ridge Leadership Computing Facility, and NERSC. The Fortran-based leading-order parton generator called Alpgen was successfully scaled to millions of threads to achieve this level of usage on Mira. Sherpa and MadGraph are next-to-leading order generators used heavily by LHC experiments for simulation. Integration times for high-multiplicity or rare processes can take a week or more on standard Grid machines, even using all 16-cores. We will describe our ongoing work to scale the Sherpa generator to thousands of threads on leadership-class machines and reduce run-times to less than a day. This work allows the experiments to leverage large-scale parallel supercomputers for event generation today, freeing tens of millions of grid hours for other work, and paving the way for future applications (simulation, reconstruction) on these and future supercomputers.
Sign: large-scale gene network estimation environment for high performance computing.

PubMed

Tamada, Yoshinori; Shimamura, Teppei; Yamaguchi, Rui; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru

2011-01-01

Our research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1. In these three programs, five different models are available: static and dynamic nonparametric Bayesian networks, state space models, graphical Gaussian models, and vector autoregressive models. All these models require a huge amount of computational resources for estimating large-scale gene networks and therefore are designed to be able to exploit the speed of 10 petaflops. The software will be available freely for "K computer" and HGC supercomputer system users. The estimated networks can be viewed and analyzed by Cell Illustrator Online and SBiP (Systems Biology integrative Pipeline). The software project web site is available at http://sign.hgc.jp/ .

Massively parallel algorithm and implementation of RI-MP2 energy calculation for peta-scale many-core supercomputers.

PubMed

Katouda, Michio; Naruse, Akira; Hirano, Yukihiko; Nakajima, Takahito

2016-11-15

A new parallel algorithm and its implementation for the RI-MP2 energy calculation utilizing peta-flop-class many-core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual-level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi-node and multi-GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi-node and multi-GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Optical clock distribution in supercomputers using polyimide-based waveguides

NASA Astrophysics Data System (ADS)

Bihari, Bipin; Gan, Jianhua; Wu, Linghui; Liu, Yujie; Tang, Suning; Chen, Ray T.

1999-04-01

Guided-wave optics is a promising way to deliver high-speed clock-signal in supercomputer with minimized clock-skew. Si- CMOS compatible polymer-based waveguides for optoelectronic interconnects and packaging have been fabricated and characterized. A 1-to-48 fanout optoelectronic interconnection layer (OIL) structure based on Ultradel 9120/9020 for the high-speed massive clock signal distribution for a Cray T-90 supercomputer board has been constructed. The OIL employs multimode polymeric channel waveguides in conjunction with surface-normal waveguide output coupler and 1-to-2 splitters. Surface-normal couplers can couple the optical clock signals into and out from the H-tree polyimide waveguides surface-normally, which facilitates the integration of photodetectors to convert optical-signal to electrical-signal. A 45-degree surface- normal couplers has been integrated at each output end. The measured output coupling efficiency is nearly 100 percent. The output profile from 45-degree surface-normal coupler were calculated using Fresnel approximation. the theoretical result is in good agreement with experimental result. A total insertion loss of 7.98 dB at 850 nm was measured experimentally.
Flow visualization of CFD using graphics workstations

NASA Technical Reports Server (NTRS)

Lasinski, Thomas; Buning, Pieter; Choi, Diana; Rogers, Stuart; Bancroft, Gordon

1987-01-01

High performance graphics workstations are used to visualize the fluid flow dynamics obtained from supercomputer solutions of computational fluid dynamic programs. The visualizations can be done independently on the workstation or while the workstation is connected to the supercomputer in a distributed computing mode. In the distributed mode, the supercomputer interactively performs the computationally intensive graphics rendering tasks while the workstation performs the viewing tasks. A major advantage of the workstations is that the viewers can interactively change their viewing position while watching the dynamics of the flow fields. An overview of the computer hardware and software required to create these displays is presented. For complex scenes the workstation cannot create the displays fast enough for good motion analysis. For these cases, the animation sequences are recorded on video tape or 16 mm film a frame at a time and played back at the desired speed. The additional software and hardware required to create these video tapes or 16 mm movies are also described. Photographs illustrating current visualization techniques are discussed. Examples of the use of the workstations for flow visualization through animation are available on video tape.
Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode

NASA Technical Reports Server (NTRS)

Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William

1986-01-01

The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
Long-Term file activity patterns in a UNIX workstation environment

NASA Technical Reports Server (NTRS)

Gibson, Timothy J.; Miller, Ethan L.

1998-01-01

As mass storage technology becomes more affordable for sites smaller than supercomputer centers, understanding their file access patterns becomes crucial for developing systems to store rarely used data on tertiary storage devices such as tapes and optical disks. This paper presents a new way to collect and analyze file system statistics for UNIX-based file systems. The collection system runs in user-space and requires no modification of the operating system kernel. The statistics package provides details about file system operations at the file level: creations, deletions, modifications, etc. The paper analyzes four months of file system activity on a university file system. The results confirm previously published results gathered from supercomputer file systems, but differ in several important areas. Files in this study were considerably smaller than those at supercomputer centers, and they were accessed less frequently. Additionally, the long-term creation rate on workstation file systems is sufficiently low so that all data more than a day old could be cheaply saved on a mass storage device, allowing the integration of time travel into every file system.
Asynchronous communication in spectral-element and discontinuous Galerkin methods for atmospheric dynamics – a case study using the High-Order Methods Modeling Environment (HOMME-homme_dg_branch)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jamroz, Benjamin F.; Klofkorn, Robert

The scalability of computational applications on current and next-generation supercomputers is increasingly limited by the cost of inter-process communication. We implement non-blocking asynchronous communication in the High-Order Methods Modeling Environment for the time integration of the hydrostatic fluid equations using both the spectral-element and discontinuous Galerkin methods. This allows the overlap of computation with communication, effectively hiding some of the costs of communication. A novel detail about our approach is that it provides some data movement to be performed during the asynchronous communication even in the absence of other computations. This method produces significant performance and scalability gains in large-scalemore » simulations.« less
Asynchronous communication in spectral-element and discontinuous Galerkin methods for atmospheric dynamics – a case study using the High-Order Methods Modeling Environment (HOMME-homme_dg_branch)

DOE PAGES

Jamroz, Benjamin F.; Klofkorn, Robert

2016-08-26

The scalability of computational applications on current and next-generation supercomputers is increasingly limited by the cost of inter-process communication. We implement non-blocking asynchronous communication in the High-Order Methods Modeling Environment for the time integration of the hydrostatic fluid equations using both the spectral-element and discontinuous Galerkin methods. This allows the overlap of computation with communication, effectively hiding some of the costs of communication. A novel detail about our approach is that it provides some data movement to be performed during the asynchronous communication even in the absence of other computations. This method produces significant performance and scalability gains in large-scalemore » simulations.« less
Adaptive methods for nonlinear structural dynamics and crashworthiness analysis

NASA Technical Reports Server (NTRS)

Belytschko, Ted

1993-01-01

The objective is to describe three research thrusts in crashworthiness analysis: adaptivity; mixed time integration, or subcycling, in which different timesteps are used for different parts of the mesh in explicit methods; and methods for contact-impact which are highly vectorizable. The techniques are being developed to improve the accuracy of calculations, ease-of-use of crashworthiness programs, and the speed of calculations. The latter is still of importance because crashworthiness calculations are often made with models of 20,000 to 50,000 elements using explicit time integration and require on the order of 20 to 100 hours on current supercomputers. The methodologies are briefly reviewed and then some example calculations employing these methods are described. The methods are also of value to other nonlinear transient computations.
Towards real-time photon Monte Carlo dose calculation in the cloud

NASA Astrophysics Data System (ADS)

Ziegenhein, Peter; Kozin, Igor N.; Kamerling, Cornelis Ph; Oelfke, Uwe

2017-06-01

Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.
The " Swarm of Ants vs. Herd of Elephants" Debated Revisited: Performance Measurements of PVM-Overflow Across a Wide Spectrum of Architectures

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Jespersen, Dennis; Buning, Peter; Bailey, David (Technical Monitor)

1996-01-01

The Gorden Bell Prizes given out at Supercomputing every year includes at least two catergories: performance (highest GFLOP count) and price-performance (GFLOP/million $$) for real applications. In the past five years, the winners of the price-performance categories all came from networks of work-stations. This reflects three important facts: 1. supercomputers are still too expensive for the masses; 2. achieving high performance for real applications takes real work; and, most importantly; 3. it is possible to obtain acceptable performance for certain real applications on network of work stations. With the continued advance of network technology as well as increased performance of "desktop" workstation, the "Swarm of Ants vs. Herd of Elephants" debate, which began with vector multiprocessors (VPPs) against SIMD type multiprocessors (e.g. CM2), is now recast as VPPs against Symetric Multiprocessors (SMPs, e.g. SGI PowerChallenge). This paper reports on performance studies we performed solving a large scale (2-million grid pt.s) CFD problem involving a Boeing 747 based on a parallel version of OVERFLOW that utilizes message passing on PVM. A performance monitoring tool developed under NASA HPCC, called AIMS, was used to instrument and analyze the the performance data thus obtained. We plan to compare its performance data obtained across a wide spectrum of architectures including: the Cray C90, IBM/SP2, SGI/Power Challenge Cluster, to a group of workstations connected over a simple network. The metrics of comparison includes speed-up, price-performance, throughput, and turn-around time. We also plan to present a plan of attack for various issues that will make the execution of Grand Challenge Applications across the Global Information Infrastructure a reality.
Core-collapse supernovae as supercomputing science: A status report toward six-dimensional simulations with exact Boltzmann neutrino transport in full general relativity

NASA Astrophysics Data System (ADS)

Kotake, Kei; Sumiyoshi, Kohsuke; Yamada, Shoichi; Takiwaki, Tomoya; Kuroda, Takami; Suwa, Yudai; Nagakura, Hiroki

2012-08-01

This is a status report on our endeavor to reveal the mechanism of core-collapse supernovae (CCSNe) by large-scale numerical simulations. Multi-dimensionality of the supernova engine, general relativistic magnetohydrodynamics, energy and lepton number transport by neutrinos emitted from the forming neutron star, as well as nuclear interactions there, are all believed to play crucial roles in repelling infalling matter and producing energetic explosions. These ingredients are non-linearly coupled with one another in the dynamics of core collapse, bounce, and shock expansion. Serious quantitative studies of CCSNe hence make extensive numerical computations mandatory. Since neutrinos are neither in thermal nor in chemical equilibrium in general, their distributions in the phase space should be computed. This is a six-dimensional (6D) neutrino transport problem and quite a challenge, even for those with access to the most advanced numerical resources such as the "K computer". To tackle this problem, we have embarked on efforts on multiple fronts. In particular, we report in this paper our recent progresses in the treatment of multidimensional (multi-D) radiation hydrodynamics. We are currently proceeding on two different paths to the ultimate goal. In one approach, we employ an approximate but highly efficient scheme for neutrino transport and treat 3D hydrodynamics and/or general relativity rigorously; some neutrino-driven explosions will be presented and quantitative comparisons will be made between 2D and 3D models. In the second approach, on the other hand, exact, but so far Newtonian, Boltzmann equations are solved in two and three spatial dimensions; we will show some example test simulations. We will also address the perspectives of exascale computations on the next generation supercomputers.
Towards real-time photon Monte Carlo dose calculation in the cloud.

PubMed

Ziegenhein, Peter; Kozin, Igor N; Kamerling, Cornelis Ph; Oelfke, Uwe

2017-06-07

Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.
Opportunities for leveraging OS virtualization in high-end supercomputing.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bridges, Patrick G.; Pedretti, Kevin Thomas Tauke

2010-11-01

This paper examines potential motivations for incorporating virtualization support in the system software stacks of high-end capability supercomputers. We advocate that this will increase the flexibility of these platforms significantly and enable new capabilities that are not possible with current fixed software stacks. Our results indicate that compute, virtual memory, and I/O virtualization overheads are low and can be further mitigated by utilizing well-known techniques such as large paging and VMM bypass. Furthermore, since the addition of virtualization support does not affect the performance of applications using the traditional native environment, there is essentially no disadvantage to its addition.
Designing a connectionist network supercomputer.

PubMed

Asanović, K; Beck, J; Feldman, J; Morgan, N; Wawrzynek, J

1993-12-01

This paper describes an effort at UC Berkeley and the International Computer Science Institute to develop a supercomputer for artificial neural network applications. Our perspective has been strongly influenced by earlier experiences with the construction and use of a simpler machine. In particular, we have observed Amdahl's Law in action in our designs and those of others. These observations inspire attention to many factors beyond fast multiply-accumulate arithmetic. We describe a number of these factors along with rough expressions for their influence and then give the applications targets, machine goals and the system architecture for the machine we are currently designing.
Building black holes: supercomputer cinema.

PubMed

Shapiro, S L; Teukolsky, S A

1988-07-22

A new computer code can solve Einstein's equations of general relativity for the dynamical evolution of a relativistic star cluster. The cluster may contain a large number of stars that move in a strong gravitational field at speeds approaching the speed of light. Unstable star clusters undergo catastrophic collapse to black holes. The collapse of an unstable cluster to a supermassive black hole at the center of a galaxy may explain the origin of quasars and active galactic nuclei. By means of a supercomputer simulation and color graphics, the whole process can be viewed in real time on a movie screen.
Supercomputer analysis of purine and pyrimidine metabolism leading to DNA synthesis.

PubMed

Heinmets, F

1989-06-01

A model-system is established to analyze purine and pyrimidine metabolism leading to DNA synthesis. The principal aim is to explore the flow and regulation of terminal deoxynucleoside triophosphates (dNTPs) in various input and parametric conditions. A series of flow equations are established, which are subsequently converted to differential equations. These are programmed (Fortran) and analyzed on a Cray chi-MP/48 supercomputer. The pool concentrations are presented as a function of time in conditions in which various pertinent parameters of the system are modified. The system is formulated by 100 differential equations.
Performance of the Widely-Used CFD Code OVERFLOW on the Pleides Supercomputer

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.

2017-01-01

Computational performance studies were made for NASA's widely used Computational Fluid Dynamics code OVERFLOW on the Pleiades Supercomputer. Two test cases were considered: a full launch vehicle with a grid of 286 million points and a full rotorcraft model with a grid of 614 million points. Computations using up to 8000 cores were run on Sandy Bridge and Ivy Bridge nodes. Performance was monitored using times reported in the day files from the Portable Batch System utility. Results for two grid topologies are presented and compared in detail. Observations and suggestions for future work are made.
Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias;

2006-01-01

The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.

A highly scalable particle tracking algorithm using partitioned global address space (PGAS) programming for extreme-scale turbulence simulations

NASA Astrophysics Data System (ADS)

Buaria, D.; Yeung, P. K.

2017-12-01

A new parallel algorithm utilizing a partitioned global address space (PGAS) programming model to achieve high scalability is reported for particle tracking in direct numerical simulations of turbulent fluid flow. The work is motivated by the desire to obtain Lagrangian information necessary for the study of turbulent dispersion at the largest problem sizes feasible on current and next-generation multi-petaflop supercomputers. A large population of fluid particles is distributed among parallel processes dynamically, based on instantaneous particle positions such that all of the interpolation information needed for each particle is available either locally on its host process or neighboring processes holding adjacent sub-domains of the velocity field. With cubic splines as the preferred interpolation method, the new algorithm is designed to minimize the need for communication, by transferring between adjacent processes only those spline coefficients determined to be necessary for specific particles. This transfer is implemented very efficiently as a one-sided communication, using Co-Array Fortran (CAF) features which facilitate small data movements between different local partitions of a large global array. The cost of monitoring transfer of particle properties between adjacent processes for particles migrating across sub-domain boundaries is found to be small. Detailed benchmarks are obtained on the Cray petascale supercomputer Blue Waters at the University of Illinois, Urbana-Champaign. For operations on the particles in a 81923 simulation (0.55 trillion grid points) on 262,144 Cray XE6 cores, the new algorithm is found to be orders of magnitude faster relative to a prior algorithm in which each particle is tracked by the same parallel process at all times. This large speedup reduces the additional cost of tracking of order 300 million particles to just over 50% of the cost of computing the Eulerian velocity field at this scale. Improving support of PGAS models on major compilers suggests that this algorithm will be of wider applicability on most upcoming supercomputers.
High Temporal Resolution Mapping of Seismic Noise Sources Using Heterogeneous Supercomputers

NASA Astrophysics Data System (ADS)

Paitz, P.; Gokhberg, A.; Ermert, L. A.; Fichtner, A.

2017-12-01

The time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems like earthquake fault zones, volcanoes, geothermal and hydrocarbon reservoirs. We present results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service providing seismic noise source maps for Central Europe with high temporal resolution. We use source imaging methods based on the cross-correlation of seismic noise records from all seismic stations available in the region of interest. The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept to provide the interested researchers worldwide with regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for the generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise source mapping itself rests on the measurement of logarithmic amplitude ratios in suitably pre-processed noise correlations, and the use of simplified sensitivity kernels. During the implementation we addressed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service-oriented architecture for coordination of various sub-systems, and engineering an appropriate data storage solution. The present pilot version of the service implements noise source maps for Switzerland. Extension of the solution to Central Europe is planned for the next project phase.

A parallel-vector algorithm for rapid structural analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1990-01-01

A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the 'loop unrolling' technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large-scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
A parallel-vector algorithm for rapid structural analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1990-01-01

A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the loop unrolling technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
Spatiotemporal modeling of node temperatures in supercomputers

DOE PAGES

Storlie, Curtis Byron; Reich, Brian James; Rust, William Newton; ...

2016-06-10

Los Alamos National Laboratory (LANL) is home to many large supercomputing clusters. These clusters require an enormous amount of power (~500-2000 kW each), and most of this energy is converted into heat. Thus, cooling the components of the supercomputer becomes a critical and expensive endeavor. Recently a project was initiated to investigate the effect that changes to the cooling system in a machine room had on three large machines that were housed there. Coupled with this goal was the aim to develop a general good-practice for characterizing the effect of cooling changes and monitoring machine node temperatures in this andmore » other machine rooms. This paper focuses on the statistical approach used to quantify the effect that several cooling changes to the room had on the temperatures of the individual nodes of the computers. The largest cluster in the room has 1,600 nodes that run a variety of jobs during general use. Since extremes temperatures are important, a Normal distribution plus generalized Pareto distribution for the upper tail is used to model the marginal distribution, along with a Gaussian process copula to account for spatio-temporal dependence. A Gaussian Markov random field (GMRF) model is used to model the spatial effects on the node temperatures as the cooling changes take place. This model is then used to assess the condition of the node temperatures after each change to the room. The analysis approach was used to uncover the cause of a problematic episode of overheating nodes on one of the supercomputing clusters. Lastly, this same approach can easily be applied to monitor and investigate cooling systems at other data centers, as well.« less
Integration of PanDA workload management system with Titan supercomputer at OLCF

NASA Astrophysics Data System (ADS)

De, K.; Klimentov, A.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.

2015-12-01

The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently distributes jobs to more than 100,000 cores at well over 100 Grid sites, the future LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). The current approach utilizes a modified PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multicore worker nodes. It also gives PanDA new capability to collect, in real time, information about unused worker nodes on Titan, which allows precise definition of the size and duration of jobs submitted to Titan according to available free resources. This capability significantly reduces PanDA job wait time while improving Titan's utilization efficiency. This implementation was tested with a variety of Monte-Carlo workloads on Titan and is being tested on several other supercomputing platforms. Notice: This manuscript has been authored, by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.
Mapping the structure of the world economy.

PubMed

Lenzen, Manfred; Kanemoto, Keiichiro; Moran, Daniel; Geschke, Arne

2012-08-07

We have developed a new series of environmentally extended multi-region input-output (MRIO) tables with applications in carbon, water, and ecological footprinting, and Life-Cycle Assessment, as well as trend and key driver analyses. Such applications have recently been at the forefront of global policy debates, such as about assigning responsibility for emissions embodied in internationally traded products. The new time series was constructed using advanced parallelized supercomputing resources, and significantly advances the previous state of art because of four innovations. First, it is available as a continuous 20-year time series of MRIO tables. Second, it distinguishes 187 individual countries comprising more than 15,000 industry sectors, and hence offers unsurpassed detail. Third, it provides information just 1-3 years delayed therefore significantly improving timeliness. Fourth, it presents MRIO elements with accompanying standard deviations in order to allow users to understand the reliability of data. These advances will lead to material improvements in the capability of applications that rely on input-output tables. The timeliness of information means that analyses are more relevant to current policy questions. The continuity of the time series enables the robust identification of key trends and drivers of global environmental change. The high country and sector detail drastically improves the resolution of Life-Cycle Assessments. Finally, the availability of information on uncertainty allows policy-makers to quantitatively judge the level of confidence that can be placed in the results of analyses.
National Storage Laboratory: a collaborative research project

NASA Astrophysics Data System (ADS)

Coyne, Robert A.; Hulen, Harry; Watson, Richard W.

1993-01-01

The grand challenges of science and industry that are driving computing and communications have created corresponding challenges in information storage and retrieval. An industry-led collaborative project has been organized to investigate technology for storage systems that will be the future repositories of national information assets. Industry participants are IBM Federal Systems Company, Ampex Recording Systems Corporation, General Atomics DISCOS Division, IBM ADSTAR, Maximum Strategy Corporation, Network Systems Corporation, and Zitel Corporation. Industry members of the collaborative project are funding their own participation. Lawrence Livermore National Laboratory through its National Energy Research Supercomputer Center (NERSC) will participate in the project as the operational site and provider of applications. The expected result is the creation of a National Storage Laboratory to serve as a prototype and demonstration facility. It is expected that this prototype will represent a significant advance in the technology for distributed storage systems capable of handling gigabyte-class files at gigabit-per-second data rates. Specifically, the collaboration expects to make significant advances in hardware, software, and systems technology in four areas of need, (1) network-attached high performance storage; (2) multiple, dynamic, distributed storage hierarchies; (3) layered access to storage system services; and (4) storage system management.
Collaborative visual analytics of radio surveys in the Big Data era

NASA Astrophysics Data System (ADS)

Vohl, Dany; Fluke, Christopher J.; Hassan, Amr H.; Barnes, David G.; Kilborn, Virginia A.

2017-06-01

Radio survey datasets comprise an increasing number of individual observations stored as sets of multidimensional data. In large survey projects, astronomers commonly face limitations regarding: 1) interactive visual analytics of sufficiently large subsets of data; 2) synchronous and asynchronous collaboration; and 3) documentation of the discovery workflow. To support collaborative data inquiry, we present encube, a large-scale comparative visual analytics framework. encube can utilise advanced visualization environments such as the CAVE2 (a hybrid 2D and 3D virtual reality environment powered with a 100 Tflop/s GPU-based supercomputer and 84 million pixels) for collaborative analysis of large subsets of data from radio surveys. It can also run on standard desktops, providing a capable visual analytics experience across the display ecology. encube is composed of four primary units enabling compute-intensive processing, advanced visualisation, dynamic interaction, parallel data query, along with data management. Its modularity will make it simple to incorporate astronomical analysis packages and Virtual Observatory capabilities developed within our community. We discuss how encube builds a bridge between high-end display systems (such as CAVE2) and the classical desktop, preserving all traces of the work completed on either platform - allowing the research process to continue wherever you are.
Quantum.Ligand.Dock: protein-ligand docking with quantum entanglement refinement on a GPU system.

PubMed

Kantardjiev, Alexander A

2012-07-01

Quantum.Ligand.Dock (protein-ligand docking with graphic processing unit (GPU) quantum entanglement refinement on a GPU system) is an original modern method for in silico prediction of protein-ligand interactions via high-performance docking code. The main flavour of our approach is a combination of fast search with a special account for overlooked physical interactions. On the one hand, we take care of self-consistency and proton equilibria mutual effects of docking partners. On the other hand, Quantum.Ligand.Dock is the the only docking server offering such a subtle supplement to protein docking algorithms as quantum entanglement contributions. The motivation for development and proposition of the method to the community hinges upon two arguments-the fundamental importance of quantum entanglement contribution in molecular interaction and the realistic possibility to implement it by the availability of supercomputing power. The implementation of sophisticated quantum methods is made possible by parallelization at several bottlenecks on a GPU supercomputer. The high-performance implementation will be of use for large-scale virtual screening projects, structural bioinformatics, systems biology and fundamental research in understanding protein-ligand recognition. The design of the interface is focused on feasibility and ease of use. Protein and ligand molecule structures are supposed to be submitted as atomic coordinate files in PDB format. A customization section is offered for addition of user-specified charges, extra ionogenic groups with intrinsic pK(a) values or fixed ions. Final predicted complexes are ranked according to obtained scores and provided in PDB format as well as interactive visualization in a molecular viewer. Quantum.Ligand.Dock server can be accessed at http://87.116.85.141/LigandDock.html.
Statistical correlations and risk analyses techniques for a diving dual phase bubble model and data bank using massively parallel supercomputers.

PubMed

Wienke, B R; O'Leary, T R

2008-05-01

Linking model and data, we detail the LANL diving reduced gradient bubble model (RGBM), dynamical principles, and correlation with data in the LANL Data Bank. Table, profile, and meter risks are obtained from likelihood analysis and quoted for air, nitrox, helitrox no-decompression time limits, repetitive dive tables, and selected mixed gas and repetitive profiles. Application analyses include the EXPLORER decompression meter algorithm, NAUI tables, University of Wisconsin Seafood Diver tables, comparative NAUI, PADI, Oceanic NDLs and repetitive dives, comparative nitrogen and helium mixed gas risks, USS Perry deep rebreather (RB) exploration dive,world record open circuit (OC) dive, and Woodville Karst Plain Project (WKPP) extreme cave exploration profiles. The algorithm has seen extensive and utilitarian application in mixed gas diving, both in recreational and technical sectors, and forms the bases forreleased tables and decompression meters used by scientific, commercial, and research divers. The LANL Data Bank is described, and the methods used to deduce risk are detailed. Risk functions for dissolved gas and bubbles are summarized. Parameters that can be used to estimate profile risk are tallied. To fit data, a modified Levenberg-Marquardt routine is employed with L2 error norm. Appendices sketch the numerical methods, and list reports from field testing for (real) mixed gas diving. A Monte Carlo-like sampling scheme for fast numerical analysis of the data is also detailed, as a coupled variance reduction technique and additional check on the canonical approach to estimating diving risk. The method suggests alternatives to the canonical approach. This work represents a first time correlation effort linking a dynamical bubble model with deep stop data. Supercomputing resources are requisite to connect model and data in application.
Hot Technology, Cool Science (LBNL Science at the Theater)

ScienceCinema

Fowler, John

2018-06-08

Great innovations start with bold ideas. Learn how Berkeley Lab scientists are devising practical solutions to everything from global warming to how you get to work. On May 11, 2009, five Berkeley Lab scientists participated in a roundtable dicussion moderated by KTVU's John Fowler on their leading-edge research. This "Science at the Theater" event, held at the Berkeley Repertory Theatre, featured technologies such as cool roofs, battery-driven transportation, a pocket-sized DNA probe, green supercomputing, and a noncontact method for restoring damaged and fragile mechanical recordings.
Supercomputer modeling of flow past hypersonic flight vehicles

NASA Astrophysics Data System (ADS)

Ermakov, M. K.; Kryukov, I. A.

2017-02-01

A software platform for MPI-based parallel solution of the Navier-Stokes (Euler) equations for viscous heat-conductive compressible perfect gas on 3-D unstructured meshes is developed. The discretization and solution of the Navier-Stokes equations are constructed on generalized S.K. Godunov’s method and the second order approximation in space and time. Developed software platform allows to carry out effectively flow past hypersonic flight vehicles simulations for the Mach numbers 6 and higher, and numerical meshes with up to 1 billion numerical cells and with up to 128 processors.
Steady viscous flow past a circular cylinder

NASA Technical Reports Server (NTRS)

Fornberg, B.

1984-01-01

Viscous flow past a circular cylinder becomes unstable around Reynolds number Re = 40. With a numerical technique based on Newton's method and made possible by the use of a supercomputer, steady (but unstable) solutions have been calculated up to Re = 400. It is found that the wake continues to grow in length approximately linearly with Re. However, in conflict with available asymptotic predictions, the width starts to increase very rapidly around Re = 300. All numerical calculations have been performed on the CDC CYBER 205 at the CDC Service Center in Arden Hills, Minnesota.
CFD lends the government a hand

NASA Technical Reports Server (NTRS)

Lekoudis, Spiro; Singleton, Robert E.; Mehta, Unmeel B.

1992-01-01

The present survey of important and novel CFD applications being developed and implemented by U.S. Government contractors gives attention to naval vessel flow-modeling, Army ballistic and rotary wing aerodynamics, and NASA hypersonic vehicle related applications of CFD. CFD-generated knowledge of numerical algorithms, fluid motion, and supercomputer use is being incorporated into such additional areas as computational electromagnetics and acoustics. Attention is presently given to CFD methods' development status in such fields as submarine boundary layers, hypersonic kinetic energy projectile shock structures, helicopter main rotor tip flows, and National Aerospace Plane aerothermodynamics.
The center for causal discovery of biomedical knowledge from big data

PubMed Central

Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard

2015-01-01

The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. PMID:26138794
High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations

NASA Technical Reports Server (NTRS)

Parks, John (Technical Monitor); Biswas, Rupak; Yan, Jerry C.; Brooks, Walter F.; Sterling, Thomas L.

2003-01-01

Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.
DOE unveils climate model in advance of global test

NASA Astrophysics Data System (ADS)

Popkin, Gabriel

2018-05-01

The world's growing collection of climate models has a high-profile new entry. Last week, after nearly 4 years of work, the U.S. Department of Energy (DOE) released computer code and initial results from an ambitious effort to simulate the Earth system. The new model is tailored to run on future supercomputers and designed to forecast not just how climate will change, but also how those changes might stress energy infrastructure. Results from an upcoming comparison of global models may show how well the new entrant works. But so far it is getting a mixed reception, with some questioning the need for another model and others saying the $80 million effort has yet to improve predictions of the future climate. Even the project's chief scientist, Ruby Leung of the Pacific Northwest National Laboratory in Richland, Washington, acknowledges that the model is not yet a leader.
A personal perspective on modelling the climate system.

PubMed

Palmer, T N

2016-04-01

Given their increasing relevance for society, I suggest that the climate science community itself does not treat the development of error-free ab initio models of the climate system with sufficient urgency. With increasing levels of difficulty, I discuss a number of proposals for speeding up such development. Firstly, I believe that climate science should make better use of the pool of post-PhD talent in mathematics and physics, for developing next-generation climate models. Secondly, I believe there is more scope for the development of modelling systems which link weather and climate prediction more seamlessly. Finally, here in Europe, I call for a new European Programme on Extreme Computing and Climate to advance our ability to simulate climate extremes, and understand the drivers of such extremes. A key goal for such a programme is the development of a 1 km global climate system model to run on the first exascale supercomputers in the early 2020s.
Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation and Completion of Episodic Information.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aimone, James Bradley; Bernard, Michael Lewis; Vineyard, Craig Michael

2014-10-01

Adult neurogenesis in the hippocampus region of the brain is a neurobiological process that is believed to contribute to the brain's advanced abilities in complex pattern recognition and cognition. Here, we describe how realistic scale simulations of the neurogenesis process can offer both a unique perspective on the biological relevance of this process and confer computational insights that are suggestive of novel machine learning techniques. First, supercomputer based scaling studies of the neurogenesis process demonstrate how a small fraction of adult-born neurons have a uniquely larger impact in biologically realistic scaled networks. Second, we describe a novel technical approach bymore » which the information content of ensembles of neurons can be estimated. Finally, we illustrate several examples of broader algorithmic impact of neurogenesis, including both extending existing machine learning approaches and novel approaches for intelligent sensing.« less
Progress on the Development of the hPIC Particle-in-Cell Code

NASA Astrophysics Data System (ADS)

Dart, Cameron; Hayes, Alyssa; Khaziev, Rinat; Marcinko, Stephen; Curreli, Davide; Laboratory of Computational Plasma Physics Team

2017-10-01

Advancements were made in the development of the kinetic-kinetic electrostatic Particle-in-Cell code, hPIC, designed for large-scale simulation of the Plasma-Material Interface. hPIC achieved a weak scaling efficiency of 87% using the Algebraic Multigrid Solver BoomerAMG from the PETSc library on more than 64,000 cores of the Blue Waters supercomputer at the University of Illinois at Urbana-Champaign. The code successfully simulates two-stream instability and a volume of plasma over several square centimeters of surface extending out to the presheath in kinetic-kinetic mode. Results from a parametric study of the plasma sheath in strongly magnetized conditions will be presented, as well as a detailed analysis of the plasma sheath structure at grazing magnetic angles. The distribution function and its moments will be reported for plasma species in the simulation domain and at the material surface for plasma sheath simulations. Membership Pending.
Quantum information processing with superconducting circuits: a review.

PubMed

Wendin, G

2017-10-01

During the last ten years, superconducting circuits have passed from being interesting physical devices to becoming contenders for near-future useful and scalable quantum information processing (QIP). Advanced quantum simulation experiments have been shown with up to nine qubits, while a demonstration of quantum supremacy with fifty qubits is anticipated in just a few years. Quantum supremacy means that the quantum system can no longer be simulated by the most powerful classical supercomputers. Integrated classical-quantum computing systems are already emerging that can be used for software development and experimentation, even via web interfaces. Therefore, the time is ripe for describing some of the recent development of superconducting devices, systems and applications. As such, the discussion of superconducting qubits and circuits is limited to devices that are proven useful for current or near future applications. Consequently, the centre of interest is the practical applications of QIP, such as computation and simulation in Physics and Chemistry.

Quantum information processing with superconducting circuits: a review

NASA Astrophysics Data System (ADS)

Wendin, G.

2017-10-01

During the last ten years, superconducting circuits have passed from being interesting physical devices to becoming contenders for near-future useful and scalable quantum information processing (QIP). Advanced quantum simulation experiments have been shown with up to nine qubits, while a demonstration of quantum supremacy with fifty qubits is anticipated in just a few years. Quantum supremacy means that the quantum system can no longer be simulated by the most powerful classical supercomputers. Integrated classical-quantum computing systems are already emerging that can be used for software development and experimentation, even via web interfaces. Therefore, the time is ripe for describing some of the recent development of superconducting devices, systems and applications. As such, the discussion of superconducting qubits and circuits is limited to devices that are proven useful for current or near future applications. Consequently, the centre of interest is the practical applications of QIP, such as computation and simulation in Physics and Chemistry.
Application of supercomputers to computational aerodynamics

NASA Technical Reports Server (NTRS)

Peterson, V. L.

1984-01-01

Computers are playing an increasingly important role in the field of aerodynamics such that they now serve as a major complement to wind tunnels in aerospace research and development. Factors pacing advances in computational aerodynamics are identified, including the amount of computational power required to take the next major step in the discipline. Example results obtained from the successively refined forms of the governing equations are discussed, both in the context of levels of computer power required and the degree to which they either further the frontiers of research or apply to problems of practical importance. Finally, the Numerical Aerodynamic Simulation (NAS) Program - with its 1988 target of achieving a sustained computational rate of 1 billion floating point operations per second and operating with a memory of 240 million words - is discussed in terms of its goals and its projected effect on the future of computational aerodynamics.
Automation of Data Traffic Control on DSM Architecture

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Jin, Hao-Qiang; Yan, Jerry

2001-01-01

The design of distributed shared memory (DSM) computers liberates users from the duty to distribute data across processors and allows for the incremental development of parallel programs using, for example, OpenMP or Java threads. DSM architecture greatly simplifies the development of parallel programs having good performance on a few processors. However, to achieve a good program scalability on DSM computers requires that the user understand data flow in the application and use various techniques to avoid data traffic congestions. In this paper we discuss a number of such techniques, including data blocking, data placement, data transposition and page size control and evaluate their efficiency on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks. We also present a tool which automates the detection of constructs causing data congestions in Fortran array oriented codes and advises the user on code transformations for improving data traffic in the application.
ALCF Data Science Program: Productive Data-centric Supercomputing

NASA Astrophysics Data System (ADS)

Romero, Nichols; Vishwanath, Venkatram

The ALCF Data Science Program (ADSP) is targeted at big data science problems that require leadership computing resources. The goal of the program is to explore and improve a variety of computational methods that will enable data-driven discoveries across all scientific disciplines. The projects will focus on data science techniques covering a wide area of discovery including but not limited to uncertainty quantification, statistics, machine learning, deep learning, databases, pattern recognition, image processing, graph analytics, data mining, real-time data analysis, and complex and interactive workflows. Project teams will be among the first to access Theta, ALCFs forthcoming 8.5 petaflops Intel/Cray system. The program will transition to the 200 petaflop/s Aurora supercomputing system when it becomes available. In 2016, four projects have been selected to kick off the ADSP. The selected projects span experimental and computational sciences and range from modeling the brain to discovering new materials for solar-powered windows to simulating collision events at the Large Hadron Collider (LHC). The program will have a regular call for proposals with the next call expected in Spring 2017.http://www.alcf.anl.gov/alcf-data-science-program This research used resources of the ALCF, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.
PENTACLE: Parallelized particle-particle particle-tree code for planet formation

NASA Astrophysics Data System (ADS)

Iwasawa, Masaki; Oshino, Shoichi; Fujii, Michiko S.; Hori, Yasunori

2017-10-01

We have newly developed a parallelized particle-particle particle-tree code for planet formation, PENTACLE, which is a parallelized hybrid N-body integrator executed on a CPU-based (super)computer. PENTACLE uses a fourth-order Hermite algorithm to calculate gravitational interactions between particles within a cut-off radius and a Barnes-Hut tree method for gravity from particles beyond. It also implements an open-source library designed for full automatic parallelization of particle simulations, FDPS (Framework for Developing Particle Simulator), to parallelize a Barnes-Hut tree algorithm for a memory-distributed supercomputer. These allow us to handle 1-10 million particles in a high-resolution N-body simulation on CPU clusters for collisional dynamics, including physical collisions in a planetesimal disc. In this paper, we show the performance and the accuracy of PENTACLE in terms of \\tilde{R}_cut and a time-step Δt. It turns out that the accuracy of a hybrid N-body simulation is controlled through Δ t / \\tilde{R}_cut and Δ t / \\tilde{R}_cut ˜ 0.1 is necessary to simulate accurately the accretion process of a planet for ≥106 yr. For all those interested in large-scale particle simulations, PENTACLE, customized for planet formation, will be freely available from https://github.com/PENTACLE-Team/PENTACLE under the MIT licence.
The ASCI Network for SC '99: A Step on the Path to a 100 Gigabit Per Second Supercomputing Network

DOE Office of Scientific and Technical Information (OSTI.GOV)

PRATT,THOMAS J.; TARMAN,THOMAS D.; MARTINEZ,LUIS M.

2000-07-24

This document highlights the Discom{sup 2}'s Distance computing and communication team activities at the 1999 Supercomputing conference in Portland, Oregon. This conference is sponsored by the IEEE and ACM. Sandia, Lawrence Livermore and Los Alamos National laboratories have participated in this conference for eleven years. For the last four years the three laboratories have come together at the conference under the DOE's ASCI, Accelerated Strategic Computing Initiatives rubric. Communication support for the ASCI exhibit is provided by the ASCI DISCOM{sup 2} project. The DISCOM{sup 2} communication team uses this forum to demonstrate and focus communication and networking developments within themore » community. At SC 99, DISCOM built a prototype of the next generation ASCI network demonstrated remote clustering techniques, demonstrated the capabilities of the emerging Terabit Routers products, demonstrated the latest technologies for delivering visualization data to the scientific users, and demonstrated the latest in encryption methods including IP VPN technologies and ATM encryption research. The authors also coordinated the other production networking activities within the booth and between their demonstration partners on the exhibit floor. This paper documents those accomplishments, discusses the details of their implementation, and describes how these demonstrations support Sandia's overall strategies in ASCI networking.« less
Watson will see you now: a supercomputer to help clinicians make informed treatment decisions.

PubMed

Doyle-Lindrud, Susan

2015-02-01

IBM has collaborated with several cancer care providers to develop and train the IBM supercomputer Watson to help clinicians make informed treatment decisions. When a patient is seen in clinic, the oncologist can input all of the clinical information into the computer system. Watson will then review all of the data and recommend treatment options based on the latest evidence and guidelines. Once the oncologist makes the treatment decision, this information can be sent directly to the insurance company for approval. Watson has the ability to standardize care and accelerate the approval process, a benefit to the healthcare provider and the patient.
Particle simulation on heterogeneous distributed supercomputers

NASA Technical Reports Server (NTRS)

Becker, Jeffrey C.; Dagum, Leonardo

1993-01-01

We describe the implementation and performance of a three dimensional particle simulation distributed between a Thinking Machines CM-2 and a Cray Y-MP. These are connected by a combination of two high-speed networks: a high-performance parallel interface (HIPPI) and an optical network (UltraNet). This is the first application to use this configuration at NASA Ames Research Center. We describe our experience implementing and using the application and report the results of several timing measurements. We show that the distribution of applications across disparate supercomputing platforms is feasible and has reasonable performance. In addition, several practical aspects of the computing environment are discussed.
The transition of a real-time single-rotor helicopter simulation program to a supercomputer

NASA Technical Reports Server (NTRS)

Martinez, Debbie

1995-01-01

This report presents the conversion effort and results of a real-time flight simulation application transition to a CONVEX supercomputer. Enclosed is a detailed description of the conversion process and a brief description of the Langley Research Center's (LaRC) flight simulation application program structure. Currently, this simulation program may be configured to represent Sikorsky S-61 helicopter (a five-blade, single-rotor, commercial passenger-type helicopter) or an Army Cobra helicopter (either the AH-1 G or AH-1 S model). This report refers to the Sikorsky S-61 simulation program since it is the most frequently used configuration.
Accelerating Virtual High-Throughput Ligand Docking: current technology and case study on a petascale supercomputer.

PubMed

Ellingson, Sally R; Dakshanamurthy, Sivanesan; Brown, Milton; Smith, Jeremy C; Baudry, Jerome

2014-04-25

In this paper we give the current state of high-throughput virtual screening. We describe a case study of using a task-parallel MPI (Message Passing Interface) version of Autodock4 [1], [2] to run a virtual high-throughput screen of one-million compounds on the Jaguar Cray XK6 Supercomputer at Oak Ridge National Laboratory. We include a description of scripts developed to increase the efficiency of the predocking file preparation and postdocking analysis. A detailed tutorial, scripts, and source code for this MPI version of Autodock4 are available online at http://www.bio.utk.edu/baudrylab/autodockmpi.htm.
Sequence search on a supercomputer.

PubMed

Gotoh, O; Tagashira, Y

1986-01-10

A set of programs was developed for searching nucleic acid and protein sequence data bases for sequences similar to a given sequence. The programs, written in FORTRAN 77, were optimized for vector processing on a Hitachi S810-20 supercomputer. A search of a 500-residue protein sequence against the entire PIR data base Ver. 1.0 (1) (0.5 M residues) is carried out in a CPU time of 45 sec. About 4 min is required for an exhaustive search of a 1500-base nucleotide sequence against all mammalian sequences (1.2M bases) in Genbank Ver. 29.0. The CPU time is reduced to about a quarter with a faster version.
Science & Technology Review November 2006

DOE Office of Scientific and Technical Information (OSTI.GOV)

Radousky, H

This months issue has the following articles: (1) Expanded Supercomputing Maximizes Scientific Discovery--Commentary by Dona Crawford; (2) Thunder's Power Delivers Breakthrough Science--Livermore's Thunder supercomputer allows researchers to model systems at scales never before possible. (3) Extracting Key Content from Images--A new system called the Image Content Engine is helping analysts find significant but hard-to-recognize details in overhead images. (4) Got Oxygen?--Oxygen, especially oxygen metabolism, was key to evolution, and a Livermore project helps find out why. (5) A Shocking New Form of Laserlike Light--According to research at Livermore, smashing a crystal with a shock wave can result in coherent light.
Optimal Full Information Synthesis for Flexible Structures Implemented on Cray Supercomputers

NASA Technical Reports Server (NTRS)

Lind, Rick; Balas, Gary J.

1995-01-01

This paper considers an algorithm for synthesis of optimal controllers for full information feedback. The synthesis procedure reduces to a single linear matrix inequality which may be solved via established convex optimization algorithms. The computational cost of the optimization is investigated. It is demonstrated the problem dimension and corresponding matrices can become large for practical engineering problems. This algorithm represents a process that is impractical for standard workstations for large order systems. A flexible structure is presented as a design example. Control synthesis requires several days on a workstation but may be solved in a reasonable amount of time using a Cray supercomputer.
SiGN-SSM: open source parallel software for estimating gene networks with state space models.

PubMed

Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru

2011-04-15

SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.
Transferring ecosystem simulation codes to supercomputers

NASA Technical Reports Server (NTRS)

Skiles, J. W.; Schulbach, C. H.

1995-01-01

Many ecosystem simulation computer codes have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Supercomputing platforms (both parallel and distributed systems) have been largely unused, however, because of the perceived difficulty in accessing and using the machines. Also, significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers must be considered. We have transferred a grassland simulation model (developed on a VAX) to a Cray Y-MP/C90. We describe porting the model to the Cray and the changes we made to exploit the parallelism in the application and improve code execution. The Cray executed the model 30 times faster than the VAX and 10 times faster than a Unix workstation. We achieved an additional speedup of 30 percent by using the compiler's vectoring and 'in-line' capabilities. The code runs at only about 5 percent of the Cray's peak speed because it ineffectively uses the vector and parallel processing capabilities of the Cray. We expect that by restructuring the code, it could execute an additional six to ten times faster.
Computing with Beowulf

NASA Technical Reports Server (NTRS)

Cohen, Jarrett

1999-01-01

Parallel computers built out of mass-market parts are cost-effectively performing data processing and simulation tasks. The Supercomputing (now known as "SC") series of conferences celebrated its 10th anniversary last November. While vendors have come and gone, the dominant paradigm for tackling big problems still is a shared-resource, commercial supercomputer. Growing numbers of users needing a cheaper or dedicated-access alternative are building their own supercomputers out of mass-market parts. Such machines are generally called Beowulf-class systems after the 11th century epic. This modern-day Beowulf story began in 1994 at NASA's Goddard Space Flight Center. A laboratory for the Earth and space sciences, computing managers there threw down a gauntlet to develop a $50,000 gigaFLOPS workstation for processing satellite data sets. Soon, Thomas Sterling and Don Becker were working on the Beowulf concept at the University Space Research Association (USRA)-run Center of Excellence in Space Data and Information Sciences (CESDIS). Beowulf clusters mix three primary ingredients: commodity personal computers or workstations, low-cost Ethernet networks, and the open-source Linux operating system. One of the larger Beowulfs is Goddard's Highly-parallel Integrated Virtual Environment, or HIVE for short.
Compute Server Performance Results

NASA Technical Reports Server (NTRS)

Stockdale, I. E.; Barton, John; Woodrow, Thomas (Technical Monitor)

1994-01-01

Parallel-vector supercomputers have been the workhorses of high performance computing. As expectations of future computing needs have risen faster than projected vector supercomputer performance, much work has been done investigating the feasibility of using Massively Parallel Processor systems as supercomputers. An even more recent development is the availability of high performance workstations which have the potential, when clustered together, to replace parallel-vector systems. We present a systematic comparison of floating point performance and price-performance for various compute server systems. A suite of highly vectorized programs was run on systems including traditional vector systems such as the Cray C90, and RISC workstations such as the IBM RS/6000 590 and the SGI R8000. The C90 system delivers 460 million floating point operations per second (FLOPS), the highest single processor rate of any vendor. However, if the price-performance ration (PPR) is considered to be most important, then the IBM and SGI processors are superior to the C90 processors. Even without code tuning, the IBM and SGI PPR's of 260 and 220 FLOPS per dollar exceed the C90 PPR of 160 FLOPS per dollar when running our highly vectorized suite,
1993 Gordon Bell Prize Winners

NASA Technical Reports Server (NTRS)

Karp, Alan H.; Simon, Horst; Heller, Don; Cooper, D. M. (Technical Monitor)

1994-01-01

The Gordon Bell Prize recognizes significant achievements in the application of supercomputers to scientific and engineering problems. In 1993, finalists were named for work in three categories: (1) Performance, which recognizes those who solved a real problem in the quickest elapsed time. (2) Price/performance, which encourages the development of cost-effective supercomputing. (3) Compiler-generated speedup, which measures how well compiler writers are facilitating the programming of parallel processors. The winners were announced November 17 at the Supercomputing 93 conference in Portland, Oregon. Gordon Bell, an independent consultant in Los Altos, California, is sponsoring $2,000 in prizes each year for 10 years to promote practical parallel processing research. This is the sixth year of the prize, which Computer administers. Something unprecedented in Gordon Bell Prize competition occurred this year: A computer manufacturer was singled out for recognition. Nine entries reporting results obtained on the Cray C90 were received, seven of the submissions orchestrated by Cray Research. Although none of these entries showed sufficiently high performance to win outright, the judges were impressed by the breadth of applications that ran well on this machine, all nine running at more than a third of the peak performance of the machine.
Trinity to Trinity 1945-2015

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moniz, Ernest; Carr, Alan; Bethe, Hans

The Trinity Test of July 16, 1945 was the first full-scale, real-world test of a nuclear weapon; with the new Trinity supercomputer Los Alamos National Laboratory's goal is to do this virtually, in 3D. Trinity was the culmination of a fantastic effort of groundbreaking science and engineering by hundreds of men and women at Los Alamos and other Manhattan Project sites. It took them less than two years to change the world. The Laboratory is marking the 70th anniversary of the Trinity Test because it not only ushered in the Nuclear Age, but with it the origin of today’s advancedmore » supercomputing. We live in the Age of Supercomputers due in large part to nuclear weapons science here at Los Alamos. National security science, and nuclear weapons science in particular, at Los Alamos National Laboratory have provided a key motivation for the evolution of large-scale scientific computing. Beginning with the Manhattan Project there has been a constant stream of increasingly significant, complex problems in nuclear weapons science whose timely solutions demand larger and faster computers. The relationship between national security science at Los Alamos and the evolution of computing is one of interdependence.« less
KNBD: A Remote Kernel Block Server for Linux

NASA Technical Reports Server (NTRS)

Becker, Jeff

1999-01-01

I am developing a prototype of a Linux remote disk block server whose purpose is to serve as a lower level component of a parallel file system. Parallel file systems are an important component of high performance supercomputers and clusters. Although supercomputer vendors such as SGI and IBM have their own custom solutions, there has been a void and hence a demand for such a system on Beowulf-type PC Clusters. Recently, the Parallel Virtual File System (PVFS) project at Clemson University has begun to address this need (1). Although their system provides much of the functionality of (and indeed was inspired by) the equivalent file systems in the commercial supercomputer market, their system is all in user-space. Migrating their 10 services to the kernel could provide a performance boost, by obviating the need for expensive system calls. Thanks to Pavel Machek, the Linux kernel has provided the network block device (2) with kernels 2.1.101 and later. You can configure this block device to redirect reads and writes to a remote machine's disk. This can be used as a building block for constructing a striped file system across several nodes.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.