Scientific Visualization in High Speed Network Environments
NASA Technical Reports Server (NTRS)
Vaziri, Arsi; Kutler, Paul (Technical Monitor)
1997-01-01
In several cases, new visualization techniques have vastly increased the researcher's ability to analyze and comprehend data. Similarly, the role of networks in providing an efficient supercomputing environment have become more critical and continue to grow at a faster rate than the increase in the processing capabilities of supercomputers. A close relationship between scientific visualization and high-speed networks in providing an important link to support efficient supercomputing is identified. The two technologies are driven by the increasing complexities and volume of supercomputer data. The interaction of scientific visualization and high-speed networks in a Computational Fluid Dynamics simulation/visualization environment are given. Current capabilities supported by high speed networks, supercomputers, and high-performance graphics workstations at the Numerical Aerodynamic Simulation Facility (NAS) at NASA Ames Research Center are described. Applied research in providing a supercomputer visualization environment to support future computational requirements are summarized.
Supercomputer networking for space science applications
NASA Technical Reports Server (NTRS)
Edelson, B. I.
1992-01-01
The initial design of a supercomputer network topology including the design of the communications nodes along with the communications interface hardware and software is covered. Several space science applications that are proposed experiments by GSFC and JPL for a supercomputer network using the NASA ACTS satellite are also reported.
Automatic discovery of the communication network topology for building a supercomputer model
NASA Astrophysics Data System (ADS)
Sobolev, Sergey; Stefanov, Konstantin; Voevodin, Vadim
2016-10-01
The Research Computing Center of Lomonosov Moscow State University is developing the Octotron software suite for automatic monitoring and mitigation of emergency situations in supercomputers so as to maximize hardware reliability. The suite is based on a software model of the supercomputer. The model uses a graph to describe the computing system components and their interconnections. One of the most complex components of a supercomputer that needs to be included in the model is its communication network. This work describes the proposed approach for automatically discovering the Ethernet communication network topology in a supercomputer and its description in terms of the Octotron model. This suite automatically detects computing nodes and switches, collects information about them and identifies their interconnections. The application of this approach is demonstrated on the "Lomonosov" and "Lomonosov-2" supercomputers.
NASA Astrophysics Data System (ADS)
Fukazawa, K.; Walker, R. J.; Kimura, T.; Tsuchiya, F.; Murakami, G.; Kita, H.; Tao, C.; Murata, K. T.
2016-12-01
Planetary magnetospheres are very large, while phenomena within them occur on meso- and micro-scales. These scales range from 10s of planetary radii to kilometers. To understand dynamics in these multi-scale systems, numerical simulations have been performed by using the supercomputer systems. We have studied the magnetospheres of Earth, Jupiter and Saturn by using 3-dimensional magnetohydrodynamic (MHD) simulations for a long time, however, we have not obtained the phenomena near the limits of the MHD approximation. In particular, we have not studied meso-scale phenomena that can be addressed by using MHD.Recently we performed our MHD simulation of Earth's magnetosphere by using the K-computer which is the first 10PFlops supercomputer and obtained multi-scale flow vorticity for the both northward and southward IMF. Furthermore, we have access to supercomputer systems which have Xeon, SPARC64, and vector-type CPUs and can compare simulation results between the different systems. Finally, we have compared the results of our parameter survey of the magnetosphere with observations from the HISAKI spacecraft.We have encountered a number of difficulties effectively using the latest supercomputer systems. First the size of simulation output increases greatly. Now a simulation group produces over 1PB of output. Storage and analysis of this much data is difficult. The traditional way to analyze simulation results is to move the results to the investigator's home computer. This takes over three months using an end-to-end 10Gbps network. In reality, there are problems at some nodes such as firewalls that can increase the transfer time to over one year. Another issue is post-processing. It is hard to treat a few TB of simulation output due to the memory limitations of a post-processing computer. To overcome these issues, we have developed and introduced the parallel network storage, the highly efficient network protocol and the CUI based visualization tools.In this study, we will show the latest simulation results using the petascale supercomputer and problems from the use of these supercomputer systems.
NASA Technical Reports Server (NTRS)
Babrauckas, Theresa
2000-01-01
The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.
ERIC Educational Resources Information Center
General Accounting Office, Washington, DC. Information Management and Technology Div.
This report was prepared in response to a request for information on supercomputers and high-speed networks from the Senate Committee on Commerce, Science, and Transportation, and the House Committee on Science, Space, and Technology. The following information was requested: (1) examples of how various industries are using supercomputers to…
Requirements for a network storage service
NASA Technical Reports Server (NTRS)
Kelly, Suzanne M.; Haynes, Rena A.
1991-01-01
Sandia National Laboratories provides a high performance classified computer network as a core capability in support of its mission of nuclear weapons design and engineering, physical sciences research, and energy research and development. The network, locally known as the Internal Secure Network (ISN), comprises multiple distributed local area networks (LAN's) residing in New Mexico and California. The TCP/IP protocol suite is used for inter-node communications. Scientific workstations and mid-range computers, running UNIX-based operating systems, compose most LAN's. One LAN, operated by the Sandia Corporate Computing Computing Directorate, is a general purpose resource providing a supercomputer and a file server to the entire ISN. The current file server on the supercomputer LAN is an implementation of the Common File Server (CFS). Subsequent to the design of the ISN, Sandia reviewed its mass storage requirements and chose to enter into a competitive procurement to replace the existing file server with one more adaptable to a UNIX/TCP/IP environment. The requirements study for the network was the starting point for the requirements study for the new file server. The file server is called the Network Storage Service (NSS) and its requirements are described. An application or functional description of the NSS is given. The final section adds performance, capacity, and access constraints to the requirements.
Requirements for a network storage service
NASA Technical Reports Server (NTRS)
Kelly, Suzanne M.; Haynes, Rena A.
1992-01-01
Sandia National Laboratories provides a high performance classified computer network as a core capability in support of its mission of nuclear weapons design and engineering, physical sciences research, and energy research and development. The network, locally known as the Internal Secure Network (ISN), was designed in 1989 and comprises multiple distributed local area networks (LAN's) residing in Albuquerque, New Mexico and Livermore, California. The TCP/IP protocol suite is used for inner-node communications. Scientific workstations and mid-range computers, running UNIX-based operating systems, compose most LAN's. One LAN, operated by the Sandia Corporate Computing Directorate, is a general purpose resource providing a supercomputer and a file server to the entire ISN. The current file server on the supercomputer LAN is an implementation of the Common File System (CFS) developed by Los Alamos National Laboratory. Subsequent to the design of the ISN, Sandia reviewed its mass storage requirements and chose to enter into a competitive procurement to replace the existing file server with one more adaptable to a UNIX/TCP/IP environment. The requirements study for the network was the starting point for the requirements study for the new file server. The file server is called the Network Storage Services (NSS) and is requirements are described in this paper. The next section gives an application or functional description of the NSS. The final section adds performance, capacity, and access constraints to the requirements.
Hahne, Jan; Helias, Moritz; Kunkel, Susanne; Igarashi, Jun; Bolten, Matthias; Frommer, Andreas; Diesmann, Markus
2015-01-01
Contemporary simulators for networks of point and few-compartment model neurons come with a plethora of ready-to-use neuron and synapse models and support complex network topologies. Recent technological advancements have broadened the spectrum of application further to the efficient simulation of brain-scale networks on supercomputers. In distributed network simulations the amount of spike data that accrues per millisecond and process is typically low, such that a common optimization strategy is to communicate spikes at relatively long intervals, where the upper limit is given by the shortest synaptic transmission delay in the network. This approach is well-suited for simulations that employ only chemical synapses but it has so far impeded the incorporation of gap-junction models, which require instantaneous neuronal interactions. Here, we present a numerical algorithm based on a waveform-relaxation technique which allows for network simulations with gap junctions in a way that is compatible with the delayed communication strategy. Using a reference implementation in the NEST simulator, we demonstrate that the algorithm and the required data structures can be smoothly integrated with existing code such that they complement the infrastructure for spiking connections. To show that the unified framework for gap-junction and spiking interactions achieves high performance and delivers high accuracy in the presence of gap junctions, we present benchmarks for workstations, clusters, and supercomputers. Finally, we discuss limitations of the novel technology.
Hahne, Jan; Helias, Moritz; Kunkel, Susanne; Igarashi, Jun; Bolten, Matthias; Frommer, Andreas; Diesmann, Markus
2015-01-01
Contemporary simulators for networks of point and few-compartment model neurons come with a plethora of ready-to-use neuron and synapse models and support complex network topologies. Recent technological advancements have broadened the spectrum of application further to the efficient simulation of brain-scale networks on supercomputers. In distributed network simulations the amount of spike data that accrues per millisecond and process is typically low, such that a common optimization strategy is to communicate spikes at relatively long intervals, where the upper limit is given by the shortest synaptic transmission delay in the network. This approach is well-suited for simulations that employ only chemical synapses but it has so far impeded the incorporation of gap-junction models, which require instantaneous neuronal interactions. Here, we present a numerical algorithm based on a waveform-relaxation technique which allows for network simulations with gap junctions in a way that is compatible with the delayed communication strategy. Using a reference implementation in the NEST simulator, we demonstrate that the algorithm and the required data structures can be smoothly integrated with existing code such that they complement the infrastructure for spiking connections. To show that the unified framework for gap-junction and spiking interactions achieves high performance and delivers high accuracy in the presence of gap junctions, we present benchmarks for workstations, clusters, and supercomputers. Finally, we discuss limitations of the novel technology. PMID:26441628
A mass storage system for supercomputers based on Unix
NASA Technical Reports Server (NTRS)
Richards, J.; Kummell, T.; Zarlengo, D. G.
1988-01-01
The authors present the design, implementation, and utilization of a large mass storage subsystem (MSS) for the numerical aerodynamics simulation. The MSS supports a large networked, multivendor Unix-based supercomputing facility. The MSS at Ames Research Center provides all processors on the numerical aerodynamics system processing network, from workstations to supercomputers, the ability to store large amounts of data in a highly accessible, long-term repository. The MSS uses Unix System V and is capable of storing hundreds of thousands of files ranging from a few bytes to 2 Gb in size.
Scaling of data communications for an advanced supercomputer network
NASA Technical Reports Server (NTRS)
Levin, E.; Eaton, C. K.; Young, Bruce
1986-01-01
The goal of NASA's Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations and by remote communication to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. The implications of a projected 20-fold increase in processing power on the data communications requirements are described.
Sign: large-scale gene network estimation environment for high performance computing.
Tamada, Yoshinori; Shimamura, Teppei; Yamaguchi, Rui; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru
2011-01-01
Our research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1. In these three programs, five different models are available: static and dynamic nonparametric Bayesian networks, state space models, graphical Gaussian models, and vector autoregressive models. All these models require a huge amount of computational resources for estimating large-scale gene networks and therefore are designed to be able to exploit the speed of 10 petaflops. The software will be available freely for "K computer" and HGC supercomputer system users. The estimated networks can be viewed and analyzed by Cell Illustrator Online and SBiP (Systems Biology integrative Pipeline). The software project web site is available at http://sign.hgc.jp/ .
History of the numerical aerodynamic simulation program
NASA Technical Reports Server (NTRS)
Peterson, Victor L.; Ballhaus, William F., Jr.
1987-01-01
The Numerical Aerodynamic Simulation (NAS) program has reached a milestone with the completion of the initial operating configuration of the NAS Processing System Network. This achievement is the first major milestone in the continuing effort to provide a state-of-the-art supercomputer facility for the national aerospace community and to serve as a pathfinder for the development and use of future supercomputer systems. The underlying factors that motivated the initiation of the program are first identified and then discussed. These include the emergence and evolution of computational aerodynamics as a powerful new capability in aerodynamics research and development, the computer power required for advances in the discipline, the complementary nature of computation and wind tunnel testing, and the need for the government to play a pathfinding role in the development and use of large-scale scientific computing systems. Finally, the history of the NAS program is traced from its inception in 1975 to the present time.
NASA Technical Reports Server (NTRS)
1991-01-01
Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
Supercomputer Issues from a University Perspective.
ERIC Educational Resources Information Center
Beering, Steven C.
1984-01-01
Discusses issues related to the access of and training of university researchers in using supercomputers, considering National Science Foundation's (NSF) role in this area, microcomputers on campuses, and the limited use of existing telecommunication networks. Includes examples of potential scientific projects (by subject area) utilizing…
Automated Help System For A Supercomputer
NASA Technical Reports Server (NTRS)
Callas, George P.; Schulbach, Catherine H.; Younkin, Michael
1994-01-01
Expert-system software developed to provide automated system of user-helping displays in supercomputer system at Ames Research Center Advanced Computer Facility. Users located at remote computer terminals connected to supercomputer and each other via gateway computers, local-area networks, telephone lines, and satellite links. Automated help system answers routine user inquiries about how to use services of computer system. Available 24 hours per day and reduces burden on human experts, freeing them to concentrate on helping users with complicated problems.
NSF Establishes First Four National Supercomputer Centers.
ERIC Educational Resources Information Center
Lepkowski, Wil
1985-01-01
The National Science Foundation (NSF) has awarded support for supercomputer centers at Cornell University, Princeton University, University of California (San Diego), and University of Illinois. These centers are to be the nucleus of a national academic network for use by scientists and engineers throughout the United States. (DH)
Library Services in a Supercomputer Center.
ERIC Educational Resources Information Center
Layman, Mary
1991-01-01
Describes library services that are offered at the San Diego Supercomputer Center (SDSC), which is located at the University of California at San Diego. Topics discussed include the user population; online searching; microcomputer use; electronic networks; current awareness programs; library catalogs; and the slide collection. A sidebar outlines…
Ultrascalable petaflop parallel supercomputer
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY
2010-07-20
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
NASA Technical Reports Server (NTRS)
1986-01-01
Overview descriptions of on-line environmental data systems, supercomputer facilities, and networks are presented. Each description addresses the concepts of content, capability, and user access relevant to the point of view of potential utilization by the Earth and environmental science community. The information on similar systems or facilities is presented in parallel fashion to encourage and facilitate intercomparison. In addition, summary sheets are given for each description, and a summary table precedes each section.
Multi-petascale highly efficient parallel supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time andmore » supports DMA functionality allowing for parallel processing message-passing.« less
Aviation Research and the Internet
NASA Technical Reports Server (NTRS)
Scott, Antoinette M.
1995-01-01
The Internet is a network of networks. It was originally funded by the Defense Advanced Research Projects Agency or DOD/DARPA and evolved in part from the connection of supercomputer sites across the United States. The National Science Foundation (NSF) made the most of their supercomputers by connecting the sites to each other. This made the supercomputers more efficient and now allows scientists, engineers and researchers to access the supercomputers from their own labs and offices. The high speed networks that connect the NSF supercomputers form the backbone of the Internet. The World Wide Web (WWW) is a menu system. It gathers Internet resources from all over the world into a series of screens that appear on your computer. The WWW is also a distributed. The distributed system stores data information on many computers (servers). These servers can go out and get data when you ask for it. Hypermedia is the base of the WWW. One can 'click' on a section and visit other hypermedia (pages). Our approach to demonstrating the importance of aviation research through the Internet began with learning how to put pages on the Internet (on-line) ourselves. We were assigned two aviation companies; Vision Micro Systems Inc. and Innovative Aerodynamic Technologies (IAT). We developed home pages for these SBIR companies. The equipment used to create the pages were the UNIX and Macintosh machines. HTML Supertext software was used to write the pages and the Sharp JX600S scanner to scan the images. As a result, with the use of the UNIX, Macintosh, Sun, PC, and AXIL machines, we were able to present our home pages to over 800,000 visitors.
NAS-current status and future plans
NASA Technical Reports Server (NTRS)
Bailey, F. R.
1987-01-01
The Numerical Aerodynamic Simulation (NAS) has met its first major milestone, the NAS Processing System Network (NPSN) Initial Operating Configuration (IOC). The program has met its goal of providing a national supercomputer facility capable of greatly enhancing the Nation's research and development efforts. Furthermore, the program is fulfilling its pathfinder role by defining and implementing a paradigm for supercomputing system environments. The IOC is only the begining and the NAS Program will aggressively continue to develop and implement emerging supercomputer, communications, storage, and software technologies to strengthen computations as a critical element in supporting the Nation's leadership role in aeronautics.
NASA Technical Reports Server (NTRS)
Tennille, Geoffrey M.; Howser, Lona M.
1993-01-01
This document briefly describes the use of the CRAY supercomputers that are an integral part of the Supercomputing Network Subsystem of the Central Scientific Computing Complex at LaRC. Features of the CRAY supercomputers are covered, including: FORTRAN, C, PASCAL, architectures of the CRAY-2 and CRAY Y-MP, the CRAY UNICOS environment, batch job submittal, debugging, performance analysis, parallel processing, utilities unique to CRAY, and documentation. The document is intended for all CRAY users as a ready reference to frequently asked questions and to more detailed information contained in the vendor manuals. It is appropriate for both the novice and the experienced user.
Designing a connectionist network supercomputer.
Asanović, K; Beck, J; Feldman, J; Morgan, N; Wawrzynek, J
1993-12-01
This paper describes an effort at UC Berkeley and the International Computer Science Institute to develop a supercomputer for artificial neural network applications. Our perspective has been strongly influenced by earlier experiences with the construction and use of a simpler machine. In particular, we have observed Amdahl's Law in action in our designs and those of others. These observations inspire attention to many factors beyond fast multiply-accumulate arithmetic. We describe a number of these factors along with rough expressions for their influence and then give the applications targets, machine goals and the system architecture for the machine we are currently designing.
Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks
NASA Technical Reports Server (NTRS)
Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias;
2006-01-01
The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.
NASA Technical Reports Server (NTRS)
Bailey, F. R.; Kutler, Paul
1988-01-01
Discussed are the capabilities of NASA's Numerical Aerodynamic Simulation (NAS) Program and its application as an advanced supercomputing system for computational fluid dynamics (CFD) research. First, the paper describes the NAS computational system, called the NAS Processing System Network, and the advanced computational capabilities it offers as a consequence of carrying out the NAS pathfinder objective. Second, it presents examples of pioneering CFD research accomplished during NAS's first operational year. Examples are included which illustrate CFD applications for predicting fluid phenomena, complementing and supplementing experimentation, and aiding in design. Finally, pacing elements and future directions for CFD and NAS are discussed.
The NFSNET: Beginnings of a National Research Internet.
ERIC Educational Resources Information Center
Catlett, Charles E.
1989-01-01
Describes the development, current status, and possible future of NSFNET, which is a backbone network designed to connect five national supercomputer centers established by the National Science Foundation. The discussion covers the implications of this network for research and national networking needs. (CLB)
Particle simulation on heterogeneous distributed supercomputers
NASA Technical Reports Server (NTRS)
Becker, Jeffrey C.; Dagum, Leonardo
1993-01-01
We describe the implementation and performance of a three dimensional particle simulation distributed between a Thinking Machines CM-2 and a Cray Y-MP. These are connected by a combination of two high-speed networks: a high-performance parallel interface (HIPPI) and an optical network (UltraNet). This is the first application to use this configuration at NASA Ames Research Center. We describe our experience implementing and using the application and report the results of several timing measurements. We show that the distribution of applications across disparate supercomputing platforms is feasible and has reasonable performance. In addition, several practical aspects of the computing environment are discussed.
SiGN-SSM: open source parallel software for estimating gene networks with state space models.
Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru
2011-04-15
SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.
Data communication requirements for the advanced NAS network
NASA Technical Reports Server (NTRS)
Levin, Eugene; Eaton, C. K.; Young, Bruce
1986-01-01
The goal of the Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations, and by remote communications to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. In the 1987/1988 time period it is anticipated that a computer with 4 times the processing speed of a Cray 2 will be obtained and by 1990 an additional supercomputer with 16 times the speed of the Cray 2. The implications of this 20-fold increase in processing power on the data communications requirements are described. The analysis was based on models of the projected workload and system architecture. The results are presented together with the estimates of their sensitivity to assumptions inherent in the models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Muller, U.A.; Baumle, B.; Kohler, P.
1992-10-01
Music, a DSP-based system with a parallel distributed-memory architecture, provides enormous computing power yet retains the flexibility of a general-purpose computer. Reaching a peak performance of 2.7 Gflops at a significantly lower cost, power consumption, and space requirement than conventional supercomputers, Music is well suited to computationally intensive applications such as neural network simulation. 12 refs., 9 figs., 2 tabs.
Seismic signal processing on heterogeneous supercomputers
NASA Astrophysics Data System (ADS)
Gokhberg, Alexey; Ermert, Laura; Fichtner, Andreas
2015-04-01
The processing of seismic signals - including the correlation of massive ambient noise data sets - represents an important part of a wide range of seismological applications. It is characterized by large data volumes as well as high computational input/output intensity. Development of efficient approaches towards seismic signal processing on emerging high performance computing systems is therefore essential. Heterogeneous supercomputing systems introduced in the recent years provide numerous computing nodes interconnected via high throughput networks, every node containing a mix of processing elements of different architectures, like several sequential processor cores and one or a few graphical processing units (GPU) serving as accelerators. A typical representative of such computing systems is "Piz Daint", a supercomputer of the Cray XC 30 family operated by the Swiss National Supercomputing Center (CSCS), which we used in this research. Heterogeneous supercomputers provide an opportunity for manifold application performance increase and are more energy-efficient, however they have much higher hardware complexity and are therefore much more difficult to program. The programming effort may be substantially reduced by the introduction of modular libraries of software components that can be reused for a wide class of seismology applications. The ultimate goal of this research is design of a prototype for such library suitable for implementing various seismic signal processing applications on heterogeneous systems. As a representative use case we have chosen an ambient noise correlation application. Ambient noise interferometry has developed into one of the most powerful tools to image and monitor the Earth's interior. Future applications will require the extraction of increasingly small details from noise recordings. To meet this demand, more advanced correlation techniques combined with very large data volumes are needed. This poses new computational problems that require dedicated HPC solutions. The chosen application is using a wide range of common signal processing methods, which include various IIR filter designs, amplitude and phase correlation, computing the analytic signal, and discrete Fourier transforms. Furthermore, various processing methods specific for seismology, like rotation of seismic traces, are used. Efficient implementation of all these methods on the GPU-accelerated systems represents several challenges. In particular, it requires a careful distribution of work between the sequential processors and accelerators. Furthermore, since the application is designed to process very large volumes of data, special attention had to be paid to the efficient use of the available memory and networking hardware resources in order to reduce intensity of data input and output. In our contribution we will explain the software architecture as well as principal engineering decisions used to address these challenges. We will also describe the programming model based on C++ and CUDA that we used to develop the software. Finally, we will demonstrate performance improvements achieved by using the heterogeneous computing architecture. This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID d26.
Network issues for large mass storage requirements
NASA Technical Reports Server (NTRS)
Perdue, James
1992-01-01
File Servers and Supercomputing environments need high performance networks to balance the I/O requirements seen in today's demanding computing scenarios. UltraNet is one solution which permits both high aggregate transfer rates and high task-to-task transfer rates as demonstrated in actual tests. UltraNet provides this capability as both a Server-to-Server and Server-to-Client access network giving the supercomputing center the following advantages highest performance Transport Level connections (to 40 MBytes/sec effective rates); matches the throughput of the emerging high performance disk technologies, such as RAID, parallel head transfer devices and software striping; supports standard network and file system applications using SOCKET's based application program interface such as FTP, rcp, rdump, etc.; supports access to the Network File System (NFS) and LARGE aggregate bandwidth for large NFS usage; provides access to a distributed, hierarchical data server capability using DISCOS UniTree product; supports file server solutions available from multiple vendors, including Cray, Convex, Alliant, FPS, IBM, and others.
NASA Astrophysics Data System (ADS)
Watari, S.; Morikawa, Y.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Kato, H.; Shimojo, S.; Murata, K. T.
2010-12-01
In the Solar-Terrestrial Physics (STP) field, spatio-temporal resolution of computer simulations is getting higher and higher because of tremendous advancement of supercomputers. A more advanced technology is Grid Computing that integrates distributed computational resources to provide scalable computing resources. In the simulation research, it is effective that a researcher oneself designs his physical model, performs calculations with a supercomputer, and analyzes and visualizes for consideration by a familiar method. A supercomputer is far from an analysis and visualization environment. In general, a researcher analyzes and visualizes in the workstation (WS) managed at hand because the installation and the operation of software in the WS are easy. Therefore, it is necessary to copy the data from the supercomputer to WS manually. Time necessary for the data transfer through long delay network disturbs high-accuracy simulations actually. In terms of usefulness, integrating a supercomputer and an analysis and visualization environment seamlessly with a researcher's familiar method is important. NICT has been developing a cloud computing environment (NICT Space Weather Cloud). In the NICT Space Weather Cloud, disk servers are located near its supercomputer and WSs for data analysis and visualization. They are connected to JGN2plus that is high-speed network for research and development. Distributed virtual high-capacity storage is also constructed by Grid Datafarm (Gfarm v2). Huge-size data output from the supercomputer is transferred to the virtual storage through JGN2plus. A researcher can concentrate on the research by a familiar method without regard to distance between a supercomputer and an analysis and visualization environment. Now, total 16 disk servers are setup in NICT headquarters (at Koganei, Tokyo), JGN2plus NOC (at Otemachi, Tokyo), Okinawa Subtropical Environment Remote-Sensing Center, and Cybermedia Center, Osaka University. They are connected on JGN2plus, and they constitute 1PB (physical size) virtual storage by Gfarm v2. These disk servers are connected with supercomputers of NICT and Osaka University. A system that data output from the supercomputers are automatically transferred to the virtual storage had been built up. Transfer rate is about 50 GB/hrs by actual measurement. It is estimated that the performance is reasonable for a certain simulation and analysis for reconstruction of coronal magnetic field. This research is assumed an experiment of the system, and the verification of practicality is advanced at the same time. Herein we introduce an overview of the space weather cloud system so far we have developed. We also demonstrate several scientific results using the space weather cloud system. We also introduce several web applications of the cloud as a service of the space weather cloud, which is named as "e-SpaceWeather" (e-SW). The e-SW provides with a variety of space weather online services from many aspects.
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan
While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less
A secure file manager for UNIX
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeVries, R.G.
1990-12-31
The development of a secure file management system for a UNIX-based computer facility with supercomputers and workstations is described. Specifically, UNIX in its usual form does not address: (1) Operation which would satisfy rigorous security requirements. (2) Online space management in an environment where total data demands would be many times the actual online capacity. (3) Making the file management system part of a computer network in which users of any computer in the local network could retrieve data generated on any other computer in the network. The characteristics of UNIX can be exploited to develop a portable, secure filemore » manager which would operate on computer systems ranging from workstations to supercomputers. Implementation considerations making unusual use of UNIX features, rather than requiring extensive internal system changes, are described, and implementation using the Cray Research Inc. UNICOS operating system is outlined.« less
NASA Technical Reports Server (NTRS)
Rogers, David
1988-01-01
The advent of the Connection Machine profoundly changes the world of supercomputers. The highly nontraditional architecture makes possible the exploration of algorithms that were impractical for standard Von Neumann architectures. Sparse distributed memory (SDM) is an example of such an algorithm. Sparse distributed memory is a particularly simple and elegant formulation for an associative memory. The foundations for sparse distributed memory are described, and some simple examples of using the memory are presented. The relationship of sparse distributed memory to three important computational systems is shown: random-access memory, neural networks, and the cerebellum of the brain. Finally, the implementation of the algorithm for sparse distributed memory on the Connection Machine is discussed.
Access control and privacy in large distributed systems
NASA Technical Reports Server (NTRS)
Leiner, B. M.; Bishop, M.
1986-01-01
Large scale distributed systems consists of workstations, mainframe computers, supercomputers and other types of servers, all connected by a computer network. These systems are being used in a variety of applications including the support of collaborative scientific research. In such an environment, issues of access control and privacy arise. Access control is required for several reasons, including the protection of sensitive resources and cost control. Privacy is also required for similar reasons, including the protection of a researcher's proprietary results. A possible architecture for integrating available computer and communications security technologies into a system that meet these requirements is described. This architecture is meant as a starting point for discussion, rather that the final answer.
Towards Scalable Deep Learning via I/O Analysis and Optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pumma, Sarunya; Si, Min; Feng, Wu-Chun
Deep learning systems have been growing in prominence as a way to automatically characterize objects, trends, and anomalies. Given the importance of deep learning systems, researchers have been investigating techniques to optimize such systems. An area of particular interest has been using large supercomputing systems to quickly generate effective deep learning networks: a phase often referred to as “training” of the deep learning neural network. As we scale existing deep learning frameworks—such as Caffe—on these large supercomputing systems, we notice that the parallelism can help improve the computation tremendously, leaving data I/O as the major bottleneck limiting the overall systemmore » scalability. In this paper, we first present a detailed analysis of the performance bottlenecks of Caffe on large supercomputing systems. Our analysis shows that the I/O subsystem of Caffe—LMDB—relies on memory-mapped I/O to access its database, which can be highly inefficient on large-scale systems because of its interaction with the process scheduling system and the network-based parallel filesystem. Based on this analysis, we then present LMDBIO, our optimized I/O plugin for Caffe that takes into account the data access pattern of Caffe in order to vastly improve I/O performance. Our experimental results show that LMDBIO can improve the overall execution time of Caffe by nearly 20-fold in some cases.« less
High Performance Computing and Networking for Science--Background Paper.
ERIC Educational Resources Information Center
Congress of the U.S., Washington, DC. Office of Technology Assessment.
The Office of Technology Assessment is conducting an assessment of the effects of new information technologies--including high performance computing, data networking, and mass data archiving--on research and development. This paper offers a view of the issues and their implications for current discussions about Federal supercomputer initiatives…
Performance Evaluation in Network-Based Parallel Computing
NASA Technical Reports Server (NTRS)
Dezhgosha, Kamyar
1996-01-01
Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
NASA Astrophysics Data System (ADS)
Bhanota, Gyan; Chen, Dong; Gara, Alan; Vranas, Pavlos
2003-05-01
The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32×32×64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.
Modeling a Million-Node Slim Fly Network Using Parallel Discrete-Event Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolfe, Noah; Carothers, Christopher; Mubarak, Misbah
As supercomputers close in on exascale performance, the increased number of processors and processing power translates to an increased demand on the underlying network interconnect. The Slim Fly network topology, a new lowdiameter and low-latency interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this paper, we present a high-fidelity Slim Fly it-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate our Slim Fly model with the Kathareios et al. Slim Fly model results provided at moderately sized network scales. We further scale the modelmore » size up to n unprecedented 1 million compute nodes; and through visualization of network simulation metrics such as link bandwidth, packet latency, and port occupancy, we get an insight into the network behavior at the million-node scale. We also show linear strong scaling of the Slim Fly model on an Intel cluster achieving a peak event rate of 36 million events per second using 128 MPI tasks to process 7 billion events. Detailed analysis of the underlying discrete-event simulation performance shows that a million-node Slim Fly model simulation can execute in 198 seconds on the Intel cluster.« less
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.
Supercomputers ready for use as discovery machines for neuroscience.
Helias, Moritz; Kunkel, Susanne; Masumoto, Gen; Igarashi, Jun; Eppler, Jochen Martin; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus
2012-01-01
NEST is a widely used tool to simulate biological spiking neural networks. Here we explain the improvements, guided by a mathematical model of memory consumption, that enable us to exploit for the first time the computational power of the K supercomputer for neuroscience. Multi-threaded components for wiring and simulation combine 8 cores per MPI process to achieve excellent scaling. K is capable of simulating networks corresponding to a brain area with 10(8) neurons and 10(12) synapses in the worst case scenario of random connectivity; for larger networks of the brain its hierarchical organization can be exploited to constrain the number of communicating computer nodes. We discuss the limits of the software technology, comparing maximum filling scaling plots for K and the JUGENE BG/P system. The usability of these machines for network simulations has become comparable to running simulations on a single PC. Turn-around times in the range of minutes even for the largest systems enable a quasi interactive working style and render simulations on this scale a practical tool for computational neuroscience.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946
Supercomputers Ready for Use as Discovery Machines for Neuroscience
Helias, Moritz; Kunkel, Susanne; Masumoto, Gen; Igarashi, Jun; Eppler, Jochen Martin; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus
2012-01-01
NEST is a widely used tool to simulate biological spiking neural networks. Here we explain the improvements, guided by a mathematical model of memory consumption, that enable us to exploit for the first time the computational power of the K supercomputer for neuroscience. Multi-threaded components for wiring and simulation combine 8 cores per MPI process to achieve excellent scaling. K is capable of simulating networks corresponding to a brain area with 108 neurons and 1012 synapses in the worst case scenario of random connectivity; for larger networks of the brain its hierarchical organization can be exploited to constrain the number of communicating computer nodes. We discuss the limits of the software technology, comparing maximum filling scaling plots for K and the JUGENE BG/P system. The usability of these machines for network simulations has become comparable to running simulations on a single PC. Turn-around times in the range of minutes even for the largest systems enable a quasi interactive working style and render simulations on this scale a practical tool for computational neuroscience. PMID:23129998
DOE Office of Scientific and Technical Information (OSTI.GOV)
Doerfler, Douglas; Austin, Brian; Cook, Brandon
There are many potential issues associated with deploying the Intel Xeon Phi™ (code named Knights Landing [KNL]) manycore processor in a large-scale supercomputer. One in particular is the ability to fully utilize the high-speed communications network, given that the serial performance of a Xeon Phi TM core is a fraction of a Xeon®core. In this paper, we take a look at the trade-offs associated with allocating enough cores to fully utilize the Aries high-speed network versus cores dedicated to computation, e.g., the trade-off between MPI and OpenMP. In addition, we evaluate new features of Cray MPI in support of KNL,more » such as internode optimizations. We also evaluate one-sided programming models such as Unified Parallel C. We quantify the impact of the above trade-offs and features using a suite of National Energy Research Scientific Computing Center applications.« less
Final Scientific Report: A Scalable Development Environment for Peta-Scale Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karbach, Carsten; Frings, Wolfgang
2013-02-22
This document is the final scientific report of the project DE-SC000120 (A scalable Development Environment for Peta-Scale Computing). The objective of this project is the extension of the Parallel Tools Platform (PTP) for applying it to peta-scale systems. PTP is an integrated development environment for parallel applications. It comprises code analysis, performance tuning, parallel debugging and system monitoring. The contribution of the Juelich Supercomputing Centre (JSC) aims to provide a scalable solution for system monitoring of supercomputers. This includes the development of a new communication protocol for exchanging status data between the target remote system and the client running PTP.more » The communication has to work for high latency. PTP needs to be implemented robustly and should hide the complexity of the supercomputer's architecture in order to provide a transparent access to various remote systems via a uniform user interface. This simplifies the porting of applications to different systems, because PTP functions as abstraction layer between parallel application developer and compute resources. The common requirement for all PTP components is that they have to interact with the remote supercomputer. E.g. applications are built remotely and performance tools are attached to job submissions and their output data resides on the remote system. Status data has to be collected by evaluating outputs of the remote job scheduler and the parallel debugger needs to control an application executed on the supercomputer. The challenge is to provide this functionality for peta-scale systems in real-time. The client server architecture of the established monitoring application LLview, developed by the JSC, can be applied to PTP's system monitoring. LLview provides a well-arranged overview of the supercomputer's current status. A set of statistics, a list of running and queued jobs as well as a node display mapping running jobs to their compute resources form the user display of LLview. These monitoring features have to be integrated into the development environment. Besides showing the current status PTP's monitoring also needs to allow for submitting and canceling user jobs. Monitoring peta-scale systems especially deals with presenting the large amount of status data in a useful manner. Users require to select arbitrary levels of detail. The monitoring views have to provide a quick overview of the system state, but also need to allow for zooming into specific parts of the system, into which the user is interested in. At present, the major batch systems running on supercomputers are PBS, TORQUE, ALPS and LoadLeveler, which have to be supported by both the monitoring and the job controlling component. Finally, PTP needs to be designed as generic as possible, so that it can be extended for future batch systems.« less
NASA Technical Reports Server (NTRS)
Guruswamy, Guru
2004-01-01
A procedure to accurately generate AIC using the Navier-Stokes solver including grid deformation is presented. Preliminary results show good comparisons between experiment and computed flutter boundaries for a rectangular wing. A full wing body configuration of an orbital space plane is selected for demonstration on a large number of processors. In the final paper the AIC of full wing body configuration will be computed. The scalability of the procedure on supercomputer will be demonstrated.
Research in Computational Astrobiology
NASA Technical Reports Server (NTRS)
Chaban, Galina; Jaffe, Richard; Liang, Shoudan; New, Michael H.; Pohorille, Andrew; Wilson, Michael A.
2002-01-01
We present results from several projects in the new field of computational astrobiology, which is devoted to advancing our understanding of the origin, evolution and distribution of life in the Universe using theoretical and computational tools. We have developed a procedure for calculating long-range effects in molecular dynamics using a plane wave expansion of the electrostatic potential. This method is expected to be highly efficient for simulating biological systems on massively parallel supercomputers. We have perform genomics analysis on a family of actin binding proteins. We have performed quantum mechanical calculations on carbon nanotubes and nucleic acids, which simulations will allow us to investigate possible sources of organic material on the early earth. Finally, we have developed a model of protobiological chemistry using neural networks.
Final Report for File System Support for Burst Buffers on HPC Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, W.; Mohror, K.
Distributed burst buffers are a promising storage architecture for handling I/O workloads for exascale computing. As they are being deployed on more supercomputers, a file system that efficiently manages these burst buffers for fast I/O operations carries great consequence. Over the past year, FSU team has undertaken several efforts to design, prototype and evaluate distributed file systems for burst buffers on HPC systems. These include MetaKV: a Key-Value Store for Metadata Management of Distributed Burst Buffers, a user-level file system with multiple backends, and a specialized file system for large datasets of deep neural networks. Our progress for these respectivemore » efforts are elaborated further in this report.« less
A Computational framework for telemedicine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foster, I.; von Laszewski, G.; Thiruvathukal, G. K.
1998-07-01
Emerging telemedicine applications require the ability to exploit diverse and geographically distributed resources. Highspeed networks are used to integrate advanced visualization devices, sophisticated instruments, large databases, archival storage devices, PCs, workstations, and supercomputers. This form of telemedical environment is similar to networked virtual supercomputers, also known as metacomputers. Metacomputers are already being used in many scientific application areas. In this article, we analyze requirements necessary for a telemedical computing infrastructure and compare them with requirements found in a typical metacomputing environment. We will show that metacomputing environments can be used to enable a more powerful and unified computational infrastructure formore » telemedicine. The Globus metacomputing toolkit can provide the necessary low level mechanisms to enable a large scale telemedical infrastructure. The Globus toolkit components are designed in a modular fashion and can be extended to support the specific requirements for telemedicine.« less
Research and development targeted at identifying and mitigating Internet security threats require current network data. To fulfill this need... researchers working for the Center for Applied Internet Data Analysis (CAIDA), a program at the San Diego Supercomputer Center (SDSC) which is based at the...vetted network and security researchers using the PREDICT/IMPACT portal and legal framework. We have also contributed to community building efforts that
The ASCI Network for SC '99: A Step on the Path to a 100 Gigabit Per Second Supercomputing Network
DOE Office of Scientific and Technical Information (OSTI.GOV)
PRATT,THOMAS J.; TARMAN,THOMAS D.; MARTINEZ,LUIS M.
2000-07-24
This document highlights the Discom{sup 2}'s Distance computing and communication team activities at the 1999 Supercomputing conference in Portland, Oregon. This conference is sponsored by the IEEE and ACM. Sandia, Lawrence Livermore and Los Alamos National laboratories have participated in this conference for eleven years. For the last four years the three laboratories have come together at the conference under the DOE's ASCI, Accelerated Strategic Computing Initiatives rubric. Communication support for the ASCI exhibit is provided by the ASCI DISCOM{sup 2} project. The DISCOM{sup 2} communication team uses this forum to demonstrate and focus communication and networking developments within themore » community. At SC 99, DISCOM built a prototype of the next generation ASCI network demonstrated remote clustering techniques, demonstrated the capabilities of the emerging Terabit Routers products, demonstrated the latest technologies for delivering visualization data to the scientific users, and demonstrated the latest in encryption methods including IP VPN technologies and ATM encryption research. The authors also coordinated the other production networking activities within the booth and between their demonstration partners on the exhibit floor. This paper documents those accomplishments, discusses the details of their implementation, and describes how these demonstrations support Sandia's overall strategies in ASCI networking.« less
Implementing Journaling in a Linux Shared Disk File System
NASA Technical Reports Server (NTRS)
Preslan, Kenneth W.; Barry, Andrew; Brassow, Jonathan; Cattelan, Russell; Manthei, Adam; Nygaard, Erling; VanOort, Seth; Teigland, David; Tilstra, Mike; O'Keefe, Matthew;
2000-01-01
In computer systems today, speed and responsiveness is often determined by network and storage subsystem performance. Faster, more scalable networking interfaces like Fibre Channel and Gigabit Ethernet provide the scaffolding from which higher performance computer systems implementations may be constructed, but new thinking is required about how machines interact with network-enabled storage devices. In this paper we describe how we implemented journaling in the Global File System (GFS), a shared-disk, cluster file system for Linux. Our previous three papers on GFS at the Mass Storage Symposium discussed our first three GFS implementations, their performance, and the lessons learned. Our fourth paper describes, appropriately enough, the evolution of GFS version 3 to version 4, which supports journaling and recovery from client failures. In addition, GFS scalability tests extending to 8 machines accessing 8 4-disk enclosures were conducted: these tests showed good scaling. We describe the GFS cluster infrastructure, which is necessary for proper recovery from machine and disk failures in a collection of machines sharing disks using GFS. Finally, we discuss the suitability of Linux for handling the big data requirements of supercomputing centers.
Use of high performance networks and supercomputers for real-time flight simulation
NASA Technical Reports Server (NTRS)
Cleveland, Jeff I., II
1993-01-01
In order to meet the stringent time-critical requirements for real-time man-in-the-loop flight simulation, computer processing operations must be consistent in processing time and be completed in as short a time as possible. These operations include simulation mathematical model computation and data input/output to the simulators. In 1986, in response to increased demands for flight simulation performance, NASA's Langley Research Center (LaRC), working with the contractor, developed extensions to the Computer Automated Measurement and Control (CAMAC) technology which resulted in a factor of ten increase in the effective bandwidth and reduced latency of modules necessary for simulator communication. This technology extension is being used by more than 80 leading technological developers in the United States, Canada, and Europe. Included among the commercial applications are nuclear process control, power grid analysis, process monitoring, real-time simulation, and radar data acquisition. Personnel at LaRC are completing the development of the use of supercomputers for mathematical model computation to support real-time flight simulation. This includes the development of a real-time operating system and development of specialized software and hardware for the simulator network. This paper describes the data acquisition technology and the development of supercomputing for flight simulation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murphy, Richard C.
2009-09-01
This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential ofmore » PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.« less
Constructing Neuronal Network Models in Massively Parallel Environments.
Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus
2017-01-01
Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.
Constructing Neuronal Network Models in Massively Parallel Environments
Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus
2017-01-01
Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808
DOE Network 2025: Network Research Problems and Challenges for DOE Scientists. Workshop Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
None, None
2016-02-01
The growing investments in large science instruments and supercomputers by the US Department of Energy (DOE) hold enormous promise for accelerating the scientific discovery process. They facilitate unprecedented collaborations of geographically dispersed teams of scientists that use these resources. These collaborations critically depend on the production, sharing, moving, and management of, as well as interactive access to, large, complex data sets at sites dispersed across the country and around the globe. In particular, they call for significant enhancements in network capacities to sustain large data volumes and, equally important, the capabilities to collaboratively access the data across computing, storage, andmore » instrument facilities by science users and automated scripts and systems. Improvements in network backbone capacities of several orders of magnitude are essential to meet these challenges, in particular, to support exascale initiatives. Yet, raw network speed represents only a part of the solution. Indeed, the speed must be matched by network and transport layer protocols and higher layer tools that scale in ways that aggregate, compose, and integrate the disparate subsystems into a complete science ecosystem. Just as important, agile monitoring and management services need to be developed to operate the network at peak performance levels. Finally, these solutions must be made an integral part of the production facilities by using sound approaches to develop, deploy, diagnose, operate, and maintain them over the science infrastructure.« less
A high performance linear equation solver on the VPP500 parallel supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi
1994-12-31
This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.
Coherent Ising machines—optical neural networks operating at the quantum limit
NASA Astrophysics Data System (ADS)
Yamamoto, Yoshihisa; Aihara, Kazuyuki; Leleu, Timothee; Kawarabayashi, Ken-ichi; Kako, Satoshi; Fejer, Martin; Inoue, Kyo; Takesue, Hiroki
2017-12-01
In this article, we will introduce the basic concept and the quantum feature of a novel computing system, coherent Ising machines, and describe their theoretical and experimental performance. We start with the discussion how to construct such physical devices as the quantum analog of classical neuron and synapse, and end with the performance comparison against various classical neural networks implemented in CPU and supercomputers.
Multi-petascale highly efficient parallel supercomputer
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng
2015-07-14
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
An Implementation Plan for NFS at NASA's NAS Facility
NASA Technical Reports Server (NTRS)
Lam, Terance L.; Kutler, Paul (Technical Monitor)
1998-01-01
This document discusses how NASA's NAS can benefit from the Sun Microsystems' Network File System (NFS). A case study is presented to demonstrate the effects of NFS on the NAS supercomputing environment. Potential problems are addressed and an implementation strategy is proposed.
NASA Technical Reports Server (NTRS)
Tennille, Geoffrey M.; Howser, Lona M.
1993-01-01
The use of the CONVEX computers that are an integral part of the Supercomputing Network Subsystems (SNS) of the Central Scientific Computing Complex of LaRC is briefly described. Features of the CONVEX computers that are significantly different than the CRAY supercomputers are covered, including: FORTRAN, C, architecture of the CONVEX computers, the CONVEX environment, batch job submittal, debugging, performance analysis, utilities unique to CONVEX, and documentation. This revision reflects the addition of the Applications Compiler and X-based debugger, CXdb. The document id intended for all CONVEX users as a ready reference to frequently asked questions and to more detailed information contained with the vendor manuals. It is appropriate for both the novice and the experienced user.
Cyberinfrastructure for high energy physics in Korea
NASA Astrophysics Data System (ADS)
Cho, Kihyeon; Kim, Hyunwoo; Jeung, Minho; High Energy Physics Team
2010-04-01
We introduce the hierarchy of cyberinfrastructure which consists of infrastructure (supercomputing and networks), Grid, e-Science, community and physics from bottom layer to top layer. KISTI is the national headquarter of supercomputer, network, Grid and e-Science in Korea. Therefore, KISTI is the best place to for high energy physicists to use cyberinfrastructure. We explain this concept on the CDF and the ALICE experiments. In the meantime, the goal of e-Science is to study high energy physics anytime and anywhere even if we are not on-site of accelerator laboratories. The components are data production, data processing and data analysis. The data production is to take both on-line and off-line shifts remotely. The data processing is to run jobs anytime, anywhere using Grid farms. The data analysis is to work together to publish papers using collaborative environment such as EVO (Enabling Virtual Organization) system. We also present the global community activities of FKPPL (France-Korea Particle Physics Laboratory) and physics as top layer.
Integrated risk/cost planning models for the US Air Traffic system
NASA Technical Reports Server (NTRS)
Mulvey, J. M.; Zenios, S. A.
1985-01-01
A prototype network planning model for the U.S. Air Traffic control system is described. The model encompasses the dual objectives of managing collision risks and transportation costs where traffic flows can be related to these objectives. The underlying structure is a network graph with nonseparable convex costs; the model is solved efficiently by capitalizing on its intrinsic characteristics. Two specialized algorithms for solving the resulting problems are described: (1) truncated Newton, and (2) simplicial decomposition. The feasibility of the approach is demonstrated using data collected from a control center in the Midwest. Computational results with different computer systems are presented, including a vector supercomputer (CRAY-XMP). The risk/cost model has two primary uses: (1) as a strategic planning tool using aggregate flight information, and (2) as an integrated operational system for forecasting congestion and monitoring (controlling) flow throughout the U.S. In the latter case, access to a supercomputer is required due to the model's enormous size.
Towards the distribution network of time and frequency
NASA Astrophysics Data System (ADS)
Lipiński, M.; Krehlik, P.; Śliwczyński, Ł.; Buczek, Ł.; Kołodziej, J.; Nawrocki, J.; Nogaś, P.; Dunst, P.; Lemański, D.; Czubla, A.; Pieczerak, J.; Adamowicz, W.; Pawszak, T.; Igalson, J.; Binczewski, A.; Bogacki, W.; Ostapowicz, P.; Stroiński, M.; Turza, K.
2014-05-01
In the paper the genesis, current stage and perspectives of the OPTIME project are described. The main goal of the project is to demonstrate that the newdeveloped at AGH technology of fiber optic transfer of the atomic clocks reference signals is ready to be used in building the domestic Time and Frequency distribution network. In the first part we summarize the two-year continuous operation of 420 kmlong link connecting the Laboratory of Time and Frequency at Central Office of Measures GUM in Warsaw and Time Service Laboratory at Astrogeodynamic Obserwatory AOS in Borowiec near Poznan. For the first time, we are reporting the two year comparison of UTC(PL) and UTC(AOS) atomic timescales with this link, and we refer it to the results of comparisons performed by GPS-based methods. We also address some practical aspects of maintaining time and frequency dissemination over fiber optical network. In the second part of the paper the concept of the general architecture of the distribution network with two Reference Time and Frequency Laboratories and local repositories is proposed. Moreover the brief project of the second branch connecting repositories in Poznan Polish Supercomputing and Networking Center and Torun Nicolaus Copernicus University with the first end-users in Torun such as National Laboratory of Atomic, Molecular and Optical Physics and Nicolaus Copernicus Astronomical Center is described. In the final part the perspective of developing the network both in the domestic range as far as extention with the international connections possibilities are presented.
Katouda, Michio; Naruse, Akira; Hirano, Yukihiko; Nakajima, Takahito
2016-11-15
A new parallel algorithm and its implementation for the RI-MP2 energy calculation utilizing peta-flop-class many-core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual-level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi-node and multi-GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi-node and multi-GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Collective network for computer structures
Blumrich, Matthias A; Coteus, Paul W; Chen, Dong; Gara, Alan; Giampapa, Mark E; Heidelberger, Philip; Hoenicke, Dirk; Takken, Todd E; Steinmacher-Burow, Burkhard D; Vranas, Pavlos M
2014-01-07
A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to the needs of a processing algorithm.
Collective network for computer structures
Blumrich, Matthias A [Ridgefield, CT; Coteus, Paul W [Yorktown Heights, NY; Chen, Dong [Croton On Hudson, NY; Gara, Alan [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Takken, Todd E [Brewster, NY; Steinmacher-Burow, Burkhard D [Wernau, DE; Vranas, Pavlos M [Bedford Hills, NY
2011-08-16
A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.
Richard P. Feynman Center for Innovation
Search Site submit About Us Los Alamos National LaboratoryRichard P. Feynman Center for Innovation Innovation protecting tomorrow Los Alamos National Laboratory The Richard P. Feynman Center for Innovation self-healing, self-forming mesh network of long range radios. READ MORE supercomputer Los Alamos
Promoting High-Performance Computing and Communications. A CBO Study.
ERIC Educational Resources Information Center
Webre, Philip
In 1991 the Federal Government initiated the multiagency High Performance Computing and Communications program (HPCC) to further the development of U.S. supercomputer technology and high-speed computer network technology. This overview by the Congressional Budget Office (CBO) concentrates on obstacles that might prevent the growth of the…
Design of a neural network simulator on a transputer array
NASA Technical Reports Server (NTRS)
Mcintire, Gary; Villarreal, James; Baffes, Paul; Rua, Monica
1987-01-01
A brief summary of neural networks is presented which concentrates on the design constraints imposed. Major design issues are discussed together with analysis methods and the chosen solutions. Although the system will be capable of running on most transputer architectures, it currently is being implemented on a 40-transputer system connected to a toroidal architecture. Predictions show a performance level equivalent to that of a highly optimized simulator running on the SX-2 supercomputer.
SNS programming environment user's guide
NASA Technical Reports Server (NTRS)
Tennille, Geoffrey M.; Howser, Lona M.; Humes, D. Creig; Cronin, Catherine K.; Bowen, John T.; Drozdowski, Joseph M.; Utley, Judith A.; Flynn, Theresa M.; Austin, Brenda A.
1992-01-01
The computing environment is briefly described for the Supercomputing Network Subsystem (SNS) of the Central Scientific Computing Complex of NASA Langley. The major SNS computers are a CRAY-2, a CRAY Y-MP, a CONVEX C-210, and a CONVEX C-220. The software is described that is common to all of these computers, including: the UNIX operating system, computer graphics, networking utilities, mass storage, and mathematical libraries. Also described is file management, validation, SNS configuration, documentation, and customer services.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aimone, James Bradley; Bernard, Michael Lewis; Vineyard, Craig Michael
2014-10-01
Adult neurogenesis in the hippocampus region of the brain is a neurobiological process that is believed to contribute to the brain's advanced abilities in complex pattern recognition and cognition. Here, we describe how realistic scale simulations of the neurogenesis process can offer both a unique perspective on the biological relevance of this process and confer computational insights that are suggestive of novel machine learning techniques. First, supercomputer based scaling studies of the neurogenesis process demonstrate how a small fraction of adult-born neurons have a uniquely larger impact in biologically realistic scaled networks. Second, we describe a novel technical approach bymore » which the information content of ensembles of neurons can be estimated. Finally, we illustrate several examples of broader algorithmic impact of neurogenesis, including both extending existing machine learning approaches and novel approaches for intelligent sensing.« less
Networking Technologies Enable Advances in Earth Science
NASA Technical Reports Server (NTRS)
Johnson, Marjory; Freeman, Kenneth; Gilstrap, Raymond; Beck, Richard
2004-01-01
This paper describes an experiment to prototype a new way of conducting science by applying networking and distributed computing technologies to an Earth Science application. A combination of satellite, wireless, and terrestrial networking provided geologists at a remote field site with interactive access to supercomputer facilities at two NASA centers, thus enabling them to validate and calibrate remotely sensed geological data in near-real time. This represents a fundamental shift in the way that Earth scientists analyze remotely sensed data. In this paper we describe the experiment and the network infrastructure that enabled it, analyze the data flow during the experiment, and discuss the scientific impact of the results.
Radio Synthesis Imaging - A High Performance Computing and Communications Project
NASA Astrophysics Data System (ADS)
Crutcher, Richard M.
The National Science Foundation has funded a five-year High Performance Computing and Communications project at the National Center for Supercomputing Applications (NCSA) for the direct implementation of several of the computing recommendations of the Astronomy and Astrophysics Survey Committee (the "Bahcall report"). This paper is a summary of the project goals and a progress report. The project will implement a prototype of the next generation of astronomical telescope systems - remotely located telescopes connected by high-speed networks to very high performance, scalable architecture computers and on-line data archives, which are accessed by astronomers over Gbit/sec networks. Specifically, a data link has been installed between the BIMA millimeter-wave synthesis array at Hat Creek, California and NCSA at Urbana, Illinois for real-time transmission of data to NCSA. Data are automatically archived, and may be browsed and retrieved by astronomers using the NCSA Mosaic software. In addition, an on-line digital library of processed images will be established. BIMA data will be processed on a very high performance distributed computing system, with I/O, user interface, and most of the software system running on the NCSA Convex C3880 supercomputer or Silicon Graphics Onyx workstations connected by HiPPI to the high performance, massively parallel Thinking Machines Corporation CM-5. The very computationally intensive algorithms for calibration and imaging of radio synthesis array observations will be optimized for the CM-5 and new algorithms which utilize the massively parallel architecture will be developed. Code running simultaneously on the distributed computers will communicate using the Data Transport Mechanism developed by NCSA. The project will also use the BLANCA Gbit/s testbed network between Urbana and Madison, Wisconsin to connect an Onyx workstation in the University of Wisconsin Astronomy Department to the NCSA CM-5, for development of long-distance distributed computing. Finally, the project is developing 2D and 3D visualization software as part of the international AIPS++ project. This research and development project is being carried out by a team of experts in radio astronomy, algorithm development for massively parallel architectures, high-speed networking, database management, and Thinking Machines Corporation personnel. The development of this complete software, distributed computing, and data archive and library solution to the radio astronomy computing problem will advance our expertise in high performance computing and communications technology and the application of these techniques to astronomical data processing.
The Genie Is Out of the Bottle
ERIC Educational Resources Information Center
Katz, Richard N.
2004-01-01
Starting in the late 1960s, data networks were created to connect supercomputers and, later, other intelligent devices. A revolution in communications was in the making. By the late 1970s, computing and communications technologies were leading us from a world of local markets trading in capital goods to one of global markets trading in capital…
Spiking network simulation code for petascale computers.
Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M; Plesser, Hans E; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz
2014-01-01
Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today.
Spiking network simulation code for petascale computers
Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M.; Plesser, Hans E.; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz
2014-01-01
Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today. PMID:25346682
CFD Research, Parallel Computation and Aerodynamic Optimization
NASA Technical Reports Server (NTRS)
Ryan, James S.
1995-01-01
During the last five years, CFD has matured substantially. Pure CFD research remains to be done, but much of the focus has shifted to integration of CFD into the design process. The work under these cooperative agreements reflects this trend. The recent work, and work which is planned, is designed to enhance the competitiveness of the US aerospace industry. CFD and optimization approaches are being developed and tested, so that the industry can better choose which methods to adopt in their design processes. The range of computer architectures has been dramatically broadened, as the assumption that only huge vector supercomputers could be useful has faded. Today, researchers and industry can trade off time, cost, and availability, choosing vector supercomputers, scalable parallel architectures, networked workstations, or heterogenous combinations of these to complete required computations efficiently.
NASA Technical Reports Server (NTRS)
Cohen, Jarrett
1999-01-01
Parallel computers built out of mass-market parts are cost-effectively performing data processing and simulation tasks. The Supercomputing (now known as "SC") series of conferences celebrated its 10th anniversary last November. While vendors have come and gone, the dominant paradigm for tackling big problems still is a shared-resource, commercial supercomputer. Growing numbers of users needing a cheaper or dedicated-access alternative are building their own supercomputers out of mass-market parts. Such machines are generally called Beowulf-class systems after the 11th century epic. This modern-day Beowulf story began in 1994 at NASA's Goddard Space Flight Center. A laboratory for the Earth and space sciences, computing managers there threw down a gauntlet to develop a $50,000 gigaFLOPS workstation for processing satellite data sets. Soon, Thomas Sterling and Don Becker were working on the Beowulf concept at the University Space Research Association (USRA)-run Center of Excellence in Space Data and Information Sciences (CESDIS). Beowulf clusters mix three primary ingredients: commodity personal computers or workstations, low-cost Ethernet networks, and the open-source Linux operating system. One of the larger Beowulfs is Goddard's Highly-parallel Integrated Virtual Environment, or HIVE for short.
KNBD: A Remote Kernel Block Server for Linux
NASA Technical Reports Server (NTRS)
Becker, Jeff
1999-01-01
I am developing a prototype of a Linux remote disk block server whose purpose is to serve as a lower level component of a parallel file system. Parallel file systems are an important component of high performance supercomputers and clusters. Although supercomputer vendors such as SGI and IBM have their own custom solutions, there has been a void and hence a demand for such a system on Beowulf-type PC Clusters. Recently, the Parallel Virtual File System (PVFS) project at Clemson University has begun to address this need (1). Although their system provides much of the functionality of (and indeed was inspired by) the equivalent file systems in the commercial supercomputer market, their system is all in user-space. Migrating their 10 services to the kernel could provide a performance boost, by obviating the need for expensive system calls. Thanks to Pavel Machek, the Linux kernel has provided the network block device (2) with kernels 2.1.101 and later. You can configure this block device to redirect reads and writes to a remote machine's disk. This can be used as a building block for constructing a striped file system across several nodes.
NASA Technical Reports Server (NTRS)
Stevens, Grady H.
1992-01-01
The Data Distribution Satellite (DDS), operating in conjunction with the planned space network, the National Research and Education Network and its commercial derivatives, would play a key role in networking the emerging supercomputing facilities, national archives, academic, industrial, and government institutions. Centrally located over the United States in geostationary orbit, DDS would carry sophisticated on-board switching and make use of advanced antennas to provide an array of special services. Institutions needing continuous high data rate service would be networked together by use of a microwave switching matrix and electronically steered hopping beams. Simultaneously, DDS would use other beams and on board processing to interconnect other institutions with lesser, low rate, intermittent needs. Dedicated links to White Sands and other facilities would enable direct access to space payloads and sensor data. Intersatellite links to a second generation ATDRS, called Advanced Space Data Acquisition and Communications System (ASDACS), would eliminate one satellite hop and enhance controllability of experimental payloads by reducing path delay. Similarly, direct access would be available to the supercomputing facilities and national data archives. Economies with DDS would be derived from its ability to switch high rate facilities amongst users needed. At the same time, having a CONUS view, DDS would interconnect with any institution regardless of how remote. Whether one needed high rate service or low rate service would be immaterial. With the capability to assign resources on demand, DDS will need only carry a portion of the resources needed if dedicated facilities were used. Efficiently switching resources to users as needed, DDS would become a very feasible spacecraft, even though it would tie together the space network, the terrestrial network, remote sites, 1000's of small users, and those few who need very large data links intermittently.
Measurements over distributed high performance computing and storage systems
NASA Technical Reports Server (NTRS)
Williams, Elizabeth; Myers, Tom
1993-01-01
A strawman proposal is given for a framework for presenting a common set of metrics for supercomputers, workstations, file servers, mass storage systems, and the networks that interconnect them. Production control and database systems are also included. Though other applications and third part software systems are not addressed, it is important to measure them as well.
Bhanot, Gyan [Princeton, NJ; Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Takken, Todd E [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2009-09-08
Class network routing is implemented in a network such as a computer network comprising a plurality of parallel compute processors at nodes thereof. Class network routing allows a compute processor to broadcast a message to a range (one or more) of other compute processors in the computer network, such as processors in a column or a row. Normally this type of operation requires a separate message to be sent to each processor. With class network routing pursuant to the invention, a single message is sufficient, which generally reduces the total number of messages in the network as well as the latency to do a broadcast. Class network routing is also applied to dense matrix inversion algorithms on distributed memory parallel supercomputers with hardware class function (multicast) capability. This is achieved by exploiting the fact that the communication patterns of dense matrix inversion can be served by hardware class functions, which results in faster execution times.
NASA Technical Reports Server (NTRS)
Parker, Jay W.; Cwik, Tom; Ferraro, Robert D.; Liewer, Paulett C.; Patterson, Jean E.
1991-01-01
The JPL designed MARKIII hypercube supercomputer has been in application service since June 1988 and has had successful application to a broad problem set including electromagnetic scattering, discrete event simulation, plasma transport, matrix algorithms, neural network simulation, image processing, and graphics. Currently, problems that are not homogeneous are being attempted, and, through this involvement with real world applications, the software is evolving to handle the heterogeneous class problems efficiently.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sreepathi, Sarat; D'Azevedo, Eduardo; Philip, Bobby
On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phasemore » of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership class supercomputer at Oak Ridge National Laboratory.« less
Associative Memories for Supercomputers
1992-12-01
the Si/PLZT technology. Finally, the associative memory system design is presented. 14. SUBJECT TERMS IS NUMBER OF PAGES 60 Memory, Associative Memory...Hybrid lens design ...................................................................... 3 3. ASSOCIATIVE MEMORY STUDY...of California, san Diego 1. OBJECTIVES Our objective during the funding period, July 14 1989 to January 13 1991, was to design and study the
Comprehensive efficiency analysis of supercomputer resource usage based on system monitoring data
NASA Astrophysics Data System (ADS)
Mamaeva, A. A.; Shaykhislamov, D. I.; Voevodin, Vad V.; Zhumatiy, S. A.
2018-03-01
One of the main problems of modern supercomputers is the low efficiency of their usage, which leads to the significant idle time of computational resources, and, in turn, to the decrease in speed of scientific research. This paper presents three approaches to study the efficiency of supercomputer resource usage based on monitoring data analysis. The first approach performs an analysis of computing resource utilization statistics, which allows to identify different typical classes of programs, to explore the structure of the supercomputer job flow and to track overall trends in the supercomputer behavior. The second approach is aimed specifically at analyzing off-the-shelf software packages and libraries installed on the supercomputer, since efficiency of their usage is becoming an increasingly important factor for the efficient functioning of the entire supercomputer. Within the third approach, abnormal jobs – jobs with abnormally inefficient behavior that differs significantly from the standard behavior of the overall supercomputer job flow – are being detected. For each approach, the results obtained in practice in the Supercomputer Center of Moscow State University are demonstrated.
Impact of the Columbia Supercomputer on NASA Space and Exploration Mission
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Kwak, Dochan; Kiris, Cetin; Lawrence, Scott
2006-01-01
NASA's 10,240-processor Columbia supercomputer gained worldwide recognition in 2004 for increasing the space agency's computing capability ten-fold, and enabling U.S. scientists and engineers to perform significant, breakthrough simulations. Columbia has amply demonstrated its capability to accelerate NASA's key missions, including space operations, exploration systems, science, and aeronautics. Columbia is part of an integrated high-end computing (HEC) environment comprised of massive storage and archive systems, high-speed networking, high-fidelity modeling and simulation tools, application performance optimization, and advanced data analysis and visualization. In this paper, we illustrate the impact Columbia is having on NASA's numerous space and exploration applications, such as the development of the Crew Exploration and Launch Vehicles (CEV/CLV), effects of long-duration human presence in space, and damage assessment and repair recommendations for remaining shuttle flights. We conclude by discussing HEC challenges that must be overcome to solve space-related science problems in the future.
Large-scale functional models of visual cortex for remote sensing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brumby, Steven P; Kenyon, Garrett; Rasmussen, Craig E
Neuroscience has revealed many properties of neurons and of the functional organization of visual cortex that are believed to be essential to human vision, but are missing in standard artificial neural networks. Equally important may be the sheer scale of visual cortex requiring {approx}1 petaflop of computation. In a year, the retina delivers {approx}1 petapixel to the brain, leading to massively large opportunities for learning at many levels of the cortical system. We describe work at Los Alamos National Laboratory (LANL) to develop large-scale functional models of visual cortex on LANL's Roadrunner petaflop supercomputer. An initial run of a simplemore » region VI code achieved 1.144 petaflops during trials at the IBM facility in Poughkeepsie, NY (June 2008). Here, we present criteria for assessing when a set of learned local representations is 'complete' along with general criteria for assessing computer vision models based on their projected scaling behavior. Finally, we extend one class of biologically-inspired learning models to problems of remote sensing imagery.« less
Optimization of Supercomputer Use on EADS II System
NASA Technical Reports Server (NTRS)
Ahmed, Ardsher
1998-01-01
The main objective of this research was to optimize supercomputer use to achieve better throughput and utilization of supercomputers and to help facilitate the movement of non-supercomputing (inappropriate for supercomputer) codes to mid-range systems for better use of Government resources at Marshall Space Flight Center (MSFC). This work involved the survey of architectures available on EADS II and monitoring customer (user) applications running on a CRAY T90 system.
Supercomputer applications in molecular modeling.
Gund, T M
1988-01-01
An overview of the functions performed by molecular modeling is given. Molecular modeling techniques benefiting from supercomputing are described, namely, conformation, search, deriving bioactive conformations, pharmacophoric pattern searching, receptor mapping, and electrostatic properties. The use of supercomputers for problems that are computationally intensive, such as protein structure prediction, protein dynamics and reactivity, protein conformations, and energetics of binding is also examined. The current status of supercomputing and supercomputer resources are discussed.
Climate@Home: Crowdsourcing Climate Change Research
NASA Astrophysics Data System (ADS)
Xu, C.; Yang, C.; Li, J.; Sun, M.; Bambacus, M.
2011-12-01
Climate change deeply impacts human wellbeing. Significant amounts of resources have been invested in building super-computers that are capable of running advanced climate models, which help scientists understand climate change mechanisms, and predict its trend. Although climate change influences all human beings, the general public is largely excluded from the research. On the other hand, scientists are eagerly seeking communication mediums for effectively enlightening the public on climate change and its consequences. The Climate@Home project is devoted to connect the two ends with an innovative solution: crowdsourcing climate computing to the general public by harvesting volunteered computing resources from the participants. A distributed web-based computing platform will be built to support climate computing, and the general public can 'plug-in' their personal computers to participate in the research. People contribute the spare computing power of their computers to run a computer model, which is used by scientists to predict climate change. Traditionally, only super-computers could handle such a large computing processing load. By orchestrating massive amounts of personal computers to perform atomized data processing tasks, investments on new super-computers, energy consumed by super-computers, and carbon release from super-computers are reduced. Meanwhile, the platform forms a social network of climate researchers and the general public, which may be leveraged to raise climate awareness among the participants. A portal is to be built as the gateway to the climate@home project. Three types of roles and the corresponding functionalities are designed and supported. The end users include the citizen participants, climate scientists, and project managers. Citizen participants connect their computing resources to the platform by downloading and installing a computing engine on their personal computers. Computer climate models are defined at the server side. Climate scientists configure computer model parameters through the portal user interface. After model configuration, scientists then launch the computing task. Next, data is atomized and distributed to computing engines that are running on citizen participants' computers. Scientists will receive notifications on the completion of computing tasks, and examine modeling results via visualization modules of the portal. Computing tasks, computing resources, and participants are managed by project managers via portal tools. A portal prototype has been built for proof of concept. Three forums have been setup for different groups of users to share information on science aspect, technology aspect, and educational outreach aspect. A facebook account has been setup to distribute messages via the most popular social networking platform. New treads are synchronized from the forums to facebook. A mapping tool displays geographic locations of the participants and the status of tasks on each client node. A group of users have been invited to test functions such as forums, blogs, and computing resource monitoring.
Compiler and Runtime Support for Programming in Adaptive Parallel Environments
1998-10-15
noother job is waiting for resources, and use a smaller number of processors when other jobs needresources. Setia et al. [15, 20] have shown that such...15] Vijay K. Naik, Sanjeev Setia , and Mark Squillante. Performance analysis of job scheduling policiesin parallel supercomputing environments. In...on networks ofheterogeneous workstations. Technical Report CSE-94-012, Oregon Graduate Institute of Scienceand Technology, 1994.[20] Sanjeev Setia
High performance computing for advanced modeling and simulation of materials
NASA Astrophysics Data System (ADS)
Wang, Jue; Gao, Fei; Vazquez-Poletti, Jose Luis; Li, Jianjiang
2017-02-01
The First International Workshop on High Performance Computing for Advanced Modeling and Simulation of Materials (HPCMS2015) was held in Austin, Texas, USA, Nov. 18, 2015. HPCMS 2015 was organized by Computer Network Information Center (Chinese Academy of Sciences), University of Michigan, Universidad Complutense de Madrid, University of Science and Technology Beijing, Pittsburgh Supercomputing Center, China Institute of Atomic Energy, and Ames Laboratory.
The role of graphics super-workstations in a supercomputing environment
NASA Technical Reports Server (NTRS)
Levin, E.
1989-01-01
A new class of very powerful workstations has recently become available which integrate near supercomputer computational performance with very powerful and high quality graphics capability. These graphics super-workstations are expected to play an increasingly important role in providing an enhanced environment for supercomputer users. Their potential uses include: off-loading the supercomputer (by serving as stand-alone processors, by post-processing of the output of supercomputer calculations, and by distributed or shared processing), scientific visualization (understanding of results, communication of results), and by real time interaction with the supercomputer (to steer an iterative computation, to abort a bad run, or to explore and develop new algorithms).
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.
Code of Federal Regulations, 2010 CFR
2010-10-01
... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.
Code of Federal Regulations, 2014 CFR
2014-10-01
... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.
Code of Federal Regulations, 2012 CFR
2012-10-01
... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.
Code of Federal Regulations, 2013 CFR
2013-10-01
... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.
Code of Federal Regulations, 2011 CFR
2011-10-01
... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...
NASA Technical Reports Server (NTRS)
Salmon, Ellen
1996-01-01
The data storage and retrieval demands of space and Earth sciences researchers have made the NASA Center for Computational Sciences (NCCS) Mass Data Storage and Delivery System (MDSDS) one of the world's most active Convex UniTree systems. Science researchers formed the NCCS's Computer Environments and Research Requirements Committee (CERRC) to relate their projected supercomputing and mass storage requirements through the year 2000. Using the CERRC guidelines and observations of current usage, some detailed projections of requirements for MDSDS network bandwidth and mass storage capacity and performance are presented.
High Efficiency Photonic Switch for Data Centers
DOE Office of Scientific and Technical Information (OSTI.GOV)
LaComb, Lloyd J.; Bablumyan, Arkady; Ordyan, Armen
2016-12-06
The worldwide demand for instant access to information is driving internet growth rates above 50% annually. This rapid growth is straining the resources and architectures of existing data centers, metro networks and high performance computer centers. If the current business as usual model continues, data centers alone will require 400TWhr of electricity by 2020. In order to meet the challenges of a faster and more cost effective data centers, metro networks and supercomputing facilities, we have demonstrated a new type of optical switch that will support transmissions speeds up to 1Tb/s, and requires significantly less energy per bit than
Simulation of unsteady flow and solute transport in a tidal river network
Zhan, X.
2003-01-01
A mathematical model and numerical method for water flow and solute transport in a tidal river network is presented. The tidal river network is defined as a system of open channels of rivers with junctions and cross sections. As an example, the Pearl River in China is represented by a network of 104 channels, 62 nodes, and a total of 330 cross sections with 11 boundary section for one of the applications. The simulations are performed with a supercomputer for seven scenarios of water flow and/or solute transport in the Pearl River, China, with different hydrological and weather conditions. Comparisons with available data are shown. The intention of this study is to summarize previous works and to provide a useful tool for water environmental management in a tidal river network, particularly for the Pearl River, China.
Data-intensive computing on numerically-insensitive supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahrens, James P; Fasel, Patricia K; Habib, Salman
2010-12-03
With the advent of the era of petascale supercomputing, via the delivery of the Roadrunner supercomputing platform at Los Alamos National Laboratory, there is a pressing need to address the problem of visualizing massive petascale-sized results. In this presentation, I discuss progress on a number of approaches including in-situ analysis, multi-resolution out-of-core streaming and interactive rendering on the supercomputing platform. These approaches are placed in context by the emerging area of data-intensive supercomputing.
The Science DMZ: A Network Design Pattern for Data-Intensive Science
Dart, Eli; Rotman, Lauren; Tierney, Brian; ...
2014-01-01
The ever-increasing scale of scientific data has become a significant challenge for researchers that rely on networks to interact with remote computing systems and transfer results to collaborators worldwide. Despite the availability of high-capacity connections, scientists struggle with inadequate cyberinfrastructure that cripples data transfer performance, and impedes scientific progress. The Science DMZ paradigm comprises a proven set of network design patterns that collectively address these problems for scientists. We explain the Science DMZ model, including network architecture, system configuration, cybersecurity, and performance tools, that creates an optimized network environment for science. We describe use cases from universities, supercomputing centers andmore » research laboratories, highlighting the effectiveness of the Science DMZ model in diverse operational settings. In all, the Science DMZ model is a solid platform that supports any science workflow, and flexibly accommodates emerging network technologies. As a result, the Science DMZ vastly improves collaboration, accelerating scientific discovery.« less
Computer Electromagnetics and Supercomputer Architecture
NASA Technical Reports Server (NTRS)
Cwik, Tom
1993-01-01
The dramatic increase in performance over the last decade for microporcessor computations is compared with that for the supercomputer computations. This performance, the projected performance, and a number of other issues such as cost and the inherent pysical limitations in curent supercomputer technology have naturally led to parallel supercomputers and ensemble of interconnected microprocessors.
Edison - A New Cray Supercomputer Advances Discovery at NERSC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy
2014-02-06
When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.
Edison - A New Cray Supercomputer Advances Discovery at NERSC
Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy; Trebotich, David; Broughton, Jeff; Antypas, Katie; Lukic, Zarija, Borrill, Julian; Draney, Brent; Chen, Jackie
2018-01-16
When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.
NASA Technical Reports Server (NTRS)
Yan, Jerry C.; Jespersen, Dennis; Buning, Peter; Bailey, David (Technical Monitor)
1996-01-01
The Gorden Bell Prizes given out at Supercomputing every year includes at least two catergories: performance (highest GFLOP count) and price-performance (GFLOP/million $$) for real applications. In the past five years, the winners of the price-performance categories all came from networks of work-stations. This reflects three important facts: 1. supercomputers are still too expensive for the masses; 2. achieving high performance for real applications takes real work; and, most importantly; 3. it is possible to obtain acceptable performance for certain real applications on network of work stations. With the continued advance of network technology as well as increased performance of "desktop" workstation, the "Swarm of Ants vs. Herd of Elephants" debate, which began with vector multiprocessors (VPPs) against SIMD type multiprocessors (e.g. CM2), is now recast as VPPs against Symetric Multiprocessors (SMPs, e.g. SGI PowerChallenge). This paper reports on performance studies we performed solving a large scale (2-million grid pt.s) CFD problem involving a Boeing 747 based on a parallel version of OVERFLOW that utilizes message passing on PVM. A performance monitoring tool developed under NASA HPCC, called AIMS, was used to instrument and analyze the the performance data thus obtained. We plan to compare its performance data obtained across a wide spectrum of architectures including: the Cray C90, IBM/SP2, SGI/Power Challenge Cluster, to a group of workstations connected over a simple network. The metrics of comparison includes speed-up, price-performance, throughput, and turn-around time. We also plan to present a plan of attack for various issues that will make the execution of Grand Challenge Applications across the Global Information Infrastructure a reality.
48 CFR 225.7012 - Restriction on supercomputers.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 48 Federal Acquisition Regulations System 3 2014-10-01 2014-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 48 Federal Acquisition Regulations System 3 2010-10-01 2010-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 48 Federal Acquisition Regulations System 3 2013-10-01 2013-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 48 Federal Acquisition Regulations System 3 2011-10-01 2011-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
48 CFR 225.7012 - Restriction on supercomputers.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 48 Federal Acquisition Regulations System 3 2012-10-01 2012-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...
Effects of the oceans on polar motion: Extended investigations
NASA Technical Reports Server (NTRS)
Dickman, Steven R.
1987-01-01
Matrix formulation of the tide equations (pole tide in nonglobal oceans); matrix formulation of the associated boundary conditions (constraints on the tide velocity at coastlines); and FORTRAN encoding of the tide equations excluding boundary conditions were completed. The need for supercomputer facilities was evident. Large versions of the programs were successfully run on the CYBER, submitting the jobs from SUNY through the BITNET network. The code was also restructured to include boundary constraints.
Mapping Flows onto Networks to Optimize Organizational Processes
2005-01-01
And G . Porter, “Assessments of Simulated Performance of Alternative Architectures for Command and Control: The Role of Coordination”, Proceedings of...the 1999 Command & Control Research & Technology Symposium, NWC, Newport, RI, June 1999, pp. 123-143. [Iverson95] M. Iverson, F. Ozguner, G . Follen...Technology Symposium, NPS, Monterrey, CA, June, 2002. [Wu88] Min-You Wu, D. Gajski . “A Programming Aid for Hypercube Architectures.” The Journal of Supercomputing, 2(1988), pp. 349-372.
Web-based system for surgical planning and simulation
NASA Astrophysics Data System (ADS)
Eldeib, Ayman M.; Ahmed, Mohamed N.; Farag, Aly A.; Sites, C. B.
1998-10-01
The growing scientific knowledge and rapid progress in medical imaging techniques has led to an increasing demand for better and more efficient methods of remote access to high-performance computer facilities. This paper introduces a web-based telemedicine project that provides interactive tools for surgical simulation and planning. The presented approach makes use of client-server architecture based on new internet technology where clients use an ordinary web browser to view, send, receive and manipulate patients' medical records while the server uses the supercomputer facility to generate online semi-automatic segmentation, 3D visualization, surgical simulation/planning and neuroendoscopic procedures navigation. The supercomputer (SGI ONYX 1000) is located at the Computer Vision and Image Processing Lab, University of Louisville, Kentucky. This system is under development in cooperation with the Department of Neurological Surgery, Alliant Health Systems, Louisville, Kentucky. The server is connected via a network to the Picture Archiving and Communication System at Alliant Health Systems through a DICOM standard interface that enables authorized clients to access patients' images from different medical modalities.
Collaborative Supercomputing for Global Change Science
NASA Astrophysics Data System (ADS)
Nemani, R.; Votava, P.; Michaelis, A.; Melton, F.; Milesi, C.
2011-03-01
There is increasing pressure on the science community not only to understand how recent and projected changes in climate will affect Earth's global environment and the natural resources on which society depends but also to design solutions to mitigate or cope with the likely impacts. Responding to this multidimensional challenge requires new tools and research frameworks that assist scientists in collaborating to rapidly investigate complex interdisciplinary science questions of critical societal importance. One such collaborative research framework, within the NASA Earth sciences program, is the NASA Earth Exchange (NEX). NEX combines state-of-the-art supercomputing, Earth system modeling, remote sensing data from NASA and other agencies, and a scientific social networking platform to deliver a complete work environment. In this platform, users can explore and analyze large Earth science data sets, run modeling codes, collaborate on new or existing projects, and share results within or among communities (see Figure S1 in the online supplement to this Eos issue (http://www.agu.org/eos_elec)).
Space Transportation and the Computer Industry: Learning from the Past
NASA Technical Reports Server (NTRS)
Merriam, M. L.; Rasky, D.
2002-01-01
Since the space shuttle began flying in 1981, NASA has made a number of attempts to advance the state of the art in space transportation. In spite of billions of dollars invested, and several concerted attempts, no replacement for the shuttle is expected before 2010. Furthermore, the cost of access to space has dropped very slowly over the last two decades. On the other hand, the same two decades have seen dramatic progress in the computer industry. Computational speeds have increased by about a factor of 1000 and available memory, disk space, and network bandwidth has seen similar increases. At the same time, the cost of computing has dropped by about a factor of 10000. Is the space transportation problem simply harder? Or is there something to be learned from the computer industry? In looking for the answers, this paper reviews the early history of NASA's experience with supercomputers and NASA's visionary course change in supercomputer procurement strategy.
TOP500 Supercomputers for June 2004
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack
2004-06-23
23rd Edition of TOP500 List of World's Fastest Supercomputers Released: Japan's Earth Simulator Enters Third Year in Top Position MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 23rd edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2004) at the International Supercomputer Conference in Heidelberg, Germany.
Automotive applications of superconductors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ginsberg, M.
1987-01-01
These proceedings compile papers on supercomputers in the automobile industry. Titles include: An automotive engineer's guide to the effective use of scalar, vector, and parallel computers; fluid mechanics, finite elements, and supercomputers; and Automotive crashworthiness performance on a supercomputer.
Improved Access to Supercomputers Boosts Chemical Applications.
ERIC Educational Resources Information Center
Borman, Stu
1989-01-01
Supercomputing is described in terms of computing power and abilities. The increase in availability of supercomputers for use in chemical calculations and modeling are reported. Efforts of the National Science Foundation and Cray Research are highlighted. (CW)
Predicting Cost/Performance Trade-Offs for Whitney: A Commodity Computing Cluster
NASA Technical Reports Server (NTRS)
Becker, Jeffrey C.; Nitzberg, Bill; VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)
1997-01-01
Recent advances in low-end processor and network technology have made it possible to build a "supercomputer" out of commodity components. We develop simple models of the NAS Parallel Benchmarks version 2 (NPB 2) to explore the cost/performance trade-offs involved in building a balanced parallel computer supporting a scientific workload. We develop closed form expressions detailing the number and size of messages sent by each benchmark. Coupling these with measured single processor performance, network latency, and network bandwidth, our models predict benchmark performance to within 30%. A comparison based on total system cost reveals that current commodity technology (200 MHz Pentium Pros with 100baseT Ethernet) is well balanced for the NPBs up to a total system cost of around $1,000,000.
NASA Astrophysics Data System (ADS)
Davis, G. A.; Battistuz, B.; Foley, S.; Vernon, F. L.; Eakins, J. A.
2009-12-01
Since April 2004 the Earthscope USArray Transportable Array (TA) network has grown to over 400 broadband seismic stations that stream multi-channel data in near real-time to the Array Network Facility in San Diego. In total, over 1.7 terabytes per year of 24-bit, 40 samples-per-second seismic and state of health data is recorded from the stations. The ANF provides analysts access to real-time and archived data, as well as state-of-health data, metadata, and interactive tools for station engineers and the public via a website. Additional processing and recovery of missing data from on-site recorders (balers) at the stations is performed before the final data is transmitted to the IRIS Data Management Center (DMC). Assembly of the final data set requires additional storage and processing capabilities to combine the real-time data with baler data. The infrastructure supporting these diverse computational and storage needs currently consists of twelve virtualized Sun Solaris Zones executing on nine physical server systems. The servers are protected against failure by redundant power, storage, and networking connections. Storage needs are provided by a hybrid iSCSI and Fiber Channel Storage Area Network (SAN) with access to over 40 terabytes of RAID 5 and 6 storage. Processing tasks are assigned to systems based on parallelization and floating-point calculation needs. On-site buffering at the data-loggers provide protection in case of short-term network or hardware problems, while backup acquisition systems at the San Diego Supercomputer Center and the DMC protect against catastrophic failure of the primary site. Configuration management and monitoring of these systems is accomplished with open-source (Cfengine, Nagios, Solaris Community Software) and commercial tools (Intermapper). In the evolution from a single server to multiple virtualized server instances, Sun Cluster software was evaluated and found to be unstable in our environment. Shared filesystem architectures using PxFS and QFS were found to be incompatible with our software architecture, so sharing of data between systems is accomplished via traditional NFS. Linux was found to be limited in terms of deployment flexibility and consistency between versions. Despite the experimentation with various technologies, our current virtualized architecture is stable to the point of an average daily real time data return rate of 92.34% over the entire lifetime of the project to date.
2016-01-01
supportive of this work from the start . This research would not have been possible without the contributions made by a number of individuals throughout...and funding structures. We started with these questions in particular based on the primary concerns at AMOS identified in the results of Phase I...from the start . Keck maintains connections with a series of other sites within a remote observing network. Remote observing from the mainland
Towards Efficient Supercomputing: Searching for the Right Efficiency Metric
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hsu, Chung-Hsing; Kuehn, Jeffery A; Poole, Stephen W
2012-01-01
The efficiency of supercomputing has traditionally been in the execution time. In early 2000 s, the concept of total cost of ownership was re-introduced, with the introduction of efficiency measure to include aspects such as energy and space. Yet the supercomputing community has never agreed upon a metric that can cover these aspects altogether and also provide a fair basis for comparison. This paper exam- ines the metrics that have been proposed in the past decade, and proposes a vector-valued metric for efficient supercom- puting. Using this metric, the paper presents a study of where the supercomputing industry has beenmore » and how it stands today with respect to efficient supercomputing.« less
NASA Astrophysics Data System (ADS)
Noumaru, Junichi; Kawai, Jun A.; Schubert, Kiaina; Yagi, Masafumi; Takata, Tadafumi; Winegar, Tom; Scanlon, Tim; Nishida, Takuhiro; Fox, Camron; Hayasaka, James; Forester, Jason; Uchida, Kenji; Nakamura, Isamu; Tom, Richard; Koura, Norikazu; Yamamoto, Tadahiro; Tanoue, Toshiya; Yamada, Toru
2008-07-01
Subaru Telescope has recently replaced most equipment of Subaru Telescope Network II with the new equipment which includes 124TB of RAID system for data archive. Switching the data storage from tape to RAID enables users to access the data faster. The STN-III dropped some important components of STN-II, such as supercomputers, development & testing subsystem for Subaru Observation Control System, or data processing subsystem. On the other hand, we invested more computers to the remote operation system. Thanks to IT innovations, our LAN as well as the network between Hilo and summit were upgraded to gigabit network at the similar or even reduced cost from the previous system. As the result of the redesigning of the computer system by more focusing on the observatory operation, we greatly reduced the total cost for computer rental, purchase and maintenance.
Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation
Phillips, James C.; Sun, Yanhua; Jain, Nikhil; Bohm, Eric J.; Kalé, Laxmikant V.
2014-01-01
Currently deployed petascale supercomputers typically use toroidal network topologies in three or more dimensions. While these networks perform well for topology-agnostic codes on a few thousand nodes, leadership machines with 20,000 nodes require topology awareness to avoid network contention for communication-intensive codes. Topology adaptation is complicated by irregular node allocation shapes and holes due to dedicated input/output nodes or hardware failure. In the context of the popular molecular dynamics program NAMD, we present methods for mapping a periodic 3-D grid of fixed-size spatial decomposition domains to 3-D Cray Gemini and 5-D IBM Blue Gene/Q toroidal networks to enable hundred-million atom full machine simulations, and to similarly partition node allocations into compact domains for smaller simulations using multiple-copy algorithms. Additional enabling techniques are discussed and performance is reported for NCSA Blue Waters, ORNL Titan, ANL Mira, TACC Stampede, and NERSC Edison. PMID:25594075
Computation Directorate 2008 Annual Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crawford, D L
2009-03-25
Whether a computer is simulating the aging and performance of a nuclear weapon, the folding of a protein, or the probability of rainfall over a particular mountain range, the necessary calculations can be enormous. Our computers help researchers answer these and other complex problems, and each new generation of system hardware and software widens the realm of possibilities. Building on Livermore's historical excellence and leadership in high-performance computing, Computation added more than 331 trillion floating-point operations per second (teraFLOPS) of power to LLNL's computer room floors in 2008. In addition, Livermore's next big supercomputer, Sequoia, advanced ever closer to itsmore » 2011-2012 delivery date, as architecture plans and the procurement contract were finalized. Hyperion, an advanced technology cluster test bed that teams Livermore with 10 industry leaders, made a big splash when it was announced during Michael Dell's keynote speech at the 2008 Supercomputing Conference. The Wall Street Journal touted Hyperion as a 'bright spot amid turmoil' in the computer industry. Computation continues to measure and improve the costs of operating LLNL's high-performance computing systems by moving hardware support in-house, by measuring causes of outages to apply resources asymmetrically, and by automating most of the account and access authorization and management processes. These improvements enable more dollars to go toward fielding the best supercomputers for science, while operating them at less cost and greater responsiveness to the customers.« less
NASA's supercomputing experience
NASA Technical Reports Server (NTRS)
Bailey, F. Ron
1990-01-01
A brief overview of NASA's recent experience in supercomputing is presented from two perspectives: early systems development and advanced supercomputing applications. NASA's role in supercomputing systems development is illustrated by discussion of activities carried out by the Numerical Aerodynamical Simulation Program. Current capabilities in advanced technology applications are illustrated with examples in turbulence physics, aerodynamics, aerothermodynamics, chemistry, and structural mechanics. Capabilities in science applications are illustrated by examples in astrophysics and atmospheric modeling. Future directions and NASA's new High Performance Computing Program are briefly discussed.
OpenMP Performance on the Columbia Supercomputer
NASA Technical Reports Server (NTRS)
Haoqiang, Jin; Hood, Robert
2005-01-01
This presentation discusses Columbia World Class Supercomputer which is one of the world's fastest supercomputers providing 61 TFLOPs (10/20/04). Conceived, designed, built, and deployed in just 120 days. A 20-node supercomputer built on proven 512-processor nodes. The largest SGI system in the world with over 10,000 Intel Itanium 2 processors and provides the largest node size incorporating commodity parts (512) and the largest shared-memory environment (2048) with 88% efficiency tops the scalar systems on the Top500 list.
Global Observation Information Networking: Using the Distributed Image Spreadsheet (DISS)
NASA Technical Reports Server (NTRS)
Hasler, Fritz
1999-01-01
The DISS and many other tools will be used to present visualizations which span the period from the original Suomi/Hasler animations of the first ATS-1 GEO weather satellite images in 1966 ....... to the latest 1999 NASA Earth Science Vision for the next 25 years. Hot off the SGI Onyx Graphics-Supercomputers are NASA's visualizations of Hurricanes Mitch, Georges, Fran and Linda. These storms have been recently featured on the covers of National Geographic, Time, Newsweek and Popular Science and used repeatedly this season on National and International network TV. Results will be presented from a new paper on automatic wind measurements in Hurricane Luis from 1-min GOES images that appeared in the November BAMS.
Site in a box: Improving the Tier 3 experience
NASA Astrophysics Data System (ADS)
Dost, J. M.; Fajardo, E. M.; Jones, T. R.; Martin, T.; Tadel, A.; Tadel, M.; Würthwein, F.
2017-10-01
The Pacific Research Platform is an initiative to interconnect Science DMZs between campuses across the West Coast of the United States over a 100 gbps network. The LHC @ UC is a proof of concept pilot project that focuses on interconnecting 6 University of California campuses. It is spearheaded by computing specialists from the UCSD Tier 2 Center in collaboration with the San Diego Supercomputer Center. A machine has been shipped to each campus extending the concept of the Data Transfer Node to a cluster in a box that is fully integrated into the local compute, storage, and networking infrastructure. The node contains a full HTCondor batch system, and also an XRootD proxy cache. User jobs routed to the DTN can run on 40 additional slots provided by the machine, and can also flock to a common GlideinWMS pilot pool, which sends jobs out to any of the participating UCs, as well as to Comet, the new supercomputer at SDSC. In addition, a common XRootD federation has been created to interconnect the UCs and give the ability to arbitrarily export data from the home university, to make it available wherever the jobs run. The UC level federation also statically redirects to either the ATLAS FAX or CMS AAA federation respectively to make globally published datasets available, depending on end user VO membership credentials. XRootD read operations from the federation transfer through the nearest DTN proxy cache located at the site where the jobs run. This reduces wide area network overhead for subsequent accesses, and improves overall read performance. Details on the technical implementation, challenges faced and overcome in setting up the infrastructure, and an analysis of usage patterns and system scalability will be presented.
Most Social Scientists Shun Free Use of Supercomputers.
ERIC Educational Resources Information Center
Kiernan, Vincent
1998-01-01
Social scientists, who frequently complain that the federal government spends too little on them, are passing up what scholars in the physical and natural sciences see as the government's best give-aways: free access to supercomputers. Some social scientists say the supercomputers are difficult to use; others find desktop computers provide…
A fault tolerant spacecraft supercomputer to enable a new class of scientific discovery
NASA Technical Reports Server (NTRS)
Katz, D. S.; McVittie, T. I.; Silliman, A. G., Jr.
2000-01-01
The goal of the Remote Exploration and Experimentation (REE) Project is to move supercomputeing into space in a coste effective manner and to allow the use of inexpensive, state of the art, commercial-off-the-shelf components and subsystems in these space-based supercomputers.
TOP500 Supercomputers for November 2003
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack
2003-11-16
22nd Edition of TOP500 List of World s Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 22nd edition of the TOP500 list of the worlds fastest supercomputers was released today (November 16, 2003). The Earth Simulator supercomputer retains the number one position with its Linpack benchmark performance of 35.86 Tflop/s (''teraflops'' or trillions of calculations per second). It was built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan.
Preparing for in situ processing on upcoming leading-edge supercomputers
Kress, James; Churchill, Randy Michael; Klasky, Scott; ...
2016-10-01
High performance computing applications are producing increasingly large amounts of data and placing enormous stress on current capabilities for traditional post-hoc visualization techniques. Because of the growing compute and I/O imbalance, data reductions, including in situ visualization, are required. These reduced data are used for analysis and visualization in a variety of different ways. Many of he visualization and analysis requirements are known a priori, but when they are not, scientists are dependent on the reduced data to accurately represent the simulation in post hoc analysis. The contributions of this paper is a description of the directions we are pursuingmore » to assist a large scale fusion simulation code succeed on the next generation of supercomputers. Finally, these directions include the role of in situ processing for performing data reductions, as well as the tradeoffs between data size and data integrity within the context of complex operations in a typical scientific workflow.« less
Remote visual analysis of large turbulence databases at multiple scales
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pulido, Jesus; Livescu, Daniel; Kanov, Kalin
The remote analysis and visualization of raw large turbulence datasets is challenging. Current accurate direct numerical simulations (DNS) of turbulent flows generate datasets with billions of points per time-step and several thousand time-steps per simulation. Until recently, the analysis and visualization of such datasets was restricted to scientists with access to large supercomputers. The public Johns Hopkins Turbulence database simplifies access to multi-terabyte turbulence datasets and facilitates the computation of statistics and extraction of features through the use of commodity hardware. In this paper, we present a framework designed around wavelet-based compression for high-speed visualization of large datasets and methodsmore » supporting multi-resolution analysis of turbulence. By integrating common technologies, this framework enables remote access to tools available on supercomputers and over 230 terabytes of DNS data over the Web. Finally, the database toolset is expanded by providing access to exploratory data analysis tools, such as wavelet decomposition capabilities and coherent feature extraction.« less
Remote visual analysis of large turbulence databases at multiple scales
Pulido, Jesus; Livescu, Daniel; Kanov, Kalin; ...
2018-06-15
The remote analysis and visualization of raw large turbulence datasets is challenging. Current accurate direct numerical simulations (DNS) of turbulent flows generate datasets with billions of points per time-step and several thousand time-steps per simulation. Until recently, the analysis and visualization of such datasets was restricted to scientists with access to large supercomputers. The public Johns Hopkins Turbulence database simplifies access to multi-terabyte turbulence datasets and facilitates the computation of statistics and extraction of features through the use of commodity hardware. In this paper, we present a framework designed around wavelet-based compression for high-speed visualization of large datasets and methodsmore » supporting multi-resolution analysis of turbulence. By integrating common technologies, this framework enables remote access to tools available on supercomputers and over 230 terabytes of DNS data over the Web. Finally, the database toolset is expanded by providing access to exploratory data analysis tools, such as wavelet decomposition capabilities and coherent feature extraction.« less
Global interrupt and barrier networks
Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E; Heidelberger, Philip; Kopcsay, Gerard V.; Steinmacher-Burow, Burkhard D.; Takken, Todd E.
2008-10-28
A system and method for generating global asynchronous signals in a computing structure. Particularly, a global interrupt and barrier network is implemented that implements logic for generating global interrupt and barrier signals for controlling global asynchronous operations performed by processing elements at selected processing nodes of a computing structure in accordance with a processing algorithm; and includes the physical interconnecting of the processing nodes for communicating the global interrupt and barrier signals to the elements via low-latency paths. The global asynchronous signals respectively initiate interrupt and barrier operations at the processing nodes at times selected for optimizing performance of the processing algorithms. In one embodiment, the global interrupt and barrier network is implemented in a scalable, massively parallel supercomputing device structure comprising a plurality of processing nodes interconnected by multiple independent networks, with each node including one or more processing elements for performing computation or communication activity as required when performing parallel algorithm operations. One multiple independent network includes a global tree network for enabling high-speed global tree communications among global tree network nodes or sub-trees thereof. The global interrupt and barrier network may operate in parallel with the global tree network for providing global asynchronous sideband signals.
COOP 3D ARPA Experiment 109 National Center for Atmospheric Research
NASA Technical Reports Server (NTRS)
1998-01-01
Coupled atmospheric and hydrodynamic forecast models were executed on the supercomputing resources of the National Center for Atmospheric Research (NCAR) in Boulder, Colorado and the Ohio Supercomputing Center (OSC)in Columbus, Ohio. respectively. The interoperation of the forecast models on these geographically diverse, high performance Cray platforms required the transfer of large three dimensional data sets at very high information rates. High capacity, terrestrial fiber optic transmission system technologies were integrated with those of an experimental high speed communications satellite in Geosynchronous Earth Orbit (GEO) to test the integration of the two systems. Operation over a spacecraft in GEO orbit required modification of the standard configuration of legacy data communications protocols to facilitate their ability to perform efficiently in the changing environment characteristic of a hybrid network. The success of this performance tuning enabled the use of such an architecture to facilitate high data rate, fiber optic quality data communications between high performance systems not accessible to standard terrestrial fiber transmission systems. Thus obviating the performance degradation often found in contemporary earth/satellite hybrids.
Integration of the Chinese HPC Grid in ATLAS Distributed Computing
NASA Astrophysics Data System (ADS)
Filipčič, A.;
2017-10-01
Fifteen Chinese High-Performance Computing sites, many of them on the TOP500 list of most powerful supercomputers, are integrated into a common infrastructure providing coherent access to a user through an interface based on a RESTful interface called SCEAPI. These resources have been integrated into the ATLAS Grid production system using a bridge between ATLAS and SCEAPI which translates the authorization and job submission protocols between the two environments. The ARC Computing Element (ARC-CE) forms the bridge using an extended batch system interface to allow job submission to SCEAPI. The ARC-CE was setup at the Institute for High Energy Physics, Beijing, in order to be as close as possible to the SCEAPI front-end interface at the Computing Network Information Center, also in Beijing. This paper describes the technical details of the integration between ARC-CE and SCEAPI and presents results so far with two supercomputer centers, Tianhe-IA and ERA. These two centers have been the pilots for ATLAS Monte Carlo Simulation in SCEAPI and have been providing CPU power since fall 2015.
Large-Scale NASA Science Applications on the Columbia Supercluster
NASA Technical Reports Server (NTRS)
Brooks, Walter
2005-01-01
Columbia, NASA's newest 61 teraflops supercomputer that became operational late last year, is a highly integrated Altix cluster of 10,240 processors, and was named to honor the crew of the Space Shuttle lost in early 2003. Constructed in just four months, Columbia increased NASA's computing capability ten-fold, and revitalized the Agency's high-end computing efforts. Significant cutting-edge science and engineering simulations in the areas of space and Earth sciences, as well as aeronautics and space operations, are already occurring on this largest operational Linux supercomputer, demonstrating its capacity and capability to accelerate NASA's space exploration vision. The presentation will describe how an integrated environment consisting not only of next-generation systems, but also modeling and simulation, high-speed networking, parallel performance optimization, and advanced data analysis and visualization, is being used to reduce design cycle time, accelerate scientific discovery, conduct parametric analysis of multiple scenarios, and enhance safety during the life cycle of NASA missions. The talk will conclude by discussing how NAS partnered with various NASA centers, other government agencies, computer industry, and academia, to create a national resource in large-scale modeling and simulation.
Distributed user services for supercomputers
NASA Technical Reports Server (NTRS)
Sowizral, Henry A.
1989-01-01
User-service operations at supercomputer facilities are examined. The question is whether a single, possibly distributed, user-services organization could be shared by NASA's supercomputer sites in support of a diverse, geographically dispersed, user community. A possible structure for such an organization is identified as well as some of the technologies needed in operating such an organization.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolfe, A.
1986-03-10
Supercomputing software is moving into high gear, spurred by the rapid spread of supercomputers into new applications. The critical challenge is how to develop tools that will make it easier for programmers to write applications that take advantage of vectorizing in the classical supercomputer and the parallelism that is emerging in supercomputers and minisupercomputers. Writing parallel software is a challenge that every programmer must face because parallel architectures are springing up across the range of computing. Cray is developing a host of tools for programmers. Tools to support multitasking (in supercomputer parlance, multitasking means dividing up a single program tomore » run on multiple processors) are high on Cray's agenda. On tap for multitasking is Premult, dubbed a microtasking tool. As a preprocessor for Cray's CFT77 FORTRAN compiler, Premult will provide fine-grain multitasking.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sreepathi, Sarat; Kumar, Jitendra; Mills, Richard T.
A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like themore » Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.« less
Will Moores law be sufficient?
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeBenedictis, Erik P.
2004-07-01
It seems well understood that supercomputer simulation is an enabler for scientific discoveries, weapons, and other activities of value to society. It also seems widely believed that Moore's Law will make progressively more powerful supercomputers over time and thus enable more of these contributions. This paper seeks to add detail to these arguments, revealing them to be generally correct but not a smooth and effortless progression. This paper will review some key problems that can be solved with supercomputer simulation, showing that more powerful supercomputers will be useful up to a very high yet finite limit of around 1021 FLOPSmore » (1 Zettaflops) . The review will also show the basic nature of these extreme problems. This paper will review work by others showing that the theoretical maximum supercomputer power is very high indeed, but will explain how a straightforward extrapolation of Moore's Law will lead to technological maturity in a few decades. The power of a supercomputer at the maturity of Moore's Law will be very high by today's standards at 1016-1019 FLOPS (100 Petaflops to 10 Exaflops), depending on architecture, but distinctly below the level required for the most ambitious applications. Having established that Moore's Law will not be that last word in supercomputing, this paper will explore the nearer term issue of what a supercomputer will look like at maturity of Moore's Law. Our approach will quantify the maximum performance as permitted by the laws of physics for extension of current technology and then find a design that approaches this limit closely. We study a 'multi-architecture' for supercomputers that combines a microprocessor with other 'advanced' concepts and find it can reach the limits as well. This approach should be quite viable in the future because the microprocessor would provide compatibility with existing codes and programming styles while the 'advanced' features would provide a boost to the limits of performance.« less
Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yilk, Todd
The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.
Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL
Yilk, Todd
2018-02-17
The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubois, David H; Dubois, Andrew J; Boorman, Thomas M
2009-01-01
This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Non-preconditioned conjugate gradient on cell and FPCA-based hybrid supercomputer nodes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubois, David H; Dubois, Andrew J; Boorman, Thomas M
2009-03-10
This work presents a detailed implementation of a double precision, Non-Preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{trademark} in conjunction with x86 Opteron{trademark} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Supercomputer Provides Molecular Insight into Cellulose (Fact Sheet)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
2011-02-01
Groundbreaking research at the National Renewable Energy Laboratory (NREL) has used supercomputing simulations to calculate the work that enzymes must do to deconstruct cellulose, which is a fundamental step in biomass conversion technologies for biofuels production. NREL used the new high-performance supercomputer Red Mesa to conduct several million central processing unit (CPU) hours of simulation.
Operational uses of ACTS technology
NASA Astrophysics Data System (ADS)
Gedney, Richard T.; Wright, David L.; Balombin, Joseph L.; Sohn, Philip Y.; Cashman, William F.; Stern, Alan L.; Golding, Len; Palmer, Larry
1992-03-01
The NASA Advanced Communications Technology Satellite (ACTS) provides the technologies for very high gain hopping spot beam antennas, on-board baseband routing and processing, and wideband (1 GHz) Ka-band transponders. A number of studies have recently been completed using the experience gained in developing the actual ACTS system hardware to quantify how well the ACTS technology can be used in future operational systems. This paper provides a summary of these study results including the spacecraft (S/C) weight per unit circuit for providing services by ACTS technologies as compared to present-day satellites. The uses of the ACTS technology discussed are for providing T1 VSAT mesh networks, aeronautical mobile communications, supervisory control and data acquisition (SCADA) services, and high data rate networks for supercomputer and other applications.
NASA Technical Reports Server (NTRS)
Sorenson, Reese L.; Mccann, Karen
1992-01-01
A proven 3-D multiple-block elliptic grid generator, designed to run in 'batch mode' on a supercomputer, is improved by the creation of a modern graphical user interface (GUI) running on a workstation. The two parts are connected in real time by a network. The resultant system offers a significant speedup in the process of preparing and formatting input data and the ability to watch the grid solution converge by replotting the grid at each iteration step. The result is a reduction in user time and CPU time required to generate the grid and an enhanced understanding of the elliptic solution process. This software system, called GRAPEVINE, is described, and certain observations are made concerning the creation of such software.
GREEN SUPERCOMPUTING IN A DESKTOP BOX
DOE Office of Scientific and Technical Information (OSTI.GOV)
HSU, CHUNG-HSING; FENG, WU-CHUN; CHING, AVERY
2007-01-17
The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an open-source operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicatedmore » computer cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren - a shared (and power-hungry) datacenter resource that must reside in a machine-cooled room in order to operate properly. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a 'green' desktop supercomputer - a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 'pizza box' workstation. In this paper, they present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop supercomputer that achieves 14 Gflops on Linpack but sips only 185 watts of power at load, resulting in a performance-power ratio that is over 300% better than their reference SMP platform.« less
Input/output behavior of supercomputing applications
NASA Technical Reports Server (NTRS)
Miller, Ethan L.
1991-01-01
The collection and analysis of supercomputer I/O traces and their use in a collection of buffering and caching simulations are described. This serves two purposes. First, it gives a model of how individual applications running on supercomputers request file system I/O, allowing system designer to optimize I/O hardware and file system algorithms to that model. Second, the buffering simulations show what resources are needed to maximize the CPU utilization of a supercomputer given a very bursty I/O request rate. By using read-ahead and write-behind in a large solid stated disk, one or two applications were sufficient to fully utilize a Cray Y-MP CPU.
Chemical calculations on Cray computers
NASA Technical Reports Server (NTRS)
Taylor, Peter R.; Bauschlicher, Charles W., Jr.; Schwenke, David W.
1989-01-01
The influence of recent developments in supercomputing on computational chemistry is discussed with particular reference to Cray computers and their pipelined vector/limited parallel architectures. After reviewing Cray hardware and software the performance of different elementary program structures are examined, and effective methods for improving program performance are outlined. The computational strategies appropriate for obtaining optimum performance in applications to quantum chemistry and dynamics are discussed. Finally, some discussion is given of new developments and future hardware and software improvements.
HeNCE: A Heterogeneous Network Computing Environment
Beguelin, Adam; Dongarra, Jack J.; Geist, George Al; ...
1994-01-01
Network computing seeks to utilize the aggregate resources of many networked computers to solve a single problem. In so doing it is often possible to obtain supercomputer performance from an inexpensive local area network. The drawback is that network computing is complicated and error prone when done by hand, especially if the computers have different operating systems and data formats and are thus heterogeneous. The heterogeneous network computing environment (HeNCE) is an integrated graphical environment for creating and running parallel programs over a heterogeneous collection of computers. It is built on a lower level package called parallel virtual machine (PVM).more » The HeNCE philosophy of parallel programming is to have the programmer graphically specify the parallelism of a computation and to automate, as much as possible, the tasks of writing, compiling, executing, debugging, and tracing the network computation. Key to HeNCE is a graphical language based on directed graphs that describe the parallelism and data dependencies of an application. Nodes in the graphs represent conventional Fortran or C subroutines and the arcs represent data and control flow. This article describes the present state of HeNCE, its capabilities, limitations, and areas of future research.« less
NASA Technical Reports Server (NTRS)
Jacob, Joseph; Katz, Daniel; Prince, Thomas; Berriman, Graham; Good, John; Laity, Anastasia
2006-01-01
The final version (3.0) of the Montage software has been released. To recapitulate from previous NASA Tech Briefs articles about Montage: This software generates custom, science-grade mosaics of astronomical images on demand from input files that comply with the Flexible Image Transport System (FITS) standard and contain image data registered on projections that comply with the World Coordinate System (WCS) standards. This software can be executed on single-processor computers, multi-processor computers, and such networks of geographically dispersed computers as the National Science Foundation s TeraGrid or NASA s Information Power Grid. The primary advantage of running Montage in a grid environment is that computations can be done on a remote supercomputer for efficiency. Multiple computers at different sites can be used for different parts of a computation a significant advantage in cases of computations for large mosaics that demand more processor time than is available at any one site. Version 3.0 incorporates several improvements over prior versions. The most significant improvement is that this version is accessible to scientists located anywhere, through operational Web services that provide access to data from several large astronomical surveys and construct mosaics on either local workstations or remote computational grids as needed.
Prospects for Boiling of Subcooled Dielectric Liquids for Supercomputer Cooling
NASA Astrophysics Data System (ADS)
Zeigarnik, Yu. A.; Vasil'ev, N. V.; Druzhinin, E. A.; Kalmykov, I. V.; Kosoi, A. S.; Khodakov, K. A.
2018-02-01
It is shown experimentally that using forced-convection boiling of dielectric coolants of the Novec 649 Refrigerant subcooled relative to the saturation temperature makes possible removing heat flow rates up to 100 W/cm2 from modern supercomputer chip interface. This fact creates prerequisites for the application of dielectric liquids in cooling systems of modern supercomputers with increased requirements for their operating reliability.
National Test Facility civilian agency use of supercomputers not feasible
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
1994-12-01
Based on interviews with civilian agencies cited in the House report (DOE, DoEd, HHS, FEMA, NOAA), none would be able to make effective use of NTF`s excess supercomputing capabilities. These agencies stated they could not use the resources primarily because (1) NTF`s supercomputers are older machines whose performance and costs cannot match those of more advanced computers available from other sources and (2) some agencies have not yet developed applications requiring supercomputer capabilities or do not have funding to support such activities. In addition, future support for the hardware and software at NTF is uncertain, making any investment by anmore » outside user risky.« less
Kriging for Spatial-Temporal Data on the Bridges Supercomputer
NASA Astrophysics Data System (ADS)
Hodgess, E. M.
2017-12-01
Currently, kriging of spatial-temporal data is slow and limited to relatively small vector sizes. We have developed a method on the Bridges supercomputer, at the Pittsburgh supercomputer center, which uses a combination of the tools R, Fortran, the Message Passage Interface (MPI), OpenACC, and special R packages for big data. This combination of tools now permits us to complete tasks which could previously not be completed, or takes literally hours to complete. We ran simulation studies from a laptop against the supercomputer. We also look at "real world" data sets, such as the Irish wind data, and some weather data. We compare the timings. We note that the timings are suprising good.
Multiple DNA and protein sequence alignment on a workstation and a supercomputer.
Tajima, K
1988-11-01
This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.
NASA Technical Reports Server (NTRS)
Kutler, Paul; Yee, Helen
1987-01-01
Topics addressed include: numerical aerodynamic simulation; computational mechanics; supercomputers; aerospace propulsion systems; computational modeling in ballistics; turbulence modeling; computational chemistry; computational fluid dynamics; and computational astrophysics.
NASA Technical Reports Server (NTRS)
Kramer, Williams T. C.; Simon, Horst D.
1994-01-01
This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.
NAS technical summaries: Numerical aerodynamic simulation program, March 1991 - February 1992
NASA Technical Reports Server (NTRS)
1992-01-01
NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefiting other supercomputer centers in Government and industry. This report contains selected scientific results from the 1991-92 NAS Operational Year, March 4, 1991 to March 3, 1992, which is the fifth year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP. The Cray-2, the first generation supercomputer, has four processors, 256 megawords of central memory, and a total sustained speed of 250 million floating point operations per second. The Cray Y-MP, the second generation supercomputer, has eight processors and a total sustained speed of one billion floating point operations per second. Additional memory was installed this year, doubling capacity from 128 to 256 megawords of solid-state storage-device memory. Because of its higher performance, the Cray Y-MP delivered approximately 77 percent of the total number of supercomputer hours used during this year.
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
Wang, Bei; Ethier, Stephane; Tang, William; ...
2017-06-29
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Bei; Ethier, Stephane; Tang, William
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Computational Nanotechnology at NASA Ames Research Center, 1996
NASA Technical Reports Server (NTRS)
Globus, Al; Bailey, David; Langhoff, Steve; Pohorille, Andrew; Levit, Creon; Chancellor, Marisa K. (Technical Monitor)
1996-01-01
Some forms of nanotechnology appear to have enormous potential to improve aerospace and computer systems; computational nanotechnology, the design and simulation of programmable molecular machines, is crucial to progress. NASA Ames Research Center has begun a computational nanotechnology program including in-house work, external research grants, and grants of supercomputer time. Four goals have been established: (1) Simulate a hypothetical programmable molecular machine replicating itself and building other products. (2) Develop molecular manufacturing CAD (computer aided design) software and use it to design molecular manufacturing systems and products of aerospace interest, including computer components. (3) Characterize nanotechnologically accessible materials of aerospace interest. Such materials may have excellent strength and thermal properties. (4) Collaborate with experimentalists. Current in-house activities include: (1) Development of NanoDesign, software to design and simulate a nanotechnology based on functionalized fullerenes. Early work focuses on gears. (2) A design for high density atomically precise memory. (3) Design of nanotechnology systems based on biology. (4) Characterization of diamonoid mechanosynthetic pathways. (5) Studies of the laplacian of the electronic charge density to understand molecular structure and reactivity. (6) Studies of entropic effects during self-assembly. Characterization of properties of matter for clusters up to sizes exhibiting bulk properties. In addition, the NAS (NASA Advanced Supercomputing) supercomputer division sponsored a workshop on computational molecular nanotechnology on March 4-5, 1996 held at NASA Ames Research Center. Finally, collaborations with Bill Goddard at CalTech, Ralph Merkle at Xerox Parc, Don Brenner at NCSU (North Carolina State University), Tom McKendree at Hughes, and Todd Wipke at UCSC are underway.
Workstations take over conceptual design
NASA Technical Reports Server (NTRS)
Kidwell, George H.
1987-01-01
Workstations provide sufficient computing memory and speed for early evaluations of aircraft design alternatives to identify those worthy of further study. It is recommended that the programming of such machines permit integrated calculations of the configuration and performance analysis of new concepts, along with the capability of changing up to 100 variables at a time and swiftly viewing the results. Computations can be augmented through links to mainframes and supercomputers. Programming, particularly debugging operations, are enhanced by the capability of working with one program line at a time and having available on-screen error indices. Workstation networks permit on-line communication among users and with persons and computers outside the facility. Application of the capabilities is illustrated through a description of NASA-Ames design efforts for an oblique wing for a jet performed on a MicroVAX network.
Merlin - Massively parallel heterogeneous computing
NASA Technical Reports Server (NTRS)
Wittie, Larry; Maples, Creve
1989-01-01
Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.
Final Report for Project FG02-05ER25685
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiaosong Ma
2009-05-07
In this report, the PI summarizes the results and achievements obtained in the sponsored project. Overall, the project has been very successful and produced both research results in massive data-intensive computing and data management for large scale supercomputers today, and in open-source software products. During the project period, 14 conference/journal publications, as well as two PhD students, have been produced due to exclusive or shared support from this award. In addition, the PI has recently been granted tenure from NC State University.
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Villa, Oreste; Tumeo, Antonino; Secchi, Simone
Irregular applications, such as data mining and analysis or graph-based computations, show unpredictable memory/network access patterns and control structures. Highly multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2 and XMT, appear to address their requirements better than commodity clusters. However, the research on highly multithreaded systems is currently limited by the lack of adequate architectural simulation infrastructures due to issues such as size of the machines, memory footprint, simulation speed, accuracy and customization. At the same time, Shared-memory MultiProcessors (SMPs) with multi-core processors have become an attractive platform to simulate large scale machines. In this paper, wemore » introduce a cycle-level simulator of the highly multithreaded Cray XMT supercomputer. The simulator runs unmodified XMT applications. We discuss how we tackled the challenges posed by its development, detailing the techniques introduced to make the simulation as fast as possible while maintaining a high accuracy. By mapping XMT processors (ThreadStorm with 128 hardware threads) to host computing cores, the simulation speed remains constant as the number of simulated processors increases, up to the number of available host cores. The simulator supports zero-overhead switching among different accuracy levels at run-time and includes a network model that takes into account contention. On a modern 48-core SMP host, our infrastructure simulates a large set of irregular applications 500 to 2000 times slower than real time when compared to a 128-processor XMT, while remaining within 10\\% of accuracy. Emulation is only from 25 to 200 times slower than real time.« less
Pomona College Dreamers and Achievers Profile
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meyers, C. A.
As an undergraduate at Pomona, Carol Meyers ’00 enjoyed proofs and pure math, but she also had a hankering to solve real-world problems. This eventually led her to discover the field of operations research, in which she earned a PhD from MIT, and from there onto a career at a national laboratory. “National labs are great because they serve a role between academia and industry,” she says, “solving problems too large or too applied for academia, and insufficiently profitdriven for industry.” Her workplace, Lawrence Livermore National Laboratory, is home to the world’s largest laser, frequent home to the world’s fastestmore » supercomputer, and the namesake to element 116 (Livermorium) on the periodic table. During her 10+ years at the laboratory Carol has worked in the areas of energy grid modernization, nuclear counterterrorism, cyber security, stockpile stewardship, and supercomputing, often as a consultant providing mathematical modeling and optimization expertise. Carol has two young kids and serves as co-chair of the New Moms’ Group at her workplace. “Being in a supportive community of moms has really helped me at work,” she says, “because let’s face it – moms know how to get things done!” “People talk a lot about the importance of networking, but this doesn’t have to mean starting technical discussions with strangers. The ability to bond with and relate to other people is far more important, and many women do this very well. Some of my best work networking connections have come through our employer-affiliated daycare.”« less
Tools for 3D scientific visualization in computational aerodynamics
NASA Technical Reports Server (NTRS)
Bancroft, Gordon; Plessel, Todd; Merritt, Fergus; Watson, Val
1989-01-01
The purpose is to describe the tools and techniques in use at the NASA Ames Research Center for performing visualization of computational aerodynamics, for example visualization of flow fields from computer simulations of fluid dynamics about vehicles such as the Space Shuttle. The hardware used for visualization is a high-performance graphics workstation connected to a super computer with a high speed channel. At present, the workstation is a Silicon Graphics IRIS 3130, the supercomputer is a CRAY2, and the high speed channel is a hyperchannel. The three techniques used for visualization are post-processing, tracking, and steering. Post-processing analysis is done after the simulation. Tracking analysis is done during a simulation but is not interactive, whereas steering analysis involves modifying the simulation interactively during the simulation. Using post-processing methods, a flow simulation is executed on a supercomputer and, after the simulation is complete, the results of the simulation are processed for viewing. The software in use and under development at NASA Ames Research Center for performing these types of tasks in computational aerodynamics is described. Workstation performance issues, benchmarking, and high-performance networks for this purpose are also discussed as well as descriptions of other hardware for digital video and film recording.
UNIX security in a supercomputing environment
NASA Technical Reports Server (NTRS)
Bishop, Matt
1989-01-01
The author critiques some security mechanisms in most versions of the Unix operating system and suggests more effective tools that either have working prototypes or have been implemented, for example in secure Unix systems. Although no computer (not even a secure one) is impenetrable, breaking into systems with these alternate mechanisms will cost more, require more skill, and be more easily detected than penetrations of systems without these mechanisms. The mechanisms described fall into four classes (with considerable overlap). User authentication at the local host affirms the identity of the person using the computer. The principle of least privilege dictates that properly authenticated users should have rights precisely sufficient to perform their tasks, and system administration functions should be compartmentalized; to this end, access control lists or capabilities should either replace or augment the default Unix protection system, and mandatory access controls implementing multilevel security models and integrity mechanisms should be available. Since most users access supercomputing environments using networks, the third class of mechanisms augments authentication (where feasible). As no security is perfect, the fourth class of mechanism logs events that may indicate possible security violations; this will allow the reconstruction of a successful penetration (if discovered), or possibly the detection of an attempted penetration.
An efficient framework for Java data processing systems in HPC environments
NASA Astrophysics Data System (ADS)
Fries, Aidan; Castañeda, Javier; Isasi, Yago; Taboada, Guillermo L.; Portell de Mora, Jordi; Sirvent, Raül
2011-11-01
Java is a commonly used programming language, although its use in High Performance Computing (HPC) remains relatively low. One of the reasons is a lack of libraries offering specific HPC functions to Java applications. In this paper we present a Java-based framework, called DpcbTools, designed to provide a set of functions that fill this gap. It includes a set of efficient data communication functions based on message-passing, thus providing, when a low latency network such as Myrinet is available, higher throughputs and lower latencies than standard solutions used by Java. DpcbTools also includes routines for the launching, monitoring and management of Java applications on several computing nodes by making use of JMX to communicate with remote Java VMs. The Gaia Data Processing and Analysis Consortium (DPAC) is a real case where scientific data from the ESA Gaia astrometric satellite will be entirely processed using Java. In this paper we describe the main elements of DPAC and its usage of the DpcbTools framework. We also assess the usefulness and performance of DpcbTools through its performance evaluation and the analysis of its impact on some DPAC systems deployed in the MareNostrum supercomputer (Barcelona Supercomputing Center).
Desktop supercomputer: what can it do?
NASA Astrophysics Data System (ADS)
Bogdanov, A.; Degtyarev, A.; Korkhov, V.
2017-12-01
The paper addresses the issues of solving complex problems that require using supercomputers or multiprocessor clusters available for most researchers nowadays. Efficient distribution of high performance computing resources according to actual application needs has been a major research topic since high-performance computing (HPC) technologies became widely introduced. At the same time, comfortable and transparent access to these resources was a key user requirement. In this paper we discuss approaches to build a virtual private supercomputer available at user's desktop: a virtual computing environment tailored specifically for a target user with a particular target application. We describe and evaluate possibilities to create the virtual supercomputer based on light-weight virtualization technologies, and analyze the efficiency of our approach compared to traditional methods of HPC resource management.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De, K; Jha, S; Klimentov, A
2016-01-01
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), MIRA supercomputer at Argonne Leadership Computing Facilities (ALCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava and others). Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full production for the ATLAS experiment since September 2015. We will present our current accomplishments with running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less
Color graphics, interactive processing, and the supercomputer
NASA Technical Reports Server (NTRS)
Smith-Taylor, Rudeen
1987-01-01
The development of a common graphics environment for the NASA Langley Research Center user community and the integration of a supercomputer into this environment is examined. The initial computer hardware, the software graphics packages, and their configurations are described. The addition of improved computer graphics capability to the supercomputer, and the utilization of the graphic software and hardware are discussed. Consideration is given to the interactive processing system which supports the computer in an interactive debugging, processing, and graphics environment.
NASA Advanced Supercomputing (NAS) User Services Group
NASA Technical Reports Server (NTRS)
Pandori, John; Hamilton, Chris; Niggley, C. E.; Parks, John W. (Technical Monitor)
2002-01-01
This viewgraph presentation provides an overview of NAS (NASA Advanced Supercomputing), its goals, and its mainframe computer assets. Also covered are its functions, including systems monitoring and technical support.
Collisional breakup in a quantum system of three charged particles
Rescigno; Baertschy; Isaacs; McCurdy
1999-12-24
Since the invention of quantum mechanics, even the simplest example of the collisional breakup of a system of charged particles, e(-) + H --> H(+) + e(-) + e(-) (where e(-) is an electron and H is hydrogen), has resisted solution and is now one of the last unsolved fundamental problems in atomic physics. A complete solution requires calculation of the energies and directions for a final state in which all three particles are moving away from each other. Even with supercomputers, the correct mathematical description of this state has proved difficult to apply. A framework for solving ionization problems in many areas of chemistry and physics is finally provided by a mathematical transformation of the Schrodinger equation that makes the final state tractable, providing the key to a numerical solution of this problem that reveals its full dynamics.
NASA Astrophysics Data System (ADS)
Green, H. D.; Contractor, N. S.; Yao, Y.
2006-12-01
A knowledge network is a multi-dimensional network created from the interactions and interconnections among the scientists, documents, data, analytic tools, and interactive collaboration spaces (like forums and wikis) associated with a collaborative environment. CI-KNOW is a suite of software tools that leverages automated data collection, social network theories, analysis techniques and algorithms to infer an individual's interests and expertise based on their interactions and activities within a knowledge network. The CI-KNOW recommender system mines the knowledge network associated with a scientific community's use of cyberinfrastructure tools and uses relational metadata to record connections among entities in the knowledge network. Recent developments in social network theories and methods provide the backbone for a modular system that creates recommendations from relational metadata. A network navigation portlet allows users to locate colleagues, documents, data or analytic tools in the knowledge network and to explore their networks through a visual, step-wise process. An internal auditing portlet offers administrators diagnostics to assess the growth and health of the entire knowledge network. The first instantiation of the prototype CI-KNOW system is part of the Environmental Cyberinfrastructure Demonstration project at the National Center for Supercomputing Applications, which supports the activities of hydrologic and environmental science communities (CLEANER and CUAHSI) under the umbrella of the WATERS network environmental observatory planning activities (http://cleaner.ncsa.uiuc.edu). This poster summarizes the key aspects of the CI-KNOW system, highlighting the key inputs, calculation mechanisms, and output modalities.
NSF Commits to Supercomputers.
ERIC Educational Resources Information Center
Waldrop, M. Mitchell
1985-01-01
The National Science Foundation (NSF) has allocated at least $200 million over the next five years to support four new supercomputer centers. Issues and trends related to this NSF initiative are examined. (JN)
Optimization of the computational load of a hypercube supercomputer onboard a mobile robot.
Barhen, J; Toomarian, N; Protopopescu, V
1987-12-01
A combinatorial optimization methodology is developed, which enables the efficient use of hypercube multiprocessors onboard mobile intelligent robots dedicated to time-critical missions. The methodology is implemented in terms of large-scale concurrent algorithms based either on fast simulated annealing, or on nonlinear asynchronous neural networks. In particular, analytic expressions are given for the effect of singleneuron perturbations on the systems' configuration energy. Compact neuromorphic data structures are used to model effects such as prec xdence constraints, processor idling times, and task-schedule overlaps. Results for a typical robot-dynamics benchmark are presented.
LLNL Partners with IBM on Brain-Like Computing Chip
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Essen, Brian
Lawrence Livermore National Laboratory (LLNL) will receive a first-of-a-kind brain-inspired supercomputing platform for deep learning developed by IBM Research. Based on a breakthrough neurosynaptic computer chip called IBM TrueNorth, the scalable platform will process the equivalent of 16 million neurons and 4 billion synapses and consume the energy equivalent of a hearing aid battery – a mere 2.5 watts of power. The brain-like, neural network design of the IBM Neuromorphic System is able to infer complex cognitive tasks such as pattern recognition and integrated sensory processing far more efficiently than conventional chips.
NASA Technical Reports Server (NTRS)
Gillian, Ronnie E.; Lotts, Christine G.
1988-01-01
The Computational Structural Mechanics (CSM) Activity at Langley Research Center is developing methods for structural analysis on modern computers. To facilitate that research effort, an applications development environment has been constructed to insulate the researcher from the many computer operating systems of a widely distributed computer network. The CSM Testbed development system was ported to the Numerical Aerodynamic Simulator (NAS) Cray-2, at the Ames Research Center, to provide a high end computational capability. This paper describes the implementation experiences, the resulting capability, and the future directions for the Testbed on supercomputers.
LLNL Partners with IBM on Brain-Like Computing Chip
Van Essen, Brian
2018-06-25
Lawrence Livermore National Laboratory (LLNL) will receive a first-of-a-kind brain-inspired supercomputing platform for deep learning developed by IBM Research. Based on a breakthrough neurosynaptic computer chip called IBM TrueNorth, the scalable platform will process the equivalent of 16 million neurons and 4 billion synapses and consume the energy equivalent of a hearing aid battery â a mere 2.5 watts of power. The brain-like, neural network design of the IBM Neuromorphic System is able to infer complex cognitive tasks such as pattern recognition and integrated sensory processing far more efficiently than conventional chips.
Algorithm implementation on the Navier-Stokes computer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krist, S.E.; Zang, T.A.
1987-03-01
The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Algorithm implementation on the Navier-Stokes computer
NASA Technical Reports Server (NTRS)
Krist, Steven E.; Zang, Thomas A.
1987-01-01
The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Optimization of the computational load of a hypercube supercomputer onboard a mobile robot
NASA Technical Reports Server (NTRS)
Barhen, Jacob; Toomarian, N.; Protopopescu, V.
1987-01-01
A combinatorial optimization methodology is developed, which enables the efficient use of hypercube multiprocessors onboard mobile intelligent robots dedicated to time-critical missions. The methodology is implemented in terms of large-scale concurrent algorithms based either on fast simulated annealing, or on nonlinear asynchronous neural networks. In particular, analytic expressions are given for the effect of single-neuron perturbations on the systems' configuration energy. Compact neuromorphic data structures are used to model effects such as precedence constraints, processor idling times, and task-schedule overlaps. Results for a typical robot-dynamics benchmark are presented.
Airport Simulations Using Distributed Computational Resources
NASA Technical Reports Server (NTRS)
McDermott, William J.; Maluf, David A.; Gawdiak, Yuri; Tran, Peter; Clancy, Daniel (Technical Monitor)
2002-01-01
The Virtual National Airspace Simulation (VNAS) will improve the safety of Air Transportation. In 2001, using simulation and information management software running over a distributed network of super-computers, researchers at NASA Ames, Glenn, and Langley Research Centers developed a working prototype of a virtual airspace. This VNAS prototype modeled daily operations of the Atlanta airport by integrating measured operational data and simulation data on up to 2,000 flights a day. The concepts and architecture developed by NASA for this prototype are integral to the National Airspace Simulation to support the development of strategies improving aviation safety, identifying precursors to component failure.
Mira: Argonne's 10-petaflops supercomputer
Papka, Michael; Coghlan, Susan; Isaacs, Eric; Peters, Mark; Messina, Paul
2018-02-13
Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.
Adventures in Computational Grids
NASA Technical Reports Server (NTRS)
Walatka, Pamela P.; Biegel, Bryan A. (Technical Monitor)
2002-01-01
Sometimes one supercomputer is not enough. Or your local supercomputers are busy, or not configured for your job. Or you don't have any supercomputers. You might be trying to simulate worldwide weather changes in real time, requiring more compute power than you could get from any one machine. Or you might be collecting microbiological samples on an island, and need to examine them with a special microscope located on the other side of the continent. These are the times when you need a computational grid.
Mira: Argonne's 10-petaflops supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Papka, Michael; Coghlan, Susan; Isaacs, Eric
2013-07-03
Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.
Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)
Guenther, Chris
2018-05-23
The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.
A high level language for a high performance computer
NASA Technical Reports Server (NTRS)
Perrott, R. H.
1978-01-01
The proposed computational aerodynamic facility will join the ranks of the supercomputers due to its architecture and increased execution speed. At present, the languages used to program these supercomputers have been modifications of programming languages which were designed many years ago for sequential machines. A new programming language should be developed based on the techniques which have proved valuable for sequential programming languages and incorporating the algorithmic techniques required for these supercomputers. The design objectives for such a language are outlined.
Technology advances and market forces: Their impact on high performance architectures
NASA Technical Reports Server (NTRS)
Best, D. R.
1978-01-01
Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.
Floating point arithmetic in future supercomputers
NASA Technical Reports Server (NTRS)
Bailey, David H.; Barton, John T.; Simon, Horst D.; Fouts, Martin J.
1989-01-01
Considerations in the floating-point design of a supercomputer are discussed. Particular attention is given to word size, hardware support for extended precision, format, and accuracy characteristics. These issues are discussed from the perspective of the Numerical Aerodynamic Simulation Systems Division at NASA Ames. The features believed to be most important for a future supercomputer floating-point design include: (1) a 64-bit IEEE floating-point format with 11 exponent bits, 52 mantissa bits, and one sign bit and (2) hardware support for reasonably fast double-precision arithmetic.
Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guenther, Chris
The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.
Integration of Panda Workload Management System with supercomputers
NASA Astrophysics Data System (ADS)
De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Novikov, A.; Oleynik, D.; Panitkin, S.; Poyda, A.; Read, K. F.; Ryabinkin, E.; Teslyuk, A.; Velikhov, V.; Wells, J. C.; Wenaus, T.
2016-09-01
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center "Kurchatov Institute", IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run singlethreaded workloads in parallel on Titan's multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility's infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.
Tracing Scientific Facilities through the Research Literature Using Persistent Identifiers
NASA Astrophysics Data System (ADS)
Mayernik, M. S.; Maull, K. E.
2016-12-01
Tracing persistent identifiers to their source publications is an easy task when authors use them, since it is a simple matter of matching the persistent identifier to the specific text string of the identifier. However, trying to understand if a publication uses the resource behind an identifier when such identifier is not referenced explicitly is a harder task. In this research, we explore the effectiveness of alternative strategies of associating publications with uses of the resource referenced by an identifier when it may not be explicit. This project is explored within the context of the NCAR supercomputer, where we are broadly interesting in the science that can be traced to the usage of the NCAR supercomputing facility, by way of the peer-reviewed research publications that utilize and reference it. In this project we explore several ways of drawing linkages between publications and the NCAR supercomputing resources. Identifying and compiling peer-reviewed publications related to NCAR supercomputer usage are explored via three sources: 1) User-supplied publications gathered through a community survey, 2) publications that were identified via manual searching of the Google scholar search index, and 3) publications associated with National Science Foundation (NSF) grants extracted from a public NSF database. These three sources represent three styles of collecting information about publications that likely imply usage of the NCAR supercomputing facilities. Each source has strengths and weaknesses, thus our discussion will explore how our publication identification and analysis methods vary in terms of accuracy, reliability, and effort. We will also discuss strategies for enabling more efficient tracing of research impacts of supercomputing facilities going forward through the assignment of a persistent web identifier to the NCAR supercomputer. While this solution has potential to greatly enhance our ability to trace the use of the facility through publications, authors must cite the facility consistently. It is therefore necessary to provide recommendations for citation and attribution behavior, and we will conclude our discussion with how such recommendations have improved tracing the supercomputer facility allowing for more consistent and widespread measurement of its impact.
Neuropeptide Signaling Networks and Brain Circuit Plasticity.
McClard, Cynthia K; Arenkiel, Benjamin R
2018-01-01
The brain is a remarkable network of circuits dedicated to sensory integration, perception, and response. The computational power of the brain is estimated to dwarf that of most modern supercomputers, but perhaps its most fascinating capability is to structurally refine itself in response to experience. In the language of computers, the brain is loaded with programs that encode when and how to alter its own hardware. This programmed "plasticity" is a critical mechanism by which the brain shapes behavior to adapt to changing environments. The expansive array of molecular commands that help execute this programming is beginning to emerge. Notably, several neuropeptide transmitters, previously best characterized for their roles in hypothalamic endocrine regulation, have increasingly been recognized for mediating activity-dependent refinement of local brain circuits. Here, we discuss recent discoveries that reveal how local signaling by corticotropin-releasing hormone reshapes mouse olfactory bulb circuits in response to activity and further explore how other local neuropeptide networks may function toward similar ends.
Seeing the forest for the trees: Networked workstations as a parallel processing computer
NASA Technical Reports Server (NTRS)
Breen, J. O.; Meleedy, D. M.
1992-01-01
Unlike traditional 'serial' processing computers in which one central processing unit performs one instruction at a time, parallel processing computers contain several processing units, thereby, performing several instructions at once. Many of today's fastest supercomputers achieve their speed by employing thousands of processing elements working in parallel. Few institutions can afford these state-of-the-art parallel processors, but many already have the makings of a modest parallel processing system. Workstations on existing high-speed networks can be harnessed as nodes in a parallel processing environment, bringing the benefits of parallel processing to many. While such a system can not rival the industry's latest machines, many common tasks can be accelerated greatly by spreading the processing burden and exploiting idle network resources. We study several aspects of this approach, from algorithms to select nodes to speed gains in specific tasks. With ever-increasing volumes of astronomical data, it becomes all the more necessary to utilize our computing resources fully.
A Percolation Model for Fracking
NASA Astrophysics Data System (ADS)
Norris, J. Q.; Turcotte, D. L.; Rundle, J. B.
2014-12-01
Developments in fracking technology have enabled the recovery of vast reserves of oil and gas; yet, there is very little publicly available scientific research on fracking. Traditional reservoir simulator models for fracking are computationally expensive, and require many hours on a supercomputer to simulate a single fracking treatment. We have developed a computationally inexpensive percolation model for fracking that can be used to understand the processes and risks associated with fracking. In our model, a fluid is injected from a single site and a network of fractures grows from the single site. The fracture network grows in bursts, the failure of a relatively strong bond followed by the failure of a series of relatively weak bonds. These bursts display similarities to micro seismic events observed during a fracking treatment. The bursts follow a power-law (Gutenburg-Richter) frequency-size distribution and have growth rates similar to observed earthquake moment rates. These are quantifiable features that can be compared to observed microseismicity to help understand the relationship between observed microseismicity and the underlying fracture network.
Hyperswitch communication network
NASA Technical Reports Server (NTRS)
Peterson, J.; Pniel, M.; Upchurch, E.
1991-01-01
The Hyperswitch Communication Network (HCN) is a large scale parallel computer prototype being developed at JPL. Commercial versions of the HCN computer are planned. The HCN computer being designed is a message passing multiple instruction multiple data (MIMD) computer, and offers many advantages in price-performance ratio, reliability and availability, and manufacturing over traditional uniprocessors and bus based multiprocessors. The design of the HCN operating system is a uniquely flexible environment that combines both parallel processing and distributed processing. This programming paradigm can achieve a balance among the following competing factors: performance in processing and communications, user friendliness, and fault tolerance. The prototype is being designed to accommodate a maximum of 64 state of the art microprocessors. The HCN is classified as a distributed supercomputer. The HCN system is described, and the performance/cost analysis and other competing factors within the system design are reviewed.
NASA Astrophysics Data System (ADS)
Cervone, G.; Clemente-Harding, L.; Alessandrini, S.; Delle Monache, L.
2016-12-01
A methodology based on Artificial Neural Networks (ANN) and an Analog Ensemble (AnEn) is presented to generate 72-hour deterministic and probabilistic forecasts of power generated by photovoltaic (PV) power plants using input from a numerical weather prediction model and computed astronomical variables. ANN and AnEn are used individually and in combination to generate forecasts for three solar power plant located in Italy. The computational scalability of the proposed solution is tested using synthetic data simulating 4,450 PV power stations. The NCAR Yellowstone supercomputer is employed to test the parallel implementation of the proposed solution, ranging from 1 node (32 cores) to 4,450 nodes (141,140 cores). Results show that a combined AnEn + ANN solution yields best results, and that the proposed solution is well suited for massive scale computation.
Energy Efficient Supercomputing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anypas, Katie
2014-10-17
Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.
Energy Efficient Supercomputing
Anypas, Katie
2018-05-07
Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.
Job Management Requirements for NAS Parallel Systems and Clusters
NASA Technical Reports Server (NTRS)
Saphir, William; Tanner, Leigh Ann; Traversat, Bernard
1995-01-01
A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance computing industry. Newer job management systems offer new functionality but do not solve fundamental problems. We address some of the main issues in resource allocation and job scheduling we have encountered on two parallel computers - a 160-node IBM SP2 and a cluster of 20 high performance workstations located at the Numerical Aerodynamic Simulation facility. We describe the requirements for resource allocation and job management that are necessary to provide a production supercomputing environment on these machines, prioritizing according to difficulty and importance, and advocating a return to fundamental issues.
Approaching the exa-scale: a real-world evaluation of rendering extremely large data sets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patchett, John M; Ahrens, James P; Lo, Li - Ta
2010-10-15
Extremely large scale analysis is becoming increasingly important as supercomputers and their simulations move from petascale to exascale. The lack of dedicated hardware acceleration for rendering on today's supercomputing platforms motivates our detailed evaluation of the possibility of interactive rendering on the supercomputer. In order to facilitate our understanding of rendering on the supercomputing platform, we focus on scalability of rendering algorithms and architecture envisioned for exascale datasets. To understand tradeoffs for dealing with extremely large datasets, we compare three different rendering algorithms for large polygonal data: software based ray tracing, software based rasterization and hardware accelerated rasterization. We presentmore » a case study of strong and weak scaling of rendering extremely large data on both GPU and CPU based parallel supercomputers using Para View, a parallel visualization tool. Wc use three different data sets: two synthetic and one from a scientific application. At an extreme scale, algorithmic rendering choices make a difference and should be considered while approaching exascale computing, visualization, and analysis. We find software based ray-tracing offers a viable approach for scalable rendering of the projected future massive data sizes.« less
Supercomputing Drives Innovation - Continuum Magazine | NREL
years, NREL scientists have used supercomputers to simulate 3D models of the primary enzymes and Scientist, discuss a 3D model of wind plant aerodynamics, showing low velocity wakes and impact on
Iowa State University – Final Report for SciDAC3/NUCLEI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vary, James P
The Iowa State University (ISU) contributions to the NUCLEI project are focused on developing, implementing and running an efficient and scalable configuration interaction code (Many-Fermion Dynamics – nuclear or MFDn) for leadership class supercomputers addressing forefront research problems in low-energy nuclear physics. We investigate nuclear structure and reactions with realistic nucleon-nucleon (NN) and three-nucleon (3N) interactions. We select a few highlights from our work that has produced a total of more than 82 refereed publications and more than 109 invited talks under SciDAC3/NUCLEI.
Ensemble-based docking: From hit discovery to metabolism and toxicity predictions
Evangelista, Wilfredo; Weir, Rebecca; Ellingson, Sally; ...
2016-07-29
The use of ensemble-based docking for the exploration of biochemical pathways and toxicity prediction of drug candidates is described. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials.
Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sarje, Abhinav; Jacobsen, Douglas W.; Williams, Samuel W.
The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.
Supercomputer algorithms for efficient linear octree encoding of three-dimensional brain images.
Berger, S B; Reis, D J
1995-02-01
We designed and implemented algorithms for three-dimensional (3-D) reconstruction of brain images from serial sections using two important supercomputer architectures, vector and parallel. These architectures were represented by the Cray YMP and Connection Machine CM-2, respectively. The programs operated on linear octree representations of the brain data sets, and achieved 500-800 times acceleration when compared with a conventional laboratory workstation. As the need for higher resolution data sets increases, supercomputer algorithms may offer a means of performing 3-D reconstruction well above current experimental limits.
Intelligent supercomputers: the Japanese computer sputnik
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walter, G.
1983-11-01
Japan's government-supported fifth-generation computer project has had a pronounced effect on the American computer and information systems industry. The US firms are intensifying their research on and production of intelligent supercomputers, a combination of computer architecture and artificial intelligence software programs. While the present generation of computers is built for the processing of numbers, the new supercomputers will be designed specifically for the solution of symbolic problems and the use of artificial intelligence software. This article discusses new and exciting developments that will increase computer capabilities in the 1990s. 4 references.
NASA Astrophysics Data System (ADS)
Jackson, David
NICT (National Institute of Information and Communications Technology) has been in charge of space weather forecast service in Japan for more than 20 years. The main target region of the space weather is the geo-space in the vicinity of the Earth where human activities are dominant. In the geo-space, serious damages of satellites, international space stations and astronauts take place caused by energetic particles or electromagnetic disturbances: the origin of the causes is dynamically changing of solar activities. Positioning systems via GPS satellites are also im-portant recently. Since the most significant effect of positioning error comes from disturbances of the ionosphere, it is crucial to estimate time-dependent modulation of the electron density profiles in the ionosphere. NICT is one of the 13 members of the ISES (International Space Environment Service), which is an international assembly of space weather forecast centers under the UNESCO. With help of geo-space environment data exchanging among the member nations, NICT operates daily space weather forecast service every day to provide informa-tion on forecasts of solar flare, geomagnetic disturbances, solar proton event, and radio-wave propagation conditions in the ionosphere. The space weather forecast at NICT is conducted based on the three methodologies: observations, simulations and informatics (OSI model). For real-time or quasi real-time reporting of space weather, we conduct our original observations: Hiraiso solar observatory to monitor the solar activity (solar flare, coronal mass ejection, and so on), domestic ionosonde network, magnetometer HF radar observations in far-east Siberia, and south-east Asia low-latitude ionosonde network (SEALION). Real-time observation data to monitor solar and solar-wind activities are obtained through antennae at NICT from ACE and STEREO satellites. We have a middle-class super-computer (NEC SX-8R) to maintain real-time computer simulations for solar and solar-wind, magnetosphere and ionosphere. The three simulations are directly or indirectly connected each other based on real-time observa-tion data to reproduce a virtual geo-space region on the super-computer. Informatics is a new methodology to make precise forecast of space weather. Based on new information and communication technologies (ICT), it provides more information in both quality and quantity. At NICT, we have been developing a cloud-computing system named "space weather cloud" based on a high-speed network system (JGN2+). Huge-scale distributed storage (1PB), clus-ter computers, visualization systems and other resources are expected to derive new findings and services of space weather forecasting. The final goal of NICT space weather service is to predict near-future space weather conditions and disturbances which will be causes of satellite malfunctions, tele-communication problems, and error of GPS navigations. In the present talk, we introduce our recent activities on the space weather services and discuss how we are going to develop the services from the view points of space science and practical uses.
Activities of NICT space weather project
NASA Astrophysics Data System (ADS)
Murata, Ken T.; Nagatsuma, Tsutomu; Watari, Shinichi; Shinagawa, Hiroyuki; Ishii, Mamoru
NICT (National Institute of Information and Communications Technology) has been in charge of space weather forecast service in Japan for more than 20 years. The main target region of the space weather is the geo-space in the vicinity of the Earth where human activities are dominant. In the geo-space, serious damages of satellites, international space stations and astronauts take place caused by energetic particles or electromagnetic disturbances: the origin of the causes is dynamically changing of solar activities. Positioning systems via GPS satellites are also im-portant recently. Since the most significant effect of positioning error comes from disturbances of the ionosphere, it is crucial to estimate time-dependent modulation of the electron density profiles in the ionosphere. NICT is one of the 13 members of the ISES (International Space Environment Service), which is an international assembly of space weather forecast centers under the UNESCO. With help of geo-space environment data exchanging among the member nations, NICT operates daily space weather forecast service every day to provide informa-tion on forecasts of solar flare, geomagnetic disturbances, solar proton event, and radio-wave propagation conditions in the ionosphere. The space weather forecast at NICT is conducted based on the three methodologies: observations, simulations and informatics (OSI model). For real-time or quasi real-time reporting of space weather, we conduct our original observations: Hiraiso solar observatory to monitor the solar activity (solar flare, coronal mass ejection, and so on), domestic ionosonde network, magnetometer HF radar observations in far-east Siberia, and south-east Asia low-latitude ionosonde network (SEALION). Real-time observation data to monitor solar and solar-wind activities are obtained through antennae at NICT from ACE and STEREO satellites. We have a middle-class super-computer (NEC SX-8R) to maintain real-time computer simulations for solar and solar-wind, magnetosphere and ionosphere. The three simulations are directly or indirectly connected each other based on real-time observa-tion data to reproduce a virtual geo-space region on the super-computer. Informatics is a new methodology to make precise forecast of space weather. Based on new information and communication technologies (ICT), it provides more information in both quality and quantity. At NICT, we have been developing a cloud-computing system named "space weather cloud" based on a high-speed network system (JGN2+). Huge-scale distributed storage (1PB), clus-ter computers, visualization systems and other resources are expected to derive new findings and services of space weather forecasting. The final goal of NICT space weather service is to predict near-future space weather conditions and disturbances which will be causes of satellite malfunctions, tele-communication problems, and error of GPS navigations. In the present talk, we introduce our recent activities on the space weather services and discuss how we are going to develop the services from the view points of space science and practical uses.
Introducing Mira, Argonne's Next-Generation Supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
2013-03-19
Mira, the new petascale IBM Blue Gene/Q system installed at the ALCF, will usher in a new era of scientific supercomputing. An engineering marvel, the 10-petaflops machine is capable of carrying out 10 quadrillion calculations per second.
Green Supercomputing at Argonne
Pete Beckman
2017-12-09
Pete Beckman, head of Argonne's Leadership Computing Facility (ALCF) talks about Argonne National Laboratory's green supercomputingâeverything from designing algorithms to use fewer kilowatts per operation to using cold Chicago winter air to cool the machine more efficiently.
Grand challenges in mass storage: A systems integrators perspective
NASA Technical Reports Server (NTRS)
Lee, Richard R.; Mintz, Daniel G.
1993-01-01
Within today's much ballyhooed supercomputing environment, with its CFLOPS of CPU power, and Gigabit networks, there exists a major roadblock to computing success; that of Mass Storage. The solution to this mass storage problem is considered to be one of the 'Grand Challenges' facing the computer industry today, as well as long into the future. It has become obvious to us, as well as many others in the industry, that there is no clear single solution in sight. The Systems Integrator today is faced with a myriad of quandaries in approaching this challenge. He must first be innovative in approach, second choose hardware solutions that are volumetric efficient; high in signal bandwidth; available from multiple sources; competitively priced, and have forward growth extendibility. In addition he must also comply with a variety of mandated, and often conflicting software standards (GOSIP, POSIX, IEEE, MSRM 4.0, and others), and finally he must deliver a systems solution with the 'most bang for the buck' in terms of cost vs. performance factors. These quandaries challenge the Systems Integrator to 'push the envelope' in terms of his or her ingenuity and innovation on an almost daily basis. This dynamic is explored further, and an attempt to acquaint the audience with rational approaches to this 'Grand Challenge' is made.
TOP500 Supercomputers for June 2003
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack
2003-06-23
21st Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 21st edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2003). The Earth Simulator supercomputer built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan, with its Linpack benchmark performance of 35.86 Tflop/s (teraflops or trillions of calculations per second), retains the number one position. The number 2 position is held by the re-measured ASCI Q system at Los Alamosmore » National Laboratory. With 13.88 Tflop/s, it is the second system ever to exceed the 10 Tflop/smark. ASCIQ was built by Hewlett-Packard and is based on the AlphaServerSC computer system.« less
Characterizing output bottlenecks in a supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Bing; Chase, Jeffrey; Dillow, David A
2012-01-01
Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more » contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.« less
NASA Astrophysics Data System (ADS)
Morikawa, Y.; Murata, K. T.; Watari, S.; Kato, H.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Shimojo, S.
2010-12-01
Main methodologies of Solar-Terrestrial Physics (STP) so far are theoretical, experimental and observational, and computer simulation approaches. Recently "informatics" is expected as a new (fourth) approach to the STP studies. Informatics is a methodology to analyze large-scale data (observation data and computer simulation data) to obtain new findings using a variety of data processing techniques. At NICT (National Institute of Information and Communications Technology, Japan) we are now developing a new research environment named "OneSpaceNet". The OneSpaceNet is a cloud-computing environment specialized for science works, which connects many researchers with high-speed network (JGN: Japan Gigabit Network). The JGN is a wide-area back-born network operated by NICT; it provides 10G network and many access points (AP) over Japan. The OneSpaceNet also provides with rich computer resources for research studies, such as super-computers, large-scale data storage area, licensed applications, visualization devices (like tiled display wall: TDW), database/DBMS, cluster computers (4-8 nodes) for data processing and communication devices. What is amazing in use of the science cloud is that a user simply prepares a terminal (low-cost PC). Once connecting the PC to JGN2plus, the user can make full use of the rich resources of the science cloud. Using communication devices, such as video-conference system, streaming and reflector servers, and media-players, the users on the OneSpaceNet can make research communications as if they belong to a same (one) laboratory: they are members of a virtual laboratory. The specification of the computer resources on the OneSpaceNet is as follows: The size of data storage we have developed so far is almost 1PB. The number of the data files managed on the cloud storage is getting larger and now more than 40,000,000. What is notable is that the disks forming the large-scale storage are distributed to 5 data centers over Japan (but the storage system performs as one disk). There are three supercomputers allocated on the cloud, one from Tokyo, one from Osaka and the other from Nagoya. One's simulation job data on any supercomputers are saved on the cloud data storage (same directory); it is a kind of virtual computing environment. The tiled display wall has 36 panels acting as one display; the pixel (resolution) size of it is as large as 18000x4300. This size is enough to preview or analyze the large-scale computer simulation data. It also allows us to take a look of multiple (e.g., 100 pictures) on one screen together with many researchers. In our talk we also present a brief report of the initial results using the OneSpaceNet for Global MHD simulations as an example of successful use of our science cloud; (i) Ultra-high time resolution visualization of Global MHD simulations on the large-scale storage and parallel processing system on the cloud, (ii) Database of real-time Global MHD simulation and statistic analyses of the data, and (iii) 3D Web service of Global MHD simulations.
P2P Technology for High-Performance Computing: An Overview
NASA Technical Reports Server (NTRS)
Follen, Gregory J. (Technical Monitor); Berry, Jason
2003-01-01
The transition from cluster computing to peer-to-peer (P2P) high-performance computing has recently attracted the attention of the computer science community. It has been recognized that existing local networks and dedicated clusters of headless workstations can serve as inexpensive yet powerful virtual supercomputers. It has also been recognized that the vast number of lower-end computers connected to the Internet stay idle for as long as 90% of the time. The growing speed of Internet connections and the high availability of free CPU time encourage exploration of the possibility to use the whole Internet rather than local clusters as massively parallel yet almost freely available P2P supercomputer. As a part of a larger project on P2P high-performance computing, it has been my goal to compile an overview of the 2P2 paradigm. I have studied various P2P platforms and I have compiled systematic brief descriptions of their most important characteristics. I have also experimented and obtained hands-on experience with selected P2P platforms focusing on those that seem promising with respect to P2P high-performance computing. I have also compiled relevant literature and web references. I have prepared a draft technical report and I have summarized my findings in a poster paper.
NASA Astrophysics Data System (ADS)
Sim, Jae-Hoon; Kim, Heung-Sik; Han, Myung Joon
2015-03-01
Using first-principles density functional theory (DFT) calculations, we investigated the electronic structure of Rh-doped iridate, Sr2Ir1-xRhxO4 for which the doping (x) dependent metal-insulator transition (MIT) has been reported experimentally and the controversial discussion developed regarding the origin of this transition. Our DFT+U calculation shows that the value of < L . S > remains largely intact over the entire doping range considered here (x = 0 . 0 , 0 . 125 , 0 . 25 , 0 . 50 , 0 . 75 , and 1 . 0) in good agreement with the branching ratio measured by x-ray absorption spectroscopy. Also contrary to a previous picture to explain MIT based on the charge transfer between the transition-metal sites, our calculation clearly shows that those sites remain basically isoelectronic while the impurity bands of predominantly rhodium character are introduced near the Fermi level. As the doping increases, this impurity band overlaps with lower Hubbard band of iridium, leading to metal-insulator transition. The results will be discussed with comparison to the case of Ru doping. Computational resources were suported by The National Institute of Supercomputing and Networking/Korea Institute of Science and Technology Information with supercomputing resources including technical spport (Grant No. KSC-2013-C2-23).
Advanced Computing for Manufacturing.
ERIC Educational Resources Information Center
Erisman, Albert M.; Neves, Kenneth W.
1987-01-01
Discusses ways that supercomputers are being used in the manufacturing industry, including the design and production of airplanes and automobiles. Describes problems that need to be solved in the next few years for supercomputers to assume a major role in industry. (TW)
INTEGRATION OF PANDA WORKLOAD MANAGEMENT SYSTEM WITH SUPERCOMPUTERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
De, K; Jha, S; Maeno, T
Abstract The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the funda- mental nature of matter and the basic forces that shape our universe, and were recently credited for the dis- covery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Datamore » Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data cen- ters are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Com- puting Facility (OLCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single- threaded workloads in parallel on Titan s multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accom- plishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility s infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less
Supercomputers Join the Fight against Cancer – U.S. Department of Energy
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
The Department of Energy has some of the best supercomputers in the world. Now, they’re joining the fight against cancer. Learn about our new partnership with the National Cancer Institute and GlaxoSmithKline Pharmaceuticals.
Secure Large-Scale Airport Simulations Using Distributed Computational Resources
NASA Technical Reports Server (NTRS)
McDermott, William J.; Maluf, David A.; Gawdiak, Yuri; Tran, Peter; Clancy, Dan (Technical Monitor)
2001-01-01
To fully conduct research that will support the far-term concepts, technologies and methods required to improve the safety of Air Transportation a simulation environment of the requisite degree of fidelity must first be in place. The Virtual National Airspace Simulation (VNAS) will provide the underlying infrastructure necessary for such a simulation system. Aerospace-specific knowledge management services such as intelligent data-integration middleware will support the management of information associated with this complex and critically important operational environment. This simulation environment, in conjunction with a distributed network of supercomputers, and high-speed network connections to aircraft, and to Federal Aviation Administration (FAA), airline and other data-sources will provide the capability to continuously monitor and measure operational performance against expected performance. The VNAS will also provide the tools to use this performance baseline to obtain a perspective of what is happening today and of the potential impact of proposed changes before they are introduced into the system.
Parallel Evolutionary Optimization for Neuromorphic Network Training
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schuman, Catherine D; Disney, Adam; Singh, Susheela
One of the key impediments to the success of current neuromorphic computing architectures is the issue of how best to program them. Evolutionary optimization (EO) is one promising programming technique; in particular, its wide applicability makes it especially attractive for neuromorphic architectures, which can have many different characteristics. In this paper, we explore different facets of EO on a spiking neuromorphic computing model called DANNA. We focus on the performance of EO in the design of our DANNA simulator, and on how to structure EO on both multicore and massively parallel computing systems. We evaluate how our parallel methods impactmore » the performance of EO on Titan, the U.S.'s largest open science supercomputer, and BOB, a Beowulf-style cluster of Raspberry Pi's. We also focus on how to improve the EO by evaluating commonality in higher performing neural networks, and present the result of a study that evaluates the EO performed by Titan.« less
PATHFINDER: Probing Atmospheric Flows in an Integrated and Distributed Environment
NASA Technical Reports Server (NTRS)
Wilhelmson, R. B.; Wojtowicz, D. P.; Shaw, C.; Hagedorn, J.; Koch, S.
1995-01-01
PATHFINDER is a software effort to create a flexible, modular, collaborative, and distributed environment for studying atmospheric, astrophysical, and other fluid flows in the evolving networked metacomputer environment of the 1990s. It uses existing software, such as HDF (Hierarchical Data Format), DTM (Data Transfer Mechanism), GEMPAK (General Meteorological Package), AVS, SGI Explorer, and Inventor to provide the researcher with the ability to harness the latest in desktop to teraflop computing. Software modules developed during the project are available in the public domain via anonymous FTP from the National Center for Supercomputing Applications (NCSA). The address is ftp.ncsa.uiuc.edu, and the directory is /SGI/PATHFINDER.
Efficient development of memory bounded geo-applications to scale on modern supercomputers
NASA Astrophysics Data System (ADS)
Räss, Ludovic; Omlin, Samuel; Licul, Aleksandar; Podladchikov, Yuri; Herman, Frédéric
2016-04-01
Numerical modeling is an actual key tool in the area of geosciences. The current challenge is to solve problems that are multi-physics and for which the length scale and the place of occurrence might not be known in advance. Also, the spatial extend of the investigated domain might strongly vary in size, ranging from millimeters for reactive transport to kilometers for glacier erosion dynamics. An efficient way to proceed is to develop simple but robust algorithms that perform well and scale on modern supercomputers and permit therefore very high-resolution simulations. We propose an efficient approach to solve memory bounded real-world applications on modern supercomputers architectures. We optimize the software to run on our newly acquired state-of-the-art GPU cluster "octopus". Our approach shows promising preliminary results on important geodynamical and geomechanical problematics: we have developed a Stokes solver for glacier flow and a poromechanical solver including complex rheologies for nonlinear waves in stressed rocks porous rocks. We solve the system of partial differential equations on a regular Cartesian grid and use an iterative finite difference scheme with preconditioning of the residuals. The MPI communication happens only locally (point-to-point); this method is known to scale linearly by construction. The "octopus" GPU cluster, which we use for the computations, has been designed to achieve maximal data transfer throughput at minimal hardware cost. It is composed of twenty compute nodes, each hosting four Nvidia Titan X GPU accelerators. These high-density nodes are interconnected with a parallel (dual-rail) FDR InfiniBand network. Our efforts show promising preliminary results for the different physics investigated. The glacier flow solver achieves good accuracy in the relevant benchmarks and the coupled poromechanical solver permits to explain previously unresolvable focused fluid flow as a natural outcome of the porosity setup. In both cases, near peak memory bandwidth transfer is achieved. Our approach allows us to get the best out of the current hardware.
High Temporal Resolution Mapping of Seismic Noise Sources Using Heterogeneous Supercomputers
NASA Astrophysics Data System (ADS)
Paitz, P.; Gokhberg, A.; Ermert, L. A.; Fichtner, A.
2017-12-01
The time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems like earthquake fault zones, volcanoes, geothermal and hydrocarbon reservoirs. We present results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service providing seismic noise source maps for Central Europe with high temporal resolution. We use source imaging methods based on the cross-correlation of seismic noise records from all seismic stations available in the region of interest. The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept to provide the interested researchers worldwide with regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for the generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise source mapping itself rests on the measurement of logarithmic amplitude ratios in suitably pre-processed noise correlations, and the use of simplified sensitivity kernels. During the implementation we addressed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service-oriented architecture for coordination of various sub-systems, and engineering an appropriate data storage solution. The present pilot version of the service implements noise source maps for Switzerland. Extension of the solution to Central Europe is planned for the next project phase.
NASA Astrophysics Data System (ADS)
Chen, Goong; Wang, Yi-Ching; Perronnet, Alain; Gu, Cong; Yao, Pengfei; Bin-Mohsin, Bandar; Hajaiej, Hichem; Scully, Marlan O.
2017-03-01
Computational mathematics, physics and engineering form a major constituent of modern computational science, which now stands on an equal footing with the established branches of theoretical and experimental sciences. Computational mechanics solves problems in science and engineering based upon mathematical modeling and computing, bypassing the need for expensive and time-consuming laboratory setups and experimental measurements. Furthermore, it allows the numerical simulations of large scale systems, such as the formation of galaxies that could not be done in any earth bound laboratories. This article is written as part of the 21st Century Frontiers Series to illustrate some state-of-the-art computational science. We emphasize how to do numerical modeling and visualization in the study of a contemporary event, the pulverizing crash of the Germanwings Flight 9525 on March 24, 2015, as a showcase. Such numerical modeling and the ensuing simulation of aircraft crashes into land or mountain are complex tasks as they involve both theoretical study and supercomputing of a complex physical system. The most tragic type of crash involves ‘pulverization’ such as the one suffered by this Germanwings flight. Here, we show pulverizing airliner crashes by visualization through video animations from supercomputer applications of the numerical modeling tool LS-DYNA. A sound validation process is challenging but essential for any sophisticated calculations. We achieve this by validation against the experimental data from a crash test done in 1993 of an F4 Phantom II fighter jet into a wall. We have developed a method by hybridizing two primary methods: finite element analysis and smoothed particle hydrodynamics. This hybrid method also enhances visualization by showing a ‘debris cloud’. Based on our supercomputer simulations and the visualization, we point out that prior works on this topic based on ‘hollow interior’ modeling can be quite problematic and, thus, not likely to be correct. We discuss the effects of terrain on pulverization using the information from the recovered flight-data-recorder and show our forensics and assessments of what may have happened during the final moments of the crash. Finally, we point out that our study has potential for being made into real-time flight crash simulators to help the study of crashworthiness and survivability for future aviation safety. Some forward-looking statements are also made.
Roadrunner Supercomputer Breaks the Petaflop Barrier
Los Alamos National Lab - Brian Albright, Charlie McMillan, Lin Yin
2017-12-09
At 3:30 a.m. on May 26, 2008, Memorial Day, the "Roadrunner" supercomputer exceeded a sustained speed of 1 petaflop/s, or 1 million billion calculations per second. The sustained performance makes Roadrunner more than twice as fast as the current number 1
QCD on the BlueGene/L Supercomputer
NASA Astrophysics Data System (ADS)
Bhanot, G.; Chen, D.; Gara, A.; Sexton, J.; Vranas, P.
2005-03-01
In June 2004 QCD was simulated for the first time at sustained speed exceeding 1 TeraFlops in the BlueGene/L supercomputer at the IBM T.J. Watson Research Lab. The implementation and performance of QCD in the BlueGene/L is presented.
Enabling parallel simulation of large-scale HPC network systems
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.; ...
2016-04-07
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Enabling parallel simulation of large-scale HPC network systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2012-01-10
The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2008-01-01
The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks.
Slażyński, Leszek; Bohte, Sander
2012-01-01
The arrival of graphics processing (GPU) cards suitable for massively parallel computing promises affordable large-scale neural network simulation previously only available at supercomputing facilities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of magnitude, the challenge is to develop fine-grained parallel algorithms to fully exploit the particulars of GPUs. Computation in a neural network is inherently parallel and thus a natural match for GPU architectures: given inputs, the internal state for each neuron can be updated in parallel. We show that for filter-based spiking neurons, like the Spike Response Model, the additive nature of membrane potential dynamics enables additional update parallelism. This also reduces the accumulation of numerical errors when using single precision computation, the native precision of GPUs. We further show that optimizing simulation algorithms and data structures to the GPU's architecture has a large pay-off: for example, matching iterative neural updating to the memory architecture of the GPU speeds up this simulation step by a factor of three to five. With such optimizations, we can simulate in better-than-realtime plausible spiking neural networks of up to 50 000 neurons, processing over 35 million spiking events per second.
NASA Astrophysics Data System (ADS)
Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.
2016-10-01
The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.
Pesce, Lorenzo L.; Lee, Hyong C.; Hereld, Mark; ...
2013-01-01
Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determinedmore » the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons) and processor pool sizes (1 to 256 processors). Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers.« less
Distributed Finite Element Analysis Using a Transputer Network
NASA Technical Reports Server (NTRS)
Watson, James; Favenesi, James; Danial, Albert; Tombrello, Joseph; Yang, Dabby; Reynolds, Brian; Turrentine, Ronald; Shephard, Mark; Baehmann, Peggy
1989-01-01
The principal objective of this research effort was to demonstrate the extraordinarily cost effective acceleration of finite element structural analysis problems using a transputer-based parallel processing network. This objective was accomplished in the form of a commercially viable parallel processing workstation. The workstation is a desktop size, low-maintenance computing unit capable of supercomputer performance yet costs two orders of magnitude less. To achieve the principal research objective, a transputer based structural analysis workstation termed XPFEM was implemented with linear static structural analysis capabilities resembling commercially available NASTRAN. Finite element model files, generated using the on-line preprocessing module or external preprocessing packages, are downloaded to a network of 32 transputers for accelerated solution. The system currently executes at about one third Cray X-MP24 speed but additional acceleration appears likely. For the NASA selected demonstration problem of a Space Shuttle main engine turbine blade model with about 1500 nodes and 4500 independent degrees of freedom, the Cray X-MP24 required 23.9 seconds to obtain a solution while the transputer network, operated from an IBM PC-AT compatible host computer, required 71.7 seconds. Consequently, the $80,000 transputer network demonstrated a cost-performance ratio about 60 times better than the $15,000,000 Cray X-MP24 system.
Distributing french seismologic data through the RESIF green IT datacentre
NASA Astrophysics Data System (ADS)
Volcke, P.; Gueguen, P.; Pequegnat, C.; Le Tanou, J.; Enderle, G.; Berthoud, F.
2012-12-01
RESIF is a nationwide french project aimed at building an excellent quality system to observe and understand the inner earth. The ultimate goal is to create a network throughout mainland France comprising 750 seismometers and geodetic measurement instruments, 250 of which will be mobile to enable the observation network to be focused on specific investigation subjects and geographic locations. This project includes the implementation of a data distribution centre hosting seismologic and geodetic data. This datacentre is operated by the Université Joseph Fourier, Grenoble, France. In the context of building the necessary computing infrastructure, the Université Joseph Fourier became the first french university earning the status of "Participant" for the European Union "Code of Conduct for Data Centres". The University commits to energy reporting and implementing best practices for energy efficiency, in a cost effective manner, without hampering mission critical functions. In this context, data currently hosted at the RESIF datacentre include data from french broadband permanent network, strong motion permanent network, and mobile seismological network. These data are freely accessible as realtime streams and continuous validated data, along with instrumental metadata, delivered using widely known formats. Futur developments include tight integration with local super-computing ressources, and setting up modern distribution systems like webservices.
NASA Astrophysics Data System (ADS)
Maloff, Joel H.
1990-01-01
"The nation which most completely assimilates high performance computing into its economy will very likely emerge as the dominant intellectual, economic, and technological force in the next century", Senator Albert Gore, Jr., May 18, 1989, while introducing Senate Bill 1067, "The National High Performance Computer Technology Act of 1989". A national network designed to link supercomputers, particle accelerators, researchers, educators, government, and industry is beginning to emerge. The degree to which the United States can mobilize the resources inherent within our academic, industrial and government sectors towards the establishment of such a network infrastructure will have direct bearing on the economic and political stature of this country in the next century. This program will have significant impact on all forms of information transfer, and peripheral benefits to all walks of life similar to those experienced from the moon landing program of the 1960's. The key to our success is the involvement of scientists, librarians, network designers, and bureaucrats in the planning stages. Collectively, the resources resident within the United States are awesome; individually, their impact is somewhat more limited. The engineers, technicians, business people, and educators participating in this conference have a vital role to play in the success of the National Research and Education Network (NREN).
Finite element methods on supercomputers - The scatter-problem
NASA Technical Reports Server (NTRS)
Loehner, R.; Morgan, K.
1985-01-01
Certain problems arise in connection with the use of supercomputers for the implementation of finite-element methods. These problems are related to the desirability of utilizing the power of the supercomputer as fully as possible for the rapid execution of the required computations, taking into account the gain in speed possible with the aid of pipelining operations. For the finite-element method, the time-consuming operations may be divided into three categories. The first two present no problems, while the third type of operation can be a reason for the inefficient performance of finite-element programs. Two possibilities for overcoming certain difficulties are proposed, giving attention to a scatter-process.
Code IN Exhibits - Supercomputing 2000
NASA Technical Reports Server (NTRS)
Yarrow, Maurice; McCann, Karen M.; Biswas, Rupak; VanderWijngaart, Rob F.; Kwak, Dochan (Technical Monitor)
2000-01-01
The creation of parameter study suites has recently become a more challenging problem as the parameter studies have become multi-tiered and the computational environment has become a supercomputer grid. The parameter spaces are vast, the individual problem sizes are getting larger, and researchers are seeking to combine several successive stages of parameterization and computation. Simultaneously, grid-based computing offers immense resource opportunities but at the expense of great difficulty of use. We present ILab, an advanced graphical user interface approach to this problem. Our novel strategy stresses intuitive visual design tools for parameter study creation and complex process specification, and also offers programming-free access to grid-based supercomputer resources and process automation.
The ASCI Network for SC 2000: Gigabyte Per Second Networking
DOE Office of Scientific and Technical Information (OSTI.GOV)
PRATT, THOMAS J.; NAEGLE, JOHN H.; MARTINEZ JR., LUIS G.
2001-11-01
This document highlights the Discom's Distance computing and communication team activities at the 2000 Supercomputing conference in Dallas Texas. This conference is sponsored by the IEEE and ACM. Sandia's participation in the conference has now spanned a decade, for the last five years Sandia National Laboratories, Los Alamos National Lab and Lawrence Livermore National Lab have come together at the conference under the DOE's ASCI, Accelerated Strategic Computing Initiatives, Program rubric to demonstrate ASCI's emerging capabilities in computational science and our combined expertise in high performance computer science and communication networking developments within the program. At SC 2000, DISCOM demonstratedmore » an infrastructure. DISCOM2 uses this forum to demonstrate and focus communication and pre-standard implementation of 10 Gigabit Ethernet, the first gigabyte per second data IP network transfer application, and VPN technology that enabled a remote Distributed Resource Management tools demonstration. Additionally a national OC48 POS network was constructed to support applications running between the show floor and home facilities. This network created the opportunity to test PSE's Parallel File Transfer Protocol (PFTP) across a network that had similar speed and distances as the then proposed DISCOM WAN. The SCINET SC2000 showcased wireless networking and the networking team had the opportunity to explore this emerging technology while on the booth. This paper documents those accomplishments, discusses the details of their convention exhibit floor. We also supported the production networking needs of the implementation, and describes how these demonstrations supports DISCOM overall strategies in high performance computing networking.« less
Probing the cosmic causes of errors in supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
Cosmic rays from outer space are causing errors in supercomputers. The neutrons that pass through the CPU may be causing binary data to flip leading to incorrect calculations. Los Alamos National Laboratory has developed detectors to determine how much data is being corrupted by these cosmic particles.
Flux-Level Transit Injection Experiments with NASA Pleiades Supercomputer
NASA Astrophysics Data System (ADS)
Li, Jie; Burke, Christopher J.; Catanzarite, Joseph; Seader, Shawn; Haas, Michael R.; Batalha, Natalie; Henze, Christopher; Christiansen, Jessie; Kepler Project, NASA Advanced Supercomputing Division
2016-06-01
Flux-Level Transit Injection (FLTI) experiments are executed with NASA's Pleiades supercomputer for the Kepler Mission. The latest release (9.3, January 2016) of the Kepler Science Operations Center Pipeline is used in the FLTI experiments. Their purpose is to validate the Analytic Completeness Model (ACM), which can be computed for all Kepler target stars, thereby enabling exoplanet occurrence rate studies. Pleiades, a facility of NASA's Advanced Supercomputing Division, is one of the world's most powerful supercomputers and represents NASA's state-of-the-art technology. We discuss the details of implementing the FLTI experiments on the Pleiades supercomputer. For example, taking into account that ~16 injections are generated by one core of the Pleiades processors in an hour, the “shallow” FLTI experiment, in which ~2000 injections are required per target star, can be done for 16% of all Kepler target stars in about 200 hours. Stripping down the transit search to bare bones, i.e. only searching adjacent high/low periods at high/low pulse durations, makes the computationally intensive FLTI experiments affordable. The design of the FLTI experiments and the analysis of the resulting data are presented in “Validating an Analytic Completeness Model for Kepler Target Stars Based on Flux-level Transit Injection Experiments” by Catanzarite et al. (#2494058).Kepler was selected as the 10th mission of the Discovery Program. Funding for the Kepler Mission has been provided by the NASA Science Mission Directorate.
The Sky's the Limit When Super Students Meet Supercomputers.
ERIC Educational Resources Information Center
Trotter, Andrew
1991-01-01
In a few select high schools in the U.S., supercomputers are allowing talented students to attempt sophisticated research projects using simultaneous simulations of nature, culture, and technology not achievable by ordinary microcomputers. Schools can get their students online by entering contests and seeking grants and partnerships with…
NSF Says It Will Support Supercomputer Centers in California and Illinois.
ERIC Educational Resources Information Center
Strosnider, Kim; Young, Jeffrey R.
1997-01-01
The National Science Foundation will increase support for supercomputer centers at the University of California, San Diego and the University of Illinois, Urbana-Champaign, while leaving unclear the status of the program at Cornell University (New York) and a cooperative Carnegie-Mellon University (Pennsylvania) and University of Pittsburgh…
Access to Supercomputers. Higher Education Panel Report 69.
ERIC Educational Resources Information Center
Holmstrom, Engin Inel
This survey was conducted to provide the National Science Foundation with baseline information on current computer use in the nation's major research universities, including the actual and potential use of supercomputers. Questionnaires were sent to 207 doctorate-granting institutions; after follow-ups, 167 institutions (91% of the institutions…
NOAA announces significant investment in next generation of supercomputers
provide more timely, accurate weather forecasts. (Credit: istockphoto.com) Today, NOAA announced the next phase in the agency's efforts to increase supercomputing capacity to provide more timely, accurate turn will lead to more timely, accurate, and reliable forecasts." Ahead of this upgrade, each of
Developments in the simulation of compressible inviscid and viscous flow on supercomputers
NASA Technical Reports Server (NTRS)
Steger, J. L.; Buning, P. G.
1985-01-01
In anticipation of future supercomputers, finite difference codes are rapidly being extended to simulate three-dimensional compressible flow about complex configurations. Some of these developments are reviewed. The importance of computational flow visualization and diagnostic methods to three-dimensional flow simulation is also briefly discussed.
NASA Technical Reports Server (NTRS)
Smarr, Larry; Press, William; Arnett, David W.; Cameron, Alastair G. W.; Crutcher, Richard M.; Helfand, David J.; Horowitz, Paul; Kleinmann, Susan G.; Linsky, Jeffrey L.; Madore, Barry F.
1991-01-01
The applications of computers and data processing to astronomy are discussed. Among the topics covered are the emerging national information infrastructure, workstations and supercomputers, supertelescopes, digital astronomy, astrophysics in a numerical laboratory, community software, archiving of ground-based observations, dynamical simulations of complex systems, plasma astrophysics, and the remote control of fourth dimension supercomputers.
HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.
2017-10-01
PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.
Supercomputer use in orthopaedic biomechanics research: focus on functional adaptation of bone.
Hart, R T; Thongpreda, N; Van Buskirk, W C
1988-01-01
The authors describe two biomechanical analyses carried out using numerical methods. One is an analysis of the stress and strain in a human mandible, and the other analysis involves modeling the adaptive response of a sheep bone to mechanical loading. The computing environment required for the two types of analyses is discussed. It is shown that a simple stress analysis of a geometrically complex mandible can be accomplished using a minicomputer. However, more sophisticated analyses of the same model with dynamic loading or nonlinear materials would require supercomputer capabilities. A supercomputer is also required for modeling the adaptive response of living bone, even when simple geometric and material models are use.
NREL's Building-Integrated Supercomputer Provides Heating and Efficient Computing (Fact Sheet)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
2014-09-01
NREL's Energy Systems Integration Facility (ESIF) is meant to investigate new ways to integrate energy sources so they work together efficiently, and one of the key tools to that investigation, a new supercomputer, is itself a prime example of energy systems integration. NREL teamed with Hewlett-Packard (HP) and Intel to develop the innovative warm-water, liquid-cooled Peregrine supercomputer, which not only operates efficiently but also serves as the primary source of building heat for ESIF offices and laboratories. This innovative high-performance computer (HPC) can perform more than a quadrillion calculations per second as part of the world's most energy-efficient HPC datamore » center.« less
Supercomputer optimizations for stochastic optimal control applications
NASA Technical Reports Server (NTRS)
Chung, Siu-Leung; Hanson, Floyd B.; Xu, Huihuang
1991-01-01
Supercomputer optimizations for a computational method of solving stochastic, multibody, dynamic programming problems are presented. The computational method is valid for a general class of optimal control problems that are nonlinear, multibody dynamical systems, perturbed by general Markov noise in continuous time, i.e., nonsmooth Gaussian as well as jump Poisson random white noise. Optimization techniques for vector multiprocessors or vectorizing supercomputers include advanced data structures, loop restructuring, loop collapsing, blocking, and compiler directives. These advanced computing techniques and superconducting hardware help alleviate Bellman's curse of dimensionality in dynamic programming computations, by permitting the solution of large multibody problems. Possible applications include lumped flight dynamics models for uncertain environments, such as large scale and background random aerospace fluctuations.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer
NASA Technical Reports Server (NTRS)
Hornfeck, William A.
1988-01-01
A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
NAS Technical Summaries, March 1993 - February 1994
NASA Technical Reports Server (NTRS)
1995-01-01
NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1993-94 operational year concluded with 448 high-speed processor projects and 95 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.
NAS technical summaries. Numerical aerodynamic simulation program, March 1992 - February 1993
NASA Technical Reports Server (NTRS)
1994-01-01
NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1992-93 operational year concluded with 399 high-speed processor projects and 91 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.
Climate Ocean Modeling on a Beowulf Class System
NASA Technical Reports Server (NTRS)
Cheng, B. N.; Chao, Y.; Wang, P.; Bondarenko, M.
2000-01-01
With the growing power and shrinking cost of personal computers. the availability of fast ethernet interconnections, and public domain software packages, it is now possible to combine them to build desktop parallel computers (named Beowulf or PC clusters) at a fraction of what it would cost to buy systems of comparable power front supercomputer companies. This led as to build and assemble our own sys tem. specifically for climate ocean modeling. In this article, we present our experience with such a system, discuss its network performance, and provide some performance comparison data with both HP SPP2000 and Cray T3E for an ocean Model used in present-day oceanographic research.
Congressional Panel Seeks To Curb Access of Foreign Students to U.S. Supercomputers.
ERIC Educational Resources Information Center
Kiernan, Vincent
1999-01-01
Fearing security problems, a congressional committee on Chinese espionage recommends that foreign students and other foreign nationals be barred from using supercomputers at national laboratories unless they first obtain export licenses from the federal government. University officials dispute the data on which the report is based and find the…
The Age of the Supercomputer Gives Way to the Age of the Super Infrastructure.
ERIC Educational Resources Information Center
Young, Jeffrey R.
1997-01-01
In October 1997, the National Science Foundation will discontinue financial support for two university-based supercomputer facilities to concentrate resources on partnerships led by facilities at the University of California, San Diego and the University of Illinois, Urbana-Champaign. The reconfigured program will develop more user-friendly and…
The ChemViz Project: Using a Supercomputer To Illustrate Abstract Concepts in Chemistry.
ERIC Educational Resources Information Center
Beckwith, E. Kenneth; Nelson, Christopher
1998-01-01
Describes the Chemistry Visualization (ChemViz) Project, a Web venture maintained by the University of Illinois National Center for Supercomputing Applications (NCSA) that enables high school students to use computational chemistry as a technique for understanding abstract concepts. Discusses the evolution of computational chemistry and provides a…
NASA Astrophysics Data System (ADS)
Schulthess, Thomas C.
2013-03-01
The continued thousand-fold improvement in sustained application performance per decade on modern supercomputers keeps opening new opportunities for scientific simulations. But supercomputers have become very complex machines, built with thousands or tens of thousands of complex nodes consisting of multiple CPU cores or, most recently, a combination of CPU and GPU processors. Efficient simulations on such high-end computing systems require tailored algorithms that optimally map numerical methods to particular architectures. These intricacies will be illustrated with simulations of strongly correlated electron systems, where the development of quantum cluster methods, Monte Carlo techniques, as well as their optimal implementation by means of algorithms with improved data locality and high arithmetic density have gone hand in hand with evolving computer architectures. The present work would not have been possible without continued access to computing resources at the National Center for Computational Science of Oak Ridge National Laboratory, which is funded by the Facilities Division of the Office of Advanced Scientific Computing Research, and the Swiss National Supercomputing Center (CSCS) that is funded by ETH Zurich.
Extracting the Textual and Temporal Structure of Supercomputing Logs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jain, S; Singh, I; Chandra, A
2009-05-26
Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an onlinemore » clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.« less
NASA Astrophysics Data System (ADS)
Voronin, A. A.; Panchenko, V. Ya; Zheltikov, A. M.
2016-06-01
High-intensity ultrashort laser pulses propagating in gas media or in condensed matter undergo complex nonlinear spatiotemporal evolution where temporal transformations of optical field waveforms are strongly coupled to an intricate beam dynamics and ultrafast field-induced ionization processes. At the level of laser peak powers orders of magnitude above the critical power of self-focusing, the beam exhibits modulation instabilities, producing random field hot spots and breaking up into multiple noise-seeded filaments. This problem is described by a (3 + 1)-dimensional nonlinear field evolution equation, which needs to be solved jointly with the equation for ultrafast ionization of a medium. Analysis of this problem, which is equivalent to solving a billion-dimensional evolution problem, is only possible by means of supercomputer simulations augmented with coordinated big-data processing of large volumes of information acquired through theory-guiding experiments and supercomputations. Here, we review the main challenges of supercomputations and big-data processing encountered in strong-field ultrafast optical physics and discuss strategies to confront these challenges.
Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers
NASA Astrophysics Data System (ADS)
Dreher, Patrick; Scullin, William; Vouk, Mladen
2015-09-01
Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers.
Space and Earth Sciences, Computer Systems, and Scientific Data Analysis Support, Volume 1
NASA Technical Reports Server (NTRS)
Estes, Ronald H. (Editor)
1993-01-01
This Final Progress Report covers the specific technical activities of Hughes STX Corporation for the last contract triannual period of 1 June through 30 Sep. 1993, in support of assigned task activities at Goddard Space Flight Center (GSFC). It also provides a brief summary of work throughout the contract period of performance on each active task. Technical activity is presented in Volume 1, while financial and level-of-effort data is presented in Volume 2. Technical support was provided to all Division and Laboratories of Goddard's Space Sciences and Earth Sciences Directorates. Types of support include: scientific programming, systems programming, computer management, mission planning, scientific investigation, data analysis, data processing, data base creation and maintenance, instrumentation development, and management services. Mission and instruments supported include: ROSAT, Astro-D, BBXRT, XTE, AXAF, GRO, COBE, WIND, UIT, SMM, STIS, HEIDI, DE, URAP, CRRES, Voyagers, ISEE, San Marco, LAGEOS, TOPEX/Poseidon, Pioneer-Venus, Galileo, Cassini, Nimbus-7/TOMS, Meteor-3/TOMS, FIFE, BOREAS, TRMM, AVHRR, and Landsat. Accomplishments include: development of computing programs for mission science and data analysis, supercomputer applications support, computer network support, computational upgrades for data archival and analysis centers, end-to-end management for mission data flow, scientific modeling and results in the fields of space and Earth physics, planning and design of GSFC VO DAAC and VO IMS, fabrication, assembly, and testing of mission instrumentation, and design of mission operations center.
Addressing the Tension Between Strong Perimeter Control an Usability
NASA Technical Reports Server (NTRS)
Hinke, Thomas H.; Kolano, Paul Z.; Keller, Chris
2006-01-01
This paper describes a strong perimeter control system for a general purpose processing system, with the perimeter control system taking significant steps to address usability issues, thus mitigating the tension between strong perimeter protection and usability. A secure front end enforces two-factor authentication for all interactive access to an enclave that contains a large supercomputer and various associated systems, with each requiring their own authentication. Usability is addressed through a design in which the user has to perform two-factor authentication at the secure front end in order to gain access to the enclave, while an agent transparently performs public key authentication as needed to authenticate to specific systems within the enclave. The paper then describes a proxy system that allows users to transfer files into the enclave under script control, when the user is not present to perform two-factor authentication. This uses a pre-authorization approach based on public key technology, which is still strongly tied to both two-factor authentication and strict control over where files can be transferred on the target system. Finally the paper describes an approach to support network applications and systems such as grids or parallel file transfer protocols that require the use of many ports through the perimeter. The paper describes a least privilege approach that dynamically opens ports on a host-specific, if-authorized, as-needed, just-in-time basis.
Automatic Energy Schemes for High Performance Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sundriyal, Vaibhav
Although high-performance computing traditionally focuses on the efficient execution of large-scale applications, both energy and power have become critical concerns when approaching exascale. Drastic increases in the power consumption of supercomputers affect significantly their operating costs and failure rates. In modern microprocessor architectures, equipped with dynamic voltage and frequency scaling (DVFS) and CPU clock modulation (throttling), the power consumption may be controlled in software. Additionally, network interconnect, such as Infiniband, may be exploited to maximize energy savings while the application performance loss and frequency switching overheads must be carefully balanced. This work first studies two important collective communication operations, all-to-allmore » and allgather and proposes energy saving strategies on the per-call basis. Next, it targets point-to-point communications to group them into phases and apply frequency scaling to them to save energy by exploiting the architectural and communication stalls. Finally, it proposes an automatic runtime system which combines both collective and point-to-point communications into phases, and applies throttling to them apart from DVFS to maximize energy savings. The experimental results are presented for NAS parallel benchmark problems as well as for the realistic parallel electronic structure calculations performed by the widely used quantum chemistry package GAMESS. Close to the maximum energy savings were obtained with a substantially low performance loss on the given platform.« less
The impact of the U.S. supercomputing initiative will be global
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crawford, Dona
2016-01-15
Last July, President Obama issued an executive order that created a coordinated federal strategy for HPC research, development, and deployment called the U.S. National Strategic Computing Initiative (NSCI). However, this bold, necessary step toward building the next generation of supercomputers has inaugurated a new era for U.S. high performance computing (HPC).
Parallel-vector solution of large-scale structural analysis problems on supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1989-01-01
A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.
Predicting Hurricanes with Supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
2010-01-01
Hurricane Emily, formed in the Atlantic Ocean on July 10, 2005, was the strongest hurricane ever to form before August. By checking computer models against the actual path of the storm, researchers can improve hurricane prediction. In 2010, NOAA researchers were awarded 25 million processor-hours on Argonne's BlueGene/P supercomputer for the project. Read more at http://go.usa.gov/OLh
NASA Technical Reports Server (NTRS)
Peterson, Victor L.; Kim, John; Holst, Terry L.; Deiwert, George S.; Cooper, David M.; Watson, Andrew B.; Bailey, F. Ron
1992-01-01
Report evaluates supercomputer needs of five key disciplines: turbulence physics, aerodynamics, aerothermodynamics, chemistry, and mathematical modeling of human vision. Predicts these fields will require computer speed greater than 10(Sup 18) floating-point operations per second (FLOP's) and memory capacity greater than 10(Sup 15) words. Also, new parallel computer architectures and new structured numerical methods will make necessary speed and capacity available.
Advances in petascale kinetic plasma simulation with VPIC and Roadrunner
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowers, Kevin J; Albright, Brian J; Yin, Lin
2009-01-01
VPIC, a first-principles 3d electromagnetic charge-conserving relativistic kinetic particle-in-cell (PIC) code, was recently adapted to run on Los Alamos's Roadrunner, the first supercomputer to break a petaflop (10{sup 15} floating point operations per second) in the TOP500 supercomputer performance rankings. They give a brief overview of the modeling capabilities and optimization techniques used in VPIC and the computational characteristics of petascale supercomputers like Roadrunner. They then discuss three applications enabled by VPIC's unprecedented performance on Roadrunner: modeling laser plasma interaction in upcoming inertial confinement fusion experiments at the National Ignition Facility (NIF), modeling short pulse laser GeV ion acceleration andmore » modeling reconnection in magnetic confinement fusion experiments.« less
Supercomputing Sheds Light on the Dark Universe
DOE Office of Scientific and Technical Information (OSTI.GOV)
Habib, Salman; Heitmann, Katrin
2012-11-15
At Argonne National Laboratory, scientists are using supercomputers to shed light on one of the great mysteries in science today, the Dark Universe. With Mira, a petascale supercomputer at the Argonne Leadership Computing Facility, a team led by physicists Salman Habib and Katrin Heitmann will run the largest, most complex simulation of the universe ever attempted. By contrasting the results from Mira with state-of-the-art telescope surveys, the scientists hope to gain new insights into the distribution of matter in the universe, advancing future investigations of dark energy and dark matter into a new realm. The team's research was named amore » finalist for the 2012 Gordon Bell Prize, an award recognizing outstanding achievement in high-performance computing.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curran, L.
1988-03-03
Interest has been building in recent months over the imminent arrival of a new class of supercomputer, called the ''supercomputer on a desk'' or the single-user model. Most observers expected the first such product to come from either of two startups, Ardent Computer Corp. or Stellar Computer Inc. But a surprise entry has shown up. Apollo Computer Inc. is launching a new work station this week that racks up an impressive list of industry first as it puts supercomputer power at the disposal of a single user. The new series 10000 from the Chelmsford, Mass., a company is built aroundmore » a reduced-instruction-set architecture that the company calls Prism, for parallel reduced-instruction-set multiprocessor. This article describes the 10000 and Prism.« less
NASA Technical Reports Server (NTRS)
Murman, E. M. (Editor); Abarbanel, S. S. (Editor)
1985-01-01
Current developments and future trends in the application of supercomputers to computational fluid dynamics are discussed in reviews and reports. Topics examined include algorithm development for personal-size supercomputers, a multiblock three-dimensional Euler code for out-of-core and multiprocessor calculations, simulation of compressible inviscid and viscous flow, high-resolution solutions of the Euler equations for vortex flows, algorithms for the Navier-Stokes equations, and viscous-flow simulation by FEM and related techniques. Consideration is given to marching iterative methods for the parabolized and thin-layer Navier-Stokes equations, multigrid solutions to quasi-elliptic schemes, secondary instability of free shear flows, simulation of turbulent flow, and problems connected with weather prediction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reynolds, William; Weber, Marta S.; Farber, Robert M.
Social Media provide an exciting and novel view into social phenomena. The vast amounts of data that can be gathered from the Internet coupled with massively parallel supercomputers such as the Cray XMT open new vistas for research. Conclusions drawn from such analysis must recognize that social media are distinct from the underlying social reality. Rigorous validation is essential. This paper briefly presents results obtained from computational analysis of social media - utilizing both blog and twitter data. Validation of these results is discussed in the context of a framework of established methodologies from the social sciences. Finally, an outlinemore » for a set of supporting studies is proposed.« less
A Laboratory Facility for Research in Parallel Computation: Project Final Report.
1987-07-01
87 UNCLASSIFED AFOSR-TR-87-i9gi AFMS-86-279 F/ G 12/6 U MENE .306 fil L -0 1 25 1 4 1111 Llj i CHART 04.- 0 . FL F0. A- h 0 r .WrnKw -- w F-U-ML la...34A software tool for Building Supercomputer Applications" (I ) G ~Ij ONAVAILABILITY OF ABSTRACT 21. ABSTRACT SECURITY CLASSIFICATION %(I T ,V/,I rDIijN...processors may display different be- haviors. For example assume we have a processor g with a "good" local structure and a processor b with a "bad" local
DOE Office of Scientific and Technical Information (OSTI.GOV)
Settlemyer, Bradley; Kettimuthu, R.; Boley, Josh
High-performance scientific work flows utilize supercomputers, scientific instruments, and large storage systems. Their executions require fast setup of a small number of dedicated network connections across the geographically distributed facility sites. We present Software-Defined Network (SDN) solutions consisting of site daemons that use dpctl, Floodlight, ONOS, or OpenDaylight controllers to set up these connections. The development of these SDN solutions could be quite disruptive to the infrastructure, while requiring a close coordination among multiple sites; in addition, the large number of possible controller and device combinations to investigate could make the infrastructure unavailable to regular users for extended periods ofmore » time. In response, we develop a Virtual Science Network Environment (VSNE) using virtual machines, Mininet, and custom scripts that support the development, testing, and evaluation of SDN solutions, without the constraints and expenses of multi-site physical infrastructures; furthermore, the chosen solutions can be directly transferred to production deployments. By complementing VSNE with a physical testbed, we conduct targeted performance tests of various SDN solutions to help choose the best candidates. In addition, we propose a switching response method to assess the setup times and throughput performances of different SDN solutions, and present experimental results that show their advantages and limitations.« less
Handels, H; Busch, C; Encarnação, J; Hahn, C; Kühn, V; Miehe, J; Pöppl, S I; Rinast, E; Rossmanith, C; Seibert, F; Will, A
1997-03-01
The software system KAMEDIN (Kooperatives Arbeiten und MEdizinische Diagnostik auf Innovativen Netzen) is a multimedia telemedicine system for exchange, cooperative diagnostics, and remote analysis of digital medical image data. It provides components for visualisation, processing, and synchronised audio-visual discussion of medical images. Techniques of computer supported cooperative work (CSCW) synchronise user interactions during a teleconference. Visibility of both local and remote cursor on the conference workstations facilitates telepointing and reinforces the conference partner's telepresence. Audio communication during teleconferences is supported by an integrated audio component. Furthermore, brain tissue segmentation with artificial neural networks can be performed on an external supercomputer as a remote image analysis procedure. KAMEDIN is designed as a low cost CSCW tool for ISDN based telecommunication. However it can be used on any TCP/IP supporting network. In a field test, KAMEDIN was installed in 15 clinics and medical departments to validate the systems' usability. The telemedicine system KAMEDIN has been developed, tested, and evaluated within a research project sponsored by German Telekom.
10 Gigabit Ethernet Performance on SGI Altix and Origin Systems
NASA Technical Reports Server (NTRS)
Meyer, Andy
2005-01-01
As the state of high performance computing continues to advance, the size of datasets continue to grow, driving a need for high bandwidth data networks. family of networks. 10 Gigabit Ethernet is the latest step in the popular Ethernet We have evaluated the S2io Xframe 10 Gigabit Ethernet adapter on 512p SGI Altix systems running ProPack 3, and Origin systems running Irix 6.5.24 and 6.5.26 in our production supercomputing environment. We encountered a number of performance and stability issues, which were promptly dealt with by SGI and S2io. Using nttcp we tested TCP performance for single and multiple streams, and we tested file transfer using NFS and bbftp. We will present the results of our testing, including the effects of various tuning options on throughput and CPU utilization, and offer suggestions for configuring and tuning S2io 10 Gigabit Ethernet cards in an Altix/Linux or Origin/Irix environment.
NASA Astrophysics Data System (ADS)
Hartmann, Alfred; Redfield, Steve
1989-04-01
This paper discusses design of large-scale (1000x 1000) optical crossbar switching networks for use in parallel processing supercom-puters. Alternative design sketches for an optical crossbar switching network are presented using free-space optical transmission with either a beam spreading/masking model or a beam steering model for internodal communications. The performances of alternative multiple access channel communications protocol-unslotted and slotted ALOHA and carrier sense multiple access (CSMA)-are compared with the performance of the classic arbitrated bus crossbar of conventional electronic parallel computing. These comparisons indicate an almost inverse relationship between ease of implementation and speed of operation. Practical issues of optical system design are addressed, and an optically addressed, composite spatial light modulator design is presented for fabrication to arbitrarily large scale. The wide range of switch architecture, communications protocol, optical systems design, device fabrication, and system performance problems presented by these design sketches poses a serious challenge to practical exploitation of highly parallel optical interconnects in advanced computer designs.
Parallel Navier-Stokes computations on shared and distributed memory architectures
NASA Technical Reports Server (NTRS)
Hayder, M. Ehtesham; Jayasimha, D. N.; Pillay, Sasi Kumar
1995-01-01
We study a high order finite difference scheme to solve the time accurate flow field of a jet using the compressible Navier-Stokes equations. As part of our ongoing efforts, we have implemented our numerical model on three parallel computing platforms to study the computational, communication, and scalability characteristics. The platforms chosen for this study are a cluster of workstations connected through fast networks (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and a distributed memory multiprocessor (the IBM SPI). Our focus in this study is on the LACE testbed. We present some results for the Cray YMP and the IBM SP1 mainly for comparison purposes. On the LACE testbed, we study: (1) the communication characteristics of Ethernet, FDDI, and the ALLNODE networks and (2) the overheads induced by the PVM message passing library used for parallelizing the application. We demonstrate that clustering of workstations is effective and has the potential to be computationally competitive with supercomputers at a fraction of the cost.
None
2018-05-01
A new Idaho National Laboratory supercomputer is helping scientists create more realistic simulations of nuclear fuel. Dubbed "Ice Storm" this 2048-processor machine allows researchers to model and predict the complex physics behind nuclear reactor behavior. And with a new visualization lab, the team can see the results of its simulations on the big screen. For more information about INL research, visit http://www.facebook.com/idahonationallaboratory.
Open Skies Project Computational Fluid Dynamic Analysis
1994-03-01
109 -. -_ _ 9 . CONCLUSIONSI1 f 10. LIST OF REFERENCES _________ ___________112 APPENDIX A: Transition Prediction __________________116 B...Behind the Open Skies Plate 20 8. VSAERO Results on the Alternate Fairing 21 9 . Centerline Cp Comparisons 22 10. VSAERO Wing Effects Study Centerline C...problems. The assistance Mrs. Mary Ann Mages, at Kirtland Supercomputer Center ( PL /SCPR) gave by setting a precedent for supercomputer account
Porting Ordinary Applications to Blue Gene/Q Supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maheshwari, Ketan C.; Wozniak, Justin M.; Armstrong, Timothy
2015-08-31
Efficiently porting ordinary applications to Blue Gene/Q supercomputers is a significant challenge. Codes are often originally developed without considering advanced architectures and related tool chains. Science needs frequently lead users to want to run large numbers of relatively small jobs (often called many-task computing, an ensemble, or a workflow), which can conflict with supercomputer configurations. In this paper, we discuss techniques developed to execute ordinary applications over leadership class supercomputers. We use the high-performance Swift parallel scripting framework and build two workflow execution techniques-sub-jobs and main-wrap. The sub-jobs technique, built on top of the IBM Blue Gene/Q resource manager Cobalt'smore » sub-block jobs, lets users submit multiple, independent, repeated smaller jobs within a single larger resource block. The main-wrap technique is a scheme that enables C/C++ programs to be defined as functions that are wrapped by a high-performance Swift wrapper and that are invoked as a Swift script. We discuss the needs, benefits, technicalities, and current limitations of these techniques. We further discuss the real-world science enabled by these techniques and the results obtained.« less
Choosing experiments to accelerate collective discovery
Rzhetsky, Andrey; Foster, Jacob G.; Foster, Ian T.
2015-01-01
A scientist’s choice of research problem affects his or her personal career trajectory. Scientists’ combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity’s importance corresponds to its degree centrality, and a problem’s difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies. PMID:26554009
PNNL streamlines energy-guzzling computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beckman, Mary T.; Marquez, Andres
In a room the size of a garage, two rows of six-foot-tall racks holding supercomputer hard drives sit back-to-back. Thin tubes and wires snake off the hard drives, slithering into the corners. Stepping between the rows, a rush of heat whips around you -- the air from fans blowing off processing heat. But walk farther in, between the next racks of hard drives, and the temperature drops noticeably. These drives are being cooled by a non-conducting liquid that runs right over the hardworking processors. The liquid carries the heat away in tubes, saving the air a few degrees. This ismore » the Energy Smart Data Center at Pacific Northwest National Laboratory. The bigger, faster, and meatier supercomputers get, the more energy they consume. PNNL's Andres Marquez has developed this test bed to learn how to train the behemoths in energy efficiency. The work will help supercomputers perform better as well. Processors have to keep cool or suffer from "thermal throttling," says Marquez. "That's the performance threshold where the computer is too hot to run well. That threshold is an industry secret." The center at EMSL, DOE's national scientific user facility at PNNL, harbors several ways of experimenting with energy usage. For example, the room's air conditioning is isolated from the rest of EMSL -- pipes running beneath the floor carry temperature-controlled water through heat exchangers to cooling towers outside. "We can test whether it's more energy efficient to cool directly on the processing chips or out in the water tower," says Marquez. The hard drives feed energy and temperature data to a network server running specially designed software that controls and monitors the data center. To test the center’s limits, the team runs the processors flat out – not only on carefully controlled test programs in the Energy Smart computers, but also on real world software from other EMSL research, such as regional weather forecasting models. Marquez's group is also developing "power aware computing", where the computer programs themselves perform calculations more energy efficiently. Maybe once computers get smart about energy, they'll have tips for their users.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Chase Qishi; Zhu, Michelle Mengxia
The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack
20th Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 20th edition of the TOP500 list of the world's fastest supercomputers was released today (November 15, 2002). The Earth Simulator supercomputer installed earlier this year at the Earth Simulator Center in Yokohama, Japan, is with its Linpack benchmark performance of 35.86 Tflop/s (trillions of calculations per second) retains the number one position. The No.2 and No.3 positions are held by two new, identical ASCI Q systems at Los Alamos National Laboratorymore » (7.73Tflop/s each). These systems are built by Hewlett-Packard and based on the Alpha Server SC computer system.« less
STAMPS: Software Tool for Automated MRI Post-processing on a supercomputer.
Bigler, Don C; Aksu, Yaman; Miller, David J; Yang, Qing X
2009-08-01
This paper describes a Software Tool for Automated MRI Post-processing (STAMP) of multiple types of brain MRIs on a workstation and for parallel processing on a supercomputer (STAMPS). This software tool enables the automation of nonlinear registration for a large image set and for multiple MR image types. The tool uses standard brain MRI post-processing tools (such as SPM, FSL, and HAMMER) for multiple MR image types in a pipeline fashion. It also contains novel MRI post-processing features. The STAMP image outputs can be used to perform brain analysis using Statistical Parametric Mapping (SPM) or single-/multi-image modality brain analysis using Support Vector Machines (SVMs). Since STAMPS is PBS-based, the supercomputer may be a multi-node computer cluster or one of the latest multi-core computers.
Japanese project aims at supercomputer that executes 10 gflops
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burskey, D.
1984-05-03
Dubbed supercom by its multicompany design team, the decade-long project's goal is an engineering supercomputer that can execute 10 billion floating-point operations/s-about 20 times faster than today's supercomputers. The project, guided by Japan's Ministry of International Trade and Industry (MITI) and the Agency of Industrial Science and Technology encompasses three parallel research programs, all aimed at some angle of the superconductor. One program should lead to superfast logic and memory circuits, another to a system architecture that will afford the best performance, and the last to the software that will ultimately control the computer. The work on logic and memorymore » chips is based on: GAAS circuit; Josephson junction devices; and high electron mobility transistor structures. The architecture will involve parallel processing.« less
Community Detection on the GPU
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naim, Md; Manne, Fredrik; Halappanavar, Mahantesh
We present and evaluate a new GPU algorithm based on the Louvain method for community detection. Our algorithm is the first for this problem that parallelizes the access to individual edges. In this way we can fine tune the load balance when processing networks with nodes of highly varying degrees. This is achieved by scaling the number of threads assigned to each node according to its degree. Extensive experiments show that we obtain speedups up to a factor of 270 compared to the sequential algorithm. The algorithm consistently outperforms other recent shared memory implementations and is only one order ofmore » magnitude slower than the current fastest parallel Louvain method running on a Blue Gene/Q supercomputer using more than 500K threads.« less
NASA Technical Reports Server (NTRS)
Oliger, Joseph
1997-01-01
Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project.
Earth Sciences Electronic Theater ''999
NASA Technical Reports Server (NTRS)
Hasler, Fritz; Manyin, Mike
1999-01-01
The Etheater presents visualizations which span the period from the original Suomi/Hasler animations of the first ATS-1 GEO weather satellite images in 1966 ....... to the latest 1999 NASA Earth Science Vision for the next 25 years. Hot off the SGI-Onyx Graphics-Supercomputer are NASA's visualizations of Hurricanes Mitch, Georges, Fran and Linda. These storms have been recently featured on the covers of National Geographic, Time, Newsweek and Popular Science. Highlights will be shown from the NASA hurricane visualization resource video tape that has been used repeatedly this season on National and International network TV. Results will be presented from a new paper on automatic wind measurements in Hurricane Luis from 1-min GOES images that appeared in the November BAMS.
Information technologies for astrophysics circa 2001
NASA Technical Reports Server (NTRS)
Denning, Peter J.
1990-01-01
It is easy to extrapolate current trends to see where technologies relating to information systems in astrophysics and other disciplines will be by the end of the decade. These technologies include mineaturization, multiprocessing, software technology, networking, databases, graphics, pattern computation, and interdisciplinary studies. It is easy to see what limits our current paradigms place on our thinking about technologies that will allow us to understand the laws governing very large systems about which we have large datasets. Three limiting paradigms are saving all the bits collected by instruments or generated by supercomputers; obtaining technology for information compression, storage and retrieval off the shelf; and the linear mode of innovation. We must extend these paradigms to meet our goals for information technology at the end of the decade.
Methodologies and systems for heterogeneous concurrent computing
NASA Technical Reports Server (NTRS)
Sunderam, V. S.
1994-01-01
Heterogeneous concurrent computing is gaining increasing acceptance as an alternative or complementary paradigm to multiprocessor-based parallel processing as well as to conventional supercomputing. While algorithmic and programming aspects of heterogeneous concurrent computing are similar to their parallel processing counterparts, system issues, partitioning and scheduling, and performance aspects are significantly different. In this paper, we discuss critical design and implementation issues in heterogeneous concurrent computing, and describe techniques for enhancing its effectiveness. In particular, we highlight the system level infrastructures that are required, aspects of parallel algorithm development that most affect performance, system capabilities and limitations, and tools and methodologies for effective computing in heterogeneous networked environments. We also present recent developments and experiences in the context of the PVM system and comment on ongoing and future work.
A Communication-Optimal Framework for Contracting Distributed Tensors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rajbhandari, Samyam; NIkam, Akshay; Lai, Pai-Wei
Tensor contractions are extremely compute intensive generalized matrix multiplication operations encountered in many computational science fields, such as quantum chemistry and nuclear physics. Unlike distributed matrix multiplication, which has been extensively studied, limited work has been done in understanding distributed tensor contractions. In this paper, we characterize distributed tensor contraction algorithms on torus networks. We develop a framework with three fundamental communication operators to generate communication-efficient contraction algorithms for arbitrary tensor contractions. We show that for a given amount of memory per processor, our framework is communication optimal for all tensor contractions. We demonstrate performance and scalability of our frameworkmore » on up to 262,144 cores of BG/Q supercomputer using five tensor contraction examples.« less
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Secchi, Simone; Tumeo, Antonino; Villa, Oreste
Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy inmore » reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.« less
NASA Astrophysics Data System (ADS)
Kuvich, Gary
2003-08-01
Vision is a part of a larger information system that converts visual information into knowledge structures. These structures drive vision process, resolve ambiguity and uncertainty via feedback projections, and provide image understanding that is an interpretation of visual information in terms of such knowledge models. The ability of human brain to emulate knowledge structures in the form of networks-symbolic models is found. And that means an important shift of paradigm in our knowledge about brain from neural networks to "cortical software". Symbols, predicates and grammars naturally emerge in such active multilevel hierarchical networks, and logic is simply a way of restructuring such models. Brain analyzes an image as a graph-type decision structure created via multilevel hierarchical compression of visual information. Mid-level vision processes like clustering, perceptual grouping, separation of figure from ground, are special kinds of graph/network transformations. They convert low-level image structure into the set of more abstract ones, which represent objects and visual scene, making them easy for analysis by higher-level knowledge structures. Higher-level vision phenomena are results of such analysis. Composition of network-symbolic models works similar to frames and agents, combines learning, classification, analogy together with higher-level model-based reasoning into a single framework. Such models do not require supercomputers. Based on such principles, and using methods of Computational intelligence, an Image Understanding system can convert images into the network-symbolic knowledge models, and effectively resolve uncertainty and ambiguity, providing unifying representation for perception and cognition. That allows creating new intelligent computer vision systems for robotic and defense industries.
Japanese supercomputer technology.
Buzbee, B L; Ewald, R H; Worlton, W J
1982-12-17
Under the auspices of the Ministry for International Trade and Industry the Japanese have launched a National Superspeed Computer Project intended to produce high-performance computers for scientific computation and a Fifth-Generation Computer Project intended to incorporate and exploit concepts of artificial intelligence. If these projects are successful, which appears likely, advanced economic and military research in the United States may become dependent on access to supercomputers of foreign manufacture.
Supercomputer Simulations Help Develop New Approach to Fight Antibiotic Resistance
Zgurskaya, Helen; Smith, Jeremy
2018-06-13
ORNL leveraged powerful supercomputing to support research led by University of Oklahoma scientists to identify chemicals that seek out and disrupt bacterial proteins called efflux pumps, known to be a major cause of antibiotic resistance. By running simulations on Titan, the team selected molecules most likely to target and potentially disable the assembly of efflux pumps found in E. coli bacteria cells.
Challenges and opportunities of cloud computing for atmospheric sciences
NASA Astrophysics Data System (ADS)
Pérez Montes, Diego A.; Añel, Juan A.; Pena, Tomás F.; Wallom, David C. H.
2016-04-01
Cloud computing is an emerging technological solution widely used in many fields. Initially developed as a flexible way of managing peak demand it has began to make its way in scientific research. One of the greatest advantages of cloud computing for scientific research is independence of having access to a large cyberinfrastructure to fund or perform a research project. Cloud computing can avoid maintenance expenses for large supercomputers and has the potential to 'democratize' the access to high-performance computing, giving flexibility to funding bodies for allocating budgets for the computational costs associated with a project. Two of the most challenging problems in atmospheric sciences are computational cost and uncertainty in meteorological forecasting and climate projections. Both problems are closely related. Usually uncertainty can be reduced with the availability of computational resources to better reproduce a phenomenon or to perform a larger number of experiments. Here we expose results of the application of cloud computing resources for climate modeling using cloud computing infrastructures of three major vendors and two climate models. We show how the cloud infrastructure compares in performance to traditional supercomputers and how it provides the capability to complete experiments in shorter periods of time. The monetary cost associated is also analyzed. Finally we discuss the future potential of this technology for meteorological and climatological applications, both from the point of view of operational use and research.
Evolution of the Virtualized HPC Infrastructure of Novosibirsk Scientific Center
NASA Astrophysics Data System (ADS)
Adakin, A.; Anisenkov, A.; Belov, S.; Chubarov, D.; Kalyuzhny, V.; Kaplin, V.; Korol, A.; Kuchin, N.; Lomakin, S.; Nikultsev, V.; Skovpen, K.; Sukharev, A.; Zaytsev, A.
2012-12-01
Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies, and Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG). Since each institute has specific requirements on the architecture of computing farms involved in its research field, currently we've got several computing facilities hosted by NSC institutes, each optimized for a particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM&MG), and a Grid Computing Facility of BINP. A dedicated optical network with the initial bandwidth of 10 Gb/s connecting these three facilities was built in order to make it possible to share the computing resources among the research communities, thus increasing the efficiency of operating the existing computing facilities and offering a common platform for building the computing infrastructure for future scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technology based on XEN and KVM platforms. This contribution gives a thorough review of the present status and future development prospects for the NSC virtualized computing infrastructure and the experience gained while using it for running production data analysis jobs related to HEP experiments being carried out at BINP, especially the KEDR detector experiment at the VEPP-4M electron-positron collider.
Next Generation Security for the 10,240 Processor Columbia System
NASA Technical Reports Server (NTRS)
Hinke, Thomas; Kolano, Paul; Shaw, Derek; Keller, Chris; Tweton, Dave; Welch, Todd; Liu, Wen (Betty)
2005-01-01
This presentation includes a discussion of the Columbia 10,240-processor system located at the NASA Advanced Supercomputing (NAS) division at the NASA Ames Research Center which supports each of NASA's four missions: science, exploration systems, aeronautics, and space operations. It is comprised of 20 Silicon Graphics nodes, each consisting of 512 Itanium II processors. A 64 processor Columbia front-end system supports users as they prepare their jobs and then submits them to the PBS system. Columbia nodes and front-end systems use the Linux OS. Prior to SC04, the Columbia system was used to attain a processing speed of 51.87 TeraFlops, which made it number two on the Top 500 list of the world's supercomputers and the world's fastest "operational" supercomputer since it was fully engaged in supporting NASA users.
CFD applications: The Lockheed perspective
NASA Technical Reports Server (NTRS)
Miranda, Luis R.
1987-01-01
The Numerical Aerodynamic Simulator (NAS) epitomizes the coming of age of supercomputing and opens exciting horizons in the world of numerical simulation. An overview of supercomputing at Lockheed Corporation in the area of Computational Fluid Dynamics (CFD) is presented. This overview will focus on developments and applications of CFD as an aircraft design tool and will attempt to present an assessment, withing this context, of the state-of-the-art in CFD methodology.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
A Layered Solution for Supercomputing Storage
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grider, Gary
To solve the supercomputing challenge of memory keeping up with processing speed, a team at Los Alamos National Laboratory developed two innovative memory management and storage technologies. Burst buffers peel off data onto flash memory to support the checkpoint/restart paradigm of large simulations. MarFS adds a thin software layer enabling a new tier for campaign storage—based on inexpensive, failure-prone disk drives—between disk drives and tape archives.
A Heterogeneous High-Performance System for Computational and Computer Science
2016-11-15
Patents Submitted Patents Awarded Awards Graduate Students Names of Post Doctorates Names of Faculty Supported Names of Under Graduate students supported...team of research faculty from the departments of computer science and natural science at Bowie State University. The supercomputer is not only to...accelerated HPC systems. The supercomputer is also ideal for the research conducted in the Department of Natural Science, as research faculty work on
LLMapReduce: Multi-Lingual Map-Reduce for Supercomputing Environments
2015-11-20
1990s. Popularized by Google [36] and Apache Hadoop [37], map-reduce has become a staple technology of the ever- growing big data community...Lexington, MA, U.S.A Abstract— The map-reduce parallel programming model has become extremely popular in the big data community. Many big data ...to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing simple parallel programming
Advanced Numerical Techniques of Performance Evaluation. Volume 1
1990-06-01
system scheduling3thread. The scheduling thread then runs any other ready thread that can be found. A thread can only sleep or switch out on itself...Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers C...Kuck 1987] C.D. Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Trans. on Comp
NASA Astrophysics Data System (ADS)
Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo
2016-07-01
Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
Ensemble-based docking: From hit discovery to metabolism and toxicity predictions.
Evangelista, Wilfredo; Weir, Rebecca L; Ellingson, Sally R; Harris, Jason B; Kapoor, Karan; Smith, Jeremy C; Baudry, Jerome
2016-10-15
This paper describes and illustrates the use of ensemble-based docking, i.e., using a collection of protein structures in docking calculations for hit discovery, the exploration of biochemical pathways and toxicity prediction of drug candidates. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials. Copyright © 2016 Elsevier Ltd. All rights reserved.
Abraham, Mark James; Murtola, Teemu; Schulz, Roland; ...
2015-07-15
GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abraham, Mark James; Murtola, Teemu; Schulz, Roland
GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
Dual-scale phase-field simulation of Mg-Al alloy solidification
NASA Astrophysics Data System (ADS)
Monas, A.; Shchyglo, O.; Höche, D.; Tegeler, M.; Steinbach, I.
2015-06-01
Phase-field simulations of the nucleation and growth of primary α-Mg phase as well as secondary, β-phase of a Mg-Al alloy are presented. The nucleation model for α- and β-Mg phases is based on the “free growth model” by Greer et al.. After the α-Mg phase solidification we study a divorced eutectic growth of α- and β-Mg phases in a zoomed in melt channel between α-phase dendrites. The simulated cooling curves and final microstructures of α-grains are compared with experiments. In order to further enhance the resolution of the interdendritic region a high-performance computing approach has been used allowing significant simulation speed gain when using supercomputing facilities.
The Pawsey Supercomputer geothermal cooling project
NASA Astrophysics Data System (ADS)
Regenauer-Lieb, K.; Horowitz, F.; Western Australian Geothermal Centre Of Excellence, T.
2010-12-01
The Australian Government has funded the Pawsey supercomputer in Perth, Western Australia, providing computational infrastructure intended to support the future operations of the Australian Square Kilometre Array radiotelescope and to boost next-generation computational geosciences in Australia. Supplementary funds have been directed to the development of a geothermal exploration well to research the potential for direct heat use applications at the Pawsey Centre site. Cooling the Pawsey supercomputer may be achieved by geothermal heat exchange rather than by conventional electrical power cooling, thus reducing the carbon footprint of the Pawsey Centre and demonstrating an innovative green technology that is widely applicable in industry and urban centres across the world. The exploration well is scheduled to be completed in 2013, with drilling due to commence in the third quarter of 2011. One year is allocated to finalizing the design of the exploration, monitoring and research well. Success in the geothermal exploration and research program will result in an industrial-scale geothermal cooling facility at the Pawsey Centre, and will provide a world-class student training environment in geothermal energy systems. A similar system is partially funded and in advanced planning to provide base-load air-conditioning for the main campus of the University of Western Australia. Both systems are expected to draw ~80-95 degrees C water from aquifers lying between 2000 and 3000 meters depth from naturally permeable rocks of the Perth sedimentary basin. The geothermal water will be run through absorption chilling devices, which only require heat (as opposed to mechanical work) to power a chilled water stream adequate to meet the cooling requirements. Once the heat has been removed from the geothermal water, licensing issues require the water to be re-injected back into the aquifer system. These systems are intended to demonstrate the feasibility of powering large-scale air-conditioning systems from the direct use of geothermal power from Hot Sedimentary Aquifer (HSA) systems. HSA systems underlie many of the world's population centers, and thus have the potential to offset a significant fraction of the world's consumption of electrical power for air-conditioning.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
NASA's Pleiades Supercomputer Crunches Data For Groundbreaking Analysis and Visualizations
2016-11-23
The Pleiades supercomputer at NASA's Ames Research Center, recently named the 13th fastest computer in the world, provides scientists and researchers high-fidelity numerical modeling of complex systems and processes. By using detailed analyses and visualizations of large-scale data, Pleiades is helping to advance human knowledge and technology, from designing the next generation of aircraft and spacecraft to understanding the Earth's climate and the mysteries of our galaxy.
A Layered Solution for Supercomputing Storage
Grider, Gary
2018-06-13
To solve the supercomputing challenge of memory keeping up with processing speed, a team at Los Alamos National Laboratory developed two innovative memory management and storage technologies. Burst buffers peel off data onto flash memory to support the checkpoint/restart paradigm of large simulations. MarFS adds a thin software layer enabling a new tier for campaign storageâbased on inexpensive, failure-prone disk drivesâbetween disk drives and tape archives.
A Long History of Supercomputing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grider, Gary
As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nation’s first computer to building the first machine to break the petaflop barrier, Los Alamos holds many “firsts” in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.
Introducing Argonne’s Theta Supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
Theta, the Argonne Leadership Computing Facility’s (ALCF) new Intel-Cray supercomputer, is officially open to the research community. Theta’s massively parallel, many-core architecture puts the ALCF on the path to Aurora, the facility’s future Intel-Cray system. Capable of nearly 10 quadrillion calculations per second, Theta enables researchers to break new ground in scientific investigations that range from modeling the inner workings of the brain to developing new materials for renewable energy applications.
NASA Advanced Supercomputing Facility Expansion
NASA Technical Reports Server (NTRS)
Thigpen, William W.
2017-01-01
The NASA Advanced Supercomputing (NAS) Division enables advances in high-end computing technologies and in modeling and simulation methods to tackle some of the toughest science and engineering challenges facing NASA today. The name "NAS" has long been associated with leadership and innovation throughout the high-end computing (HEC) community. We play a significant role in shaping HEC standards and paradigms, and provide leadership in the areas of large-scale InfiniBand fabrics, Lustre open-source filesystems, and hyperwall technologies. We provide an integrated high-end computing environment to accelerate NASA missions and make revolutionary advances in science. Pleiades, a petaflop-scale supercomputer, is used by scientists throughout the U.S. to support NASA missions, and is ranked among the most powerful systems in the world. One of our key focus areas is in modeling and simulation to support NASA's real-world engineering applications and make fundamental advances in modeling and simulation methods.
ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.
Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping
2018-04-27
A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.
Graphics supercomputer for computational fluid dynamics research
NASA Astrophysics Data System (ADS)
Liaw, Goang S.
1994-11-01
The objective of this project is to purchase a state-of-the-art graphics supercomputer to improve the Computational Fluid Dynamics (CFD) research capability at Alabama A & M University (AAMU) and to support the Air Force research projects. A cutting-edge graphics supercomputer system, Onyx VTX, from Silicon Graphics Computer Systems (SGI), was purchased and installed. Other equipment including a desktop personal computer, PC-486 DX2 with a built-in 10-BaseT Ethernet card, a 10-BaseT hub, an Apple Laser Printer Select 360, and a notebook computer from Zenith were also purchased. A reading room has been converted to a research computer lab by adding some furniture and an air conditioning unit in order to provide an appropriate working environments for researchers and the purchase equipment. All the purchased equipment were successfully installed and are fully functional. Several research projects, including two existing Air Force projects, are being performed using these facilities.
Modelling sodium cobaltate by mapping onto magnetic Ising model
NASA Astrophysics Data System (ADS)
Gemperline, Patrick; Morris, David Jonathan Pryce
Fast Ion conductors are a class of crystals that are frequently used as battery materials, especially in smart phones, laptops, and other portable devices. Sodium Cobalt Oxide, NaxCoO2, falls into this class of crystals, but is unique because it possesses the ability to act as a thermoelectric material and a superconductor at different concentrations of Na+. The crystal lattice is mapped onto an Ising Magnetic Spin model and a Monte-Carol Simulation is used to find the most energetically favorable configuration of spins. This spin configuration is mapped back to the crystal lattice resulting in the most stable crystal structure of Sodium Cobalt Oxide at various concentrations. Knowing the atomic structures of the crystals will aid in the research of the materials capabilities and the possible uses of the material commercially. Ohio Supercomputer Center. 1987. Ohio Supercomputer Center. Columbus OH: Ohio Supercomputer Center. and the John Hauck Foundation.
The architecture of tomorrow's massively parallel computer
NASA Technical Reports Server (NTRS)
Batcher, Ken
1987-01-01
Goodyear Aerospace delivered the Massively Parallel Processor (MPP) to NASA/Goddard in May 1983, over three years ago. Ever since then, Goodyear has tried to look in a forward direction. There is always some debate as to which way is forward when it comes to supercomputer architecture. Improvements to the MPP's massively parallel architecture are discussed in the areas of data I/O, memory capacity, connectivity, and indirect (or local) addressing. In I/O, transfer rates up to 640 megabytes per second can be achieved. There are devices that can supply the data and accept it at this rate. The memory capacity can be increased up to 128 megabytes in the ARU and over a gigabyte in the staging memory. For connectivity, there are several different kinds of multistage networks that should be considered.
NASA Technical Reports Server (NTRS)
Hasler, A. F.
1999-01-01
The Etheater presents visualizations which span the period from the original Suomi/Hasler animations of the first ATS-1 GEO weather satellite images in 1966 ... to the latest 1999 NASA Earth Science Vision for the next 25 years. Hot off the SGI-Onyx Graphics-Supercomputer are NASA's visualizations of Hurricanes Mitch, Georges, Fran and Linda. These storms have been recently featured on the covers of National Geographic, Time, Newsweek and Popular Science. Highlights will be shown from the NASA hurricane visualization resource video tape that has been used repeatedly this season on National and International network TV. Results will be presented from a new paper on automatic wind measurements in Hurricane Luis from 1-min GOES images that appeared in the November BAMS.
NASA Technical Reports Server (NTRS)
Hasler, A. Fritz; Allen, Jesse
1999-01-01
The Etheater presents visualizations which span the period from the original Suomi/Hasler animations of the first ATS-1 GEO weather satellite images in 1966....... to the latest 1999 NASA Earth Science Vision for the next 25 years. Hot off the SGI-Onyx Graphics-Supercomputer are NASA's visualizations of Hurricanes Mitch, Georges, Fran and Linda. These storms have been recently featured on the covers of National Geographic, Time, Newsweek and Popular Science. Highlights will be shown from the NASA hurricane visualization resource video tape in standard and HDTV that has been used repeatedly this season on National and International network TV. Results will be presented from a new paper on automatic wind measurements in Hurricane Luis from 1-min GOES images that appeared in the November BAMS.
Information technologies for astrophysics circa 2001
NASA Technical Reports Server (NTRS)
Denning, Peter J.
1991-01-01
It is easy to extrapolate current trends to see where technologies relating to information systems in astrophysics and other disciplines will be by the end of the decade. These technologies include miniaturization, multiprocessing, software technology, networking, databases, graphics, pattern computation, and interdisciplinary studies. It is less easy to see what limits our current paradigms place on our thinking about technologies that will allow us to understand the laws governing very large systems about which we have large data sets. Three limiting paradigms are as follows: saving all the bits collected by instruments or generated by supercomputers; obtaining technology for information compression, storage, and retrieval off the shelf; and the linear model of innovation. We must extend these paradigms to meet our goals for information technology at the end of the decade.
NASA Technical Reports Server (NTRS)
Sanz, J.; Pischel, K.; Hubler, D.
1992-01-01
An application for parallel computation on a combined cluster of powerful workstations and supercomputers was developed. A Parallel Virtual Machine (PVM) is used as message passage language on a macro-tasking parallelization of the Aerodynamic Inverse Design and Analysis for a Full Engine computer code. The heterogeneous nature of the cluster is perfectly handled by the controlling host machine. Communication is established via Ethernet with the TCP/IP protocol over an open network. A reasonable overhead is imposed for internode communication, rendering an efficient utilization of the engaged processors. Perhaps one of the most interesting features of the system is its versatile nature, that permits the usage of the computational resources available that are experiencing less use at a given point in time.
High Performance Computing Software Applications for Space Situational Awareness
NASA Astrophysics Data System (ADS)
Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.
The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.
On multigrid methods for the Navier-Stokes Computer
NASA Technical Reports Server (NTRS)
Nosenchuck, D. M.; Krist, S. E.; Zang, T. A.
1988-01-01
The overall architecture of the multipurpose parallel-processing Navier-Stokes Computer (NSC) being developed by Princeton and NASA Langley (Nosenchuck et al., 1986) is described and illustrated with extensive diagrams, and the NSC implementation of an elementary multigrid algorithm for simulating isotropic turbulence (based on solution of the incompressible time-dependent Navier-Stokes equations with constant viscosity) is characterized in detail. The present NSC design concept calls for 64 nodes, each with the performance of a class VI supercomputer, linked together by a fiber-optic hypercube network and joined to a front-end computer by a global bus. In this configuration, the NSC would have a storage capacity of over 32 Gword and a peak speed of over 40 Gflops. The multigrid Navier-Stokes code discussed would give sustained operation rates of about 25 Gflops.
NASA Technical Reports Server (NTRS)
Phillips, Jennifer K.
1995-01-01
Two of the current and most popular implementations of the Message-Passing Standard, Message Passing Interface (MPI), were contrasted: MPICH by Argonne National Laboratory, and LAM by the Ohio Supercomputer Center at Ohio State University. A parallel skyline matrix solver was adapted to be run in a heterogeneous environment using MPI. The Message-Passing Interface Forum was held in May 1994 which lead to a specification of library functions that implement the message-passing model of parallel communication. LAM, which creates it's own environment, is more robust in a highly heterogeneous network. MPICH uses the environment native to the machine architecture. While neither of these free-ware implementations provides the performance of native message-passing or vendor's implementations, MPICH begins to approach that performance on the SP-2. The machines used in this study were: IBM RS6000, 3 Sun4, SGI, and the IBM SP-2. Each machine is unique and a few machines required specific modifications during the installation. When installed correctly, both implementations worked well with only minor problems.
US Department of Energy High School Student Supercomputing Honors Program: A follow-up assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1987-01-01
The US DOE High School Student Supercomputing Honors Program was designed to recognize high school students with superior skills in mathematics and computer science and to provide them with formal training and experience with advanced computer equipment. This document reports on the participants who attended the first such program, which was held at the National Magnetic Fusion Energy Computer Center at the Lawrence Livermore National Laboratory (LLNL) during August 1985.
Green Supercomputing at Argonne
Beckman, Pete
2018-02-07
Pete Beckman, head of Argonne's Leadership Computing Facility (ALCF) talks about Argonne National Laboratory's green supercomputingâeverything from designing algorithms to use fewer kilowatts per operation to using cold Chicago winter air to cool the machine more efficiently. Argonne was recognized for green computing in the 2009 HPCwire Readers Choice Awards. More at http://www.anl.gov/Media_Center/News/2009/news091117.html Read more about the Argonne Leadership Computing Facility at http://www.alcf.anl.gov/
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kozacik, Stephen
Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
NASA Astrophysics Data System (ADS)
Tripathi, Vijay S.; Yeh, G. T.
1993-06-01
Sophisticated and highly computation-intensive models of transport of reactive contaminants in groundwater have been developed in recent years. Application of such models to real-world contaminant transport problems, e.g., simulation of groundwater transport of 10-15 chemically reactive elements (e.g., toxic metals) and relevant complexes and minerals in two and three dimensions over a distance of several hundred meters, requires high-performance computers including supercomputers. Although not widely recognized as such, the computational complexity and demand of these models compare with well-known computation-intensive applications including weather forecasting and quantum chemical calculations. A survey of the performance of a variety of available hardware, as measured by the run times for a reactive transport model HYDROGEOCHEM, showed that while supercomputers provide the fastest execution times for such problems, relatively low-cost reduced instruction set computer (RISC) based scalar computers provide the best performance-to-price ratio. Because supercomputers like the Cray X-MP are inherently multiuser resources, often the RISC computers also provide much better turnaround times. Furthermore, RISC-based workstations provide the best platforms for "visualization" of groundwater flow and contaminant plumes. The most notable result, however, is that current workstations costing less than $10,000 provide performance within a factor of 5 of a Cray X-MP.
NASA Astrophysics Data System (ADS)
Gur, M.; Zomot, E.; Bahar, I.
2013-09-01
The Anton supercomputing technology recently developed for efficient molecular dynamics simulations permits us to examine micro- to milli-second events at full atomic resolution for proteins in explicit water and lipid bilayer. It also permits us to investigate to what extent the collective motions predicted by network models (that have found broad use in molecular biophysics) agree with those exhibited by full-atomic long simulations. The present study focuses on Anton trajectories generated for two systems: the bovine pancreatic trypsin inhibitor, and an archaeal aspartate transporter, GltPh. The former, a thoroughly studied system, helps benchmark the method of comparative analysis, and the latter provides new insights into the mechanism of function of glutamate transporters. The principal modes of motion derived from both simulations closely overlap with those predicted for each system by the anisotropic network model (ANM). Notably, the ANM modes define the collective mechanisms, or the pathways on conformational energy landscape, that underlie the passage between the crystal structure and substates visited in simulations. In particular, the lowest frequency ANM modes facilitate the conversion between the most probable substates, lending support to the view that easy access to functional substates is a robust determinant of evolutionarily selected native contact topology.
VRML and Collaborative Environments: New Tools for Networked Visualization
NASA Astrophysics Data System (ADS)
Crutcher, R. M.; Plante, R. L.; Rajlich, P.
We present two new applications that engage the network as a tool for astronomical research and/or education. The first is a VRML server which allows users over the Web to interactively create three-dimensional visualizations of FITS images contained in the NCSA Astronomy Digital Image Library (ADIL). The server's Web interface allows users to select images from the ADIL, fill in processing parameters, and create renderings featuring isosurfaces, slices, contours, and annotations; the often extensive computations are carried out on an NCSA SGI supercomputer server without the user having an individual account on the system. The user can then download the 3D visualizations as VRML files, which may be rotated and manipulated locally on virtually any class of computer. The second application is the ADILBrowser, a part of the NCSA Horizon Image Data Browser Java package. ADILBrowser allows a group of participants to browse images from the ADIL within a collaborative session. The collaborative environment is provided by the NCSA Habanero package which includes text and audio chat tools and a white board. The ADILBrowser is just an example of a collaborative tool that can be built with the Horizon and Habanero packages. The classes provided by these packages can be assembled to create custom collaborative applications that visualize data either from local disk or from anywhere on the network.
High temporal resolution mapping of seismic noise sources using heterogeneous supercomputers
NASA Astrophysics Data System (ADS)
Gokhberg, Alexey; Ermert, Laura; Paitz, Patrick; Fichtner, Andreas
2017-04-01
Time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems. Significant interest in seismic noise source maps with high temporal resolution (days) is expected to come from a number of domains, including natural resources exploration, analysis of active earthquake fault zones and volcanoes, as well as geothermal and hydrocarbon reservoir monitoring. Currently, knowledge of noise sources is insufficient for high-resolution subsurface monitoring applications. Near-real-time seismic data, as well as advanced imaging methods to constrain seismic noise sources have recently become available. These methods are based on the massive cross-correlation of seismic noise records from all available seismic stations in the region of interest and are therefore very computationally intensive. Heterogeneous massively parallel supercomputing systems introduced in the recent years combine conventional multi-core CPU with GPU accelerators and provide an opportunity for manifold increase and computing performance. Therefore, these systems represent an efficient platform for implementation of a noise source mapping solution. We present the first results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service that provides seismic noise source maps for Central Europe with high temporal resolution (days to few weeks depending on frequency and data availability). The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept in order to provide the interested external researchers the regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise mapping application is composed of four principal modules: (1) pre-processing of raw data, (2) massive cross-correlation, (3) post-processing of correlation data based on computation of logarithmic energy ratio and (4) generation of source maps from post-processed data. Implementation of the solution posed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service oriented architecture for coordination of various sub-systems, and engineering an appropriate data storage solution. The present pilot version of the service implements noise source maps for Switzerland. Extension of the solution to Central Europe is planned for the next project phase.
Continuum-based DFN-consistent simulations of oxygen ingress in fractured crystalline rocks
NASA Astrophysics Data System (ADS)
Trinchero, P.; Puigdomenech, I.; Molinero, J.; Ebrahimi, H.; Gylling, B.; Svensson, U.; Bosbach, D.; Deissmann, G.
2016-12-01
The potential transient infiltration of oxygenated glacial meltwater into initially anoxic and reducing fractured crystalline rocks during glaciation events is an issue of concern for some of the prospected deep geological repositories for spent nuclear fuel. Here, this problem is assessed using reactive transport calculations. First, a novel parameterisation procedure is presented, where flow, transport and geochemical parameters (i.e. hydraulic conductivity, effective/kinetic porosity, and mineral specific surface and abundance) are defined on a finite volume numerical grid based on the (spatially varying) properties of an underlying Discrete Fracture Network (DFN). Second, using this approach, a realistic reactive transport model of Forsmark, i.e. the selected site for the proposed Swedish spent nuclear fuel repository, is implemented. The model consists of more than 70 million geochemical transport degrees of freedom and simulates the ingress of oxygen-rich water from the recharge area of the domain and its depletion due to reactions with the Fe(II) mineral chlorite. Third, the calculations are solved in the supercomputer JUQUEEN of the Jülich Supercomputing Centre. The results of the simulations show that oxygen infiltrates relatively quickly along fractures and deformation zones until a steady state profile is reached, where geochemical reactions counterbalance advective transport processes. Interestingly, most of the iron-bearing minerals are consumed in the highly conductive zones, where larger mineral surfaces are available for reactions. An analysis based on mineral mass balance shows that the considered rock medium has enough capacity to buffer oxygen infiltration for a long period of time (i.e. some thousand years).
New Computer Simulations of Macular Neural Functioning
NASA Technical Reports Server (NTRS)
Ross, Muriel D.; Doshay, D.; Linton, S.; Parnas, B.; Montgomery, K.; Chimento, T.
1994-01-01
We use high performance graphics workstations and supercomputers to study the functional significance of the three-dimensional (3-D) organization of gravity sensors. These sensors have a prototypic architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scaled-up, 3-D versions run on a Cray Y-MP supercomputer. A semi-automated method of reconstruction of neural tissue from serial sections studied in a transmission electron microscope has been developed to eliminate tedious conventional photography. The reconstructions use a mesh as a step in generating a neural surface for visualization. Two meshes are required to model calyx surfaces. The meshes are connected and the resulting prisms represent the cytoplasm and the bounding membranes. A finite volume analysis method is employed to simulate voltage changes along the calyx in response to synapse activation on the calyx or on calyceal processes. The finite volume method insures that charge is conserved at the calyx-process junction. These and other models indicate that efferent processes act as voltage followers, and that the morphology of some afferent processes affects their functioning. In a final application, morphological information is symbolically represented in three dimensions in a computer. The possible functioning of the connectivities is tested using mathematical interpretations of physiological parameters taken from the literature. Symbolic, 3-D simulations are in progress to probe the functional significance of the connectivities. This research is expected to advance computer-based studies of macular functioning and of synaptic plasticity.
Development of seismic tomography software for hybrid supercomputers
NASA Astrophysics Data System (ADS)
Nikitin, Alexandr; Serdyukov, Alexandr; Duchkov, Anton
2015-04-01
Seismic tomography is a technique used for computing velocity model of geologic structure from first arrival travel times of seismic waves. The technique is used in processing of regional and global seismic data, in seismic exploration for prospecting and exploration of mineral and hydrocarbon deposits, and in seismic engineering for monitoring the condition of engineering structures and the surrounding host medium. As a consequence of development of seismic monitoring systems and increasing volume of seismic data, there is a growing need for new, more effective computational algorithms for use in seismic tomography applications with improved performance, accuracy and resolution. To achieve this goal, it is necessary to use modern high performance computing systems, such as supercomputers with hybrid architecture that use not only CPUs, but also accelerators and co-processors for computation. The goal of this research is the development of parallel seismic tomography algorithms and software package for such systems, to be used in processing of large volumes of seismic data (hundreds of gigabytes and more). These algorithms and software package will be optimized for the most common computing devices used in modern hybrid supercomputers, such as Intel Xeon CPUs, NVIDIA Tesla accelerators and Intel Xeon Phi co-processors. In this work, the following general scheme of seismic tomography is utilized. Using the eikonal equation solver, arrival times of seismic waves are computed based on assumed velocity model of geologic structure being analyzed. In order to solve the linearized inverse problem, tomographic matrix is computed that connects model adjustments with travel time residuals, and the resulting system of linear equations is regularized and solved to adjust the model. The effectiveness of parallel implementations of existing algorithms on target architectures is considered. During the first stage of this work, algorithms were developed for execution on supercomputers using multicore CPUs only, with preliminary performance tests showing good parallel efficiency on large numerical grids. Porting of the algorithms to hybrid supercomputers is currently ongoing.
NASA Astrophysics Data System (ADS)
Yamamoto, H.; Nakajima, K.; Zhang, K.; Nanai, S.
2015-12-01
Powerful numerical codes that are capable of modeling complex coupled processes of physics and chemistry have been developed for predicting the fate of CO2 in reservoirs as well as its potential impacts on groundwater and subsurface environments. However, they are often computationally demanding for solving highly non-linear models in sufficient spatial and temporal resolutions. Geological heterogeneity and uncertainties further increase the challenges in modeling works. Two-phase flow simulations in heterogeneous media usually require much longer computational time than that in homogeneous media. Uncertainties in reservoir properties may necessitate stochastic simulations with multiple realizations. Recently, massively parallel supercomputers with more than thousands of processors become available in scientific and engineering communities. Such supercomputers may attract attentions from geoscientist and reservoir engineers for solving the large and non-linear models in higher resolutions within a reasonable time. However, for making it a useful tool, it is essential to tackle several practical obstacles to utilize large number of processors effectively for general-purpose reservoir simulators. We have implemented massively-parallel versions of two TOUGH2 family codes (a multi-phase flow simulator TOUGH2 and a chemically reactive transport simulator TOUGHREACT) on two different types (vector- and scalar-type) of supercomputers with a thousand to tens of thousands of processors. After completing implementation and extensive tune-up on the supercomputers, the computational performance was measured for three simulations with multi-million grid models, including a simulation of the dissolution-diffusion-convection process that requires high spatial and temporal resolutions to simulate the growth of small convective fingers of CO2-dissolved water to larger ones in a reservoir scale. The performance measurement confirmed that the both simulators exhibit excellent scalabilities showing almost linear speedup against number of processors up to over ten thousand cores. Generally this allows us to perform coupled multi-physics (THC) simulations on high resolution geologic models with multi-million grid in a practical time (e.g., less than a second per time step).
Comparison of neuronal spike exchange methods on a Blue Gene/P supercomputer.
Hines, Michael; Kumar, Sameer; Schürmann, Felix
2011-01-01
For neural network simulations on parallel machines, interprocessor spike communication can be a significant portion of the total simulation time. The performance of several spike exchange methods using a Blue Gene/P (BG/P) supercomputer has been tested with 8-128 K cores using randomly connected networks of up to 32 M cells with 1 k connections per cell and 4 M cells with 10 k connections per cell, i.e., on the order of 4·10(10) connections (K is 1024, M is 1024(2), and k is 1000). The spike exchange methods used are the standard Message Passing Interface (MPI) collective, MPI_Allgather, and several variants of the non-blocking Multisend method either implemented via non-blocking MPI_Isend, or exploiting the possibility of very low overhead direct memory access (DMA) communication available on the BG/P. In all cases, the worst performing method was that using MPI_Isend due to the high overhead of initiating a spike communication. The two best performing methods-the persistent Multisend method using the Record-Replay feature of the Deep Computing Messaging Framework DCMF_Multicast; and a two-phase multisend in which a DCMF_Multicast is used to first send to a subset of phase one destination cores, which then pass it on to their subset of phase two destination cores-had similar performance with very low overhead for the initiation of spike communication. Departure from ideal scaling for the Multisend methods is almost completely due to load imbalance caused by the large variation in number of cells that fire on each processor in the interval between synchronization. Spike exchange time itself is negligible since transmission overlaps with computation and is handled by a DMA controller. We conclude that ideal performance scaling will be ultimately limited by imbalance between incoming processor spikes between synchronization intervals. Thus, counterintuitively, maximization of load balance requires that the distribution of cells on processors should not reflect neural net architecture but be randomly distributed so that sets of cells which are burst firing together should be on different processors with their targets on as large a set of processors as possible.
NASA Astrophysics Data System (ADS)
Anantharaj, V.; Mayer, B.; Wang, F.; Hack, J.; McKenna, D.; Hartman-Baker, R.
2012-04-01
The Oak Ridge Leadership Computing Facility (OLCF) facilitates the execution of computational experiments that require tens of millions of CPU hours (typically using thousands of processors simultaneously) while generating hundreds of terabytes of data. A set of ultra high resolution climate experiments in progress, using the Community Earth System Model (CESM), will produce over 35,000 files, ranging in sizes from 21 MB to 110 GB each. The execution of the experiments will require nearly 70 Million CPU hours on the Jaguar and Titan supercomputers at OLCF. The total volume of the output from these climate modeling experiments will be in excess of 300 TB. This model output must then be archived, analyzed, distributed to the project partners in a timely manner, and also made available more broadly. Meeting this challenge would require efficient movement of the data, staging the simulation output to a large and fast file system that provides high volume access to other computational systems used to analyze the data and synthesize results. This file system also needs to be accessible via high speed networks to an archival system that can provide long term reliable storage. Ideally this archival system is itself directly available to other systems that can be used to host services making the data and analysis available to the participants in the distributed research project and to the broader climate community. The various resources available at the OLCF now support this workflow. The available systems include the new Jaguar Cray XK6 2.63 petaflops (estimated) supercomputer, the 10 PB Spider center-wide parallel file system, the Lens/EVEREST analysis and visualization system, the HPSS archival storage system, the Earth System Grid (ESG), and the ORNL Climate Data Server (CDS). The ESG features federated services, search & discovery, extensive data handling capabilities, deep storage access, and Live Access Server (LAS) integration. The scientific workflow enabled on these systems, and developed as part of the Ultra-High Resolution Climate Modeling Project, allows users of OLCF resources to efficiently share simulated data, often multi-terabyte in volume, as well as the results from the modeling experiments and various synthesized products derived from these simulations. The final objective in the exercise is to ensure that the simulation results and the enhanced understanding will serve the needs of a diverse group of stakeholders across the world, including our research partners in U.S. Department of Energy laboratories & universities, domain scientists, students (K-12 as well as higher education), resource managers, decision makers, and the general public.
Global and local waveform simulations using the VERCE platform
NASA Astrophysics Data System (ADS)
Garth, Thomas; Saleh, Rafiq; Spinuso, Alessandro; Gemund, Andre; Casarotti, Emanuele; Magnoni, Federica; Krischner, Lion; Igel, Heiner; Schlichtweg, Horst; Frank, Anton; Michelini, Alberto; Vilotte, Jean-Pierre; Rietbrock, Andreas
2017-04-01
In recent years the potential to increase resolution of seismic imaging by full waveform inversion has been demonstrated on a range of scales from basin to continental scales. These techniques rely on harnessing the computational power of large supercomputers, and running large parallel codes to simulate the seismic wave field in a three-dimensional geological setting. The VERCE platform is designed to make these full waveform techniques accessible to a far wider spectrum of the seismological community. The platform supports the two widely used spectral element simulation programs SPECFEM3D Cartesian, and SPECFEM3D globe, allowing users to run a wide range of simulations. In the SPECFEM3D Cartesian implementation the user can run waveform simulations on a range of pre-loaded meshes and velocity models for specific areas, or upload their own velocity model and mesh. In the new SPECFEM3D globe implementation, the user will be able to select from a number of continent scale model regions, or perform waveform simulations for the whole earth. Earthquake focal mechanisms can be downloaded within the platform, for example from the GCMT catalogue, or users can upload their own focal mechanism catalogue through the platform. The simulations can be run on a range of European supercomputers in the PRACE network. Once a job has been submitted and run through the platform, the simulated waveforms can be manipulated or downloaded for further analysis. The misfit between the simulated and recorded waveforms can then be calculated through the platform through three interoperable workflows, for raw-data access (FDSN) and caching, pre-processing and finally misfit. The last workflow makes use of the Pyflex analysis software. In addition, the VERCE platform can be used to produce animations of waveform propagation through the velocity model, and synthetic shakemaps. All these data-products are made discoverable and re-usable thanks to the VERCE data and metadata management layer. We demonstrate the functionality of the VERCE platform with two use cases, one using the pre-loaded velocity model and mesh for the Maule area of Chile using the SPECFEM3D Cartesian workflow, and one showing the output of a global simulation using the SPECFEM3D globe workflow. It is envisioned that this tool will allow a much greater range of seismologists to access these full waveform inversion tools, and aid full waveform tomographic and source inversion, synthetic shakemap production and other full waveform applications, in a wide range of tectonic settings.
Childers, J. T.; Uram, T. D.; LeCompte, T. J.; ...
2016-09-29
As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. Finally, this paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Childers, J. T.; Uram, T. D.; LeCompte, T. J.
As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. Finally, this paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less
Finite-volume Atmospheric Model of the IAP/LASG (FAMIL)
NASA Astrophysics Data System (ADS)
Bao, Q.
2015-12-01
The Finite-volume Atmospheric Model of the IAP/LASG (FAMIL) is introduced in this work. FAMIL have the flexible horizontal and vertical resolutions up to 25km and 1Pa respectively, which currently running on the "Tianhe 1A&2" supercomputers. FAMIL is the atmospheric component of the third-generation Flexible Global Ocean-Atmosphere-Land climate System model (FGOALS3) which will participate in the Coupled Model Intercomparison Project Phase 6 (CMIP6). In addition to describing the dynamical core and physical parameterizations of FAMIL, this talk describes the simulated characteristics of energy and water balances, precipitation, Asian Summer Monsoon and stratospheric circulation, and compares them with observational/reanalysis data. Finally, the model biases as well as possible solutions are discussed.
Using Supercomputers to Probe the Early Universe
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giorgi, Elena Edi
For decades physicists have been trying to decipher the first moments after the Big Bang. Using very large telescopes, for example, scientists scan the skies and look at how fast galaxies move. Satellites study the relic radiation left from the Big Bang, called the cosmic microwave background radiation. And finally, particle colliders, like the Large Hadron Collider at CERN, allow researchers to smash protons together and analyze the debris left behind by such collisions. Physicists at Los Alamos National Laboratory, however, are taking a different approach: they are using computers. In collaboration with colleagues at University of California San Diego,more » the Los Alamos researchers developed a computer code, called BURST, that can simulate conditions during the first few minutes of cosmological evolution.« less
Developing software to use parallel processing effectively. Final report, June-December 1987
DOE Office of Scientific and Technical Information (OSTI.GOV)
Center, J.
1988-10-01
This report describes the difficulties involved in writing efficient parallel programs and describes the hardware and software support currently available for generating software that utilizes processing effectively. Historically, the processing rate of single-processor computers has increased by one order of magnitude every five years. However, this pace is slowing since electronic circuitry is coming up against physical barriers. Unfortunately, the complexity of engineering and research problems continues to require ever more processing power (far in excess of the maximum estimated 3 Gflops achievable by single-processor computers). For this reason, parallel-processing architectures are receiving considerable interest, since they offer high performancemore » more cheaply than a single-processor supercomputer, such as the Cray.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rajbhandari, Samyam; NIkam, Akshay; Lai, Pai-Wei
Tensor contractions represent the most compute-intensive core kernels in ab initio computational quantum chemistry and nuclear physics. Symmetries in these tensor contractions makes them difficult to load balance and scale to large distributed systems. In this paper, we develop an efficient and scalable algorithm to contract symmetric tensors. We introduce a novel approach that avoids data redistribution in contracting symmetric tensors while also avoiding redundant storage and maintaining load balance. We present experimental results on two parallel supercomputers for several symmetric contractions that appear in the CCSD quantum chemistry method. We also present a novel approach to tensor redistribution thatmore » can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions, and use collective broadcast when the final distribution has replicated dimensions, making the algorithm very efficient.« less
A Long History of Supercomputing
Grider, Gary
2018-06-13
As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nationâs first computer to building the first machine to break the petaflop barrier, Los Alamos holds many âfirstsâ in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.
2014-09-01
simulation time frame from 30 days to one year. This was enabled by porting the simulation to the Pleiades supercomputer at NASA Ames Research Center, a...including the motivation for changes to our past approach. We then present the software implementation (3) on the NASA Ames Pleiades supercomputer...significantly updated since last year’s paper [25]. The main incentive for that was the shift to a highly parallel approach in order to utilize the Pleiades
Parallel-Vector Algorithm For Rapid Structural Anlysis
NASA Technical Reports Server (NTRS)
Agarwal, Tarun R.; Nguyen, Duc T.; Storaasli, Olaf O.
1993-01-01
New algorithm developed to overcome deficiency of skyline storage scheme by use of variable-band storage scheme. Exploits both parallel and vector capabilities of modern high-performance computers. Gives engineers and designers opportunity to include more design variables and constraints during optimization of structures. Enables use of more refined finite-element meshes to obtain improved understanding of complex behaviors of aerospace structures leading to better, safer designs. Not only attractive for current supercomputers but also for next generation of shared-memory supercomputers.
Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.; Kwak, Dochan (Technical Monitor)
2002-01-01
A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel supercomputers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.
Science and Technology Review June 2000
DOE Office of Scientific and Technical Information (OSTI.GOV)
de Pruneda, J.H.
2000-06-01
This issue contains the following articles: (1) ''Accelerating on the ASCI Challenge''. (2) ''New Day Daws in Supercomputing'' When the ASCI White supercomputer comes online this summer, DOE's Stockpile Stewardship Program will make another significant advanced toward helping to ensure the safety, reliability, and performance of the nation's nuclear weapons. (3) ''Uncovering the Secrets of Actinides'' Researchers are obtaining fundamental information about the actinides, a group of elements with a key role in nuclear weapons and fuels. (4) ''A Predictable Structure for Aerogels''. (5) ''Tibet--Where Continents Collide''.
Role of HPC in Advancing Computational Aeroelasticity
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.
2004-01-01
On behalf of the High Performance Computing and Modernization Program (HPCMP) and NASA Advanced Supercomputing Division (NAS) a study is conducted to assess the role of supercomputers on computational aeroelasticity of aerospace vehicles. The study is mostly based on the responses to a web based questionnaire that was designed to capture the nuances of high performance computational aeroelasticity, particularly on parallel computers. A procedure is presented to assign a fidelity-complexity index to each application. Case studies based on major applications using HPCMP resources are presented.
PerSEUS: Ultra-Low-Power High Performance Computing for Plasma Simulations
NASA Astrophysics Data System (ADS)
Doxas, I.; Andreou, A.; Lyon, J.; Angelopoulos, V.; Lu, S.; Pritchett, P. L.
2017-12-01
Peta-op SupErcomputing Unconventional System (PerSEUS) aims to explore the use for High Performance Scientific Computing (HPC) of ultra-low-power mixed signal unconventional computational elements developed by Johns Hopkins University (JHU), and demonstrate that capability on both fluid and particle Plasma codes. We will describe the JHU Mixed-signal Unconventional Supercomputing Elements (MUSE), and report initial results for the Lyon-Fedder-Mobarry (LFM) global magnetospheric MHD code, and a UCLA general purpose relativistic Particle-In-Cell (PIC) code.
Heart Fibrillation and Parallel Supercomputers
NASA Technical Reports Server (NTRS)
Kogan, B. Y.; Karplus, W. J.; Chudin, E. E.
1997-01-01
The Luo and Rudy 3 cardiac cell mathematical model is implemented on the parallel supercomputer CRAY - T3D. The splitting algorithm combined with variable time step and an explicit method of integration provide reasonable solution times and almost perfect scaling for rectilinear wave propagation. The computer simulation makes it possible to observe new phenomena: the break-up of spiral waves caused by intracellular calcium and dynamics and the non-uniformity of the calcium distribution in space during the onset of the spiral wave.
2017-12-08
Two rows of the “Discover” supercomputer at the NASA Center for Climate Simulation (NCCS) contain more than 4,000 computer processors. Discover has a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
2017-12-08
This close-up view highlights one row—approximately 2,000 computer processors—of the “Discover” supercomputer at the NASA Center for Climate Simulation (NCCS). Discover has a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide
Tang, William; Wang, Bei; Ethier, Stephane; ...
2016-11-01
The goal of the extreme scale plasma turbulence studies described in this paper is to expedite the delivery of reliable predictions on confinement physics in large magnetic fusion systems by using world-class supercomputers to carry out simulations with unprecedented resolution and temporal duration. This has involved architecture-dependent optimizations of performance scaling and addressing code portability and energy issues, with the metrics for multi-platform comparisons being 'time-to-solution' and 'energy-to-solution'. Realistic results addressing how confinement losses caused by plasma turbulence scale from present-day devices to the much larger $25 billion international ITER fusion facility have been enabled by innovative advances in themore » GTC-P code including (i) implementation of one-sided communication from MPI 3.0 standard; (ii) creative optimization techniques on Xeon Phi processors; and (iii) development of a novel performance model for the key kernels of the PIC code. Our results show that modeling data movement is sufficient to predict performance on modern supercomputer platforms.« less
NASA Astrophysics Data System (ADS)
Landgrebe, Anton J.
1987-03-01
An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers.
NASA Technical Reports Server (NTRS)
Landgrebe, Anton J.
1987-01-01
An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers.
Antenna pattern control using impedance surfaces
NASA Technical Reports Server (NTRS)
Balanis, Constantine A.; Liu, Kefeng
1992-01-01
During this research period, we have effectively transferred existing computer codes from CRAY supercomputer to work station based systems. The work station based version of our code preserved the accuracy of the numerical computations while giving a much better turn-around time than the CRAY supercomputer. Such a task relieved us of the heavy dependence of the supercomputer account budget and made codes developed in this research project more feasible for applications. The analysis of pyramidal horns with impedance surfaces was our major focus during this research period. Three different modeling algorithms in analyzing lossy impedance surfaces were investigated and compared with measured data. Through this investigation, we discovered that a hybrid Fourier transform technique, which uses the eigen mode in the stepped waveguide section and the Fourier transformed field distributions across the stepped discontinuities for lossy impedances coating, gives a better accuracy in analyzing lossy coatings. After a further refinement of the present technique, we will perform an accurate radiation pattern synthesis in the coming reporting period.
Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization
NASA Technical Reports Server (NTRS)
Jones, James Patton; Nitzberg, Bill
1999-01-01
The NAS facility has operated parallel supercomputers for the past 11 years, including the Intel iPSC/860, Intel Paragon, Thinking Machines CM-5, IBM SP-2, and Cray Origin 2000. Across this wide variety of machine architectures, across a span of 10 years, across a large number of different users, and through thousands of minor configuration and policy changes, the utilization of these machines shows three general trends: (1) scheduling using a naive FIFO first-fit policy results in 40-60% utilization, (2) switching to the more sophisticated dynamic backfilling scheduling algorithm improves utilization by about 15 percentage points (yielding about 70% utilization), and (3) reducing the maximum allowable job size further increases utilization. Most surprising is the consistency of these trends. Over the lifetime of the NAS parallel systems, we made hundreds, perhaps thousands, of small changes to hardware, software, and policy, yet, utilization was affected little. In particular these results show that the goal of achieving near 100% utilization while supporting a real parallel supercomputing workload is unrealistic.
Large-Scale Simulations of Plastic Neural Networks on Neuromorphic Hardware
Knight, James C.; Tully, Philip J.; Kaplan, Bernhard A.; Lansner, Anders; Furber, Steve B.
2016-01-01
SpiNNaker is a digital, neuromorphic architecture designed for simulating large-scale spiking neural networks at speeds close to biological real-time. Rather than using bespoke analog or digital hardware, the basic computational unit of a SpiNNaker system is a general-purpose ARM processor, allowing it to be programmed to simulate a wide variety of neuron and synapse models. This flexibility is particularly valuable in the study of biological plasticity phenomena. A recently proposed learning rule based on the Bayesian Confidence Propagation Neural Network (BCPNN) paradigm offers a generic framework for modeling the interaction of different plasticity mechanisms using spiking neurons. However, it can be computationally expensive to simulate large networks with BCPNN learning since it requires multiple state variables for each synapse, each of which needs to be updated every simulation time-step. We discuss the trade-offs in efficiency and accuracy involved in developing an event-based BCPNN implementation for SpiNNaker based on an analytical solution to the BCPNN equations, and detail the steps taken to fit this within the limited computational and memory resources of the SpiNNaker architecture. We demonstrate this learning rule by learning temporal sequences of neural activity within a recurrent attractor network which we simulate at scales of up to 2.0 × 104 neurons and 5.1 × 107 plastic synapses: the largest plastic neural network ever to be simulated on neuromorphic hardware. We also run a comparable simulation on a Cray XC-30 supercomputer system and find that, if it is to match the run-time of our SpiNNaker simulation, the super computer system uses approximately 45× more power. This suggests that cheaper, more power efficient neuromorphic systems are becoming useful discovery tools in the study of plasticity in large-scale brain models. PMID:27092061
A History of High-Performance Computing
NASA Technical Reports Server (NTRS)
2006-01-01
Faster than most speedy computers. More powerful than its NASA data-processing predecessors. Able to leap large, mission-related computational problems in a single bound. Clearly, it s neither a bird nor a plane, nor does it need to don a red cape, because it s super in its own way. It's Columbia, NASA s newest supercomputer and one of the world s most powerful production/processing units. Named Columbia to honor the STS-107 Space Shuttle Columbia crewmembers, the new supercomputer is making it possible for NASA to achieve breakthroughs in science and engineering, fulfilling the Agency s missions, and, ultimately, the Vision for Space Exploration. Shortly after being built in 2004, Columbia achieved a benchmark rating of 51.9 teraflop/s on 10,240 processors, making it the world s fastest operational computer at the time of completion. Putting this speed into perspective, 20 years ago, the most powerful computer at NASA s Ames Research Center, home of the NASA Advanced Supercomputing Division (NAS), ran at a speed of about 1 gigaflop (one billion calculations per second). The Columbia supercomputer is 50,000 times faster than this computer and offers a tenfold increase in capacity over the prior system housed at Ames. What s more, Columbia is considered the world s largest Linux-based, shared-memory system. The system is offering immeasurable benefits to society and is the zenith of years of NASA/private industry collaboration that has spawned new generations of commercial, high-speed computing systems.
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.
Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias
2011-01-01
The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
NASA Technical Reports Server (NTRS)
Hasler, Fritz
1999-01-01
The Etheater presents visualizations which span the period from the original Suomi/Hasler animations of the first ATS-1 GEO weather satellite images in 1966 ....... to the latest 1999 NASA Earth Science Vision for the next 25 years. Hot off the SGI-Onyx Graphics-Supercomputer are NASA's visualizations of Hurricanes Mitch, Georges, Fran and Linda. These storms have been recently featured on the covers of National Geographic, Time, Newsweek and Popular Science. Highlights will be shown from the NASA hurricane visualization resource video tape in standard and HDTV that has been used repeatedly this season on National and International network TV. Results will be presented from a new paper on automatic wind measurements in Hurricane Luis from 1-min GOES images that appeared in the November BAMS.
Forecasting techno-social systems: how physics and computing help to fight off global pandemics
NASA Astrophysics Data System (ADS)
Vespignani, Alessandro
2010-03-01
The crucial issue when planning for adequate public health interventions to mitigate the spread and impact of epidemics is risk evaluation and forecast. This amount to the anticipation of where, when and how strong the epidemic will strike. In the last decade advances in performance in computer technology, data acquisition, statistical physics and complex networks theory allow the generation of sophisticated simulations on supercomputer infrastructures to anticipate the spreading pattern of a pandemic. For the first time we are in the position of generating real time forecast of epidemic spreading. I will review the history of the current H1N1 pandemic, the major road-blocks the community has faced in its containment and mitigation and how physics and computing provide predictive tools that help us to battle epidemics.
NASA Technical Reports Server (NTRS)
1984-01-01
NASA has planned a supercomputer for computational fluid dynamics research since the mid-1970's. With the approval of the Numerical Aerodynamic Simulation Program as a FY 1984 new start, Congress requested an assessment of the program's objectives, projected short- and long-term uses, program design, computer architecture, user needs, and handling of proprietary and classified information. Specifically requested was an examination of the merits of proceeding with multiple high speed processor (HSP) systems contrasted with a single high speed processor system. The panel found NASA's objectives and projected uses sound and the projected distribution of users as realistic as possible at this stage. The multiple-HSP, whereby new, more powerful state-of-the-art HSP's would be integrated into a flexible network, was judged to present major advantages over any single HSP system.
Accessing Wind Tunnels From NASA's Information Power Grid
NASA Technical Reports Server (NTRS)
Becker, Jeff; Biegel, Bryan (Technical Monitor)
2002-01-01
The NASA Ames wind tunnel customers are one of the first users of the Information Power Grid (IPG) storage system at the NASA Advanced Supercomputing Division. We wanted to be able to store their data on the IPG so that it could be accessed remotely in a secure but timely fashion. In addition, incorporation into the IPG allows future use of grid computational resources, e.g., for post-processing of data, or to do side-by-side CFD validation. In this paper, we describe the integration of grid data access mechanisms with the existing DARWIN web-based system that is used to access wind tunnel test data. We also show that the combined system has reasonable performance: wind tunnel data may be retrieved at 50Mbits/s over a 100 base T network connected to the IPG storage server.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Timothy J.
2016-03-01
While benchmarking software is useful for testing the performance limits and stability of Argonne National Laboratory’s new Theta supercomputer, there is no substitute for running real applications to explore the system’s potential. The Argonne Leadership Computing Facility’s Theta Early Science Program, modeled after its highly successful code migration program for the Mira supercomputer, has one primary aim: to deliver science on day one. Here is a closer look at the type of science problems that will be getting early access to Theta, a next-generation machine being rolled out this year.
Supercomputer analysis of sedimentary basins.
Bethke, C M; Altaner, S P; Harrison, W J; Upson, C
1988-01-15
Geological processes of fluid transport and chemical reaction in sedimentary basins have formed many of the earth's energy and mineral resources. These processes can be analyzed on natural time and distance scales with the use of supercomputers. Numerical experiments are presented that give insights to the factors controlling subsurface pressures, temperatures, and reactions; the origin of ores; and the distribution and quality of hydrocarbon reservoirs. The results show that numerical analysis combined with stratigraphic, sea level, and plate tectonic histories provides a powerful tool for studying the evolution of sedimentary basins over geologic time.
2017-12-08
The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
2017-12-08
The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
2017-12-08
The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.
Development of the general interpolants method for the CYBER 200 series of supercomputers
NASA Technical Reports Server (NTRS)
Stalnaker, J. F.; Robinson, M. A.; Spradley, L. W.; Kurzius, S. C.; Thoenes, J.
1988-01-01
The General Interpolants Method (GIM) is a 3-D, time-dependent, hybrid procedure for generating numerical analogs of the conservation laws. This study is directed toward the development and application of the GIM computer code for fluid dynamic research applications as implemented for the Cyber 200 series of supercomputers. An elliptic and quasi-parabolic version of the GIM code are discussed. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and an implicit finite difference scheme are also included.
NASA Technical Reports Server (NTRS)
Nosenchuck, D. M.; Littman, M. G.
1986-01-01
The Navier-Stokes computer (NSC) has been developed for solving problems in fluid mechanics involving complex flow simulations that require more speed and capacity than provided by current and proposed Class VI supercomputers. The machine is a parallel processing supercomputer with several new architectural elements which can be programmed to address a wide range of problems meeting the following criteria: (1) the problem is numerically intensive, and (2) the code makes use of long vectors. A simulation of two-dimensional nonsteady viscous flows is presented to illustrate the architecture, programming, and some of the capabilities of the NSC.
Merging the Machines of Modern Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolf, Laura; Collins, Jim
Two recent projects have harnessed supercomputing resources at the US Department of Energy’s Argonne National Laboratory in a novel way to support major fusion science and particle collider experiments. Using leadership computing resources, one team ran fine-grid analysis of real-time data to make near-real-time adjustments to an ongoing experiment, while a second team is working to integrate Argonne’s supercomputers into the Large Hadron Collider/ATLAS workflow. Together these efforts represent a new paradigm of the high-performance computing center as a partner in experimental science.
Ngi and Internet2: accelerating the creation of tomorrow's internet.
Kratz, M; Ackerman, M; Hanss, T; Corbato, S
2001-01-01
Internet2 is a consortium of leading U.S. universities working in partnership with industry and the U.S. government's Next Generation Internet (NGI) initiative to develop a faster, more reliable Internet for research and education including enhanced, high-performance networking services and the advanced applications that are enabled by those services [1]. By facilitating and coordinating the development, deployment, operation, and technology transfer of advanced, network-based applications and network services, Internet2 and NGI are working together to fundamentally change the way scientists, engineers, clinicians, and others work together. [http://www.internet2.edu] The NGI Program has three tracks: research, network testbeds, and applications. The aim of the research track is to promote experimentation with the next generation of network technologies. The network testbed track aims to develop next generation network testbeds to connect universities and federal research institutions at speeds that are sufficient to demonstrate new technologies and support future research. The aim of the applications track is to demonstrate new applications, enabled by the NGI networks, to meet important national goals and missions [2]. [http://www.ngi.gov/] The Internet2/NGI backbone networks, Abilene and vBNS (very high performance Backbone Network Service), provide the basis of collaboration and development for a new breed of advanced medical applications. Academic medical centers leverage the resources available throughout the Internet2 high-performance networking community for high-capacity broadband and selectable quality of service to make effective use of national repositories. The Internet2 Health Sciences Initiative enables a new generation of emerging medical applications whose architecture and development have been restricted by or are beyond the constraints of traditional Internet environments. These initiatives facilitate a variety of activities to foster the development and deployment of emerging applications that meet the requirements of clinical practice, medical and related biological research, education, and medical awareness throughout the public sector. Medical applications that work with high performance networks and supercomputing capabilities offer exciting new solutions for the medical industry. Internet2 and NGI,strive to combine the expertise of their constituents to establish a distributed knowledge system for achieving innovation in research, teaching, learning, and clinical care.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kostuk, M.; Uram, T. D.; Evans, T.
For the first time, an automatically triggered, between-pulse fusion science analysis code was run on-demand at a remotely located supercomputer at Argonne Leadership Computing Facility (ALCF, Lemont, IL) in support of in-process experiments being performed at DIII-D (San Diego, CA). This represents a new paradigm for combining geographically distant experimental and high performance computing (HPC) facilities to provide enhanced data analysis that is quickly available to researchers. Enhanced analysis improves the understanding of the current pulse, translating into a more efficient use of experimental resources, and to the quality of the resultant science. The analysis code used here, called SURFMN,more » calculates the magnetic structure of the plasma using Fourier transform. Increasing the number of Fourier components provides a more accurate determination of the stochastic boundary layer near the plasma edge by better resolving magnetic islands, but requires 26 minutes to complete using local DIII-D resources, putting it well outside the useful time range for between pulse analysis. These islands relate to confinement and edge localized mode (ELM) suppression, and may be controlled by adjusting coil currents for the next pulse. Argonne has ensured on-demand execution of SURFMN by providing a reserved queue, a specialized service that launches the code after receiving an automatic trigger, and with network access from the worker nodes for data transfer. Runs are executed on 252 cores of ALCF’s Cooley cluster and the data is available locally at DIII-D within three minutes of triggering. The original SURFMN design limits additional improvements with more cores, however our work shows a path forward where codes that benefit from thousands of processors can run between pulses.« less
Kostuk, M.; Uram, T. D.; Evans, T.; ...
2018-02-01
For the first time, an automatically triggered, between-pulse fusion science analysis code was run on-demand at a remotely located supercomputer at Argonne Leadership Computing Facility (ALCF, Lemont, IL) in support of in-process experiments being performed at DIII-D (San Diego, CA). This represents a new paradigm for combining geographically distant experimental and high performance computing (HPC) facilities to provide enhanced data analysis that is quickly available to researchers. Enhanced analysis improves the understanding of the current pulse, translating into a more efficient use of experimental resources, and to the quality of the resultant science. The analysis code used here, called SURFMN,more » calculates the magnetic structure of the plasma using Fourier transform. Increasing the number of Fourier components provides a more accurate determination of the stochastic boundary layer near the plasma edge by better resolving magnetic islands, but requires 26 minutes to complete using local DIII-D resources, putting it well outside the useful time range for between pulse analysis. These islands relate to confinement and edge localized mode (ELM) suppression, and may be controlled by adjusting coil currents for the next pulse. Argonne has ensured on-demand execution of SURFMN by providing a reserved queue, a specialized service that launches the code after receiving an automatic trigger, and with network access from the worker nodes for data transfer. Runs are executed on 252 cores of ALCF’s Cooley cluster and the data is available locally at DIII-D within three minutes of triggering. The original SURFMN design limits additional improvements with more cores, however our work shows a path forward where codes that benefit from thousands of processors can run between pulses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bland, Arthur S Buddy; Hack, James J; Baker, Ann E
Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energymore » assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools and resources for next-generation systems.« less
NASA Technical Reports Server (NTRS)
Himer, J. T.
1992-01-01
Fortran has largely enjoyed prominence for the past few decades as the computer programming language of choice for numerically intensive scientific, engineering, and process control applications. Fortran's well understood static language syntax has allowed resulting parsers and compiler optimizing technologies to often generate among the most efficient and fastest run-time executables, particularly on high-end scalar and vector supercomputers. Computing architectures and paradigms have changed considerably since the last ANSI/ISO Fortran release in 1978, and while FORTRAN 77 has more than survived, it's aged features provide only partial functionality for today's demanding computing environments. The simple block procedural languages have been necessarily evolving, or giving way, to specialized supercomputing, network resource, and object-oriented paradigms. To address these new computing demands, ANSI has worked for the last 12-years with three international public reviews to deliver Fortran 90. Fortran 90 has superseded and replaced ISO FORTRAN 77 internationally as the sole Fortran standard; while in the US, Fortran 90 is expected to be adopted as the ANSI standard this summer, coexisting with ANSI FORTRAN 77 until at least 1996. The development path and current state of Fortran will be briefly described highlighting the many new Fortran 90 syntactic and semantic additions which support (among others): free form source; array syntax; new control structures; modules and interfaces; pointers; derived data types; dynamic memory; enhanced I/O; operator overloading; data abstraction; user optional arguments; new intrinsics for array, bit manipulation, and system inquiry; and enhanced portability through better generic control of underlying system arithmetic models. Examples from dynamical astronomy, signal and image processing will attempt to illustrate Fortran 90's applicability to today's general scalar, vector, and parallel scientific and engineering requirements and object oriented programming paradigms. Time permitting, current work proceeding on the future development of Fortran 2000 and collateral standards will be introduced.
Report of the theory panel. [space physics
NASA Technical Reports Server (NTRS)
Ashourabdalla, Maha; Rosner, Robert; Antiochos, Spiro; Curtis, Steven; Fejer, B.; Goertz, Christoph K.; Goldstein, Melvyn L.; Holzer, Thomas E.; Jokipii, J. R.; Lee, Lou-Chuang
1991-01-01
The ultimate goal of this research is to develop an understanding which is sufficiently comprehensive to allow realistic predictions of the behavior of the physical systems. Theory has a central role to play in the quest for this understanding. The level of theoretical description is dependent on three constraints: (1) the available computer hardware may limit both the number and the size of physical processes the model system can describe; (2) the fact that some natural systems may only be described in a statistical manner; and (3) the fact that some natural systems may be observable only through remote sensing which is intrinsically limited by spatial resolution and line of sight integration. From this the report discusses present accomplishments and future goals of theoretical space physics. Finally, the development and use of new supercomputer is examined.
Final Report for DOE Award ER25756
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kesselman, Carl
2014-11-17
The SciDAC-funded Center for Enabling Distributed Petascale Science (CEDPS) was established to address technical challenges that arise due to the frequent geographic distribution of data producers (in particular, supercomputers and scientific instruments) and data consumers (people and computers) within the DOE laboratory system. Its goal is to produce technical innovations that meet DOE end-user needs for (a) rapid and dependable placement of large quantities of data within a distributed high-performance environment, and (b) the convenient construction of scalable science services that provide for the reliable and high-performance processing of computation and data analysis requests from many remote clients. The Centermore » is also addressing (c) the important problem of troubleshooting these and other related ultra-high-performance distributed activities from the perspective of both performance and functionality« less
Black-hole Merger Simulations for LISA Science
NASA Technical Reports Server (NTRS)
Kelly, Bernard J.; Baker, John G.; vanMeter, James R.; Boggs, William D.; Centrella, Joan M.; McWilliams, Sean T.
2009-01-01
The strongest expected sources of gravitational waves in the LISA band are the mergers of massive black holes. LISA may observe these systems to high redshift, z>10, to uncover details of the origin of massive black holes, and of the relationship between black holes and their host structures, and structure formation itself. These signals arise from the final stage in the development of a massive black-hole binary emitting strong gravitational radiation that accelerates the system's inspiral toward merger. The strongest part of the signal, at the point of merger, carries much information about the system and provides a probe of extreme gravitational physics. Theoretical predictions for these merger signals rely on supercomputer simulations to solve Einstein's equations. We discuss recent numerical results and their impact on LISA science expectations.
NASA Astrophysics Data System (ADS)
Romano, Paul Kollath
Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallel efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O( N ) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes---in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with measured data from simulations in OpenMC on a full-core benchmark problem. Finally, a novel algorithm for decomposing large tally data was proposed, analyzed, and implemented/tested in OpenMC. The algorithm relies on disjoint sets of compute processes and tally servers. The analysis showed that for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead. Tests were performed on Intrepid and Titan and demonstrated that the algorithm did indeed perform well over a wide range of parameters. (Copies available exclusively from MIT Libraries, libraries.mit.edu/docs - docs mit.edu)
HACC: Simulating sky surveys on state-of-the-art supercomputing architectures
NASA Astrophysics Data System (ADS)
Habib, Salman; Pope, Adrian; Finkel, Hal; Frontiere, Nicholas; Heitmann, Katrin; Daniel, David; Fasel, Patricia; Morozov, Vitali; Zagaris, George; Peterka, Tom; Vishwanath, Venkatram; Lukić, Zarija; Sehrish, Saba; Liao, Wei-keng
2016-01-01
Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the 'Dark Universe', dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers that enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models and algorithms. It has been demonstrated at scale on Cell- and GPU-accelerated systems, standard multi-core node clusters, and Blue Gene systems. HACC's design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available. We present a description of the design philosophy of HACC, the underlying algorithms and code structure, and outline implementation details for several specific architectures. We show selected accuracy and performance results from some of the largest high resolution cosmological simulations so far performed, including benchmarks evolving more than 3.6 trillion particles.
HACC: Simulating sky surveys on state-of-the-art supercomputing architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Habib, Salman; Pope, Adrian; Finkel, Hal
2016-01-01
Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the ‘Dark Universe’, dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers thatmore » enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models and algorithms. It has been demonstrated at scale on Cell- and GPU-accelerated systems, standard multi-core node clusters, and Blue Gene systems. HACC’s design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available. We present a description of the design philosophy of HACC, the underlying algorithms and code structure, and outline implementation details for several specific architectures. We show selected accuracy and performance results from some of the largest high resolution cosmological simulations so far performed, including benchmarks evolving more than 3.6 trillion particles.« less
NASA's Participation in the National Computational Grid
NASA Technical Reports Server (NTRS)
Feiereisen, William J.; Zornetzer, Steve F. (Technical Monitor)
1998-01-01
Over the last several years it has become evident that the character of NASA's supercomputing needs has changed. One of the major missions of the agency is to support the design and manufacture of aero- and space-vehicles with technologies that will significantly reduce their cost. It is becoming clear that improvements in the process of aerospace design and manufacturing will require a high performance information infrastructure that allows geographically dispersed teams to draw upon resources that are broader than traditional supercomputing. A computational grid draws together our information resources into one system. We can foresee the time when a Grid will allow engineers and scientists to use the tools of supercomputers, databases and on line experimental devices in a virtual environment to collaborate with distant colleagues. The concept of a computational grid has been spoken of for many years, but several events in recent times are conspiring to allow us to actually build one. In late 1997 the National Science Foundation initiated the Partnerships for Advanced Computational Infrastructure (PACI) which is built around the idea of distributed high performance computing. The Alliance lead, by the National Computational Science Alliance (NCSA), and the National Partnership for Advanced Computational Infrastructure (NPACI), lead by the San Diego Supercomputing Center, have been instrumental in drawing together the "Grid Community" to identify the technology bottlenecks and propose a research agenda to address them. During the same period NASA has begun to reformulate parts of two major high performance computing research programs to concentrate on distributed high performance computing and has banded together with the PACI centers to address the research agenda in common.
Jiang, Wei; Luo, Yun; Maragliano, Luca; Roux, Benoît
2012-11-13
An extremely scalable computational strategy is described for calculations of the potential of mean force (PMF) in multidimensions on massively distributed supercomputers. The approach involves coupling thousands of umbrella sampling (US) simulation windows distributed to cover the space of order parameters with a Hamiltonian molecular dynamics replica-exchange (H-REMD) algorithm to enhance the sampling of each simulation. In the present application, US/H-REMD is carried out in a two-dimensional (2D) space and exchanges are attempted alternatively along the two axes corresponding to the two order parameters. The US/H-REMD strategy is implemented on the basis of parallel/parallel multiple copy protocol at the MPI level, and therefore can fully exploit computing power of large-scale supercomputers. Here the novel technique is illustrated using the leadership supercomputer IBM Blue Gene/P with an application to a typical biomolecular calculation of general interest, namely the binding of calcium ions to the small protein Calbindin D9k. The free energy landscape associated with two order parameters, the distance between the ion and its binding pocket and the root-mean-square deviation (rmsd) of the binding pocket relative the crystal structure, was calculated using the US/H-REMD method. The results are then used to estimate the absolute binding free energy of calcium ion to Calbindin D9k. The tests demonstrate that the 2D US/H-REMD scheme greatly accelerates the configurational sampling of the binding pocket, thereby improving the convergence of the potential of mean force calculation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meneses, Esteban; Ni, Xiang; Jones, Terry R
The unprecedented computational power of cur- rent supercomputers now makes possible the exploration of complex problems in many scientific fields, from genomic analysis to computational fluid dynamics. Modern machines are powerful because they are massive: they assemble millions of cores and a huge quantity of disks, cards, routers, and other components. But it is precisely the size of these machines that glooms the future of supercomputing. A system that comprises many components has a high chance to fail, and fail often. In order to make the next generation of supercomputers usable, it is imperative to use some type of faultmore » tolerance platform to run applications on large machines. Most fault tolerance strategies can be optimized for the peculiarities of each system and boost efficacy by keeping the system productive. In this paper, we aim to understand how failure characterization can improve resilience in several layers of the software stack: applications, runtime systems, and job schedulers. We examine the Titan supercomputer, one of the fastest systems in the world. We analyze a full year of Titan in production and distill the failure patterns of the machine. By looking into Titan s log files and using the criteria of experts, we provide a detailed description of the types of failures. In addition, we inspect the job submission files and describe how the system is used. Using those two sources, we cross correlate failures in the machine to executing jobs and provide a picture of how failures affect the user experience. We believe such characterization is fundamental in developing appropriate fault tolerance solutions for Cray systems similar to Titan.« less
National Storage Laboratory: a collaborative research project
NASA Astrophysics Data System (ADS)
Coyne, Robert A.; Hulen, Harry; Watson, Richard W.
1993-01-01
The grand challenges of science and industry that are driving computing and communications have created corresponding challenges in information storage and retrieval. An industry-led collaborative project has been organized to investigate technology for storage systems that will be the future repositories of national information assets. Industry participants are IBM Federal Systems Company, Ampex Recording Systems Corporation, General Atomics DISCOS Division, IBM ADSTAR, Maximum Strategy Corporation, Network Systems Corporation, and Zitel Corporation. Industry members of the collaborative project are funding their own participation. Lawrence Livermore National Laboratory through its National Energy Research Supercomputer Center (NERSC) will participate in the project as the operational site and provider of applications. The expected result is the creation of a National Storage Laboratory to serve as a prototype and demonstration facility. It is expected that this prototype will represent a significant advance in the technology for distributed storage systems capable of handling gigabyte-class files at gigabit-per-second data rates. Specifically, the collaboration expects to make significant advances in hardware, software, and systems technology in four areas of need, (1) network-attached high performance storage; (2) multiple, dynamic, distributed storage hierarchies; (3) layered access to storage system services; and (4) storage system management.
Calibrating Building Energy Models Using Supercomputer Trained Machine Learning Agents
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanyal, Jibonananda; New, Joshua Ryan; Edwards, Richard
2014-01-01
Building Energy Modeling (BEM) is an approach to model the energy usage in buildings for design and retrofit purposes. EnergyPlus is the flagship Department of Energy software that performs BEM for different types of buildings. The input to EnergyPlus can often extend in the order of a few thousand parameters which have to be calibrated manually by an expert for realistic energy modeling. This makes it challenging and expensive thereby making building energy modeling unfeasible for smaller projects. In this paper, we describe the Autotune research which employs machine learning algorithms to generate agents for the different kinds of standardmore » reference buildings in the U.S. building stock. The parametric space and the variety of building locations and types make this a challenging computational problem necessitating the use of supercomputers. Millions of EnergyPlus simulations are run on supercomputers which are subsequently used to train machine learning algorithms to generate agents. These agents, once created, can then run in a fraction of the time thereby allowing cost-effective calibration of building models.« less
Challenges in scaling NLO generators to leadership computers
NASA Astrophysics Data System (ADS)
Benjamin, D.; Childers, JT; Hoeche, S.; LeCompte, T.; Uram, T.
2017-10-01
Exascale computing resources are roughly a decade away and will be capable of 100 times more computing than current supercomputers. In the last year, Energy Frontier experiments crossed a milestone of 100 million core-hours used at the Argonne Leadership Computing Facility, Oak Ridge Leadership Computing Facility, and NERSC. The Fortran-based leading-order parton generator called Alpgen was successfully scaled to millions of threads to achieve this level of usage on Mira. Sherpa and MadGraph are next-to-leading order generators used heavily by LHC experiments for simulation. Integration times for high-multiplicity or rare processes can take a week or more on standard Grid machines, even using all 16-cores. We will describe our ongoing work to scale the Sherpa generator to thousands of threads on leadership-class machines and reduce run-times to less than a day. This work allows the experiments to leverage large-scale parallel supercomputers for event generation today, freeing tens of millions of grid hours for other work, and paving the way for future applications (simulation, reconstruction) on these and future supercomputers.
Optical clock distribution in supercomputers using polyimide-based waveguides
NASA Astrophysics Data System (ADS)
Bihari, Bipin; Gan, Jianhua; Wu, Linghui; Liu, Yujie; Tang, Suning; Chen, Ray T.
1999-04-01
Guided-wave optics is a promising way to deliver high-speed clock-signal in supercomputer with minimized clock-skew. Si- CMOS compatible polymer-based waveguides for optoelectronic interconnects and packaging have been fabricated and characterized. A 1-to-48 fanout optoelectronic interconnection layer (OIL) structure based on Ultradel 9120/9020 for the high-speed massive clock signal distribution for a Cray T-90 supercomputer board has been constructed. The OIL employs multimode polymeric channel waveguides in conjunction with surface-normal waveguide output coupler and 1-to-2 splitters. Surface-normal couplers can couple the optical clock signals into and out from the H-tree polyimide waveguides surface-normally, which facilitates the integration of photodetectors to convert optical-signal to electrical-signal. A 45-degree surface- normal couplers has been integrated at each output end. The measured output coupling efficiency is nearly 100 percent. The output profile from 45-degree surface-normal coupler were calculated using Fresnel approximation. the theoretical result is in good agreement with experimental result. A total insertion loss of 7.98 dB at 850 nm was measured experimentally.
Flow visualization of CFD using graphics workstations
NASA Technical Reports Server (NTRS)
Lasinski, Thomas; Buning, Pieter; Choi, Diana; Rogers, Stuart; Bancroft, Gordon
1987-01-01
High performance graphics workstations are used to visualize the fluid flow dynamics obtained from supercomputer solutions of computational fluid dynamic programs. The visualizations can be done independently on the workstation or while the workstation is connected to the supercomputer in a distributed computing mode. In the distributed mode, the supercomputer interactively performs the computationally intensive graphics rendering tasks while the workstation performs the viewing tasks. A major advantage of the workstations is that the viewers can interactively change their viewing position while watching the dynamics of the flow fields. An overview of the computer hardware and software required to create these displays is presented. For complex scenes the workstation cannot create the displays fast enough for good motion analysis. For these cases, the animation sequences are recorded on video tape or 16 mm film a frame at a time and played back at the desired speed. The additional software and hardware required to create these video tapes or 16 mm movies are also described. Photographs illustrating current visualization techniques are discussed. Examples of the use of the workstations for flow visualization through animation are available on video tape.
Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode
NASA Technical Reports Server (NTRS)
Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William
1986-01-01
The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
Long-Term file activity patterns in a UNIX workstation environment
NASA Technical Reports Server (NTRS)
Gibson, Timothy J.; Miller, Ethan L.
1998-01-01
As mass storage technology becomes more affordable for sites smaller than supercomputer centers, understanding their file access patterns becomes crucial for developing systems to store rarely used data on tertiary storage devices such as tapes and optical disks. This paper presents a new way to collect and analyze file system statistics for UNIX-based file systems. The collection system runs in user-space and requires no modification of the operating system kernel. The statistics package provides details about file system operations at the file level: creations, deletions, modifications, etc. The paper analyzes four months of file system activity on a university file system. The results confirm previously published results gathered from supercomputer file systems, but differ in several important areas. Files in this study were considerably smaller than those at supercomputer centers, and they were accessed less frequently. Additionally, the long-term creation rate on workstation file systems is sufficiently low so that all data more than a day old could be cheaply saved on a mass storage device, allowing the integration of time travel into every file system.
NASA Astrophysics Data System (ADS)
Day, B. H.; Bland, P.
2016-12-01
Fireballs in the Sky is an innovative Australian citizen science program that connects the public with the research of the Desert Fireball Network (DFN). This research aims to understand the early workings of the solar system, and Fireballs in the Sky invites people around the world to learn about this science, contributing fireball sightings via a user-friendly app. To date, more than 23,000 people have downloaded the app world-wide and participated in planetary science. The Fireballs in the Sky app allows users to get involved with the Desert Fireball Network research, supplementing DFN observations and providing enhanced coverage by reporting their own meteor sightings to DFN scientists. Fireballs in the Sky reports are used to track the trajectories of meteors - from their orbit in space to where they might have landed on Earth. Led by Phil Bland at Curtin University in Australia, the Desert Fireball Network (DFN) uses automated observatories across Australia to triangulate trajectories of meteorites entering the atmosphere, determine pre-entry orbits, and pinpoint their fall positions. Each observatory is an autonomous intelligent imaging system, taking 1000×36Megapixel all-sky images throughout the night, using neural network algorithms to recognize events. They are capable of operating for 12 months in a harsh environment, and store all imagery collected. We developed a completely automated software pipeline for data reduction, and built a supercomputer database for storage, allowing us to process our entire archive. The DFN currently stands at 50 stations distributed across the Australian continent, covering an area of 2.5 million km^2. Working with DFN's partners at NASA's Solar System Exploration Research Virtual Institute, the team is expanding the network beyond Australia to locations around the world. Fireballs in the Sky allows a growing public base to learn about and participate in this exciting research.
Ubiquitous Green Computing Techniques for High Demand Applications in Smart Environments
Zapater, Marina; Sanchez, Cesar; Ayala, Jose L.; Moya, Jose M.; Risco-Martín, José L.
2012-01-01
Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time. PMID:23112621
The center for causal discovery of biomedical knowledge from big data
Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-01-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. PMID:26138794
Computational chemistry research
NASA Technical Reports Server (NTRS)
Levin, Eugene
1987-01-01
Task 41 is composed of two parts: (1) analysis and design studies related to the Numerical Aerodynamic Simulation (NAS) Extended Operating Configuration (EOC) and (2) computational chemistry. During the first half of 1987, Dr. Levin served as a member of an advanced system planning team to establish the requirements, goals, and principal technical characteristics of the NAS EOC. A paper entitled 'Scaling of Data Communications for an Advanced Supercomputer Network' is included. The high temperature transport properties (such as viscosity, thermal conductivity, etc.) of the major constituents of air (oxygen and nitrogen) were correctly determined. The results of prior ab initio computer solutions of the Schroedinger equation were combined with the best available experimental data to obtain complete interaction potentials for both neutral and ion-atom collision partners. These potentials were then used in a computer program to evaluate the collision cross-sections from which the transport properties could be determined. A paper entitled 'High Temperature Transport Properties of Air' is included.
Leveraging the national cyberinfrastructure for biomedical research.
LeDuc, Richard; Vaughn, Matthew; Fonner, John M; Sullivan, Michael; Williams, James G; Blood, Philip D; Taylor, James; Barnett, William
2014-01-01
In the USA, the national cyberinfrastructure refers to a system of research supercomputer and other IT facilities and the high speed networks that connect them. These resources have been heavily leveraged by scientists in disciplines such as high energy physics, astronomy, and climatology, but until recently they have been little used by biomedical researchers. We suggest that many of the 'Big Data' challenges facing the medical informatics community can be efficiently handled using national-scale cyberinfrastructure. Resources such as the Extreme Science and Discovery Environment, the Open Science Grid, and Internet2 provide economical and proven infrastructures for Big Data challenges, but these resources can be difficult to approach. Specialized web portals, support centers, and virtual organizations can be constructed on these resources to meet defined computational challenges, specifically for genomics. We provide examples of how this has been done in basic biology as an illustration for the biomedical informatics community.
Leveraging the national cyberinfrastructure for biomedical research
LeDuc, Richard; Vaughn, Matthew; Fonner, John M; Sullivan, Michael; Williams, James G; Blood, Philip D; Taylor, James; Barnett, William
2014-01-01
In the USA, the national cyberinfrastructure refers to a system of research supercomputer and other IT facilities and the high speed networks that connect them. These resources have been heavily leveraged by scientists in disciplines such as high energy physics, astronomy, and climatology, but until recently they have been little used by biomedical researchers. We suggest that many of the ‘Big Data’ challenges facing the medical informatics community can be efficiently handled using national-scale cyberinfrastructure. Resources such as the Extreme Science and Discovery Environment, the Open Science Grid, and Internet2 provide economical and proven infrastructures for Big Data challenges, but these resources can be difficult to approach. Specialized web portals, support centers, and virtual organizations can be constructed on these resources to meet defined computational challenges, specifically for genomics. We provide examples of how this has been done in basic biology as an illustration for the biomedical informatics community. PMID:23964072
Machine Learning Toolkit for Extreme Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-03-31
Support Vector Machines (SVM) is a popular machine learning technique, which has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. MaTEx undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Several techniques are proposed for improved speed and memory space usage including adaptive and aggressive elimination of samples for faster convergence , and sparse format representation of data samples. Several heuristics for earliest possible to lazy elimination of non-contributing samples are consideredmore » in MaTEx. In many cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The proposed algorithm and heuristics are implemented and evaluated on various publicly available datasets« less
Roadmap of optical communications
NASA Astrophysics Data System (ADS)
Agrell, Erik; Karlsson, Magnus; Chraplyvy, A. R.; Richardson, David J.; Krummrich, Peter M.; Winzer, Peter; Roberts, Kim; Fischer, Johannes Karl; Savory, Seb J.; Eggleton, Benjamin J.; Secondini, Marco; Kschischang, Frank R.; Lord, Andrew; Prat, Josep; Tomkos, Ioannis; Bowers, John E.; Srinivasan, Sudha; Brandt-Pearce, Maïté; Gisin, Nicolas
2016-06-01
Lightwave communications is a necessity for the information age. Optical links provide enormous bandwidth, and the optical fiber is the only medium that can meet the modern society's needs for transporting massive amounts of data over long distances. Applications range from global high-capacity networks, which constitute the backbone of the internet, to the massively parallel interconnects that provide data connectivity inside datacenters and supercomputers. Optical communications is a diverse and rapidly changing field, where experts in photonics, communications, electronics, and signal processing work side by side to meet the ever-increasing demands for higher capacity, lower cost, and lower energy consumption, while adapting the system design to novel services and technologies. Due to the interdisciplinary nature of this rich research field, Journal of Optics has invited 16 researchers, each a world-leading expert in their respective subfields, to contribute a section to this invited review article, summarizing their views on state-of-the-art and future developments in optical communications.
Ubiquitous green computing techniques for high demand applications in Smart environments.
Zapater, Marina; Sanchez, Cesar; Ayala, Jose L; Moya, Jose M; Risco-Martín, José L
2012-01-01
Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time.
Opportunities for leveraging OS virtualization in high-end supercomputing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bridges, Patrick G.; Pedretti, Kevin Thomas Tauke
2010-11-01
This paper examines potential motivations for incorporating virtualization support in the system software stacks of high-end capability supercomputers. We advocate that this will increase the flexibility of these platforms significantly and enable new capabilities that are not possible with current fixed software stacks. Our results indicate that compute, virtual memory, and I/O virtualization overheads are low and can be further mitigated by utilizing well-known techniques such as large paging and VMM bypass. Furthermore, since the addition of virtualization support does not affect the performance of applications using the traditional native environment, there is essentially no disadvantage to its addition.
Building black holes: supercomputer cinema.
Shapiro, S L; Teukolsky, S A
1988-07-22
A new computer code can solve Einstein's equations of general relativity for the dynamical evolution of a relativistic star cluster. The cluster may contain a large number of stars that move in a strong gravitational field at speeds approaching the speed of light. Unstable star clusters undergo catastrophic collapse to black holes. The collapse of an unstable cluster to a supermassive black hole at the center of a galaxy may explain the origin of quasars and active galactic nuclei. By means of a supercomputer simulation and color graphics, the whole process can be viewed in real time on a movie screen.
Supercomputer analysis of purine and pyrimidine metabolism leading to DNA synthesis.
Heinmets, F
1989-06-01
A model-system is established to analyze purine and pyrimidine metabolism leading to DNA synthesis. The principal aim is to explore the flow and regulation of terminal deoxynucleoside triophosphates (dNTPs) in various input and parametric conditions. A series of flow equations are established, which are subsequently converted to differential equations. These are programmed (Fortran) and analyzed on a Cray chi-MP/48 supercomputer. The pool concentrations are presented as a function of time in conditions in which various pertinent parameters of the system are modified. The system is formulated by 100 differential equations.
Performance of the Widely-Used CFD Code OVERFLOW on the Pleides Supercomputer
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.
2017-01-01
Computational performance studies were made for NASA's widely used Computational Fluid Dynamics code OVERFLOW on the Pleiades Supercomputer. Two test cases were considered: a full launch vehicle with a grid of 286 million points and a full rotorcraft model with a grid of 614 million points. Computations using up to 8000 cores were run on Sandy Bridge and Ivy Bridge nodes. Performance was monitored using times reported in the day files from the Portable Batch System utility. Results for two grid topologies are presented and compared in detail. Observations and suggestions for future work are made.
NASA Astrophysics Data System (ADS)
Hu, X.; Zou, Z.
2017-12-01
For the next decades, comprehensive big data application environment is the dominant direction of cyberinfrastructure development on space science. To make the concept of such BIG cyberinfrastructure (e.g. Digital Space) a reality, these aspects of capability should be focused on and integrated, which includes science data system, digital space engine, big data application (tools and models) and the IT infrastructure. In the past few years, CAS Chinese Space Science Data Center (CSSDC) has made a helpful attempt in this direction. A cloud-enabled virtual research platform on space science, called Solar-Terrestrial and Astronomical Research Network (STAR-Network), has been developed to serve the full lifecycle of space science missions and research activities. It integrated a wide range of disciplinary and interdisciplinary resources, to provide science-problem-oriented data retrieval and query service, collaborative mission demonstration service, mission operation supporting service, space weather computing and Analysis service and other self-help service. This platform is supported by persistent infrastructure, including cloud storage, cloud computing, supercomputing and so on. Different variety of resource are interconnected: the science data can be displayed on the browser by visualization tools, the data analysis tools and physical models can be drived by the applicable science data, the computing results can be saved on the cloud, for example. So far, STAR-Network has served a series of space science mission in China, involving Strategic Pioneer Program on Space Science (this program has invested some space science satellite as DAMPE, HXMT, QUESS, and more satellite will be launched around 2020) and Meridian Space Weather Monitor Project. Scientists have obtained some new findings by using the science data from these missions with STAR-Network's contribution. We are confident that STAR-Network is an exciting practice of new cyberinfrastructure architecture on space science.
HPCC and the National Information Infrastructure: an overview.
Lindberg, D A
1995-01-01
The National Information Infrastructure (NII) or "information superhighway" is a high-priority federal initiative to combine communications networks, computers, databases, and consumer electronics to deliver information services to all U.S. citizens. The NII will be used to improve government and social services while cutting administrative costs. Operated by the private sector, the NII will rely on advanced technologies developed under the direction of the federal High Performance Computing and Communications (HPCC) Program. These include computing systems capable of performing trillions of operations (teraops) per second and networks capable of transmitting billions of bits (gigabits) per second. Among other activities, the HPCC Program supports the national supercomputer research centers, the federal portion of the Internet, and the development of interface software, such as Mosaic, that facilitates access to network information services. Health care has been identified as a critical demonstration area for HPCC technology and an important application area for the NII. As an HPCC participant, the National Library of Medicine (NLM) assists hospitals and medical centers to connect to the Internet through projects directed by the Regional Medical Libraries and through an Internet Connections Program cosponsored by the National Science Foundation. In addition to using the Internet to provide enhanced access to its own information services, NLM sponsors health-related applications of HPCC technology. Examples include the "Visible Human" project and recently awarded contracts for test-bed networks to share patient data and medical images, telemedicine projects to provide consultation and medical care to patients in rural areas, and advanced computer simulations of human anatomy for training in "virtual surgery." PMID:7703935
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Maxine D.; Leigh, Jason
2014-02-17
The Blaze high-performance visual computing system serves the high-performance computing research and education needs of University of Illinois at Chicago (UIC). Blaze consists of a state-of-the-art, networked, computer cluster and ultra-high-resolution visualization system called CAVE2(TM) that is currently not available anywhere in Illinois. This system is connected via a high-speed 100-Gigabit network to the State of Illinois' I-WIRE optical network, as well as to national and international high speed networks, such as the Internet2, and the Global Lambda Integrated Facility. This enables Blaze to serve as an on-ramp to national cyberinfrastructure, such as the National Science Foundation’s Blue Waters petascalemore » computer at the National Center for Supercomputing Applications at the University of Illinois at Chicago and the Department of Energy’s Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. DOE award # DE-SC005067, leveraged with NSF award #CNS-0959053 for “Development of the Next-Generation CAVE Virtual Environment (NG-CAVE),” enabled us to create a first-of-its-kind high-performance visual computing system. The UIC Electronic Visualization Laboratory (EVL) worked with two U.S. companies to advance their commercial products and maintain U.S. leadership in the global information technology economy. New applications are being enabled with the CAVE2/Blaze visual computing system that is advancing scientific research and education in the U.S. and globally, and help train the next-generation workforce.« less
Spatiotemporal modeling of node temperatures in supercomputers
Storlie, Curtis Byron; Reich, Brian James; Rust, William Newton; ...
2016-06-10
Los Alamos National Laboratory (LANL) is home to many large supercomputing clusters. These clusters require an enormous amount of power (~500-2000 kW each), and most of this energy is converted into heat. Thus, cooling the components of the supercomputer becomes a critical and expensive endeavor. Recently a project was initiated to investigate the effect that changes to the cooling system in a machine room had on three large machines that were housed there. Coupled with this goal was the aim to develop a general good-practice for characterizing the effect of cooling changes and monitoring machine node temperatures in this andmore » other machine rooms. This paper focuses on the statistical approach used to quantify the effect that several cooling changes to the room had on the temperatures of the individual nodes of the computers. The largest cluster in the room has 1,600 nodes that run a variety of jobs during general use. Since extremes temperatures are important, a Normal distribution plus generalized Pareto distribution for the upper tail is used to model the marginal distribution, along with a Gaussian process copula to account for spatio-temporal dependence. A Gaussian Markov random field (GMRF) model is used to model the spatial effects on the node temperatures as the cooling changes take place. This model is then used to assess the condition of the node temperatures after each change to the room. The analysis approach was used to uncover the cause of a problematic episode of overheating nodes on one of the supercomputing clusters. Lastly, this same approach can easily be applied to monitor and investigate cooling systems at other data centers, as well.« less
Integration of PanDA workload management system with Titan supercomputer at OLCF
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.
2015-12-01
The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently distributes jobs to more than 100,000 cores at well over 100 Grid sites, the future LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). The current approach utilizes a modified PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multicore worker nodes. It also gives PanDA new capability to collect, in real time, information about unused worker nodes on Titan, which allows precise definition of the size and duration of jobs submitted to Titan according to available free resources. This capability significantly reduces PanDA job wait time while improving Titan's utilization efficiency. This implementation was tested with a variety of Monte-Carlo workloads on Titan and is being tested on several other supercomputing platforms. Notice: This manuscript has been authored, by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.
High End Computer Network Testbedding at NASA Goddard Space Flight Center
NASA Technical Reports Server (NTRS)
Gary, James Patrick
1998-01-01
The Earth & Space Data Computing (ESDC) Division, at the Goddard Space Flight Center, is involved in development and demonstrating various high end computer networking capabilities. The ESDC has several high end super computers. These are used to run: (1) computer simulation of the climate systems; (2) to support the Earth and Space Sciences (ESS) project; (3) to support the Grand Challenge (GC) Science, which is aimed at understanding the turbulent convection and dynamos in stars. GC research occurs in many sites throughout the country, and this research is enabled by, in part, the multiple high performance network interconnections. The application drivers for High End Computer Networking use distributed supercomputing to support virtual reality applications, such as TerraVision, (i.e., three dimensional browser of remotely accessed data), and Cave Automatic Virtual Environments (CAVE). Workstations can access and display data from multiple CAVE's with video servers, which allows for group/project collaborations using a combination of video, data, voice and shared white boarding. The ESDC is also developing and demonstrating the high degree of interoperability between satellite and terrestrial-based networks. To this end, the ESDC is conducting research and evaluations of new computer networking protocols and related technologies which improve the interoperability of satellite and terrestrial networks. The ESDC is also involved in the Security Proof of Concept Keystone (SPOCK) program sponsored by National Security Agency (NSA). The SPOCK activity provides a forum for government users and security technology providers to share information on security requirements, emerging technologies and new product developments. Also, the ESDC is involved in the Trans-Pacific Digital Library Experiment, which aims to demonstrate and evaluate the use of high performance satellite communications and advanced data communications protocols to enable interactive digital library data access between the U. S. Library of Congress, the National Library of Japan and other digital library sites at 155 MegaBytes Per Second. The ESDC participation in this program is the Trans-Pacific access to GLOBE visualizations in real time. ESDC is participating in the Department of Defense's ATDNet with Multiwavelength Optical Network (MONET) a fully switched Wavelength Division Networking testbed. This presentation is in viewgraph format.
The impact of supercomputers on experimentation: A view from a national laboratory
NASA Technical Reports Server (NTRS)
Peterson, V. L.; Arnold, J. O.
1985-01-01
The relative roles of large scale scientific computers and physical experiments in several science and engineering disciplines are discussed. Increasing dependence on computers is shown to be motivated both by the rapid growth in computer speed and memory, which permits accurate numerical simulation of complex physical phenomena, and by the rapid reduction in the cost of performing a calculation, which makes computation an increasingly attractive complement to experimentation. Computer speed and memory requirements are presented for selected areas of such disciplines as fluid dynamics, aerodynamics, aerothermodynamics, chemistry, atmospheric sciences, astronomy, and astrophysics, together with some examples of the complementary nature of computation and experiment. Finally, the impact of the emerging role of computers in the technical disciplines is discussed in terms of both the requirements for experimentation and the attainment of previously inaccessible information on physical processes.
Enabling opportunistic resources for CMS Computing Operations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hufnagel, Dirk
With the increased pressure on computing brought by the higher energy and luminosity from the LHC in Run 2, CMS Computing Operations expects to require the ability to utilize opportunistic resources resources not owned by, or a priori configured for CMS to meet peak demands. In addition to our dedicated resources we look to add computing resources from non CMS grids, cloud resources, and national supercomputing centers. CMS uses the HTCondor/glideinWMS job submission infrastructure for all its batch processing, so such resources will need to be transparently integrated into its glideinWMS pool. Bosco and parrot wrappers are used to enablemore » access and bring the CMS environment into these non CMS resources. Finally, we describe our strategy to supplement our native capabilities with opportunistic resources and our experience so far using them.« less
Enabling opportunistic resources for CMS Computing Operations
Hufnagel, Dirk
2015-12-23
With the increased pressure on computing brought by the higher energy and luminosity from the LHC in Run 2, CMS Computing Operations expects to require the ability to utilize opportunistic resources resources not owned by, or a priori configured for CMS to meet peak demands. In addition to our dedicated resources we look to add computing resources from non CMS grids, cloud resources, and national supercomputing centers. CMS uses the HTCondor/glideinWMS job submission infrastructure for all its batch processing, so such resources will need to be transparently integrated into its glideinWMS pool. Bosco and parrot wrappers are used to enablemore » access and bring the CMS environment into these non CMS resources. Finally, we describe our strategy to supplement our native capabilities with opportunistic resources and our experience so far using them.« less
Monitoring Object Library Usage and Changes
NASA Technical Reports Server (NTRS)
Owen, R. K.; Craw, James M. (Technical Monitor)
1995-01-01
The NASA Ames Numerical Aerodynamic Simulation program Aeronautics Consolidated Supercomputing Facility (NAS/ACSF) supercomputing center services over 1600 users, and has numerous analysts with root access. Several tools have been developed to monitor object library usage and changes. Some of the tools do "noninvasive" monitoring and other tools implement run-time logging even for object-only libraries. The run-time logging identifies who, when, and what is being used. The benefits are that real usage can be measured, unused libraries can be discontinued, training and optimization efforts can be focused at those numerical methods that are actually used. An overview of the tools will be given and the results will be discussed.
Watson will see you now: a supercomputer to help clinicians make informed treatment decisions.
Doyle-Lindrud, Susan
2015-02-01
IBM has collaborated with several cancer care providers to develop and train the IBM supercomputer Watson to help clinicians make informed treatment decisions. When a patient is seen in clinic, the oncologist can input all of the clinical information into the computer system. Watson will then review all of the data and recommend treatment options based on the latest evidence and guidelines. Once the oncologist makes the treatment decision, this information can be sent directly to the insurance company for approval. Watson has the ability to standardize care and accelerate the approval process, a benefit to the healthcare provider and the patient.
The transition of a real-time single-rotor helicopter simulation program to a supercomputer
NASA Technical Reports Server (NTRS)
Martinez, Debbie
1995-01-01
This report presents the conversion effort and results of a real-time flight simulation application transition to a CONVEX supercomputer. Enclosed is a detailed description of the conversion process and a brief description of the Langley Research Center's (LaRC) flight simulation application program structure. Currently, this simulation program may be configured to represent Sikorsky S-61 helicopter (a five-blade, single-rotor, commercial passenger-type helicopter) or an Army Cobra helicopter (either the AH-1 G or AH-1 S model). This report refers to the Sikorsky S-61 simulation program since it is the most frequently used configuration.
Ellingson, Sally R; Dakshanamurthy, Sivanesan; Brown, Milton; Smith, Jeremy C; Baudry, Jerome
2014-04-25
In this paper we give the current state of high-throughput virtual screening. We describe a case study of using a task-parallel MPI (Message Passing Interface) version of Autodock4 [1], [2] to run a virtual high-throughput screen of one-million compounds on the Jaguar Cray XK6 Supercomputer at Oak Ridge National Laboratory. We include a description of scripts developed to increase the efficiency of the predocking file preparation and postdocking analysis. A detailed tutorial, scripts, and source code for this MPI version of Autodock4 are available online at http://www.bio.utk.edu/baudrylab/autodockmpi.htm.
Sequence search on a supercomputer.
Gotoh, O; Tagashira, Y
1986-01-10
A set of programs was developed for searching nucleic acid and protein sequence data bases for sequences similar to a given sequence. The programs, written in FORTRAN 77, were optimized for vector processing on a Hitachi S810-20 supercomputer. A search of a 500-residue protein sequence against the entire PIR data base Ver. 1.0 (1) (0.5 M residues) is carried out in a CPU time of 45 sec. About 4 min is required for an exhaustive search of a 1500-base nucleotide sequence against all mammalian sequences (1.2M bases) in Genbank Ver. 29.0. The CPU time is reduced to about a quarter with a faster version.
Science & Technology Review November 2006
DOE Office of Scientific and Technical Information (OSTI.GOV)
Radousky, H
This months issue has the following articles: (1) Expanded Supercomputing Maximizes Scientific Discovery--Commentary by Dona Crawford; (2) Thunder's Power Delivers Breakthrough Science--Livermore's Thunder supercomputer allows researchers to model systems at scales never before possible. (3) Extracting Key Content from Images--A new system called the Image Content Engine is helping analysts find significant but hard-to-recognize details in overhead images. (4) Got Oxygen?--Oxygen, especially oxygen metabolism, was key to evolution, and a Livermore project helps find out why. (5) A Shocking New Form of Laserlike Light--According to research at Livermore, smashing a crystal with a shock wave can result in coherent light.
Optimal Full Information Synthesis for Flexible Structures Implemented on Cray Supercomputers
NASA Technical Reports Server (NTRS)
Lind, Rick; Balas, Gary J.
1995-01-01
This paper considers an algorithm for synthesis of optimal controllers for full information feedback. The synthesis procedure reduces to a single linear matrix inequality which may be solved via established convex optimization algorithms. The computational cost of the optimization is investigated. It is demonstrated the problem dimension and corresponding matrices can become large for practical engineering problems. This algorithm represents a process that is impractical for standard workstations for large order systems. A flexible structure is presented as a design example. Control synthesis requires several days on a workstation but may be solved in a reasonable amount of time using a Cray supercomputer.
Transferring ecosystem simulation codes to supercomputers
NASA Technical Reports Server (NTRS)
Skiles, J. W.; Schulbach, C. H.
1995-01-01
Many ecosystem simulation computer codes have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Supercomputing platforms (both parallel and distributed systems) have been largely unused, however, because of the perceived difficulty in accessing and using the machines. Also, significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers must be considered. We have transferred a grassland simulation model (developed on a VAX) to a Cray Y-MP/C90. We describe porting the model to the Cray and the changes we made to exploit the parallelism in the application and improve code execution. The Cray executed the model 30 times faster than the VAX and 10 times faster than a Unix workstation. We achieved an additional speedup of 30 percent by using the compiler's vectoring and 'in-line' capabilities. The code runs at only about 5 percent of the Cray's peak speed because it ineffectively uses the vector and parallel processing capabilities of the Cray. We expect that by restructuring the code, it could execute an additional six to ten times faster.
Federal Market Information Technology in the Post Flash Crash Era: Roles for Supercomputing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bethel, E. Wes; Leinweber, David; Ruebel, Oliver
2011-09-16
This paper describes collaborative work between active traders, regulators, economists, and supercomputing researchers to replicate and extend investigations of the Flash Crash and other market anomalies in a National Laboratory HPC environment. Our work suggests that supercomputing tools and methods will be valuable to market regulators in achieving the goal of market safety, stability, and security. Research results using high frequency data and analytics are described, and directions for future development are discussed. Currently the key mechanism for preventing catastrophic market action are “circuit breakers.” We believe a more graduated approach, similar to the “yellow light” approach in motorsports tomore » slow down traffic, might be a better way to achieve the same goal. To enable this objective, we study a number of indicators that could foresee hazards in market conditions and explore options to confirm such predictions. Our tests confirm that Volume Synchronized Probability of Informed Trading (VPIN) and a version of volume Herfindahl-Hirschman Index (HHI) for measuring market fragmentation can indeed give strong signals ahead of the Flash Crash event on May 6 2010. This is a preliminary step toward a full-fledged early-warning system for unusual market conditions.« less
Compute Server Performance Results
NASA Technical Reports Server (NTRS)
Stockdale, I. E.; Barton, John; Woodrow, Thomas (Technical Monitor)
1994-01-01
Parallel-vector supercomputers have been the workhorses of high performance computing. As expectations of future computing needs have risen faster than projected vector supercomputer performance, much work has been done investigating the feasibility of using Massively Parallel Processor systems as supercomputers. An even more recent development is the availability of high performance workstations which have the potential, when clustered together, to replace parallel-vector systems. We present a systematic comparison of floating point performance and price-performance for various compute server systems. A suite of highly vectorized programs was run on systems including traditional vector systems such as the Cray C90, and RISC workstations such as the IBM RS/6000 590 and the SGI R8000. The C90 system delivers 460 million floating point operations per second (FLOPS), the highest single processor rate of any vendor. However, if the price-performance ration (PPR) is considered to be most important, then the IBM and SGI processors are superior to the C90 processors. Even without code tuning, the IBM and SGI PPR's of 260 and 220 FLOPS per dollar exceed the C90 PPR of 160 FLOPS per dollar when running our highly vectorized suite,
1993 Gordon Bell Prize Winners
NASA Technical Reports Server (NTRS)
Karp, Alan H.; Simon, Horst; Heller, Don; Cooper, D. M. (Technical Monitor)
1994-01-01
The Gordon Bell Prize recognizes significant achievements in the application of supercomputers to scientific and engineering problems. In 1993, finalists were named for work in three categories: (1) Performance, which recognizes those who solved a real problem in the quickest elapsed time. (2) Price/performance, which encourages the development of cost-effective supercomputing. (3) Compiler-generated speedup, which measures how well compiler writers are facilitating the programming of parallel processors. The winners were announced November 17 at the Supercomputing 93 conference in Portland, Oregon. Gordon Bell, an independent consultant in Los Altos, California, is sponsoring $2,000 in prizes each year for 10 years to promote practical parallel processing research. This is the sixth year of the prize, which Computer administers. Something unprecedented in Gordon Bell Prize competition occurred this year: A computer manufacturer was singled out for recognition. Nine entries reporting results obtained on the Cray C90 were received, seven of the submissions orchestrated by Cray Research. Although none of these entries showed sufficiently high performance to win outright, the judges were impressed by the breadth of applications that ran well on this machine, all nine running at more than a third of the peak performance of the machine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moniz, Ernest; Carr, Alan; Bethe, Hans
The Trinity Test of July 16, 1945 was the first full-scale, real-world test of a nuclear weapon; with the new Trinity supercomputer Los Alamos National Laboratory's goal is to do this virtually, in 3D. Trinity was the culmination of a fantastic effort of groundbreaking science and engineering by hundreds of men and women at Los Alamos and other Manhattan Project sites. It took them less than two years to change the world. The Laboratory is marking the 70th anniversary of the Trinity Test because it not only ushered in the Nuclear Age, but with it the origin of today’s advancedmore » supercomputing. We live in the Age of Supercomputers due in large part to nuclear weapons science here at Los Alamos. National security science, and nuclear weapons science in particular, at Los Alamos National Laboratory have provided a key motivation for the evolution of large-scale scientific computing. Beginning with the Manhattan Project there has been a constant stream of increasingly significant, complex problems in nuclear weapons science whose timely solutions demand larger and faster computers. The relationship between national security science at Los Alamos and the evolution of computing is one of interdependence.« less
Improving Memory Error Handling Using Linux
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carlton, Michael Andrew; Blanchard, Sean P.; Debardeleben, Nathan A.
As supercomputers continue to get faster and more powerful in the future, they will also have more nodes. If nothing is done, then the amount of memory in supercomputer clusters will soon grow large enough that memory failures will be unmanageable to deal with by manually replacing memory DIMMs. "Improving Memory Error Handling Using Linux" is a process oriented method to solve this problem by using the Linux kernel to disable (offline) faulty memory pages containing bad addresses, preventing them from being used again by a process. The process of offlining memory pages simplifies error handling and results in reducingmore » both hardware and manpower costs required to run Los Alamos National Laboratory (LANL) clusters. This process will be necessary for the future of supercomputing to allow the development of exascale computers. It will not be feasible without memory error handling to manually replace the number of DIMMs that will fail daily on a machine consisting of 32-128 petabytes of memory. Testing reveals the process of offlining memory pages works and is relatively simple to use. As more and more testing is conducted, the entire process will be automated within the high-performance computing (HPC) monitoring software, Zenoss, at LANL.« less
NASA Astrophysics Data System (ADS)
Schaaf, Kjeld; Overeem, Ruud
2004-06-01
Moore’s law is best exploited by using consumer market hardware. In particular, the gaming industry pushes the limit of processor performance thus reducing the cost per raw flop even faster than Moore’s law predicts. Next to the cost benefits of Common-Of-The-Shelf (COTS) processing resources, there is a rapidly growing experience pool in cluster based processing. The typical Beowulf cluster of PC’s supercomputers are well known. Multiple examples exists of specialised cluster computers based on more advanced server nodes or even gaming stations. All these cluster machines build upon the same knowledge about cluster software management, scheduling, middleware libraries and mathematical libraries. In this study, we have integrated COTS processing resources and cluster nodes into a very high performance processing platform suitable for streaming data applications, in particular to implement a correlator. The required processing power for the correlator in modern radio telescopes is in the range of the larger supercomputers, which motivates the usage of supercomputer technology. Raw processing power is provided by graphical processors and is combined with an Infiniband host bus adapter with integrated data stream handling logic. With this processing platform a scalable correlator can be built with continuously growing processing power at consumer market prices.
Moniz, Ernest; Carr, Alan; Bethe, Hans; Morrison, Phillip; Ramsay, Norman; Teller, Edward; Brixner, Berlyn; Archer, Bill; Agnew, Harold; Morrison, John
2018-01-16
The Trinity Test of July 16, 1945 was the first full-scale, real-world test of a nuclear weapon; with the new Trinity supercomputer Los Alamos National Laboratory's goal is to do this virtually, in 3D. Trinity was the culmination of a fantastic effort of groundbreaking science and engineering by hundreds of men and women at Los Alamos and other Manhattan Project sites. It took them less than two years to change the world. The Laboratory is marking the 70th anniversary of the Trinity Test because it not only ushered in the Nuclear Age, but with it the origin of todayâs advanced supercomputing. We live in the Age of Supercomputers due in large part to nuclear weapons science here at Los Alamos. National security science, and nuclear weapons science in particular, at Los Alamos National Laboratory have provided a key motivation for the evolution of large-scale scientific computing. Beginning with the Manhattan Project there has been a constant stream of increasingly significant, complex problems in nuclear weapons science whose timely solutions demand larger and faster computers. The relationship between national security science at Los Alamos and the evolution of computing is one of interdependence.
The Q continuum simulation: Harnessing the power of GPU accelerated supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heitmann, Katrin; Frontiere, Nicholas; Sewell, Chris
2015-08-01
Modeling large-scale sky survey observations is a key driver for the continuing development of high-resolution, large-volume, cosmological simulations. We report the first results from the "Q Continuum" cosmological N-body simulation run carried out on the GPU-accelerated supercomputer Titan. The simulation encompasses a volume of (1300 Mpc)(3) and evolves more than half a trillion particles, leading to a particle mass resolution of m(p) similar or equal to 1.5 . 10(8) M-circle dot. At thismass resolution, the Q Continuum run is currently the largest cosmology simulation available. It enables the construction of detailed synthetic sky catalogs, encompassing different modeling methodologies, including semi-analyticmore » modeling and sub-halo abundance matching in a large, cosmological volume. Here we describe the simulation and outputs in detail and present first results for a range of cosmological statistics, such as mass power spectra, halo mass functions, and halo mass-concentration relations for different epochs. We also provide details on challenges connected to running a simulation on almost 90% of Titan, one of the fastest supercomputers in the world, including our usage of Titan's GPU accelerators.« less
An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer.
Yang, Xi; Wu, Chengkun; Lu, Kai; Fang, Lin; Zhang, Yong; Li, Shengkang; Guo, Guixin; Du, YunFei
2017-12-01
Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides extra possibilities and capacity to address the challenges associated with big data. In this paper, we propose Orion-a big data interface on the Tianhe-2 supercomputer-to enable big data applications to run on Tianhe-2 via a single command or a shell script. Orion supports multiple users, and each user can launch multiple tasks. It minimizes the effort needed to initiate big data applications on the Tianhe-2 supercomputer via automated configuration. Orion follows the "allocate-when-needed" paradigm, and it avoids the idle occupation of computational resources. We tested the utility and performance of Orion using a big genomic dataset and achieved a satisfactory performance on Tianhe-2 with very few modifications to existing applications that were implemented in Hadoop/Spark. In summary, Orion provides a practical and economical interface for big data processing on Tianhe-2.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bailey, David H.
The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, althoughmore » the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.« less
Quantum.Ligand.Dock: protein-ligand docking with quantum entanglement refinement on a GPU system.
Kantardjiev, Alexander A
2012-07-01
Quantum.Ligand.Dock (protein-ligand docking with graphic processing unit (GPU) quantum entanglement refinement on a GPU system) is an original modern method for in silico prediction of protein-ligand interactions via high-performance docking code. The main flavour of our approach is a combination of fast search with a special account for overlooked physical interactions. On the one hand, we take care of self-consistency and proton equilibria mutual effects of docking partners. On the other hand, Quantum.Ligand.Dock is the the only docking server offering such a subtle supplement to protein docking algorithms as quantum entanglement contributions. The motivation for development and proposition of the method to the community hinges upon two arguments-the fundamental importance of quantum entanglement contribution in molecular interaction and the realistic possibility to implement it by the availability of supercomputing power. The implementation of sophisticated quantum methods is made possible by parallelization at several bottlenecks on a GPU supercomputer. The high-performance implementation will be of use for large-scale virtual screening projects, structural bioinformatics, systems biology and fundamental research in understanding protein-ligand recognition. The design of the interface is focused on feasibility and ease of use. Protein and ligand molecule structures are supposed to be submitted as atomic coordinate files in PDB format. A customization section is offered for addition of user-specified charges, extra ionogenic groups with intrinsic pK(a) values or fixed ions. Final predicted complexes are ranked according to obtained scores and provided in PDB format as well as interactive visualization in a molecular viewer. Quantum.Ligand.Dock server can be accessed at http://87.116.85.141/LigandDock.html.
Final report and recommendations of the ESnet Authentication Pilot Project
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, G.R.; Moore, J.P.; Athey, C.L.
1995-01-01
To conduct their work, U.S. Department of Energy (DOE) researchers require access to a wide range of computing systems and information resources outside of their respective laboratories. Electronically communicating with peers using the global Internet has become a necessity to effective collaboration with university, industrial, and other government partners. DOE`s Energy Sciences Network (ESnet) needs to be engineered to facilitate this {open_quotes}collaboratory{close_quotes} while ensuring the protection of government computing resources from unauthorized use. Sensitive information and intellectual properties must be protected from unauthorized disclosure, modification, or destruction. In August 1993, DOE funded four ESnet sites (Argonne National Laboratory, Lawrence Livermoremore » National Laboratory, the National Energy Research Supercomputer Center, and Pacific Northwest Laboratory) to begin implementing and evaluating authenticated ESnet services using the advanced Kerberos Version 5. The purpose of this project was to identify, understand, and resolve the technical, procedural, cultural, and policy issues surrounding peer-to-peer authentication in an inter-organization internet. The investigators have concluded that, with certain conditions, Kerberos Version 5 is a suitable technology to enable ESnet users to freely share resources and information without compromising the integrity of their systems and data. The pilot project has demonstrated that Kerberos Version 5 is capable of supporting trusted third-party authentication across an inter-organization internet and that Kerberos Version 5 would be practical to implement across the ESnet community within the U.S. The investigators made several modifications to the Kerberos Version 5 system that are necessary for operation in the current Internet environment and have documented other technical shortcomings that must be addressed before large-scale deployment is attempted.« less
Exploring Solar System Origins With The Desert Fireball Network
NASA Astrophysics Data System (ADS)
Day, B. H.; Bland, P.
2016-12-01
Fireball camera networks are designed to recover meteorites with orbits. A geological context is a prerequisite for understanding terrestrial rocks. An improved dynamical context would benefit our understanding of extraterrestrial geology. A dozen projects - professional and amateur - have pursued this goal over the years. The effort has yielded 10 meteorites with orbits. Why so few? All these projects were in the temperate zone of the northern hemisphere: areas where meteorite recovery is marginal. Deserts are one of the few places on Earth where field searches for meteorites can be mounted with a realistic chance of success. This was the driver behind the Desert Fireball Network. The Desert Fireball Network (DFN) uses automated observatories across Australia to triangulate trajectories of meteorites entering the atmosphere, determine pre-entry orbits, and pinpoint their fall positions. Each observatory is an autonomous intelligent imaging system, taking 1000×36Megapixel all-sky images throughout the night, using neural network algorithms to recognise events. They are capable of operating for 12 months in a harsh environment, and store all imagery collected. We developed a completely automated software pipeline for data reduction, and built a supercomputer database for storage, allowing us to process our entire archive. We successfully recovered a meteorite from Lake Eyre on 31st December 2015, using this pipeline. By February 2016 we had reduced our complete fireball dataset, deriving precise orbits for >350 events: a dataset that provides a unique window on the dynamics of material in the inner solar system. The DFN currently stands at 50 stations distributed across the Australian continent, covering an area of 2.5 million km2. The fireball and meteorite orbital data that it can provide will deliver a new dynamical window on the inner solar system, and new insights into solar system origins. Working with DFN's partners at NASA's Solar System Exploration Research Virtual Institute, the team is now working to expand the network beyond Australia to locations around the world.
Enabling Discoveries in Earth Sciences Through the Geosciences Network (GEON)
NASA Astrophysics Data System (ADS)
Seber, D.; Baru, C.; Memon, A.; Lin, K.; Youn, C.
2005-12-01
Taking advantage of the state-of-the-art information technology resources GEON researchers are building a cyberinfrastructure designed to enable data sharing, semantic data integration, high-end computations and 4D visualization in easy-to-use web-based environments. The GEON Network currently allows users to search and register Earth science resources such as data sets (GIS layers, GMT files, geoTIFF images, ASCII files, relational databases etc), software applications or ontologies. Portal based access mechanisms enable developers to built dynamic user interfaces to conduct advanced processing and modeling efforts across distributed computers and supercomputers. Researchers and educators can access the networked resources through the GEON portal and its portlets that were developed to conduct better and more comprehensive science and educational studies. For example, the SYNSEIS portlet in GEON enables users to access in near-real time seismic waveforms from the IRIS Data Management Center, easily build a 3D geologic model within the area of the seismic station(s) and the epicenter and perform a 3D synthetic seismogram analysis to understand the lithospheric structure and earthquake source parameters for any given earthquake in the US. Similarly, GEON's workbench area enables users to create their own work environment and copy, visualize and analyze any data sets within the network, and create subsets of the data sets for their own purposes. Since all these resources are built as part of a Service-oriented Architecture (SOA), they are also used in other development platforms. One such platform is Kepler Workflow system which can access web service based resources and provides users with graphical programming interfaces to build a model to conduct computations and/or visualization efforts using the networked resources. Developments in the area of semantic integration of the networked datasets continue to advance and prototype studies can be accessed via the GEON portal at www.geongrid.org
HPC enabled real-time remote processing of laparoscopic surgery
NASA Astrophysics Data System (ADS)
Ronaghi, Zahra; Sapra, Karan; Izard, Ryan; Duffy, Edward; Smith, Melissa C.; Wang, Kuang-Ching; Kwartowitz, David M.
2016-03-01
Laparoscopic surgery is a minimally invasive surgical technique. The benefit of small incisions has a disadvantage of limited visualization of subsurface tissues. Image-guided surgery (IGS) uses pre-operative and intra-operative images to map subsurface structures. One particular laparoscopic system is the daVinci-si robotic surgical system. The video streams generate approximately 360 megabytes of data per second. Real-time processing this large stream of data on a bedside PC, single or dual node setup, has become challenging and a high-performance computing (HPC) environment may not always be available at the point of care. To process this data on remote HPC clusters at the typical 30 frames per second rate, it is required that each 11.9 MB video frame be processed by a server and returned within 1/30th of a second. We have implement and compared performance of compression, segmentation and registration algorithms on Clemson's Palmetto supercomputer using dual NVIDIA K40 GPUs per node. Our computing framework will also enable reliability using replication of computation. We will securely transfer the files to remote HPC clusters utilizing an OpenFlow-based network service, Steroid OpenFlow Service (SOS) that can increase performance of large data transfers over long-distance and high bandwidth networks. As a result, utilizing high-speed OpenFlow- based network to access computing clusters with GPUs will improve surgical procedures by providing real-time medical image processing and laparoscopic data.
Kukkonen, C A
1995-06-01
High-speed information processing technologies being developed and applied by the Jet Propulsion Laboratory for NASA and Department of Defense mission needs have potential dual-uses in telemedicine and other medical applications. Fiber optic ground networks connected with microwave satellite links allow NASA to communicate with its astronauts in Earth orbit or on the moon, and with its deep space probes billions of miles away. These networks monitor the health of astronauts and or robotic spacecraft. Similar communications technology will also allow patients to communicate with doctors anywhere on Earth. NASA space missions have science as a major objective. Science sensors have become so sophisticated that they can take more data than our scientists can analyze by hand. High performance computers--workstations, supercomputer and massively parallel computers are being used to transform this data into knowledge. This is done using image processing, data visualization and other techniques to present the data--one's and zero's in forms that a human analyst can readily relate to and understand. Medical sensors have also explored in the in data output--witness CT scans, MRI, and ultrasound. This data must be presented in visual form and computers will allow routine combination of many two dimensional MRI images into three dimensional reconstructions of organs that then can be fully examined by physicians. Emerging technologies such as neural networks that are being "trained" to detect craters on planets or incoming missiles amongst decoys can be used to identify microcalcification in mammograms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tierney, Brian; Dart, Eli; Tierney, Brian
The Energy Sciences Network (ESnet) is the primary provider of network connectivity for the U.S. Department of Energy Office of Science, the single largest supporter of basic research in the physical sciences in the United States of America. In support of the Office of Science programs, ESnet regularly updates and refreshes its understanding of the networking requirements of the instruments, facilities, scientists, and science programs that it serves. This focus has helped ESnet to be a highly successful enabler of scientific discovery for over 20 years. In March 2008, ESnet and the Fusion Energy Sciences (FES) Program Office of themore » DOE Office of Science organized a workshop to characterize the networking requirements of the science programs funded by the FES Program Office. Most sites that conduct data-intensive activities (the Tokamaks at GA and MIT, the supercomputer centers at NERSC and ORNL) show a need for on the order of 10 Gbps of network bandwidth for FES-related work within 5 years. PPPL reported a need for 8 times that (80 Gbps) in that time frame. Estimates for the 5-10 year time period are up to 160 Mbps for large simulations. Bandwidth requirements for ITER range from 10 to 80 Gbps. In terms of science process and collaboration structure, it is clear that the proposed Fusion Simulation Project (FSP) has the potential to significantly impact the data movement patterns and therefore the network requirements for U.S. fusion science. As the FSP is defined over the next two years, these changes will become clearer. Also, there is a clear and present unmet need for better network connectivity between U.S. FES sites and two Asian fusion experiments--the EAST Tokamak in China and the KSTAR Tokamak in South Korea. In addition to achieving its goal of collecting and characterizing the network requirements of the science endeavors funded by the FES Program Office, the workshop emphasized that there is a need for research into better ways of conducting remote collaboration with the control room of a Tokamak running an experiment. This is especially important since the current plans for ITER assume that this problem will be solved.« less
NASA Technical Reports Server (NTRS)
Ganguly, Sangram; Kalia, Subodh; Li, Shuang; Michaelis, Andrew; Nemani, Ramakrishna R.; Saatchi, Sassan A
2017-01-01
Uncertainties in input land cover estimates contribute to a significant bias in modeled above ground biomass (AGB) and carbon estimates from satellite-derived data. The resolution of most currently used passive remote sensing products is not sufficient to capture tree canopy cover of less than ca. 10-20 percent, limiting their utility to estimate canopy cover and AGB for trees outside of forest land. In our study, we created a first of its kind Continental United States (CONUS) tree cover map at a spatial resolution of 1-m for the 2010-2012 epoch using the USDA NAIP imagery to address the present uncertainties in AGB estimates. The process involves different tasks including data acquisition ingestion to pre-processing and running a state-of-art encoder-decoder based deep convolutional neural network (CNN) algorithm for automatically generating a tree non-tree map for almost a quarter million scenes. The entire processing chain including generation of the largest open source existing aerial satellite image training database was performed at the NEX supercomputing and storage facility. We believe the resulting forest cover product will substantially contribute to filling the gaps in ongoing carbon and ecological monitoring research and help quantifying the errors and uncertainties in derived products.
Intelligent distributed medical image management
NASA Astrophysics Data System (ADS)
Garcia, Hong-Mei C.; Yun, David Y.
1995-05-01
The rapid advancements in high performance global communication have accelerated cooperative image-based medical services to a new frontier. Traditional image-based medical services such as radiology and diagnostic consultation can now fully utilize multimedia technologies in order to provide novel services, including remote cooperative medical triage, distributed virtual simulation of operations, as well as cross-country collaborative medical research and training. Fast (efficient) and easy (flexible) retrieval of relevant images remains a critical requirement for the provision of remote medical services. This paper describes the database system requirements, identifies technological building blocks for meeting the requirements, and presents a system architecture for our target image database system, MISSION-DBS, which has been designed to fulfill the goals of Project MISSION (medical imaging support via satellite integrated optical network) -- an experimental high performance gigabit satellite communication network with access to remote supercomputing power, medical image databases, and 3D visualization capabilities in addition to medical expertise anywhere and anytime around the country. The MISSION-DBS design employs a synergistic fusion of techniques in distributed databases (DDB) and artificial intelligence (AI) for storing, migrating, accessing, and exploring images. The efficient storage and retrieval of voluminous image information is achieved by integrating DDB modeling and AI techniques for image processing while the flexible retrieval mechanisms are accomplished by combining attribute- based and content-based retrievals.
DeepSAT's CloudCNN: A Deep Neural Network for Rapid Cloud Detection from Geostationary Satellites
NASA Astrophysics Data System (ADS)
Kalia, S.; Li, S.; Ganguly, S.; Nemani, R. R.
2017-12-01
Cloud and cloud shadow detection has important applications in weather and climate studies. It is even more crucial when we introduce geostationary satellites into the field of terrestrial remotesensing. With the challenges associated with data acquired in very high frequency (10-15 mins per scan), the ability to derive an accurate cloud/shadow mask from geostationary satellite data iscritical. The key to the success for most of the existing algorithms depends on spatially and temporally varying thresholds, which better capture local atmospheric and surface effects.However, the selection of proper threshold is difficult and may lead to erroneous results. In this work, we propose a deep neural network based approach called CloudCNN to classifycloud/shadow from Himawari-8 AHI and GOES-16 ABI multispectral data. DeepSAT's CloudCNN consists of an encoder-decoder based architecture for binary-class pixel wise segmentation. We train CloudCNN on multi-GPU Nvidia Devbox cluster, and deploy the prediction pipeline on NASA Earth Exchange (NEX) Pleiades supercomputer. We achieved an overall accuracy of 93.29% on test samples. Since, the predictions take only a few seconds to segment a full multi-spectral GOES-16 or Himawari-8 Full Disk image, the developed framework can be used for real-time cloud detection, cyclone detection, or extreme weather event predictions.
NASA Astrophysics Data System (ADS)
Ganguly, S.; Kalia, S.; Li, S.; Michaelis, A.; Nemani, R. R.; Saatchi, S.
2017-12-01
Uncertainties in input land cover estimates contribute to a significant bias in modeled above gound biomass (AGB) and carbon estimates from satellite-derived data. The resolution of most currently used passive remote sensing products is not sufficient to capture tree canopy cover of less than ca. 10-20 percent, limiting their utility to estimate canopy cover and AGB for trees outside of forest land. In our study, we created a first of its kind Continental United States (CONUS) tree cover map at a spatial resolution of 1-m for the 2010-2012 epoch using the USDA NAIP imagery to address the present uncertainties in AGB estimates. The process involves different tasks including data acquisition/ingestion to pre-processing and running a state-of-art encoder-decoder based deep convolutional neural network (CNN) algorithm for automatically generating a tree/non-tree map for almost a quarter million scenes. The entire processing chain including generation of the largest open source existing aerial/satellite image training database was performed at the NEX supercomputing and storage facility. We believe the resulting forest cover product will substantially contribute to filling the gaps in ongoing carbon and ecological monitoring research and help quantifying the errors and uncertainties in derived products.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-16
...-BI51 Information Reporting for Payments Made in Settlement of Payment Card and Third Party Network..., and backup withholding requirements for payment card and third party network transactions. The final... third party network transactions for each calendar year. The final regulations in this document will...
ICPP: Approach for Understanding Complexity of Plasma
NASA Astrophysics Data System (ADS)
Sato, Tetsuya
2000-10-01
In this talk I wish to present an IT system that could promote Science of Complexity. In order to deal with a seemingly `complex' phenomenon, which means `beyond analytical manipulation', computer simulation is a viable powerful tool. However, complexity implies a concept beyond the horizon of reductionism. Therefore, rather than simply solving a complex phenomenon for a given boundary condition, one must establish an intelligent way of attacking mutual evolution of a system and its environment. NIFS-TCSC has been developing a prototype system that consists of supercomputers, virtual reality devices and high-speed network system. Let us explain this by picking up a global atmospheric circulation group, global oceanic circulation group and local weather prediction group. Local weather prediction group predicts the local change of the weather such as the creation of cloud and rain in the near future under the global conditions obtained by the global atmospheric and ocean groups. The global groups run simulations by modifying the local heat source/sink evaluated by the local weather prediction and then obtain the global conditions in the next time step. By repeating such a feedback performance one can predict the mutual evolution of the local system and its environment. Mutual information exchanges among multiple groups are carried out instantaneously by the networked common virtual reality space in which 3-D global and local images of the atmospheric and oceanic circulation and the cloud and rain maps are arbitrarily manipulated by any of the groups and commonly viewed. The present networking system has a great advantage that any simulation groups can freely and arbitrarily change their alignment, so that mutual evolution of any stratum system can become tractable by utilizing this network system.
High performance computing applications in neurobiological research
NASA Technical Reports Server (NTRS)
Ross, Muriel D.; Cheng, Rei; Doshay, David G.; Linton, Samuel W.; Montgomery, Kevin; Parnas, Bruce R.
1994-01-01
The human nervous system is a massively parallel processor of information. The vast numbers of neurons, synapses and circuits is daunting to those seeking to understand the neural basis of consciousness and intellect. Pervading obstacles are lack of knowledge of the detailed, three-dimensional (3-D) organization of even a simple neural system and the paucity of large scale, biologically relevant computer simulations. We use high performance graphics workstations and supercomputers to study the 3-D organization of gravity sensors as a prototype architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scale-up, three-dimensional versions run on the Cray Y-MP and CM5 supercomputers.
The TESS science processing operations center
NASA Astrophysics Data System (ADS)
Jenkins, Jon M.; Twicken, Joseph D.; McCauliff, Sean; Campbell, Jennifer; Sanderfer, Dwight; Lung, David; Mansouri-Samani, Masoud; Girouard, Forrest; Tenenbaum, Peter; Klaus, Todd; Smith, Jeffrey C.; Caldwell, Douglas A.; Chacon, A. D.; Henze, Christopher; Heiges, Cory; Latham, David W.; Morgan, Edward; Swade, Daryl; Rinehart, Stephen; Vanderspek, Roland
2016-08-01
The Transiting Exoplanet Survey Satellite (TESS) will conduct a search for Earth's closest cousins starting in early 2018 and is expected to discover 1,000 small planets with Rp < 4 R⊕ and measure the masses of at least 50 of these small worlds. The Science Processing Operations Center (SPOC) is being developed at NASA Ames Research Center based on the Kepler science pipeline and will generate calibrated pixels and light curves on the NASA Advanced Supercomputing Division's Pleiades supercomputer. The SPOC will also search for periodic transit events and generate validation products for the transit-like features in the light curves. All TESS SPOC data products will be archived to the Mikulski Archive for Space Telescopes (MAST).
CFD code evaluation for internal flow modeling
NASA Technical Reports Server (NTRS)
Chung, T. J.
1990-01-01
Research on the computational fluid dynamics (CFD) code evaluation with emphasis on supercomputing in reacting flows is discussed. Advantages of unstructured grids, multigrids, adaptive methods, improved flow solvers, vector processing, parallel processing, and reduction of memory requirements are discussed. As examples, researchers include applications of supercomputing to reacting flow Navier-Stokes equations including shock waves and turbulence and combustion instability problems associated with solid and liquid propellants. Evaluation of codes developed by other organizations are not included. Instead, the basic criteria for accuracy and efficiency have been established, and some applications on rocket combustion have been made. Research toward an ultimate goal, the most accurate and efficient CFD code, is in progress and will continue for years to come.
Internal computational fluid mechanics on supercomputers for aerospace propulsion systems
NASA Technical Reports Server (NTRS)
Andersen, Bernhard H.; Benson, Thomas J.
1987-01-01
The accurate calculation of three-dimensional internal flowfields for application towards aerospace propulsion systems requires computational resources available only on supercomputers. A survey is presented of three-dimensional calculations of hypersonic, transonic, and subsonic internal flowfields conducted at the Lewis Research Center. A steady state Parabolized Navier-Stokes (PNS) solution of flow in a Mach 5.0, mixed compression inlet, a Navier-Stokes solution of flow in the vicinity of a terminal shock, and a PNS solution of flow in a diffusing S-bend with vortex generators are presented and discussed. All of these calculations were performed on either the NAS Cray-2 or the Lewis Research Center Cray XMP.
Supercomputer modeling of hydrogen combustion in rocket engines
NASA Astrophysics Data System (ADS)
Betelin, V. B.; Nikitin, V. F.; Altukhov, D. I.; Dushin, V. R.; Koo, Jaye
2013-08-01
Hydrogen being an ecological fuel is very attractive now for rocket engines designers. However, peculiarities of hydrogen combustion kinetics, the presence of zones of inverse dependence of reaction rate on pressure, etc. prevents from using hydrogen engines in all stages not being supported by other types of engines, which often brings the ecological gains back to zero from using hydrogen. Computer aided design of new effective and clean hydrogen engines needs mathematical tools for supercomputer modeling of hydrogen-oxygen components mixing and combustion in rocket engines. The paper presents the results of developing verification and validation of mathematical model making it possible to simulate unsteady processes of ignition and combustion in rocket engines.
Close to real life. [solving for transonic flow about lifting airfoils using supercomputers
NASA Technical Reports Server (NTRS)
Peterson, Victor L.; Bailey, F. Ron
1988-01-01
NASA's Numerical Aerodynamic Simulation (NAS) facility for CFD modeling of highly complex aerodynamic flows employs as its basic hardware two Cray-2s, an ETA-10 Model Q, an Amdahl 5880 mainframe computer that furnishes both support processing and access to 300 Gbytes of disk storage, several minicomputers and superminicomputers, and a Thinking Machines 16,000-device 'connection machine' processor. NAS, which was the first supercomputer facility to standardize operating-system and communication software on all processors, has done important Space Shuttle aerodynamics simulations and will be critical to the configurational refinement of the National Aerospace Plane and its intergrated powerplant, which will involve complex, high temperature reactive gasdynamic computations.
Ohue, Masahito; Shimoda, Takehiro; Suzuki, Shuji; Matsuzaki, Yuri; Ishida, Takashi; Akiyama, Yutaka
2014-11-15
The application of protein-protein docking in large-scale interactome analysis is a major challenge in structural bioinformatics and requires huge computing resources. In this work, we present MEGADOCK 4.0, an FFT-based docking software that makes extensive use of recent heterogeneous supercomputers and shows powerful, scalable performance of >97% strong scaling. MEGADOCK 4.0 is written in C++ with OpenMPI and NVIDIA CUDA 5.0 (or later) and is freely available to all academic and non-profit users at: http://www.bi.cs.titech.ac.jp/megadock. akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Optimal wavelength-space crossbar switches for supercomputer optical interconnects.
Roudas, Ioannis; Hemenway, B Roe; Grzybowski, Richard R; Karinou, Fotini
2012-08-27
We propose a most economical design of the Optical Shared MemOry Supercomputer Interconnect System (OSMOSIS) all-optical, wavelength-space crossbar switch fabric. It is shown, by analysis and simulation, that the total number of on-off gates required for the proposed N × N switch fabric can scale asymptotically as N ln N if the number of input/output ports N can be factored into a product of small primes. This is of the same order of magnitude as Shannon's lower bound for switch complexity, according to which the minimum number of two-state switches required for the construction of a N × N permutation switch is log2 (N!).
Role of High-End Computing in Meeting NASA's Science and Engineering Challenges
NASA Technical Reports Server (NTRS)
Biswas, Rupak
2006-01-01
High-End Computing (HEC) has always played a major role in meeting the modeling and simulation needs of various NASA missions. With NASA's newest 62 teraflops Columbia supercomputer, HEC is having an even greater impact within the Agency and beyond. Significant cutting-edge science and engineering simulations in the areas of space exploration, Shuttle operations, Earth sciences, and aeronautics research, are already occurring on Columbia, demonstrating its ability to accelerate NASA s exploration vision. The talk will describe how the integrated supercomputing production environment is being used to reduce design cycle time, accelerate scientific discovery, conduct parametric analysis of multiple scenarios, and enhance safety during the life cycle of NASA missions.
MILC Code Performance on High End CPU and GPU Supercomputer Clusters
NASA Astrophysics Data System (ADS)
DeTar, Carleton; Gottlieb, Steven; Li, Ruizi; Toussaint, Doug
2018-03-01
With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.
NASA Technical Reports Server (NTRS)
Yarrow, Maurice; McCann, Karen M.; Biswas, Rupak; VanderWijngaart, Rob; Yan, Jerry C. (Technical Monitor)
2000-01-01
The creation of parameter study suites has recently become a more challenging problem as the parameter studies have now become multi-tiered and the computational environment has become a supercomputer grid. The parameter spaces are vast, the individual problem sizes are getting larger, and researchers are now seeking to combine several successive stages of parameterization and computation. Simultaneously, grid-based computing offers great resource opportunity but at the expense of great difficulty of use. We present an approach to this problem which stresses intuitive visual design tools for parameter study creation and complex process specification, and also offers programming-free access to grid-based supercomputer resources and process automation.
A personal perspective on modelling the climate system.
Palmer, T N
2016-04-01
Given their increasing relevance for society, I suggest that the climate science community itself does not treat the development of error-free ab initio models of the climate system with sufficient urgency. With increasing levels of difficulty, I discuss a number of proposals for speeding up such development. Firstly, I believe that climate science should make better use of the pool of post-PhD talent in mathematics and physics, for developing next-generation climate models. Secondly, I believe there is more scope for the development of modelling systems which link weather and climate prediction more seamlessly. Finally, here in Europe, I call for a new European Programme on Extreme Computing and Climate to advance our ability to simulate climate extremes, and understand the drivers of such extremes. A key goal for such a programme is the development of a 1 km global climate system model to run on the first exascale supercomputers in the early 2020s.
NASA Technical Reports Server (NTRS)
1997-01-01
Coryphaeus Software, founded in 1989 by former NASA electronic engineer Steve Lakowske, creates real-time 3D software. Designer's Workbench, the company flagship product, is a modeling and simulation tool for the development of both static and dynamic 3D databases. Other products soon followed. Activation, specifically designed for game developers, allows developers to play and test the 3D games before they commit to a target platform. Game publishers can shorten development time and prove the "playability" of the title, maximizing their chances of introducing a smash hit. Another product, EasyT, lets users create massive, realistic representation of Earth terrains that can be viewed and traversed in real time. Finally, EasyScene software control the actions among interactive objects within a virtual world. Coryphaeus products are used on Silican Graphics workstation and supercomputers to simulate real-world performance in synthetic environments. Customers include aerospace, aviation, architectural and engineering firms, game developers, and the entertainment industry.
Application of supercomputers to computational aerodynamics
NASA Technical Reports Server (NTRS)
Peterson, V. L.
1984-01-01
Computers are playing an increasingly important role in the field of aerodynamics such that they now serve as a major complement to wind tunnels in aerospace research and development. Factors pacing advances in computational aerodynamics are identified, including the amount of computational power required to take the next major step in the discipline. Example results obtained from the successively refined forms of the governing equations are discussed, both in the context of levels of computer power required and the degree to which they either further the frontiers of research or apply to problems of practical importance. Finally, the Numerical Aerodynamic Simulation (NAS) Program - with its 1988 target of achieving a sustained computational rate of 1 billion floating point operations per second and operating with a memory of 240 million words - is discussed in terms of its goals and its projected effect on the future of computational aerodynamics.
Barrier-breaking performance for industrial problems on the CRAY C916
DOE Office of Scientific and Technical Information (OSTI.GOV)
Graffunder, S.K.
1993-12-31
Nine applications, including third-party codes, were submitted to the Gordon Bell Prize committee showing the CRAY C916 supercomputer providing record-breaking time to solution for industrial problems in several disciplines. Performance was obtained by balancing raw hardware speed; effective use of large, real, shared memory; compiler vectorization and autotasking; hand optimization; asynchronous I/O techniques; and new algorithms. The highest GFLOPS performance for the submissions was 11.1 GFLOPS out of a peak advertised performance of 16 GFLOPS for the CRAY C916 system. One program achieved a 15.45 speedup from the compiler with just two hand-inserted directives to scope variables properly for themore » mathematical library. New I/O techniques hide tens of gigabytes of I/O behind parallel computations. Finally, new iterative solver algorithms have demonstrated times to solution on 1 CPU as high as 70 times faster than the best direct solvers.« less
Direct Solve of Electrically Large Integral Equations for Problem Sizes to 1M Unknowns
NASA Technical Reports Server (NTRS)
Shaeffer, John
2008-01-01
Matrix methods for solving integral equations via direct solve LU factorization are presently limited to weeks to months of very expensive supercomputer time for problems sizes of several hundred thousand unknowns. This report presents matrix LU factor solutions for electromagnetic scattering problems for problem sizes to one million unknowns with thousands of right hand sides that run in mere days on PC level hardware. This EM solution is accomplished by utilizing the numerical low rank nature of spatially blocked unknowns using the Adaptive Cross Approximation for compressing the rank deficient blocks of the system Z matrix, the L and U factors, the right hand side forcing function and the final current solution. This compressed matrix solution is applied to a frequency domain EM solution of Maxwell's equations using standard Method of Moments approach. Compressed matrix storage and operations count leads to orders of magnitude reduction in memory and run time.
Final Report - Cloud-Based Management Platform for Distributed, Multi-Domain Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chowdhury, Pulak; Mukherjee, Biswanath
2017-11-03
In this Department of Energy (DOE) Small Business Innovation Research (SBIR) Phase II project final report, Ennetix presents the development of a solution for end-to-end monitoring, analysis, and visualization of network performance for distributed networks. This solution benefits enterprises of all sizes, operators of distributed and federated networks, and service providers.
Nagasaki, Hideki; Mochizuki, Takako; Kodama, Yuichi; Saruhashi, Satoshi; Morizaki, Shota; Sugawara, Hideaki; Ohyanagi, Hajime; Kurata, Nori; Okubo, Kousaku; Takagi, Toshihisa; Kaminuma, Eli; Nakamura, Yasukazu
2013-08-01
High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.
Nagasaki, Hideki; Mochizuki, Takako; Kodama, Yuichi; Saruhashi, Satoshi; Morizaki, Shota; Sugawara, Hideaki; Ohyanagi, Hajime; Kurata, Nori; Okubo, Kousaku; Takagi, Toshihisa; Kaminuma, Eli; Nakamura, Yasukazu
2013-01-01
High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/. PMID:23657089
Fang, Xiang; Li, Ning-qiu; Fu, Xiao-zhe; Li, Kai-bin; Lin, Qiang; Liu, Li-hui; Shi, Cun-bin; Wu, Shu-qin
2015-07-01
As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.
A special purpose silicon compiler for designing supercomputing VLSI systems
NASA Technical Reports Server (NTRS)
Venkateswaran, N.; Murugavel, P.; Kamakoti, V.; Shankarraman, M. J.; Rangarajan, S.; Mallikarjun, M.; Karthikeyan, B.; Prabhakar, T. S.; Satish, V.; Venkatasubramaniam, P. R.
1991-01-01
Design of general/special purpose supercomputing VLSI systems for numeric algorithm execution involves tackling two important aspects, namely their computational and communication complexities. Development of software tools for designing such systems itself becomes complex. Hence a novel design methodology has to be developed. For designing such complex systems a special purpose silicon compiler is needed in which: the computational and communicational structures of different numeric algorithms should be taken into account to simplify the silicon compiler design, the approach is macrocell based, and the software tools at different levels (algorithm down to the VLSI circuit layout) should get integrated. In this paper a special purpose silicon (SPS) compiler based on PACUBE macrocell VLSI arrays for designing supercomputing VLSI systems is presented. It is shown that turn-around time and silicon real estate get reduced over the silicon compilers based on PLA's, SLA's, and gate arrays. The first two silicon compiler characteristics mentioned above enable the SPS compiler to perform systolic mapping (at the macrocell level) of algorithms whose computational structures are of GIPOP (generalized inner product outer product) form. Direct systolic mapping on PLA's, SLA's, and gate arrays is very difficult as they are micro-cell based. A novel GIPOP processor is under development using this special purpose silicon compiler.
NASA Astrophysics Data System (ADS)
Belyaev, A.; Berezhnaya, A.; Betev, L.; Buncic, P.; De, K.; Drizhuk, D.; Klimentov, A.; Lazin, Y.; Lyalin, I.; Mashinistov, R.; Novikov, A.; Oleynik, D.; Polyakov, A.; Poyda, A.; Ryabinkin, E.; Teslyuk, A.; Tkachenko, I.; Yasnopolskiy, L.
2015-12-01
The LHC experiments are preparing for the precision measurements and further discoveries that will be made possible by higher LHC energies from April 2015 (LHC Run2). The need for simulation, data processing and analysis would overwhelm the expected capacity of grid infrastructure computing facilities deployed by the Worldwide LHC Computing Grid (WLCG). To meet this challenge the integration of the opportunistic resources into LHC computing model is highly important. The Tier-1 facility at Kurchatov Institute (NRC-KI) in Moscow is a part of WLCG and it will process, simulate and store up to 10% of total data obtained from ALICE, ATLAS and LHCb experiments. In addition Kurchatov Institute has supercomputers with peak performance 0.12 PFLOPS. The delegation of even a fraction of supercomputing resources to the LHC Computing will notably increase total capacity. In 2014 the development a portal combining a Tier-1 and a supercomputer in Kurchatov Institute was started to provide common interfaces and storage. The portal will be used not only for HENP experiments, but also by other data- and compute-intensive sciences like biology with genome sequencing analysis; astrophysics with cosmic rays analysis, antimatter and dark matter search, etc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kristofferson, D.; Mack, D.
1996-10-01
This is the final report for a DOE funded project on BIOSCI Electronic Newsgroup Network for the biological sciences. A usable network for scientific discussion, major announcements, problem solving, etc. has been created.
Some Problems and Solutions in Transferring Ecosystem Simulation Codes to Supercomputers
NASA Technical Reports Server (NTRS)
Skiles, J. W.; Schulbach, C. H.
1994-01-01
Many computer codes for the simulation of ecological systems have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Recent recognition of ecosystem science as a High Performance Computing and Communications Program Grand Challenge area emphasizes supercomputers (both parallel and distributed systems) as the next set of tools for ecological simulation. Transferring ecosystem simulation codes to such systems is not a matter of simply compiling and executing existing code on the supercomputer since there are significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers. To more appropriately match the application to the architecture (necessary to achieve reasonable performance), the parallelism (if it exists) of the original application must be exploited. We discuss our work in transferring a general grassland simulation model (developed on a VAX in the FORTRAN computer programming language) to a Cray Y-MP. We show the Cray shared-memory vector-architecture, and discuss our rationale for selecting the Cray. We describe porting the model to the Cray and executing and verifying a baseline version, and we discuss the changes we made to exploit the parallelism in the application and to improve code execution. As a result, the Cray executed the model 30 times faster than the VAX 11/785 and 10 times faster than a Sun 4 workstation. We achieved an additional speed-up of approximately 30 percent over the original Cray run by using the compiler's vectorizing capabilities and the machine's ability to put subroutines and functions "in-line" in the code. With the modifications, the code still runs at only about 5% of the Cray's peak speed because it makes ineffective use of the vector processing capabilities of the Cray. We conclude with a discussion and future plans.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hazi, A U
2007-02-06
Setting performance goals is part of the business plan for almost every company. The same is true in the world of supercomputers. Ten years ago, the Department of Energy (DOE) launched the Accelerated Strategic Computing Initiative (ASCI) to help ensure the safety and reliability of the nation's nuclear weapons stockpile without nuclear testing. ASCI, which is now called the Advanced Simulation and Computing (ASC) Program and is managed by DOE's National Nuclear Security Administration (NNSA), set an initial 10-year goal to obtain computers that could process up to 100 trillion floating-point operations per second (teraflops). Many computer experts thought themore » goal was overly ambitious, but the program's results have proved them wrong. Last November, a Livermore-IBM team received the 2005 Gordon Bell Prize for achieving more than 100 teraflops while modeling the pressure-induced solidification of molten metal. The prestigious prize, which is named for a founding father of supercomputing, is awarded each year at the Supercomputing Conference to innovators who advance high-performance computing. Recipients for the 2005 prize included six Livermore scientists--physicists Fred Streitz, James Glosli, and Mehul Patel and computer scientists Bor Chan, Robert Yates, and Bronis de Supinski--as well as IBM researchers James Sexton and John Gunnels. This team produced the first atomic-scale model of metal solidification from the liquid phase with results that were independent of system size. The record-setting calculation used Livermore's domain decomposition molecular-dynamics (ddcMD) code running on BlueGene/L, a supercomputer developed by IBM in partnership with the ASC Program. BlueGene/L reached 280.6 teraflops on the Linpack benchmark, the industry standard used to measure computing speed. As a result, it ranks first on the list of Top500 Supercomputer Sites released in November 2005. To evaluate the performance of nuclear weapons systems, scientists must understand how materials behave under extreme conditions. Because experiments at high pressures and temperatures are often difficult or impossible to conduct, scientists rely on computer models that have been validated with obtainable data. Of particular interest to weapons scientists is the solidification of metals. ''To predict the performance of aging nuclear weapons, we need detailed information on a material's phase transitions'', says Streitz, who leads the Livermore-IBM team. For example, scientists want to know what happens to a metal as it changes from molten liquid to a solid and how that transition affects the material's characteristics, such as its strength.« less
Experiences From NASA/Langley's DMSS Project
NASA Technical Reports Server (NTRS)
1996-01-01
There is a trend in institutions with high performance computing and data management requirements to explore mass storage systems with peripherals directly attached to a high speed network. The Distributed Mass Storage System (DMSS) Project at the NASA Langley Research Center (LaRC) has placed such a system into production use. This paper will present the experiences, both good and bad, we have had with this system since putting it into production usage. The system is comprised of: 1) National Storage Laboratory (NSL)/UniTree 2.1, 2) IBM 9570 HIPPI attached disk arrays (both RAID 3 and RAID 5), 3) IBM RS6000 server, 4) HIPPI/IPI3 third party transfers between the disk array systems and the supercomputer clients, a CRAY Y-MP and a CRAY 2, 5) a "warm spare" file server, 6) transition software to convert from CRAY's Data Migration Facility (DMF) based system to DMSS, 7) an NSC PS32 HIPPI switch, and 8) a STK 4490 robotic library accessed from the IBM RS6000 block mux interface. This paper will cover: the performance of the DMSS in the following areas: file transfer rates, migration and recall, and file manipulation (listing, deleting, etc.); the appropriateness of a workstation class of file server for NSL/UniTree with LaRC's present storage requirements in mind the role of the third party transfers between the supercomputers and the DMSS disk array systems in DMSS; a detailed comparison (both in performance and functionality) between the DMF and DMSS systems LaRC's enhancements to the NSL/UniTree system administration environment the mechanism for DMSS to provide file server redundancy the statistics on the availability of DMSS the design and experiences with the locally developed transparent transition software which allowed us to make over 1.5 million DMF files available to NSL/UniTree with minimal system outage