Sample records for supercomputing applications ncsa

  1. The ChemViz Project: Using a Supercomputer To Illustrate Abstract Concepts in Chemistry.

    ERIC Educational Resources Information Center

    Beckwith, E. Kenneth; Nelson, Christopher

    1998-01-01

    Describes the Chemistry Visualization (ChemViz) Project, a Web venture maintained by the University of Illinois National Center for Supercomputing Applications (NCSA) that enables high school students to use computational chemistry as a technique for understanding abstract concepts. Discusses the evolution of computational chemistry and provides a…

  2. A One-of-a-Kind Technology Expansion.

    ERIC Educational Resources Information Center

    Wiens, Janet

    2002-01-01

    Describes the design of the expansion of the National Center for Supercomputing Applications (NCSA) Advanced Computation Building at the University of Illinois, Champaign. Discusses how the design incorporated column-free space for flexibility, cooling capacity, a freight elevator, and a 6-foot raised access floor to neatly house airflow, wiring,…

  3. Radio Synthesis Imaging - A High Performance Computing and Communications Project

    NASA Astrophysics Data System (ADS)

    Crutcher, Richard M.

    The National Science Foundation has funded a five-year High Performance Computing and Communications project at the National Center for Supercomputing Applications (NCSA) for the direct implementation of several of the computing recommendations of the Astronomy and Astrophysics Survey Committee (the "Bahcall report"). This paper is a summary of the project goals and a progress report. The project will implement a prototype of the next generation of astronomical telescope systems - remotely located telescopes connected by high-speed networks to very high performance, scalable architecture computers and on-line data archives, which are accessed by astronomers over Gbit/sec networks. Specifically, a data link has been installed between the BIMA millimeter-wave synthesis array at Hat Creek, California and NCSA at Urbana, Illinois for real-time transmission of data to NCSA. Data are automatically archived, and may be browsed and retrieved by astronomers using the NCSA Mosaic software. In addition, an on-line digital library of processed images will be established. BIMA data will be processed on a very high performance distributed computing system, with I/O, user interface, and most of the software system running on the NCSA Convex C3880 supercomputer or Silicon Graphics Onyx workstations connected by HiPPI to the high performance, massively parallel Thinking Machines Corporation CM-5. The very computationally intensive algorithms for calibration and imaging of radio synthesis array observations will be optimized for the CM-5 and new algorithms which utilize the massively parallel architecture will be developed. Code running simultaneously on the distributed computers will communicate using the Data Transport Mechanism developed by NCSA. The project will also use the BLANCA Gbit/s testbed network between Urbana and Madison, Wisconsin to connect an Onyx workstation in the University of Wisconsin Astronomy Department to the NCSA CM-5, for development of long-distance distributed computing. Finally, the project is developing 2D and 3D visualization software as part of the international AIPS++ project. This research and development project is being carried out by a team of experts in radio astronomy, algorithm development for massively parallel architectures, high-speed networking, database management, and Thinking Machines Corporation personnel. The development of this complete software, distributed computing, and data archive and library solution to the radio astronomy computing problem will advance our expertise in high performance computing and communications technology and the application of these techniques to astronomical data processing.

  4. PATHFINDER: Probing Atmospheric Flows in an Integrated and Distributed Environment

    NASA Technical Reports Server (NTRS)

    Wilhelmson, R. B.; Wojtowicz, D. P.; Shaw, C.; Hagedorn, J.; Koch, S.

    1995-01-01

    PATHFINDER is a software effort to create a flexible, modular, collaborative, and distributed environment for studying atmospheric, astrophysical, and other fluid flows in the evolving networked metacomputer environment of the 1990s. It uses existing software, such as HDF (Hierarchical Data Format), DTM (Data Transfer Mechanism), GEMPAK (General Meteorological Package), AVS, SGI Explorer, and Inventor to provide the researcher with the ability to harness the latest in desktop to teraflop computing. Software modules developed during the project are available in the public domain via anonymous FTP from the National Center for Supercomputing Applications (NCSA). The address is ftp.ncsa.uiuc.edu, and the directory is /SGI/PATHFINDER.

  5. VRML and Collaborative Environments: New Tools for Networked Visualization

    NASA Astrophysics Data System (ADS)

    Crutcher, R. M.; Plante, R. L.; Rajlich, P.

    We present two new applications that engage the network as a tool for astronomical research and/or education. The first is a VRML server which allows users over the Web to interactively create three-dimensional visualizations of FITS images contained in the NCSA Astronomy Digital Image Library (ADIL). The server's Web interface allows users to select images from the ADIL, fill in processing parameters, and create renderings featuring isosurfaces, slices, contours, and annotations; the often extensive computations are carried out on an NCSA SGI supercomputer server without the user having an individual account on the system. The user can then download the 3D visualizations as VRML files, which may be rotated and manipulated locally on virtually any class of computer. The second application is the ADILBrowser, a part of the NCSA Horizon Image Data Browser Java package. ADILBrowser allows a group of participants to browse images from the ADIL within a collaborative session. The collaborative environment is provided by the NCSA Habanero package which includes text and audio chat tools and a white board. The ADILBrowser is just an example of a collaborative tool that can be built with the Horizon and Habanero packages. The classes provided by these packages can be assembled to create custom collaborative applications that visualize data either from local disk or from anywhere on the network.

  6. Application-level regression testing framework using Jenkins

    DOE PAGES

    Budiardja, Reuben; Bouvet, Timothy; Arnold, Galen

    2017-09-26

    Monitoring and testing for regression of large-scale systems such as the NCSA's Blue Waters supercomputer are challenging tasks. In this paper, we describe the solution we came up with to perform those tasks. The goal was to find an automated solution for running user-level regression tests to evaluate system usability and performance. Jenkins, an automation server software, was chosen for its versatility, large user base, and multitude of plugins including collecting data and plotting test results over time. We also describe our Jenkins deployment to launch and monitor jobs on remote HPC system, perform authentication with one-time password, and integratemore » with our LDAP server for its authorization. We show some use cases and describe our best practices for successfully using Jenkins as a user-level system-wide regression testing and monitoring framework for large supercomputer systems.« less

  7. Application-level regression testing framework using Jenkins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Budiardja, Reuben; Bouvet, Timothy; Arnold, Galen

    Monitoring and testing for regression of large-scale systems such as the NCSA's Blue Waters supercomputer are challenging tasks. In this paper, we describe the solution we came up with to perform those tasks. The goal was to find an automated solution for running user-level regression tests to evaluate system usability and performance. Jenkins, an automation server software, was chosen for its versatility, large user base, and multitude of plugins including collecting data and plotting test results over time. We also describe our Jenkins deployment to launch and monitor jobs on remote HPC system, perform authentication with one-time password, and integratemore » with our LDAP server for its authorization. We show some use cases and describe our best practices for successfully using Jenkins as a user-level system-wide regression testing and monitoring framework for large supercomputer systems.« less

  8. IMIRSEL: a secure music retrieval testing environment

    NASA Astrophysics Data System (ADS)

    Downie, John S.

    2004-10-01

    The Music Information Retrieval (MIR) and Music Digital Library (MDL) research communities have long noted the need for formal evaluation mechanisms. Issues concerning the unavailability of freely-available music materials have greatly hindered the creation of standardized test collections with which these communities could scientifically assess the strengths and weaknesses of their various music retrieval techniques. The International Music Information Retrieval Systems Evaluation Laboratory (IMIRSEL) is being developed at the University of Illinois at Urbana-Champaign (UIUC) specifically to overcome this hindrance to the scientific evaluation of MIR/MDL systems. Together with its subsidiary Human Use of Music Information Retrieval Systems (HUMIRS) project, IMIRSEL will allow MIR/MDL researchers access to the standardized large-scale collection of copyright-sensitive music materials and standardized test queries being housed at UIUC's National Center for Supercomputing Applications (NCSA). Virtual Research Labs (VRL), based upon NCSA's Data-to-Knowledge (D2K) tool set, are being developed through which MIR/MDL researchers will interact with the music materials under a "trusted code" security model.

  9. Building the interspace: Digital library infrastructure for a University Engineering Community

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schatz, B.

    A large-scale digital library is being constructed and evaluated at the University of Illinois, with the goal of bringing professional search and display to Internet information services. A testbed planned to grow to 10K documents and 100K users is being constructed in the Grainger Engineering Library Information Center, as a joint effort of the University Library and the National Center for Supercomputing Applications (NCSA), with evaluation and research by the Graduate School of Library and Information Science and the Department of Computer Science. The electronic collection will be articles from engineering and science journals and magazines, obtained directly from publishersmore » in SGML format and displayed containing all text, figures, tables, and equations. The publisher partners include IEEE Computer Society, AIAA (Aerospace Engineering), American Physical Society, and Wiley & Sons. The software will be based upon NCSA Mosaic as a network engine connected to commercial SGML displayers and full-text searchers. The users will include faculty/students across the midwestern universities in the Big Ten, with evaluations via interviews, surveys, and transaction logs. Concurrently, research into scaling the testbed is being conducted. This includes efforts in computer science, information science, library science, and information systems. These efforts will evaluate different semantic retrieval technologies, including automatic thesaurus and subject classification graphs. New architectures will be designed and implemented for a next generation digital library infrastructure, the Interspace, which supports interaction with information spread across information spaces within the Net.« less

  10. Analysis, Mining and Visualization Service at NCSA

    NASA Astrophysics Data System (ADS)

    Wilhelmson, R.; Cox, D.; Welge, M.

    2004-12-01

    NCSA's goal is to create a balanced system that fully supports high-end computing as well as: 1) high-end data management and analysis; 2) visualization of massive, highly complex data collections; 3) large databases; 4) geographically distributed Grid computing; and 5) collaboratories, all based on a secure computational environment and driven with workflow-based services. To this end NCSA has defined a new technology path that includes the integration and provision of cyberservices in support of data analysis, mining, and visualization. NCSA has begun to develop and apply a data mining system-NCSA Data-to-Knowledge (D2K)-in conjunction with both the application and research communities. NCSA D2K will enable the formation of model-based application workflows and visual programming interfaces for rapid data analysis. The Java-based D2K framework, which integrates analytical data mining methods with data management, data transformation, and information visualization tools, will be configurable from the cyberservices (web and grid services, tools, ..) viewpoint to solve a wide range of important data mining problems. This effort will use modules, such as a new classification methods for the detection of high-risk geoscience events, and existing D2K data management, machine learning, and information visualization modules. A D2K cyberservices interface will be developed to seamlessly connect client applications with remote back-end D2K servers, providing computational resources for data mining and integration with local or remote data stores. This work is being coordinated with SDSC's data and services efforts. The new NCSA Visualization embedded workflow environment (NVIEW) will be integrated with D2K functionality to tightly couple informatics and scientific visualization with the data analysis and management services. Visualization services will access and filter disparate data sources, simplifying tasks such as fusing related data from distinct sources into a coherent visual representation. This approach enables collaboration among geographically dispersed researchers via portals and front-end clients, and the coupling with data management services enables recording associations among datasets and building annotation systems into visualization tools and portals, giving scientists a persistent, shareable, virtual lab notebook. To facilitate provision of these cyberservices to the national community, NCSA will be providing a computational environment for large-scale data assimilation, analysis, mining, and visualization. This will be initially implemented on the new 512 processor shared memory SGI's recently purchased by NCSA. In addition to standard batch capabilities, NCSA will provide on-demand capabilities for those projects requiring rapid response (e.g., development of severe weather, earthquake events) for decision makers. It will also be used for non-sequential interactive analysis of data sets where it is important have access to large data volumes over space and time.

  11. NASA's Participation in the National Computational Grid

    NASA Technical Reports Server (NTRS)

    Feiereisen, William J.; Zornetzer, Steve F. (Technical Monitor)

    1998-01-01

    Over the last several years it has become evident that the character of NASA's supercomputing needs has changed. One of the major missions of the agency is to support the design and manufacture of aero- and space-vehicles with technologies that will significantly reduce their cost. It is becoming clear that improvements in the process of aerospace design and manufacturing will require a high performance information infrastructure that allows geographically dispersed teams to draw upon resources that are broader than traditional supercomputing. A computational grid draws together our information resources into one system. We can foresee the time when a Grid will allow engineers and scientists to use the tools of supercomputers, databases and on line experimental devices in a virtual environment to collaborate with distant colleagues. The concept of a computational grid has been spoken of for many years, but several events in recent times are conspiring to allow us to actually build one. In late 1997 the National Science Foundation initiated the Partnerships for Advanced Computational Infrastructure (PACI) which is built around the idea of distributed high performance computing. The Alliance lead, by the National Computational Science Alliance (NCSA), and the National Partnership for Advanced Computational Infrastructure (NPACI), lead by the San Diego Supercomputing Center, have been instrumental in drawing together the "Grid Community" to identify the technology bottlenecks and propose a research agenda to address them. During the same period NASA has begun to reformulate parts of two major high performance computing research programs to concentrate on distributed high performance computing and has banded together with the PACI centers to address the research agenda in common.

  12. Remedial Investigation Report. Volume 11. North Central Study Area, Section 1.0 Text. Version 3.3

    DTIC Science & Technology

    1989-07-01

    Soils ... ........... NCSA 1.5-1 Summary of Alluvial Aquifer Pumping Tests ............ NCSA 1.5-2 Summary of Aquifer Parameters-Alluvial Aquifer...NCSA 1.5-3 Summary of Results for Pumping Tests in the Denver Formation ............ ...................... NCSA 1.5-4 Summary of Hydraulic...of fluid was pumped to Basin C and the liner was repaired. The remaining fluid in Basins A and C was transferred to Basin F, which by this time was

  13. A World Wide Web (WWW) server database engine for an organelle database, MitoDat.

    PubMed

    Lemkin, P F; Chipperfield, M; Merril, C; Zullo, S

    1996-03-01

    We describe a simple database search engine "dbEngine" which may be used to quickly create a searchable database on a World Wide Web (WWW) server. Data may be prepared from spreadsheet programs (such as Excel, etc.) or from tables exported from relationship database systems. This Common Gateway Interface (CGI-BIN) program is used with a WWW server such as available commercially, or from National Center for Supercomputer Algorithms (NCSA) or CERN. Its capabilities include: (i) searching records by combinations of terms connected with ANDs or ORs; (ii) returning search results as hypertext links to other WWW database servers; (iii) mapping lists of literature reference identifiers to the full references; (iv) creating bidirectional hypertext links between pictures and the database. DbEngine has been used to support the MitoDat database (Mendelian and non-Mendelian inheritance associated with the Mitochondrion) on the WWW.

  14. Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation

    PubMed Central

    Phillips, James C.; Sun, Yanhua; Jain, Nikhil; Bohm, Eric J.; Kalé, Laxmikant V.

    2014-01-01

    Currently deployed petascale supercomputers typically use toroidal network topologies in three or more dimensions. While these networks perform well for topology-agnostic codes on a few thousand nodes, leadership machines with 20,000 nodes require topology awareness to avoid network contention for communication-intensive codes. Topology adaptation is complicated by irregular node allocation shapes and holes due to dedicated input/output nodes or hardware failure. In the context of the popular molecular dynamics program NAMD, we present methods for mapping a periodic 3-D grid of fixed-size spatial decomposition domains to 3-D Cray Gemini and 5-D IBM Blue Gene/Q toroidal networks to enable hundred-million atom full machine simulations, and to similarly partition node allocations into compact domains for smaller simulations using multiple-copy algorithms. Additional enabling techniques are discussed and performance is reported for NCSA Blue Waters, ORNL Titan, ANL Mira, TACC Stampede, and NERSC Edison. PMID:25594075

  15. Comparison of Accuracy and Performance for Lattice Boltzmann and Finite Difference Simulations of Steady Viscous Flow

    NASA Astrophysics Data System (ADS)

    Noble, David R.; Georgiadis, John G.; Buckius, Richard O.

    1996-07-01

    The lattice Boltzmann method (LBM) is used to simulate flow in an infinite periodic array of octagonal cylinders. Results are compared with those obtained by a finite difference (FD) simulation solved in terms of streamfunction and vorticity using an alternating direction implicit scheme. Computed velocity profiles are compared along lines common to both the lattice Boltzmann and finite difference grids. Along all such slices, both streamwise and transverse velocity predictions agree to within 05% of the average streamwise velocity. The local shear on the surface of the cylinders also compares well, with the only deviations occurring in the vicinity of the corners of the cylinders, where the slope of the shear is discontinuous. When a constant dimensionless relaxation time is maintained, LBM exhibits the same convergence behaviour as the FD algorithm, with the time step increasing as the square of the grid size. By adjusting the relaxation time such that a constant Mach number is achieved, the time step of LBM varies linearly with the grid size. The efficiency of LBM on the CM-5 parallel computer at the National Center for Supercomputing Applications (NCSA) is evaluated by examining each part of the algorithm. Overall, a speed of 139 GFLOPS is obtained using 512 processors for a domain size of 2176×2176.

  16. The National Comorbidity Survey Adolescent Supplement (NCS-A): I. Background and Measures

    PubMed Central

    Merikangas, Kathleen R.; Avenevoli, Shelli; Costello, E. Jane; Koretz, Doreen; Kessler, Ronald C.

    2009-01-01

    Objective This paper presents an overview of the background and measures used in the National Comorbidity Survey Replication Adolescent Supplement (NCS-A). Methods The NCS-A is a national psychiatric epidemiological survey of adolescents ages 13–17. Results The NCS-A was designed to provide the first nationally representative estimates of the prevalence, correlates and patterns of service use for DSM-V mental disorders among US adolescents and to lay the groundwork for follow-up studies of risk-protective factors, consequences, and early expressions of adult mental disorders. The core NCS-A diagnostic interview, the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI), is a fully-structured research diagnostic interview designed for use by trained lay interviewers. A multi-construct, multi-method, multi-informant battery was also included to assess risk and protective factors and barriers to service use. Design limitations due to the NCS-A evolving as a supplement to an ongoing survey of mental disorders of US adults include restricted age range of youth, cross-sectional assessment, and lack of full parental/surrogate informant reports on youth mental disorders and correlates. Conclusions Despite these limitations, the NCS-A contains unparalleled information that can be used to generate national estimates of prevalence and correlates of adolescent mental disorders, risk and protective factors, patterns of service use, and barriers to receiving treatment for these disorders. The retrospective NCS-A data on the development of psychopathology can additionally complement data from longitudinal studies based on more geographically restricted samples and serve as a useful baseline for future prospective studies of the onset and progression of mental disorders in adulthood. PMID:19242382

  17. Highly parallel implementation of non-adiabatic Ehrenfest molecular dynamics

    NASA Astrophysics Data System (ADS)

    Kanai, Yosuke; Schleife, Andre; Draeger, Erik; Anisimov, Victor; Correa, Alfredo

    2014-03-01

    While the adiabatic Born-Oppenheimer approximation tremendously lowers computational effort, many questions in modern physics, chemistry, and materials science require an explicit description of coupled non-adiabatic electron-ion dynamics. Electronic stopping, i.e. the energy transfer of a fast projectile atom to the electronic system of the target material, is a notorious example. We recently implemented real-time time-dependent density functional theory based on the plane-wave pseudopotential formalism in the Qbox/qb@ll codes. We demonstrate that explicit integration using a fourth-order Runge-Kutta scheme is very suitable for modern highly parallelized supercomputers. Applying the new implementation to systems with hundreds of atoms and thousands of electrons, we achieved excellent performance and scalability on a large number of nodes both on the BlueGene based ``Sequoia'' system at LLNL as well as the Cray architecture of ``Blue Waters'' at NCSA. As an example, we discuss our work on computing the electronic stopping power of aluminum and gold for hydrogen projectiles, showing an excellent agreement with experiment. These first-principles calculations allow us to gain important insight into the the fundamental physics of electronic stopping.

  18. Pulling the Internet Together with Mosaic.

    ERIC Educational Resources Information Center

    Sheehan, Mark

    1995-01-01

    Presents the history of the Internet with specific emphasis on Mosaic; discusses hypertext and hypermedia information; and describes software and hardware requirements. Sidebars include information on the National Center for Super Computing Applications (NCSA); World Wide Web browsers for use in Windows, Macintosh, and X-Windows (UNIX); and…

  19. Virtual Sensors in a Web 2.0 Digital Watershed

    NASA Astrophysics Data System (ADS)

    Liu, Y.; Hill, D. J.; Marini, L.; Kooper, R.; Rodriguez, A.; Myers, J. D.

    2008-12-01

    The lack of rainfall data in many watersheds is one of the major barriers for modeling and studying many environmental and hydrological processes and supporting decision making. There are just not enough rain gages on the ground. To overcome this data scarcity issue, a Web 2.0 digital watershed is developed at NCSA(National Center for Supercomputing Applications), where users can point-and-click on a web-based google map interface and create new precipitation virtual sensors at any location within the same coverage region as a NEXRAD station. A set of scientific workflows are implemented to perform spatial, temporal and thematic transformations to the near-real-time NEXRAD Level II data. Such workflows can be triggered by the users' actions and generate either rainfall rate or rainfall accumulation streaming data at a user-specified time interval. We will discuss some underlying components of this digital watershed, which consists of a semantic content management middleware, a semantically enhanced streaming data toolkit, virtual sensor management functionality, and RESTful (REpresentational State Transfer) web service that can trigger the workflow execution. Such loosely coupled architecture presents a generic framework for constructing a Web 2.0 style digital watershed. An implementation of this architecture at the Upper Illinois Rive Basin will be presented. We will also discuss the implications of the virtual sensor concept for the broad environmental observatory community and how such concept will help us move towards a participatory digital watershed.

  20. CI-KNOW: Cyberinfrastructure Knowledge Networks on the Web. A Social Network Enabled Recommender System for Locating Resources in Cyberinfrastructures

    NASA Astrophysics Data System (ADS)

    Green, H. D.; Contractor, N. S.; Yao, Y.

    2006-12-01

    A knowledge network is a multi-dimensional network created from the interactions and interconnections among the scientists, documents, data, analytic tools, and interactive collaboration spaces (like forums and wikis) associated with a collaborative environment. CI-KNOW is a suite of software tools that leverages automated data collection, social network theories, analysis techniques and algorithms to infer an individual's interests and expertise based on their interactions and activities within a knowledge network. The CI-KNOW recommender system mines the knowledge network associated with a scientific community's use of cyberinfrastructure tools and uses relational metadata to record connections among entities in the knowledge network. Recent developments in social network theories and methods provide the backbone for a modular system that creates recommendations from relational metadata. A network navigation portlet allows users to locate colleagues, documents, data or analytic tools in the knowledge network and to explore their networks through a visual, step-wise process. An internal auditing portlet offers administrators diagnostics to assess the growth and health of the entire knowledge network. The first instantiation of the prototype CI-KNOW system is part of the Environmental Cyberinfrastructure Demonstration project at the National Center for Supercomputing Applications, which supports the activities of hydrologic and environmental science communities (CLEANER and CUAHSI) under the umbrella of the WATERS network environmental observatory planning activities (http://cleaner.ncsa.uiuc.edu). This poster summarizes the key aspects of the CI-KNOW system, highlighting the key inputs, calculation mechanisms, and output modalities.

  1. NASA's supercomputing experience

    NASA Technical Reports Server (NTRS)

    Bailey, F. Ron

    1990-01-01

    A brief overview of NASA's recent experience in supercomputing is presented from two perspectives: early systems development and advanced supercomputing applications. NASA's role in supercomputing systems development is illustrated by discussion of activities carried out by the Numerical Aerodynamical Simulation Program. Current capabilities in advanced technology applications are illustrated with examples in turbulence physics, aerodynamics, aerothermodynamics, chemistry, and structural mechanics. Capabilities in science applications are illustrated by examples in astrophysics and atmospheric modeling. Future directions and NASA's new High Performance Computing Program are briefly discussed.

  2. Supercomputer networking for space science applications

    NASA Technical Reports Server (NTRS)

    Edelson, B. I.

    1992-01-01

    The initial design of a supercomputer network topology including the design of the communications nodes along with the communications interface hardware and software is covered. Several space science applications that are proposed experiments by GSFC and JPL for a supercomputer network using the NASA ACTS satellite are also reported.

  3. Lifetime Prevalence of Mental Disorders in U.S. Adolescents: Results from the National Comorbidity Survey Replication-Adolescent Supplement (NCS-A)

    ERIC Educational Resources Information Center

    Merikangas, Kathleen Ries; He, Jian-ping; Burstein, Marcy; Swanson, Sonja A.; Avenevoli, Shelli; Cui, Lihong; Benjet, Corina; Georgiades, Katholiki; Swendsen, Joel

    2010-01-01

    Objective: To present estimates of the lifetime prevalence of "DSM-IV" mental disorders with and without severe impairment, their comorbidity across broad classes of disorder, and their sociodemographic correlates. Method: The National Comorbidity Survey-Adolescent Supplement NCS-A is a nationally representative face-to-face survey of…

  4. Virtual Observatory and Distributed Data Mining

    NASA Astrophysics Data System (ADS)

    Borne, Kirk D.

    2012-03-01

    New modes of discovery are enabled by the growth of data and computational resources (i.e., cyberinfrastructure) in the sciences. This cyberinfrastructure includes structured databases, virtual observatories (distributed data, as described in Section 20.2.1 of this chapter), high-performance computing (petascale machines), distributed computing (e.g., the Grid, the Cloud, and peer-to-peer networks), intelligent search and discovery tools, and innovative visualization environments. Data streams from experiments, sensors, and simulations are increasingly complex and growing in volume. This is true in most sciences, including astronomy, climate simulations, Earth observing systems, remote sensing data collections, and sensor networks. At the same time, we see an emerging confluence of new technologies and approaches to science, most clearly visible in the growing synergism of the four modes of scientific discovery: sensors-modeling-computing-data (Eastman et al. 2005). This has been driven by numerous developments, including the information explosion, development of large-array sensors, acceleration in high-performance computing (HPC) power, advances in algorithms, and efficient modeling techniques. Among these, the most extreme is the growth in new data. Specifically, the acquisition of data in all scientific disciplines is rapidly accelerating and causing a data glut (Bell et al. 2007). It has been estimated that data volumes double every year—for example, the NCSA (National Center for Supercomputing Applications) reported that their users cumulatively generated one petabyte of data over the first 19 years of NCSA operation, but they then generated their next one petabyte in the next year alone, and the data production has been growing by almost 100% each year after that (Butler 2008). The NCSA example is just one of many demonstrations of the exponential (annual data-doubling) growth in scientific data collections. In general, this putative data-doubling is an inevitable result of several compounding factors: the proliferation of data-generating devices, sensors, projects, and enterprises; the 18-month doubling of the digital capacity of these microprocessor-based sensors and devices (commonly referred to as "Moore’s law"); the move to digital for nearly all forms of information; the increase in human-generated data (both unstructured information on the web and structured data from experiments, models, and simulation); and the ever-expanding capability of higher density media to hold greater volumes of data (i.e., data production expands to fill the available storage space). These factors are consequently producing an exponential data growth rate, which will soon (if not already) become an insurmountable technical challenge even with the great advances in computation and algorithms. This technical challenge is compounded by the ever-increasing geographic dispersion of important data sources—the data collections are not stored uniformly at a single location, or with a single data model, or in uniform formats and modalities (e.g., images, databases, structured and unstructured files, and XML data sets)—the data are in fact large, distributed, heterogeneous, and complex. The greatest scientific research challenge with these massive distributed data collections is consequently extracting all of the rich information and knowledge content contained therein, thus requiring new approaches to scientific research. This emerging data-intensive and data-oriented approach to scientific research is sometimes called discovery informatics or X-informatics (where X can be any science, such as bio, geo, astro, chem, eco, or anything; Agresti 2003; Gray 2003; Borne 2010). This data-oriented approach to science is now recognized by some (e.g., Mahootian and Eastman 2009; Hey et al. 2009) as the fourth paradigm of research, following (historically) experiment/observation, modeling/analysis, and computational science.

  5. Input/output behavior of supercomputing applications

    NASA Technical Reports Server (NTRS)

    Miller, Ethan L.

    1991-01-01

    The collection and analysis of supercomputer I/O traces and their use in a collection of buffering and caching simulations are described. This serves two purposes. First, it gives a model of how individual applications running on supercomputers request file system I/O, allowing system designer to optimize I/O hardware and file system algorithms to that model. Second, the buffering simulations show what resources are needed to maximize the CPU utilization of a supercomputer given a very bursty I/O request rate. By using read-ahead and write-behind in a large solid stated disk, one or two applications were sufficient to fully utilize a Cray Y-MP CPU.

  6. Optimization of Supercomputer Use on EADS II System

    NASA Technical Reports Server (NTRS)

    Ahmed, Ardsher

    1998-01-01

    The main objective of this research was to optimize supercomputer use to achieve better throughput and utilization of supercomputers and to help facilitate the movement of non-supercomputing (inappropriate for supercomputer) codes to mid-range systems for better use of Government resources at Marshall Space Flight Center (MSFC). This work involved the survey of architectures available on EADS II and monitoring customer (user) applications running on a CRAY T90 system.

  7. Supercomputer applications in molecular modeling.

    PubMed

    Gund, T M

    1988-01-01

    An overview of the functions performed by molecular modeling is given. Molecular modeling techniques benefiting from supercomputing are described, namely, conformation, search, deriving bioactive conformations, pharmacophoric pattern searching, receptor mapping, and electrostatic properties. The use of supercomputers for problems that are computationally intensive, such as protein structure prediction, protein dynamics and reactivity, protein conformations, and energetics of binding is also examined. The current status of supercomputing and supercomputer resources are discussed.

  8. Automotive applications of superconductors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ginsberg, M.

    1987-01-01

    These proceedings compile papers on supercomputers in the automobile industry. Titles include: An automotive engineer's guide to the effective use of scalar, vector, and parallel computers; fluid mechanics, finite elements, and supercomputers; and Automotive crashworthiness performance on a supercomputer.

  9. 3D medical volume reconstruction using web services.

    PubMed

    Kooper, Rob; Shirk, Andrew; Lee, Sang-Chul; Lin, Amy; Folberg, Robert; Bajcsy, Peter

    2008-04-01

    We address the problem of 3D medical volume reconstruction using web services. The use of proposed web services is motivated by the fact that the problem of 3D medical volume reconstruction requires significant computer resources and human expertise in medical and computer science areas. Web services are implemented as an additional layer to a dataflow framework called data to knowledge. In the collaboration between UIC and NCSA, pre-processed input images at NCSA are made accessible to medical collaborators for registration. Every time UIC medical collaborators inspected images and selected corresponding features for registration, the web service at NCSA is contacted and the registration processing query is executed using the image to knowledge library of registration methods. Co-registered frames are returned for verification by medical collaborators in a new window. In this paper, we present 3D volume reconstruction problem requirements and the architecture of the developed prototype system at http://isda.ncsa.uiuc.edu/MedVolume. We also explain the tradeoffs of our system design and provide experimental data to support our system implementation. The prototype system has been used for multiple 3D volume reconstructions of blood vessels and vasculogenic mimicry patterns in histological sections of uveal melanoma studied by fluorescent confocal laser scanning microscope.

  10. Improved Access to Supercomputers Boosts Chemical Applications.

    ERIC Educational Resources Information Center

    Borman, Stu

    1989-01-01

    Supercomputing is described in terms of computing power and abilities. The increase in availability of supercomputers for use in chemical calculations and modeling are reported. Efforts of the National Science Foundation and Cray Research are highlighted. (CW)

  11. Full speed ahead for software

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolfe, A.

    1986-03-10

    Supercomputing software is moving into high gear, spurred by the rapid spread of supercomputers into new applications. The critical challenge is how to develop tools that will make it easier for programmers to write applications that take advantage of vectorizing in the classical supercomputer and the parallelism that is emerging in supercomputers and minisupercomputers. Writing parallel software is a challenge that every programmer must face because parallel architectures are springing up across the range of computing. Cray is developing a host of tools for programmers. Tools to support multitasking (in supercomputer parlance, multitasking means dividing up a single program tomore » run on multiple processors) are high on Cray's agenda. On tap for multitasking is Premult, dubbed a microtasking tool. As a preprocessor for Cray's CFT77 FORTRAN compiler, Premult will provide fine-grain multitasking.« less

  12. Data-Driven Geospatial Visual Analytics for Real-Time Urban Flooding Decision Support

    NASA Astrophysics Data System (ADS)

    Liu, Y.; Hill, D.; Rodriguez, A.; Marini, L.; Kooper, R.; Myers, J.; Wu, X.; Minsker, B. S.

    2009-12-01

    Urban flooding is responsible for the loss of life and property as well as the release of pathogens and other pollutants into the environment. Previous studies have shown that spatial distribution of intense rainfall significantly impacts the triggering and behavior of urban flooding. However, no general purpose tools yet exist for deriving rainfall data and rendering them in real-time at the resolution of hydrologic units used for analyzing urban flooding. This paper presents a new visual analytics system that derives and renders rainfall data from the NEXRAD weather radar system at the sewershed (i.e. urban hydrologic unit) scale in real-time for a Chicago stormwater management project. We introduce a lightweight Web 2.0 approach which takes advantages of scientific workflow management and publishing capabilities developed at NCSA (National Center for Supercomputing Applications), streaming data-aware semantic content management repository, web-based Google Earth/Map and time-aware KML (Keyhole Markup Language). A collection of polygon-based virtual sensors is created from the NEXRAD Level II data using spatial, temporal and thematic transformations at the sewershed level in order to produce persistent virtual rainfall data sources for the animation. Animated color-coded rainfall map in the sewershed can be played in real-time as a movie using time-aware KML inside the web browser-based Google Earth for visually analyzing the spatiotemporal patterns of the rainfall intensity in the sewershed. Such system provides valuable information for situational awareness and improved decision support during extreme storm events in an urban area. Our further work includes incorporating additional data (such as basement flooding events data) or physics-based predictive models that can be used for more integrated data-driven decision support.

  13. The National Comorbidity Survey Adolescent Supplement (NCS-A): II. Overview and Design

    PubMed Central

    Kessler, Ronald C.; Avenevoli, Shelli; Costello, E. Jane; Green, Jennifer Greif; Gruber, Michael J.; Heeringa, Steven; Merikangas, Kathleen R.; Pennell, Beth-Ellen; Sampson, Nancy A.; Zaslavsky, Alan M.

    2009-01-01

    OBJECTIVE To present an overview of the design and field procedures of the National Comorbidity Survey Replication Adolescent Supplement (NCS-A) METHOD The NCS-A is a nationally representative face-to-face household survey of the prevalence and correlates of DSM-IV mental disorders among US adolescents (ages 13–17) that was carried out between February 2001 and January 2004 by the Survey Research Center of the Institute for Social Research at the University of Michigan. The sample was based on a dual-frame design that included 904 adolescent residents of the households that participated in the National Comorbidity Survey Replication (85.9% response rate) and 9244 adolescent students selected from a representative sample of 320 schools in the same nationally representative sample of counties as the NCS-R (74.7% response rate). RESULTS Comparisons of sample and population distributions on Census socio-demographic variables and, in the school sample, school characteristics documented only minor differences that were corrected with post-stratification weighting. Comparisons of DSM-IV disorder prevalence estimates among household vs. school sample respondents in counties that differed in the use of replacement schools for originally selected schools that refused to participate showed that the use of replacement schools did not introduce bias into prevalence estimates. CONCLUSIONS The NCS-A is a rich nationally representative dataset that will substantially increase understanding of the mental health and well-being of adolescents in the United States. PMID:19242381

  14. Archaeal Tuc1/Ncs6 Homolog Required for Wobble Uridine tRNA Thiolation Is Associated with Ubiquitin-Proteasome, Translation, and RNA Processing System Homologs

    PubMed Central

    Chavarria, Nikita E.; Hwang, Sungmin; Cao, Shiyun; Fu, Xian; Holman, Mary; Elbanna, Dina; Rodriguez, Suzanne; Arrington, Deanna; Englert, Markus; Uthandi, Sivakumar; Söll, Dieter; Maupin-Furlow, Julie A.

    2014-01-01

    While cytoplasmic tRNA 2-thiolation protein 1 (Tuc1/Ncs6) and ubiquitin-related modifier-1 (Urm1) are important in the 2-thiolation of 5-methoxycarbonylmethyl-2-thiouridine (mcm5s2U) at wobble uridines of tRNAs in eukaryotes, the biocatalytic roles and properties of Ncs6/Tuc1 and its homologs are poorly understood. Here we present the first report of an Ncs6 homolog of archaea (NcsA of Haloferax volcanii) that is essential for maintaining cellular pools of thiolated tRNALys UUU and for growth at high temperature. When purified from Hfx. volcanii, NcsA was found to be modified at Lys204 by isopeptide linkage to polymeric chains of the ubiquitin-fold protein SAMP2. The ubiquitin-activating E1 enzyme homolog of archaea (UbaA) was required for this covalent modification. Non-covalent protein partners that specifically associated with NcsA were also identified including UbaA, SAMP2, proteasome activating nucleotidase (PAN)-A/1, translation elongation factor aEF-1α and a β-CASP ribonuclease homolog of the archaeal cleavage and polyadenylation specificity factor 1 family (aCPSF1). Together, our study reveals that NcsA is essential for growth at high temperature, required for formation of thiolated tRNALys UUU and intimately linked to homologs of ubiquitin-proteasome, translation and RNA processing systems. PMID:24906001

  15. Archaeal Tuc1/Ncs6 homolog required for wobble uridine tRNA thiolation is associated with ubiquitin-proteasome, translation, and RNA processing system homologs.

    PubMed

    Chavarria, Nikita E; Hwang, Sungmin; Cao, Shiyun; Fu, Xian; Holman, Mary; Elbanna, Dina; Rodriguez, Suzanne; Arrington, Deanna; Englert, Markus; Uthandi, Sivakumar; Söll, Dieter; Maupin-Furlow, Julie A

    2014-01-01

    While cytoplasmic tRNA 2-thiolation protein 1 (Tuc1/Ncs6) and ubiquitin-related modifier-1 (Urm1) are important in the 2-thiolation of 5-methoxycarbonylmethyl-2-thiouridine (mcm5s2U) at wobble uridines of tRNAs in eukaryotes, the biocatalytic roles and properties of Ncs6/Tuc1 and its homologs are poorly understood. Here we present the first report of an Ncs6 homolog of archaea (NcsA of Haloferax volcanii) that is essential for maintaining cellular pools of thiolated tRNA(Lys)UUU and for growth at high temperature. When purified from Hfx. volcanii, NcsA was found to be modified at Lys204 by isopeptide linkage to polymeric chains of the ubiquitin-fold protein SAMP2. The ubiquitin-activating E1 enzyme homolog of archaea (UbaA) was required for this covalent modification. Non-covalent protein partners that specifically associated with NcsA were also identified including UbaA, SAMP2, proteasome activating nucleotidase (PAN)-A/1, translation elongation factor aEF-1α and a β-CASP ribonuclease homolog of the archaeal cleavage and polyadenylation specificity factor 1 family (aCPSF1). Together, our study reveals that NcsA is essential for growth at high temperature, required for formation of thiolated tRNA(Lys)UUU and intimately linked to homologs of ubiquitin-proteasome, translation and RNA processing systems.

  16. Desktop supercomputer: what can it do?

    NASA Astrophysics Data System (ADS)

    Bogdanov, A.; Degtyarev, A.; Korkhov, V.

    2017-12-01

    The paper addresses the issues of solving complex problems that require using supercomputers or multiprocessor clusters available for most researchers nowadays. Efficient distribution of high performance computing resources according to actual application needs has been a major research topic since high-performance computing (HPC) technologies became widely introduced. At the same time, comfortable and transparent access to these resources was a key user requirement. In this paper we discuss approaches to build a virtual private supercomputer available at user's desktop: a virtual computing environment tailored specifically for a target user with a particular target application. We describe and evaluate possibilities to create the virtual supercomputer based on light-weight virtualization technologies, and analyze the efficiency of our approach compared to traditional methods of HPC resource management.

  17. Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sarje, Abhinav; Jacobsen, Douglas W.; Williams, Samuel W.

    The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.

  18. Automatic discovery of the communication network topology for building a supercomputer model

    NASA Astrophysics Data System (ADS)

    Sobolev, Sergey; Stefanov, Konstantin; Voevodin, Vadim

    2016-10-01

    The Research Computing Center of Lomonosov Moscow State University is developing the Octotron software suite for automatic monitoring and mitigation of emergency situations in supercomputers so as to maximize hardware reliability. The suite is based on a software model of the supercomputer. The model uses a graph to describe the computing system components and their interconnections. One of the most complex components of a supercomputer that needs to be included in the model is its communication network. This work describes the proposed approach for automatically discovering the Ethernet communication network topology in a supercomputer and its description in terms of the Octotron model. This suite automatically detects computing nodes and switches, collects information about them and identifies their interconnections. The application of this approach is demonstrated on the "Lomonosov" and "Lomonosov-2" supercomputers.

  19. Longitudinal relaxation of initially straight flexible and stiff polymers

    NASA Astrophysics Data System (ADS)

    Dimitrakopoulos, Panagiotis; Dissanayake, Inuka

    2004-11-01

    The present talk considers the relaxation of a single flexible or stiff polymer chain from an initial straight configuration in a viscous solvent. This problem commonly arises when strong flows are turned off in both industrial and biological applications. The problem is also motivated by recent experiments with single biopolymer molecules relaxing after being fully extended by applied forces as well as by the recent development of micro-devices involving stretched tethered biopolymers. Our results are applicable to a wide array of synthetic polymers such as polyacrylamides, Kevlar and polyesters as well as biopolymers such as DNA, actin filaments, microtubules and MTV. In this talk we discuss the mechanism of the polymer relaxation as was revealed through Brownian Dynamics simulations covering a broad range of time scales and chain stiffness. After the short-time free diffusion, the chain's longitudinal reduction at early intermediate times is shown to constitute a universal behavior for any chain stiffness caused by a quasi-steady relaxation of tensions associated with the deforming action of the Brownian forces. Stiff chains are shown to exhibit a late intermediate-time longitudinal reduction associated with a relaxation of tensions affected by the deforming Brownian and the restoring bending forces. The longitudinal and transverse relaxations are shown to obey different laws, i.e. the chain relaxation is anisotropic at all times. In the talk, we show how from the knowledge of the relaxation mechanism, we can predict and explain the polymer properties including the polymer stress and the solution birefringence. In addition, a generalized stress-optic law is derived valid for any time and chain stiffness. All polymer properties which depend on the polymer length are shown to exhibit two intermediate-time behaviors with the early one to constitute a universal behavior for any chain stiffness. This work was supported in part by the Minta Martin Research Fund. The computations were performed on multiprocessor computers provided by the National Center for Supercomputing Applications (NCSA) in Illinois (grant DMR000003), and by an Academic Equipment Grant from Sun Microsystems Inc.

  20. Ab initio phasing by molecular averaging in real space with new criteria: application to structure determination of a betanodavirus.

    PubMed

    Yoshimura, Masato; Chen, Nai Chi; Guan, Hong Hsiang; Chuankhayan, Phimonphan; Lin, Chien Chih; Nakagawa, Atsushi; Chen, Chun Jung

    2016-07-01

    Molecular averaging, including noncrystallographic symmetry (NCS) averaging, is a powerful method for ab initio phase determination and phase improvement. Applications of the cross-crystal averaging (CCA) method have been shown to be effective for phase improvement after initial phasing by molecular replacement, isomorphous replacement, anomalous dispersion or combinations of these methods. Here, a two-step process for phase determination in the X-ray structural analysis of a new coat protein from a betanodavirus, Grouper nervous necrosis virus, is described in detail. The first step is ab initio structure determination of the T = 3 icosahedral virus-like particle using NCS averaging (NCSA). The second step involves structure determination of the protrusion domain of the viral molecule using cross-crystal averaging. In this method, molecular averaging and solvent flattening constrain the electron density in real space. To quantify these constraints, a new, simple and general indicator, free fraction (ff), is introduced, where ff is defined as the ratio of the volume of the electron density that is freely changed to the total volume of the crystal unit cell. This indicator is useful and effective to evaluate the strengths of both NCSA and CCA. Under the condition that a mask (envelope) covers the target molecule well, an ff value of less than 0.1, as a new rule of thumb, gives sufficient phasing power for the successful construction of new structures.

  1. Prospects for Boiling of Subcooled Dielectric Liquids for Supercomputer Cooling

    NASA Astrophysics Data System (ADS)

    Zeigarnik, Yu. A.; Vasil'ev, N. V.; Druzhinin, E. A.; Kalmykov, I. V.; Kosoi, A. S.; Khodakov, K. A.

    2018-02-01

    It is shown experimentally that using forced-convection boiling of dielectric coolants of the Novec 649 Refrigerant subcooled relative to the saturation temperature makes possible removing heat flow rates up to 100 W/cm2 from modern supercomputer chip interface. This fact creates prerequisites for the application of dielectric liquids in cooling systems of modern supercomputers with increased requirements for their operating reliability.

  2. SCEC Earthquake System Science Using High Performance Computing

    NASA Astrophysics Data System (ADS)

    Maechling, P. J.; Jordan, T. H.; Archuleta, R.; Beroza, G.; Bielak, J.; Chen, P.; Cui, Y.; Day, S.; Deelman, E.; Graves, R. W.; Minster, J. B.; Olsen, K. B.

    2008-12-01

    The SCEC Community Modeling Environment (SCEC/CME) collaboration performs basic scientific research using high performance computing with the goal of developing a predictive understanding of earthquake processes and seismic hazards in California. SCEC/CME research areas including dynamic rupture modeling, wave propagation modeling, probabilistic seismic hazard analysis (PSHA), and full 3D tomography. SCEC/CME computational capabilities are organized around the development and application of robust, re- usable, well-validated simulation systems we call computational platforms. The SCEC earthquake system science research program includes a wide range of numerical modeling efforts and we continue to extend our numerical modeling codes to include more realistic physics and to run at higher and higher resolution. During this year, the SCEC/USGS OpenSHA PSHA computational platform was used to calculate PSHA hazard curves and hazard maps using the new UCERF2.0 ERF and new 2008 attenuation relationships. Three SCEC/CME modeling groups ran 1Hz ShakeOut simulations using different codes and computer systems and carefully compared the results. The DynaShake Platform was used to calculate several dynamic rupture-based source descriptions equivalent in magnitude and final surface slip to the ShakeOut 1.2 kinematic source description. A SCEC/CME modeler produced 10Hz synthetic seismograms for the ShakeOut 1.2 scenario rupture by combining 1Hz deterministic simulation results with 10Hz stochastic seismograms. SCEC/CME modelers ran an ensemble of seven ShakeOut-D simulations to investigate the variability of ground motions produced by dynamic rupture-based source descriptions. The CyberShake Platform was used to calculate more than 15 new probabilistic seismic hazard analysis (PSHA) hazard curves using full 3D waveform modeling and the new UCERF2.0 ERF. The SCEC/CME group has also produced significant computer science results this year. Large-scale SCEC/CME high performance codes were run on NSF TeraGrid sites including simulations that use the full PSC Big Ben supercomputer (4096 cores) and simulations that ran on more than 10K cores at TACC Ranger. The SCEC/CME group used scientific workflow tools and grid-computing to run more than 1.5 million jobs at NCSA for the CyberShake project. Visualizations produced by a SCEC/CME researcher of the 10Hz ShakeOut 1.2 scenario simulation data were used by USGS in ShakeOut publications and public outreach efforts. OpenSHA was ported onto an NSF supercomputer and was used to produce very high resolution hazard PSHA maps that contained more than 1.6 million hazard curves.

  3. Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yilk, Todd

    The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.

  4. Qualifying for the Green500: Experience with the newest generation of supercomputers at LANL

    DOE PAGES

    Yilk, Todd

    2018-02-17

    The High Performance Computing Division of Los Alamos National Laboratory recently brought four new supercomputing platforms on line: Trinity with separate partitions built around the Haswell and Knights Landing CPU architectures for capability computing and Grizzly, Fire, and Ice for capacity computing applications. The power monitoring infrastructure of these machines is significantly enhanced over previous supercomputing generations at LANL and all were qualified at the highest level of the Green500 benchmark. Here, this paper discusses supercomputing at LANL, the Green500 benchmark, and notes on our experience meeting the Green500's reporting requirements.

  5. Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

    NASA Technical Reports Server (NTRS)

    1991-01-01

    Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)

  6. Ab initio phasing by molecular averaging in real space with new criteria: application to structure determination of a betanodavirus

    PubMed Central

    Yoshimura, Masato; Chen, Nai-Chi; Guan, Hong-Hsiang; Chuankhayan, Phimonphan; Lin, Chien-Chih; Nakagawa, Atsushi; Chen, Chun-Jung

    2016-01-01

    Molecular averaging, including noncrystallographic symmetry (NCS) averaging, is a powerful method for ab initio phase determination and phase improvement. Applications of the cross-crystal averaging (CCA) method have been shown to be effective for phase improvement after initial phasing by molecular replacement, isomorphous replacement, anomalous dispersion or combinations of these methods. Here, a two-step process for phase determination in the X-ray structural analysis of a new coat protein from a betanodavirus, Grouper nervous necrosis virus, is described in detail. The first step is ab initio structure determination of the T = 3 icosahedral virus-like particle using NCS averaging (NCSA). The second step involves structure determination of the protrusion domain of the viral molecule using cross-crystal averaging. In this method, molecular averaging and solvent flattening constrain the electron density in real space. To quantify these constraints, a new, simple and general indicator, free fraction (ff), is introduced, where ff is defined as the ratio of the volume of the electron density that is freely changed to the total volume of the crystal unit cell. This indicator is useful and effective to evaluate the strengths of both NCSA and CCA. Under the condition that a mask (envelope) covers the target molecule well, an ff value of less than 0.1, as a new rule of thumb, gives sufficient phasing power for the successful construction of new structures. PMID:27377380

  7. Supercomputer optimizations for stochastic optimal control applications

    NASA Technical Reports Server (NTRS)

    Chung, Siu-Leung; Hanson, Floyd B.; Xu, Huihuang

    1991-01-01

    Supercomputer optimizations for a computational method of solving stochastic, multibody, dynamic programming problems are presented. The computational method is valid for a general class of optimal control problems that are nonlinear, multibody dynamical systems, perturbed by general Markov noise in continuous time, i.e., nonsmooth Gaussian as well as jump Poisson random white noise. Optimization techniques for vector multiprocessors or vectorizing supercomputers include advanced data structures, loop restructuring, loop collapsing, blocking, and compiler directives. These advanced computing techniques and superconducting hardware help alleviate Bellman's curse of dimensionality in dynamic programming computations, by permitting the solution of large multibody problems. Possible applications include lumped flight dynamics models for uncertain environments, such as large scale and background random aerospace fluctuations.

  8. Technology advances and market forces: Their impact on high performance architectures

    NASA Technical Reports Server (NTRS)

    Best, D. R.

    1978-01-01

    Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.

  9. National Test Facility civilian agency use of supercomputers not feasible

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1994-12-01

    Based on interviews with civilian agencies cited in the House report (DOE, DoEd, HHS, FEMA, NOAA), none would be able to make effective use of NTF`s excess supercomputing capabilities. These agencies stated they could not use the resources primarily because (1) NTF`s supercomputers are older machines whose performance and costs cannot match those of more advanced computers available from other sources and (2) some agencies have not yet developed applications requiring supercomputer capabilities or do not have funding to support such activities. In addition, future support for the hardware and software at NTF is uncertain, making any investment by anmore » outside user risky.« less

  10. Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

    PubMed

    Tajima, K

    1988-11-01

    This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.

  11. CFD applications: The Lockheed perspective

    NASA Technical Reports Server (NTRS)

    Miranda, Luis R.

    1987-01-01

    The Numerical Aerodynamic Simulator (NAS) epitomizes the coming of age of supercomputing and opens exciting horizons in the world of numerical simulation. An overview of supercomputing at Lockheed Corporation in the area of Computational Fluid Dynamics (CFD) is presented. This overview will focus on developments and applications of CFD as an aircraft design tool and will attempt to present an assessment, withing this context, of the state-of-the-art in CFD methodology.

  12. An Integrated Cyberenvironment for Event-Driven Environmental Observatory Research and Education

    NASA Astrophysics Data System (ADS)

    Myers, J.; Minsker, B.; Butler, R.

    2006-12-01

    National environmental observatories will soon provide large-scale data from diverse sensor networks and community models. While much attention is focused on piping data from sensors to archives and users, truly integrating these resources into the everyday research activities of scientists and engineers across the community, and enabling their results and innovations to be brought back into the observatory, also critical to long-term success of the observatories, is often neglected. This talk will give an overview of the Environmental Cyberinfrastructure Demonstrator (ECID) Cyberenvironment for observatory-centric environmental research and education, under development at the National Center for Supercomputing Applications (NCSA), which is designed to address these issues. Cyberenvironments incorporate collaboratory and grid technologies, web services, and other cyberinfrastructure into an overall framework that balances needs for efficient coordination and the ability to innovate. They are designed to support the full scientific lifecycle both in terms of individual experiments moving from data to workflows to publication and at the macro level where new discoveries lead to additional data, models, tools, and conceptual frameworks that augment and evolve community-scale systems such as observatories. The ECID cyberenvironment currently integrates five major components a collaborative portal, workflow engine, event manager, metadata repository, and social network personalization capabilities - that have novel features inspired by the Cyberenvironment concept and enabling powerful environmental research scenarios. A summary of these components and the overall cyberenvironment will be given in this talk, while other posters will give details on several of the components. The summary will be presented within the context of environmental use case scenarios created in collaboration with researchers from the WATERS (WATer and Environmental Research Systems) Network, a joint National Science Foundation-funded initiative of the hydrology and environmental engineering communities. The use case scenarios include identifying sensor anomalies in point- and streaming sensor data and notifying data managers in near-real time; and referring users of data or data products (e.g., workflows, publications) to related data or data products.

  13. Porting Ordinary Applications to Blue Gene/Q Supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maheshwari, Ketan C.; Wozniak, Justin M.; Armstrong, Timothy

    2015-08-31

    Efficiently porting ordinary applications to Blue Gene/Q supercomputers is a significant challenge. Codes are often originally developed without considering advanced architectures and related tool chains. Science needs frequently lead users to want to run large numbers of relatively small jobs (often called many-task computing, an ensemble, or a workflow), which can conflict with supercomputer configurations. In this paper, we discuss techniques developed to execute ordinary applications over leadership class supercomputers. We use the high-performance Swift parallel scripting framework and build two workflow execution techniques-sub-jobs and main-wrap. The sub-jobs technique, built on top of the IBM Blue Gene/Q resource manager Cobalt'smore » sub-block jobs, lets users submit multiple, independent, repeated smaller jobs within a single larger resource block. The main-wrap technique is a scheme that enables C/C++ programs to be defined as functions that are wrapped by a high-performance Swift wrapper and that are invoked as a Swift script. We discuss the needs, benefits, technicalities, and current limitations of these techniques. We further discuss the real-world science enabled by these techniques and the results obtained.« less

  14. Validation of diagnoses of distress disorders in the US National Comorbidity Survey Replication Adolescent (NCS-A) Supplement

    PubMed Central

    Green, Jennifer Greif; Avenevoli, Shelli; Gruber, Michael; Kessler, Ronald C.; Lakoma, Matthew; Merikangas, Kathleen R.; Sampson, Nancy A.; Zaslavsky, Alan M.

    2012-01-01

    Research diagnostic interviews need to discriminate between closely related disorders in order to allow comorbidity among mental disorders to be studied reliably. Yet conventional studies of diagnostic validity generally focus on single disorders and do not examine discriminant validity. The current study examines the validity of fully-structured diagnoses of closely-related distress disorders (generalized anxiety disorder, post-traumatic stress disorder, major depressive episode, and dysthymic disorder) in the lay-administered Composite International Diagnostic Interview Version 3.0 (CIDI) with independent clinical diagnoses based on the Schedule for Affective Disorders and Schizophrenia for School-Age Children (K-SADS) in the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A). The NCS-A is a national survey of DSM-IV mental disorders among 10,148 adolescents. A probability subsample of 347 of these adolescents and their parents were administered blinded follow-up K-SADS interviews. Good concordance (AUC; area under the receiver operating characteristic curve) was found between diagnoses based on the CIDI and the K-SADS for generalized anxiety disorder (AUC = .78), post-traumatic stress disorder (AUC = .79), and major depressive episode/dysthymic disorder (AUC = .86). Further, the CIDI was able to effectively discriminate among different types of distress disorders in the sub-sample of respondents with any distress disorder. PMID:22086845

  15. Computing and data processing

    NASA Technical Reports Server (NTRS)

    Smarr, Larry; Press, William; Arnett, David W.; Cameron, Alastair G. W.; Crutcher, Richard M.; Helfand, David J.; Horowitz, Paul; Kleinmann, Susan G.; Linsky, Jeffrey L.; Madore, Barry F.

    1991-01-01

    The applications of computers and data processing to astronomy are discussed. Among the topics covered are the emerging national information infrastructure, workstations and supercomputers, supertelescopes, digital astronomy, astrophysics in a numerical laboratory, community software, archiving of ground-based observations, dynamical simulations of complex systems, plasma astrophysics, and the remote control of fourth dimension supercomputers.

  16. Multi-petascale highly efficient parallel supercomputer

    DOEpatents

    Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

    2015-07-14

    A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

  17. Final Scientific Report: A Scalable Development Environment for Peta-Scale Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Karbach, Carsten; Frings, Wolfgang

    2013-02-22

    This document is the final scientific report of the project DE-SC000120 (A scalable Development Environment for Peta-Scale Computing). The objective of this project is the extension of the Parallel Tools Platform (PTP) for applying it to peta-scale systems. PTP is an integrated development environment for parallel applications. It comprises code analysis, performance tuning, parallel debugging and system monitoring. The contribution of the Juelich Supercomputing Centre (JSC) aims to provide a scalable solution for system monitoring of supercomputers. This includes the development of a new communication protocol for exchanging status data between the target remote system and the client running PTP.more » The communication has to work for high latency. PTP needs to be implemented robustly and should hide the complexity of the supercomputer's architecture in order to provide a transparent access to various remote systems via a uniform user interface. This simplifies the porting of applications to different systems, because PTP functions as abstraction layer between parallel application developer and compute resources. The common requirement for all PTP components is that they have to interact with the remote supercomputer. E.g. applications are built remotely and performance tools are attached to job submissions and their output data resides on the remote system. Status data has to be collected by evaluating outputs of the remote job scheduler and the parallel debugger needs to control an application executed on the supercomputer. The challenge is to provide this functionality for peta-scale systems in real-time. The client server architecture of the established monitoring application LLview, developed by the JSC, can be applied to PTP's system monitoring. LLview provides a well-arranged overview of the supercomputer's current status. A set of statistics, a list of running and queued jobs as well as a node display mapping running jobs to their compute resources form the user display of LLview. These monitoring features have to be integrated into the development environment. Besides showing the current status PTP's monitoring also needs to allow for submitting and canceling user jobs. Monitoring peta-scale systems especially deals with presenting the large amount of status data in a useful manner. Users require to select arbitrary levels of detail. The monitoring views have to provide a quick overview of the system state, but also need to allow for zooming into specific parts of the system, into which the user is interested in. At present, the major batch systems running on supercomputers are PBS, TORQUE, ALPS and LoadLeveler, which have to be supported by both the monitoring and the job controlling component. Finally, PTP needs to be designed as generic as possible, so that it can be extended for future batch systems.« less

  18. Design and Field Procedures in the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A)

    PubMed Central

    Kessler, Ronald C.; Avenevoli, Shelli; Costello, E. Jane; Green, Jennifer Greif; Gruber, Michael J.; Heeringa, Steven; Merikangas, Kathleen R.; Pennell, Beth-Ellen; Sampson, Nancy A.; Zaslavsky, Alan M.

    2009-01-01

    An overview is presented of the design and field procedures of the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A), a US face-to-face household survey of the prevalence and correlates of DSM-IV mental disorders. The survey was based on a dual-frame design that included 904 adolescent residents of the households that participated in the US National Comorbidity Survey Replication (85.9% response rate) and 9,244 adolescent students selected from a nationally representative sample of 320 schools (74.7% response rate). After expositing the logic of dual-frame designs, comparisons are presented of sample and population distributions on Census socio-demographic variables and, in the school sample, school characteristics. These document only minor differences between the samples and the population. The results of statistical analysis of the bias-efficiency trade-off in weight trimming are then presented. These show that modest trimming meaningfully reduces mean squared error. Analysis of comparative sample efficiency shows that the household sample is more efficient than the school sample, leading to the household sample getting a higher weight relative to its size in the consolidated sample relative to the school sample. Taken together, these results show that the NCS-A is an efficient sample of the target population with good representativeness on a range of socio-demographic and geographic variables. PMID:19507169

  19. Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sreepathi, Sarat; D'Azevedo, Eduardo; Philip, Bobby

    On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phasemore » of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership class supercomputer at Oak Ridge National Laboratory.« less

  20. Role of HPC in Advancing Computational Aeroelasticity

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.

    2004-01-01

    On behalf of the High Performance Computing and Modernization Program (HPCMP) and NASA Advanced Supercomputing Division (NAS) a study is conducted to assess the role of supercomputers on computational aeroelasticity of aerospace vehicles. The study is mostly based on the responses to a web based questionnaire that was designed to capture the nuances of high performance computational aeroelasticity, particularly on parallel computers. A procedure is presented to assign a fidelity-complexity index to each application. Case studies based on major applications using HPCMP resources are presented.

  1. Panic Disorder among Adults

    MedlinePlus

    ... 20855043 Statistical Methods and Measurement Caveats National Comorbidity Survey Replication (NCS-R) Diagnostic Assessment and Population: The ... the NIMH NCS-R study page . National Comorbidity Survey Adolescent Supplement (NCS-A) Diagnostic Assessment and Population: ...

  2. Clinical utility of a functional lumen imaging probe in management of dysphagia following head and neck cancer therapies.

    PubMed

    Wu, Peter I; Szczesniak, Michal M; Maclean, Julia; Choo, Lennart; Quon, Harry; Graham, Peter H; Zhang, Teng; Cook, Ian J

    2017-09-01

    Background and aims  Chemoradiotherapy for head and neck cancer (HNC) with/without laryngectomy commonly causes dysphagia. Pharyngoesophageal junction (PEJ) stricturing is an important contributor. We aimed to validate a functional lumen imaging probe (the EndoFLIP system) as a tool for quantitating pretreatment PEJ distensibility and treatment-related changes in HNC survivors with dysphagia and to evaluate the diagnostic accuracy of EndoFLIP-derived distensibility in detecting PEJ strictures. Methods  We studied 34 consecutive HNC survivors with long-term (> 12 months) dysphagia who underwent endoscopic dilation for suspected strictures. Twenty non-dysphagic patients undergoing routine endoscopy served as controls. PEJ distensibility was measured at endoscopy with the EndoFLIP system pre- and post-dilation. PEJ stricture was defined as the presence of a mucosal tear post-dilation. Results  PEJ stricture was confirmed in 22/34 HNC patients (65 %). During distension up to 60 mmHg, the mean EndoFLIP-derived narrowest cross-sectional area (nCSA) in HNC patients with strictures, without strictures, and in controls were 58 mm 2 (95 % confidence interval [CI] 22 to 118), 195 mm 2 (95 %CI 129 to 334), and 227 mm 2 (95 %CI 168 to 316), respectively. A cutoff of 114 mm 2 for the nCSA at the PEJ had perfect diagnostic accuracy in detecting strictures (area under the receiver operating characteristic curve = 1). In patients with strictures, a single session of dilation increased the nCSA by 29 mm 2 (95 %CI 20 to 37; P  < 0.001). In patients with no strictures, dilation caused no change in the nCSA (mean difference 13 mm 2 [95 %CI -4 to 30]; P  = 0.13). Conclusions  EndoFLIP is a highly accurate technique for the detection of PEJ strictures. EndoFLIP may complement conventional diagnostic tools in the detection of pharyngeal outflow obstruction. © Georg Thieme Verlag KG Stuttgart · New York.

  3. High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

    NASA Technical Reports Server (NTRS)

    Kramer, Williams T. C.; Simon, Horst D.

    1994-01-01

    This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.

  4. Development of the general interpolants method for the CYBER 200 series of supercomputers

    NASA Technical Reports Server (NTRS)

    Stalnaker, J. F.; Robinson, M. A.; Spradley, L. W.; Kurzius, S. C.; Thoenes, J.

    1988-01-01

    The General Interpolants Method (GIM) is a 3-D, time-dependent, hybrid procedure for generating numerical analogs of the conservation laws. This study is directed toward the development and application of the GIM computer code for fluid dynamic research applications as implemented for the Cyber 200 series of supercomputers. An elliptic and quasi-parabolic version of the GIM code are discussed. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and an implicit finite difference scheme are also included.

  5. Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers

    NASA Astrophysics Data System (ADS)

    Dreher, Patrick; Scullin, William; Vouk, Mladen

    2015-09-01

    Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers.

  6. Characterizing output bottlenecks in a supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xie, Bing; Chase, Jeffrey; Dillow, David A

    2012-01-01

    Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more » contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.« less

  7. Particle simulation on heterogeneous distributed supercomputers

    NASA Technical Reports Server (NTRS)

    Becker, Jeffrey C.; Dagum, Leonardo

    1993-01-01

    We describe the implementation and performance of a three dimensional particle simulation distributed between a Thinking Machines CM-2 and a Cray Y-MP. These are connected by a combination of two high-speed networks: a high-performance parallel interface (HIPPI) and an optical network (UltraNet). This is the first application to use this configuration at NASA Ames Research Center. We describe our experience implementing and using the application and report the results of several timing measurements. We show that the distribution of applications across disparate supercomputing platforms is feasible and has reasonable performance. In addition, several practical aspects of the computing environment are discussed.

  8. Pre-installation customer satisfaction survey

    DOT National Transportation Integrated Search

    1996-10-01

    The National Center for Statistics and Analysis (NCSA) Information Services Branch (ISB) required a more effective method of receiving, tracking, and completing requests for data, statistics, and information. To enhance ISBs services, a new cus...

  9. Rural and Urban Crashes: A Comparative Analysis

    DOT National Transportation Integrated Search

    1996-08-01

    National Highway Traffic Safety Administration's National Center for Statistics : and Analysis (NCSA) recently completed a study comparing the characteristics of : crashes occurring in rural areas to the characteristics of crashes occurring in : urba...

  10. Nonoccupant Fatalities Associated with Backing Crashes

    DOT National Transportation Integrated Search

    1997-02-01

    National Highway Traffic Safety Administration's National Center for Statistics : and Analysis (NCSA) recently completed a study of data from the National Center : Health Statistics (NCHS) to obtain an estimate of the number of nonoccupant : fataliti...

  11. Injuries Associated With Hazards Involving Motor Vehicle Power Windows

    DOT National Transportation Integrated Search

    1997-05-01

    National Highway Traffic Safety Administration's (NHTSA) National Center for : Statistics and Analysis (NCSA) recently completed a study of data from the : Consumer Product Safety Commission's (CPSC) National Electronic Injury : Surveillance System (...

  12. Will Moores law be sufficient?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DeBenedictis, Erik P.

    2004-07-01

    It seems well understood that supercomputer simulation is an enabler for scientific discoveries, weapons, and other activities of value to society. It also seems widely believed that Moore's Law will make progressively more powerful supercomputers over time and thus enable more of these contributions. This paper seeks to add detail to these arguments, revealing them to be generally correct but not a smooth and effortless progression. This paper will review some key problems that can be solved with supercomputer simulation, showing that more powerful supercomputers will be useful up to a very high yet finite limit of around 1021 FLOPSmore » (1 Zettaflops) . The review will also show the basic nature of these extreme problems. This paper will review work by others showing that the theoretical maximum supercomputer power is very high indeed, but will explain how a straightforward extrapolation of Moore's Law will lead to technological maturity in a few decades. The power of a supercomputer at the maturity of Moore's Law will be very high by today's standards at 1016-1019 FLOPS (100 Petaflops to 10 Exaflops), depending on architecture, but distinctly below the level required for the most ambitious applications. Having established that Moore's Law will not be that last word in supercomputing, this paper will explore the nearer term issue of what a supercomputer will look like at maturity of Moore's Law. Our approach will quantify the maximum performance as permitted by the laws of physics for extension of current technology and then find a design that approaches this limit closely. We study a 'multi-architecture' for supercomputers that combines a microprocessor with other 'advanced' concepts and find it can reach the limits as well. This approach should be quite viable in the future because the microprocessor would provide compatibility with existing codes and programming styles while the 'advanced' features would provide a boost to the limits of performance.« less

  13. Wheelchair User Injuries and Deaths Associated with Motor Vehicle Related Incidents

    DOT National Transportation Integrated Search

    1997-09-01

    National Highway Traffic Safety Administration's National Center for Statistics : and Analysis (NCSA) recently completed a study of data from the Consumer Product : Safety Commission's (CPSC) National Electronic Injury Surveillance System : (NEISS) o...

  14. Seismic signal processing on heterogeneous supercomputers

    NASA Astrophysics Data System (ADS)

    Gokhberg, Alexey; Ermert, Laura; Fichtner, Andreas

    2015-04-01

    The processing of seismic signals - including the correlation of massive ambient noise data sets - represents an important part of a wide range of seismological applications. It is characterized by large data volumes as well as high computational input/output intensity. Development of efficient approaches towards seismic signal processing on emerging high performance computing systems is therefore essential. Heterogeneous supercomputing systems introduced in the recent years provide numerous computing nodes interconnected via high throughput networks, every node containing a mix of processing elements of different architectures, like several sequential processor cores and one or a few graphical processing units (GPU) serving as accelerators. A typical representative of such computing systems is "Piz Daint", a supercomputer of the Cray XC 30 family operated by the Swiss National Supercomputing Center (CSCS), which we used in this research. Heterogeneous supercomputers provide an opportunity for manifold application performance increase and are more energy-efficient, however they have much higher hardware complexity and are therefore much more difficult to program. The programming effort may be substantially reduced by the introduction of modular libraries of software components that can be reused for a wide class of seismology applications. The ultimate goal of this research is design of a prototype for such library suitable for implementing various seismic signal processing applications on heterogeneous systems. As a representative use case we have chosen an ambient noise correlation application. Ambient noise interferometry has developed into one of the most powerful tools to image and monitor the Earth's interior. Future applications will require the extraction of increasingly small details from noise recordings. To meet this demand, more advanced correlation techniques combined with very large data volumes are needed. This poses new computational problems that require dedicated HPC solutions. The chosen application is using a wide range of common signal processing methods, which include various IIR filter designs, amplitude and phase correlation, computing the analytic signal, and discrete Fourier transforms. Furthermore, various processing methods specific for seismology, like rotation of seismic traces, are used. Efficient implementation of all these methods on the GPU-accelerated systems represents several challenges. In particular, it requires a careful distribution of work between the sequential processors and accelerators. Furthermore, since the application is designed to process very large volumes of data, special attention had to be paid to the efficient use of the available memory and networking hardware resources in order to reduce intensity of data input and output. In our contribution we will explain the software architecture as well as principal engineering decisions used to address these challenges. We will also describe the programming model based on C++ and CUDA that we used to develop the software. Finally, we will demonstrate performance improvements achieved by using the heterogeneous computing architecture. This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID d26.

  15. Approaching the exa-scale: a real-world evaluation of rendering extremely large data sets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Patchett, John M; Ahrens, James P; Lo, Li - Ta

    2010-10-15

    Extremely large scale analysis is becoming increasingly important as supercomputers and their simulations move from petascale to exascale. The lack of dedicated hardware acceleration for rendering on today's supercomputing platforms motivates our detailed evaluation of the possibility of interactive rendering on the supercomputer. In order to facilitate our understanding of rendering on the supercomputing platform, we focus on scalability of rendering algorithms and architecture envisioned for exascale datasets. To understand tradeoffs for dealing with extremely large datasets, we compare three different rendering algorithms for large polygonal data: software based ray tracing, software based rasterization and hardware accelerated rasterization. We presentmore » a case study of strong and weak scaling of rendering extremely large data on both GPU and CPU based parallel supercomputers using Para View, a parallel visualization tool. Wc use three different data sets: two synthetic and one from a scientific application. At an extreme scale, algorithmic rendering choices make a difference and should be considered while approaching exascale computing, visualization, and analysis. We find software based ray-tracing offers a viable approach for scalable rendering of the projected future massive data sizes.« less

  16. Achieving supercomputer performance for neural net simulation with an array of digital signal processors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Muller, U.A.; Baumle, B.; Kohler, P.

    1992-10-01

    Music, a DSP-based system with a parallel distributed-memory architecture, provides enormous computing power yet retains the flexibility of a general-purpose computer. Reaching a peak performance of 2.7 Gflops at a significantly lower cost, power consumption, and space requirement than conventional supercomputers, Music is well suited to computationally intensive applications such as neural network simulation. 12 refs., 9 figs., 2 tabs.

  17. Designing a connectionist network supercomputer.

    PubMed

    Asanović, K; Beck, J; Feldman, J; Morgan, N; Wawrzynek, J

    1993-12-01

    This paper describes an effort at UC Berkeley and the International Computer Science Institute to develop a supercomputer for artificial neural network applications. Our perspective has been strongly influenced by earlier experiences with the construction and use of a simpler machine. In particular, we have observed Amdahl's Law in action in our designs and those of others. These observations inspire attention to many factors beyond fast multiply-accumulate arithmetic. We describe a number of these factors along with rough expressions for their influence and then give the applications targets, machine goals and the system architecture for the machine we are currently designing.

  18. Revised Estimates of Child Restraint Effectiveness

    DOT National Transportation Integrated Search

    1996-12-01

    NHTSA's National Center for Statistics and Analysis (NCSA) recently completed an : analysis of data from the Fatal Accident Reporting System (FARS) to reexamine : the effectiveness of restraints in saving the lives of children, ages 0 - 4. : This ana...

  19. The Terra Data Fusion Project: An Update

    NASA Astrophysics Data System (ADS)

    Di Girolamo, L.; Bansal, S.; Butler, M.; Fu, D.; Gao, Y.; Lee, H. J.; Liu, Y.; Lo, Y. L.; Raila, D.; Turner, K.; Towns, J.; Wang, S. W.; Yang, K.; Zhao, G.

    2017-12-01

    Terra is the flagship of NASA's Earth Observing System. Launched in 1999, Terra's five instruments continue to gather data that enable scientists to address fundamental Earth science questions. By design, the strength of the Terra mission has always been rooted in its five instruments and the ability to fuse the instrument data together for obtaining greater quality of information for Earth Science compared to individual instruments alone. As the data volume grows and the central Earth Science questions move towards problems requiring decadal-scale data records, the need for data fusion and the ability for scientists to perform large-scale analytics with long records have never been greater. The challenge is particularly acute for Terra, given its growing volume of data (> 1 petabyte), the storage of different instrument data at different archive centers, the different file formats and projection systems employed for different instrument data, and the inadequate cyberinfrastructure for scientists to access and process whole-mission fusion data (including Level 1 data). Sharing newly derived Terra products with the rest of the world also poses challenges. As such, the Terra Data Fusion Project aims to resolve two long-standing problems: 1) How do we efficiently generate and deliver Terra data fusion products? 2) How do we facilitate the use of Terra data fusion products by the community in generating new products and knowledge through national computing facilities, and disseminate these new products and knowledge through national data sharing services? Here, we will provide an update on significant progress made in addressing these problems by working with NASA and leveraging national facilities managed by the National Center for Supercomputing Applications (NCSA). The problems that we faced in deriving and delivering Terra L1B2 basic, reprojected and cloud-element fusion products, such as data transfer, data fusion, processing on different computer architectures, science, and sharing, will be presented with quantitative specifics. Results from several science-specific drivers for Terra fusion products will also be presented. We demonstrate that the Terra Data Fusion Project itself provides an excellent use-case for the community addressing Big Data and cyberinfrastructure problems.

  20. Introducing Argonne’s Theta Supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    Theta, the Argonne Leadership Computing Facility’s (ALCF) new Intel-Cray supercomputer, is officially open to the research community. Theta’s massively parallel, many-core architecture puts the ALCF on the path to Aurora, the facility’s future Intel-Cray system. Capable of nearly 10 quadrillion calculations per second, Theta enables researchers to break new ground in scientific investigations that range from modeling the inner workings of the brain to developing new materials for renewable energy applications.

  1. Traffic Tech: National Traffic Speeds Survey III: 2015

    DOT National Transportation Integrated Search

    2018-03-01

    Vehicle speeds are an important factor in traffic safety. NHTSAs most recent data estimates that approximately 27 percent of all fatal motor vehicle crashes are speeding-related (NCSA, 2018). NHTSA estimated the economic cost of speeding-related c...

  2. 12 & 15 passenger vans tire pressure study : preliminary results

    DOT National Transportation Integrated Search

    2005-05-01

    A study was conducted by the National Highway Traffic Safety Administration's (NHTSA's) National Center for Statistics and Analysis (NCSA) to determine the extent of underinflation and observe the tire condition in 12- and 15-passenger vans. This Res...

  3. Advances in petascale kinetic plasma simulation with VPIC and Roadrunner

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowers, Kevin J; Albright, Brian J; Yin, Lin

    2009-01-01

    VPIC, a first-principles 3d electromagnetic charge-conserving relativistic kinetic particle-in-cell (PIC) code, was recently adapted to run on Los Alamos's Roadrunner, the first supercomputer to break a petaflop (10{sup 15} floating point operations per second) in the TOP500 supercomputer performance rankings. They give a brief overview of the modeling capabilities and optimization techniques used in VPIC and the computational characteristics of petascale supercomputers like Roadrunner. They then discuss three applications enabled by VPIC's unprecedented performance on Roadrunner: modeling laser plasma interaction in upcoming inertial confinement fusion experiments at the National Ignition Facility (NIF), modeling short pulse laser GeV ion acceleration andmore » modeling reconnection in magnetic confinement fusion experiments.« less

  4. Progress and supercomputing in computational fluid dynamics; Proceedings of U.S.-Israel Workshop, Jerusalem, Israel, December 1984

    NASA Technical Reports Server (NTRS)

    Murman, E. M. (Editor); Abarbanel, S. S. (Editor)

    1985-01-01

    Current developments and future trends in the application of supercomputers to computational fluid dynamics are discussed in reviews and reports. Topics examined include algorithm development for personal-size supercomputers, a multiblock three-dimensional Euler code for out-of-core and multiprocessor calculations, simulation of compressible inviscid and viscous flow, high-resolution solutions of the Euler equations for vortex flows, algorithms for the Navier-Stokes equations, and viscous-flow simulation by FEM and related techniques. Consideration is given to marching iterative methods for the parabolized and thin-layer Navier-Stokes equations, multigrid solutions to quasi-elliptic schemes, secondary instability of free shear flows, simulation of turbulent flow, and problems connected with weather prediction.

  5. The transition of a real-time single-rotor helicopter simulation program to a supercomputer

    NASA Technical Reports Server (NTRS)

    Martinez, Debbie

    1995-01-01

    This report presents the conversion effort and results of a real-time flight simulation application transition to a CONVEX supercomputer. Enclosed is a detailed description of the conversion process and a brief description of the Langley Research Center's (LaRC) flight simulation application program structure. Currently, this simulation program may be configured to represent Sikorsky S-61 helicopter (a five-blade, single-rotor, commercial passenger-type helicopter) or an Army Cobra helicopter (either the AH-1 G or AH-1 S model). This report refers to the Sikorsky S-61 simulation program since it is the most frequently used configuration.

  6. An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer.

    PubMed

    Yang, Xi; Wu, Chengkun; Lu, Kai; Fang, Lin; Zhang, Yong; Li, Shengkang; Guo, Guixin; Du, YunFei

    2017-12-01

    Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides extra possibilities and capacity to address the challenges associated with big data. In this paper, we propose Orion-a big data interface on the Tianhe-2 supercomputer-to enable big data applications to run on Tianhe-2 via a single command or a shell script. Orion supports multiple users, and each user can launch multiple tasks. It minimizes the effort needed to initiate big data applications on the Tianhe-2 supercomputer via automated configuration. Orion follows the "allocate-when-needed" paradigm, and it avoids the idle occupation of computational resources. We tested the utility and performance of Orion using a big genomic dataset and achieved a satisfactory performance on Tianhe-2 with very few modifications to existing applications that were implemented in Hadoop/Spark. In summary, Orion provides a practical and economical interface for big data processing on Tianhe-2.

  7. Fatalities Associated with Carbon Monoxide Poisoning from Motor Vehicles in 1993

    DOT National Transportation Integrated Search

    1996-12-01

    National Highway Traffic Safety Administration's National Center for Statistics : and Analysis (NCSA) recently completed a study of data from the National Center : for Health Statistics (NCHS) to obtain an estimate of the number of persons : killed a...

  8. An evaluation of the NCSA hydrochloric acid leaching procedure.

    DOT National Transportation Integrated Search

    1970-02-01

    In the past several years the National Crushed stone Association : has been conducting a research study into the skid resistance of carbonate : aggregate. The aim of the study has been the establishment of a relationship : between the skid resistance...

  9. A performance comparison of scalar, vector, and concurrent vector computers including supercomputers for modeling transport of reactive contaminants in groundwater

    NASA Astrophysics Data System (ADS)

    Tripathi, Vijay S.; Yeh, G. T.

    1993-06-01

    Sophisticated and highly computation-intensive models of transport of reactive contaminants in groundwater have been developed in recent years. Application of such models to real-world contaminant transport problems, e.g., simulation of groundwater transport of 10-15 chemically reactive elements (e.g., toxic metals) and relevant complexes and minerals in two and three dimensions over a distance of several hundred meters, requires high-performance computers including supercomputers. Although not widely recognized as such, the computational complexity and demand of these models compare with well-known computation-intensive applications including weather forecasting and quantum chemical calculations. A survey of the performance of a variety of available hardware, as measured by the run times for a reactive transport model HYDROGEOCHEM, showed that while supercomputers provide the fastest execution times for such problems, relatively low-cost reduced instruction set computer (RISC) based scalar computers provide the best performance-to-price ratio. Because supercomputers like the Cray X-MP are inherently multiuser resources, often the RISC computers also provide much better turnaround times. Furthermore, RISC-based workstations provide the best platforms for "visualization" of groundwater flow and contaminant plumes. The most notable result, however, is that current workstations costing less than $10,000 provide performance within a factor of 5 of a Cray X-MP.

  10. GLOBAL NON-SPHERICAL OSCILLATIONS IN THREE-DIMENSIONAL 4π SIMULATIONS OF THE H-INGESTION FLASH

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Herwig, Falk; Woodward, Paul R.; Lin, Pei-Hung

    2014-09-01

    We performed three-dimensional simulations of proton-rich material entrainment into {sup 12}C-rich He-shell flash convection and the subsequent H-ingestion flash that took place in the post-asymptotic giant branch star Sakurai's object. Observations of the transient nature and anomalous abundance features are available to validate our method and assumptions, with the aim of applying them to very low-metallicity stars in the future. We include nuclear energy feedback from H burning and cover the full 4π geometry of the shell. Runs on 768{sup 3} and 1536{sup 3} grids agree well with each other and have been followed for 1500 minutes and 1200 minutes.more » After an 850 minute long quiescent entrainment phase, the simulations enter into a global non-spherical oscillation that is launched and sustained by individual ignition events of H-rich fluid pockets. Fast circumferential flows collide at the antipode and cause the formation and localized ignition of the next H-overabundant pocket. The cycle repeats for more than a dozen times while its amplitude decreases. During the global oscillation, the entrainment rate increases temporarily by a factor of ≈100. Entrained entropy quenches convective motions in the upper layer until the burning of entrained H establishes a separate convection zone. The lower-resolution run hints at the possibility that another global oscillation, perhaps even more violent, will follow. The location of the H-burning convection zone agrees with a one-dimensional model in which the mixing efficiency is calibrated to reproduce the light curve. The simulations have been performed at the NSF Blue Waters supercomputer at NCSA.« less

  11. Analytical Applications of Monte Carlo Techniques.

    ERIC Educational Resources Information Center

    Guell, Oscar A.; Holcombe, James A.

    1990-01-01

    Described are analytical applications of the theory of random processes, in particular solutions obtained by using statistical procedures known as Monte Carlo techniques. Supercomputer simulations, sampling, integration, ensemble, annealing, and explicit simulation are discussed. (CW)

  12. 45 CFR 2551.12 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... Welfare Regulations Relating to Public Welfare (Continued) CORPORATION FOR NATIONAL AND COMMUNITY SERVICE... Officer of the Corporation appointed under the National and Community Service Act of 1990, as amended, (NCSA), 42 U.S.C. 12501 et seq. (f) Corporation. The Corporation for National and Community Service...

  13. Revised Vehicle Miles of Travel for Passenger Cars and Light Trucks, 1975 to 1993

    DOT National Transportation Integrated Search

    1995-09-21

    The Federal Highway Administration (FHWA) historically has supplied registered : vehicle and vehicle miles of travel (VMT) data for use with the National Center : for Statistics and Analysis (NCSA) motor vehicle injury and fatality data. : However, t...

  14. Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Kwak, Dochan (Technical Monitor)

    2002-01-01

    A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel supercomputers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.

  15. Intricacies of modern supercomputing illustrated with recent advances in simulations of strongly correlated electron systems

    NASA Astrophysics Data System (ADS)

    Schulthess, Thomas C.

    2013-03-01

    The continued thousand-fold improvement in sustained application performance per decade on modern supercomputers keeps opening new opportunities for scientific simulations. But supercomputers have become very complex machines, built with thousands or tens of thousands of complex nodes consisting of multiple CPU cores or, most recently, a combination of CPU and GPU processors. Efficient simulations on such high-end computing systems require tailored algorithms that optimally map numerical methods to particular architectures. These intricacies will be illustrated with simulations of strongly correlated electron systems, where the development of quantum cluster methods, Monte Carlo techniques, as well as their optimal implementation by means of algorithms with improved data locality and high arithmetic density have gone hand in hand with evolving computer architectures. The present work would not have been possible without continued access to computing resources at the National Center for Computational Science of Oak Ridge National Laboratory, which is funded by the Facilities Division of the Office of Advanced Scientific Computing Research, and the Swiss National Supercomputing Center (CSCS) that is funded by ETH Zurich.

  16. Motor Vehicle Crashes as a Leading Cause of Death in 1994

    DOT National Transportation Integrated Search

    1998-03-01

    National Highway Traffic Safety Administration's National Center for Statistics : and Analysis (NCSA) recently completed a study of data on the causes of death : for all persons, by age and sex, which occurred in the U. S. in 1994. The : purpose of t...

  17. Transitioning to multiple imputation : a new method to impute missing blood alcohol concentration (BAC) values in FARS

    DOT National Transportation Integrated Search

    2002-01-01

    The National Center for Statistics and Analysis (NCSA) of the National Highway Traffic Safety : Administration (NHTSA) has undertaken several approaches to remedy the problem of missing blood alcohol : test results in the Fatality Analysis Reporting ...

  18. Use of World Wide Web and NCSA Mcsaic at Langley

    NASA Technical Reports Server (NTRS)

    Nelson, Michael

    1994-01-01

    A brief history of the use of the World Wide Web at Langley Research Center is presented along with architecture of the Langley Web. Benefits derived from the Web and some Langley projects that have employed the World Wide Web are discussed.

  19. CFD code evaluation for internal flow modeling

    NASA Technical Reports Server (NTRS)

    Chung, T. J.

    1990-01-01

    Research on the computational fluid dynamics (CFD) code evaluation with emphasis on supercomputing in reacting flows is discussed. Advantages of unstructured grids, multigrids, adaptive methods, improved flow solvers, vector processing, parallel processing, and reduction of memory requirements are discussed. As examples, researchers include applications of supercomputing to reacting flow Navier-Stokes equations including shock waves and turbulence and combustion instability problems associated with solid and liquid propellants. Evaluation of codes developed by other organizations are not included. Instead, the basic criteria for accuracy and efficiency have been established, and some applications on rocket combustion have been made. Research toward an ultimate goal, the most accurate and efficient CFD code, is in progress and will continue for years to come.

  20. Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    De, K; Jha, S; Klimentov, A

    2016-01-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), MIRA supercomputer at Argonne Leadership Computing Facilities (ALCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava and others). Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full production for the ATLAS experiment since September 2015. We will present our current accomplishments with running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less

  1. Integration of Panda Workload Management System with supercomputers

    NASA Astrophysics Data System (ADS)

    De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Novikov, A.; Oleynik, D.; Panitkin, S.; Poyda, A.; Read, K. F.; Ryabinkin, E.; Teslyuk, A.; Velikhov, V.; Wells, J. C.; Wenaus, T.

    2016-09-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center "Kurchatov Institute", IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run singlethreaded workloads in parallel on Titan's multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility's infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

  2. Building more powerful less expensive supercomputers using Processing-In-Memory (PIM) LDRD final report.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Murphy, Richard C.

    2009-09-01

    This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential ofmore » PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.« less

  3. Tomographic Imaging on Distributed Unattended Ground Sensor Arrays

    DTIC Science & Technology

    2002-05-14

    communication, the recently released Bluetooth standard warrants investigation into its usefulness on ground sensors. Although not as powerful or as fast...NTSC,” June 2001, http://archive.ncsa.uiuc.edu/ SCMS /training/general/details/ntsc.html [14] Techfest, “PCI local bus technical summary,” 1999, http

  4. Data Service: Distributed Data Capture and Replication

    NASA Astrophysics Data System (ADS)

    Warner, P. B.; Pietrowicz, S. R.

    2007-10-01

    Data Service is a critical component of the NOAO Data Management and Science Support (DMaSS) Solutions Platform, which is based on a service-oriented architecture, and is to replace the current NOAO Data Transport System. Its responsibilities include capturing data from NOAO and partner telescopes and instruments and replicating the data across multiple (currently six) storage sites. Java 5 was chosen as the implementation language, and Java EE as the underlying enterprise framework. Application metadata persistence is performed using EJB and Hibernate on the JBoss Application Server, with PostgreSQL as the persistence back-end. Although potentially any underlying mass storage system may be used as the Data Service file persistence technology, DTS deployments and Data Service test deployments currently use the Storage Resource Broker from SDSC. This paper presents an overview and high-level design of the Data Service, including aspects of deployment, e.g., for the LSST Data Challenge at the NCSA computing facilities.

  5. Supercomputing resources empowering superstack with interactive and integrated systems

    NASA Astrophysics Data System (ADS)

    Rückemann, Claus-Peter

    2012-09-01

    This paper presents the results from the development and implementation of Superstack algorithms to be dynamically used with integrated systems and supercomputing resources. Processing of geophysical data, thus named geoprocessing, is an essential part of the analysis of geoscientific data. The theory of Superstack algorithms and the practical application on modern computing architectures was inspired by developments introduced with processing of seismic data on mainframes and within the last years leading to high end scientific computing applications. There are several stacking algorithms known but with low signal to noise ratio in seismic data the use of iterative algorithms like the Superstack can support analysis and interpretation. The new Superstack algorithms are in use with wave theory and optical phenomena on highly performant computing resources for huge data sets as well as for sophisticated application scenarios in geosciences and archaeology.

  6. Delivering Science on Day One

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Timothy J.

    2016-03-01

    While benchmarking software is useful for testing the performance limits and stability of Argonne National Laboratory’s new Theta supercomputer, there is no substitute for running real applications to explore the system’s potential. The Argonne Leadership Computing Facility’s Theta Early Science Program, modeled after its highly successful code migration program for the Mira supercomputer, has one primary aim: to deliver science on day one. Here is a closer look at the type of science problems that will be getting early access to Theta, a next-generation machine being rolled out this year.

  7. Towards future high performance computing: What will change? How can we be efficient?

    NASA Astrophysics Data System (ADS)

    Düben, Peter

    2017-04-01

    How can we make the most out of "exascale" supercomputers that will be available soon and enable us to calculate an amazing number of 1,000,000,000,000,000,000 real numbers operations within a single second? How do we need to design applications to use these machines efficiently? What are the limits? We will discuss opportunities and limits of the use of future high performance computers from the perspective of Earth System Modelling. We will provide an overview about future challenges and outline how numerical application will need to be changed to run efficiently on supercomputers in the future. We will also discuss how different disciplines can support each other and talk about data handling and numerical precision of data.

  8. Demonstration of NICT Space Weather Cloud --Integration of Supercomputer into Analysis and Visualization Environment--

    NASA Astrophysics Data System (ADS)

    Watari, S.; Morikawa, Y.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Kato, H.; Shimojo, S.; Murata, K. T.

    2010-12-01

    In the Solar-Terrestrial Physics (STP) field, spatio-temporal resolution of computer simulations is getting higher and higher because of tremendous advancement of supercomputers. A more advanced technology is Grid Computing that integrates distributed computational resources to provide scalable computing resources. In the simulation research, it is effective that a researcher oneself designs his physical model, performs calculations with a supercomputer, and analyzes and visualizes for consideration by a familiar method. A supercomputer is far from an analysis and visualization environment. In general, a researcher analyzes and visualizes in the workstation (WS) managed at hand because the installation and the operation of software in the WS are easy. Therefore, it is necessary to copy the data from the supercomputer to WS manually. Time necessary for the data transfer through long delay network disturbs high-accuracy simulations actually. In terms of usefulness, integrating a supercomputer and an analysis and visualization environment seamlessly with a researcher's familiar method is important. NICT has been developing a cloud computing environment (NICT Space Weather Cloud). In the NICT Space Weather Cloud, disk servers are located near its supercomputer and WSs for data analysis and visualization. They are connected to JGN2plus that is high-speed network for research and development. Distributed virtual high-capacity storage is also constructed by Grid Datafarm (Gfarm v2). Huge-size data output from the supercomputer is transferred to the virtual storage through JGN2plus. A researcher can concentrate on the research by a familiar method without regard to distance between a supercomputer and an analysis and visualization environment. Now, total 16 disk servers are setup in NICT headquarters (at Koganei, Tokyo), JGN2plus NOC (at Otemachi, Tokyo), Okinawa Subtropical Environment Remote-Sensing Center, and Cybermedia Center, Osaka University. They are connected on JGN2plus, and they constitute 1PB (physical size) virtual storage by Gfarm v2. These disk servers are connected with supercomputers of NICT and Osaka University. A system that data output from the supercomputers are automatically transferred to the virtual storage had been built up. Transfer rate is about 50 GB/hrs by actual measurement. It is estimated that the performance is reasonable for a certain simulation and analysis for reconstruction of coronal magnetic field. This research is assumed an experiment of the system, and the verification of practicality is advanced at the same time. Herein we introduce an overview of the space weather cloud system so far we have developed. We also demonstrate several scientific results using the space weather cloud system. We also introduce several web applications of the cloud as a service of the space weather cloud, which is named as "e-SpaceWeather" (e-SW). The e-SW provides with a variety of space weather online services from many aspects.

  9. NASA Advanced Supercomputing Facility Expansion

    NASA Technical Reports Server (NTRS)

    Thigpen, William W.

    2017-01-01

    The NASA Advanced Supercomputing (NAS) Division enables advances in high-end computing technologies and in modeling and simulation methods to tackle some of the toughest science and engineering challenges facing NASA today. The name "NAS" has long been associated with leadership and innovation throughout the high-end computing (HEC) community. We play a significant role in shaping HEC standards and paradigms, and provide leadership in the areas of large-scale InfiniBand fabrics, Lustre open-source filesystems, and hyperwall technologies. We provide an integrated high-end computing environment to accelerate NASA missions and make revolutionary advances in science. Pleiades, a petaflop-scale supercomputer, is used by scientists throughout the U.S. to support NASA missions, and is ranked among the most powerful systems in the world. One of our key focus areas is in modeling and simulation to support NASA's real-world engineering applications and make fundamental advances in modeling and simulation methods.

  10. 1993 Gordon Bell Prize Winners

    NASA Technical Reports Server (NTRS)

    Karp, Alan H.; Simon, Horst; Heller, Don; Cooper, D. M. (Technical Monitor)

    1994-01-01

    The Gordon Bell Prize recognizes significant achievements in the application of supercomputers to scientific and engineering problems. In 1993, finalists were named for work in three categories: (1) Performance, which recognizes those who solved a real problem in the quickest elapsed time. (2) Price/performance, which encourages the development of cost-effective supercomputing. (3) Compiler-generated speedup, which measures how well compiler writers are facilitating the programming of parallel processors. The winners were announced November 17 at the Supercomputing 93 conference in Portland, Oregon. Gordon Bell, an independent consultant in Los Altos, California, is sponsoring $2,000 in prizes each year for 10 years to promote practical parallel processing research. This is the sixth year of the prize, which Computer administers. Something unprecedented in Gordon Bell Prize competition occurred this year: A computer manufacturer was singled out for recognition. Nine entries reporting results obtained on the Cray C90 were received, seven of the submissions orchestrated by Cray Research. Although none of these entries showed sufficiently high performance to win outright, the judges were impressed by the breadth of applications that ran well on this machine, all nine running at more than a third of the peak performance of the machine.

  11. Calculation of Free Energy Landscape in Multi-Dimensions with Hamiltonian-Exchange Umbrella Sampling on Petascale Supercomputer.

    PubMed

    Jiang, Wei; Luo, Yun; Maragliano, Luca; Roux, Benoît

    2012-11-13

    An extremely scalable computational strategy is described for calculations of the potential of mean force (PMF) in multidimensions on massively distributed supercomputers. The approach involves coupling thousands of umbrella sampling (US) simulation windows distributed to cover the space of order parameters with a Hamiltonian molecular dynamics replica-exchange (H-REMD) algorithm to enhance the sampling of each simulation. In the present application, US/H-REMD is carried out in a two-dimensional (2D) space and exchanges are attempted alternatively along the two axes corresponding to the two order parameters. The US/H-REMD strategy is implemented on the basis of parallel/parallel multiple copy protocol at the MPI level, and therefore can fully exploit computing power of large-scale supercomputers. Here the novel technique is illustrated using the leadership supercomputer IBM Blue Gene/P with an application to a typical biomolecular calculation of general interest, namely the binding of calcium ions to the small protein Calbindin D9k. The free energy landscape associated with two order parameters, the distance between the ion and its binding pocket and the root-mean-square deviation (rmsd) of the binding pocket relative the crystal structure, was calculated using the US/H-REMD method. The results are then used to estimate the absolute binding free energy of calcium ion to Calbindin D9k. The tests demonstrate that the 2D US/H-REMD scheme greatly accelerates the configurational sampling of the binding pocket, thereby improving the convergence of the potential of mean force calculation.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meneses, Esteban; Ni, Xiang; Jones, Terry R

    The unprecedented computational power of cur- rent supercomputers now makes possible the exploration of complex problems in many scientific fields, from genomic analysis to computational fluid dynamics. Modern machines are powerful because they are massive: they assemble millions of cores and a huge quantity of disks, cards, routers, and other components. But it is precisely the size of these machines that glooms the future of supercomputing. A system that comprises many components has a high chance to fail, and fail often. In order to make the next generation of supercomputers usable, it is imperative to use some type of faultmore » tolerance platform to run applications on large machines. Most fault tolerance strategies can be optimized for the peculiarities of each system and boost efficacy by keeping the system productive. In this paper, we aim to understand how failure characterization can improve resilience in several layers of the software stack: applications, runtime systems, and job schedulers. We examine the Titan supercomputer, one of the fastest systems in the world. We analyze a full year of Titan in production and distill the failure patterns of the machine. By looking into Titan s log files and using the criteria of experts, we provide a detailed description of the types of failures. In addition, we inspect the job submission files and describe how the system is used. Using those two sources, we cross correlate failures in the machine to executing jobs and provide a picture of how failures affect the user experience. We believe such characterization is fundamental in developing appropriate fault tolerance solutions for Cray systems similar to Titan.« less

  13. INTEGRATION OF PANDA WORKLOAD MANAGEMENT SYSTEM WITH SUPERCOMPUTERS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    De, K; Jha, S; Maeno, T

    Abstract The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the funda- mental nature of matter and the basic forces that shape our universe, and were recently credited for the dis- covery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Datamore » Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data cen- ters are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Com- puting Facility (OLCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single- threaded workloads in parallel on Titan s multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accom- plishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility s infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.« less

  14. NAS Applications and Advanced Algorithms

    NASA Technical Reports Server (NTRS)

    Bailey, David H.; Biswas, Rupak; VanDerWijngaart, Rob; Kutler, Paul (Technical Monitor)

    1997-01-01

    This paper examines the applications most commonly run on the supercomputers at the Numerical Aerospace Simulation (NAS) facility. It analyzes the extent to which such applications are fundamentally oriented to vector computers, and whether or not they can be efficiently implemented on hierarchical memory machines, such as systems with cache memories and highly parallel, distributed memory systems.

  15. Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

    NASA Astrophysics Data System (ADS)

    Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.

    2016-10-01

    The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

  16. Opportunities for leveraging OS virtualization in high-end supercomputing.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bridges, Patrick G.; Pedretti, Kevin Thomas Tauke

    2010-11-01

    This paper examines potential motivations for incorporating virtualization support in the system software stacks of high-end capability supercomputers. We advocate that this will increase the flexibility of these platforms significantly and enable new capabilities that are not possible with current fixed software stacks. Our results indicate that compute, virtual memory, and I/O virtualization overheads are low and can be further mitigated by utilizing well-known techniques such as large paging and VMM bypass. Furthermore, since the addition of virtualization support does not affect the performance of applications using the traditional native environment, there is essentially no disadvantage to its addition.

  17. Some Problems and Solutions in Transferring Ecosystem Simulation Codes to Supercomputers

    NASA Technical Reports Server (NTRS)

    Skiles, J. W.; Schulbach, C. H.

    1994-01-01

    Many computer codes for the simulation of ecological systems have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Recent recognition of ecosystem science as a High Performance Computing and Communications Program Grand Challenge area emphasizes supercomputers (both parallel and distributed systems) as the next set of tools for ecological simulation. Transferring ecosystem simulation codes to such systems is not a matter of simply compiling and executing existing code on the supercomputer since there are significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers. To more appropriately match the application to the architecture (necessary to achieve reasonable performance), the parallelism (if it exists) of the original application must be exploited. We discuss our work in transferring a general grassland simulation model (developed on a VAX in the FORTRAN computer programming language) to a Cray Y-MP. We show the Cray shared-memory vector-architecture, and discuss our rationale for selecting the Cray. We describe porting the model to the Cray and executing and verifying a baseline version, and we discuss the changes we made to exploit the parallelism in the application and to improve code execution. As a result, the Cray executed the model 30 times faster than the VAX 11/785 and 10 times faster than a Sun 4 workstation. We achieved an additional speed-up of approximately 30 percent over the original Cray run by using the compiler's vectorizing capabilities and the machine's ability to put subroutines and functions "in-line" in the code. With the modifications, the code still runs at only about 5% of the Cray's peak speed because it makes ineffective use of the vector processing capabilities of the Cray. We conclude with a discussion and future plans.

  18. Comprehensive efficiency analysis of supercomputer resource usage based on system monitoring data

    NASA Astrophysics Data System (ADS)

    Mamaeva, A. A.; Shaykhislamov, D. I.; Voevodin, Vad V.; Zhumatiy, S. A.

    2018-03-01

    One of the main problems of modern supercomputers is the low efficiency of their usage, which leads to the significant idle time of computational resources, and, in turn, to the decrease in speed of scientific research. This paper presents three approaches to study the efficiency of supercomputer resource usage based on monitoring data analysis. The first approach performs an analysis of computing resource utilization statistics, which allows to identify different typical classes of programs, to explore the structure of the supercomputer job flow and to track overall trends in the supercomputer behavior. The second approach is aimed specifically at analyzing off-the-shelf software packages and libraries installed on the supercomputer, since efficiency of their usage is becoming an increasingly important factor for the efficient functioning of the entire supercomputer. Within the third approach, abnormal jobs – jobs with abnormally inefficient behavior that differs significantly from the standard behavior of the overall supercomputer job flow – are being detected. For each approach, the results obtained in practice in the Supercomputer Center of Moscow State University are demonstrated.

  19. 45 CFR 2550.80 - What are the duties of the State entities?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ...) The plan is subject to approval by the chief executive officer of the State. (9) The plan must be... national service program that receives financial assistance under section 121 of the NCSA or title II of... Federal financial assistance programs under the Community Services Block Grant Act (42 U.S.C. 9901 et seq...

  20. National Comorbidity Survey Replication Adolescent Supplement (NCS-A): III. Concordance of DSM-IV/CIDI Diagnoses with Clinical Reassessments

    ERIC Educational Resources Information Center

    Kessler, Ronald C.; Avenevoli, Shelli; Green, Jennifer; Gruber, Michael J.; Guyer, Margaret; He, Yulei; Jin, Robert; Kaufman, Joan; Sampson, Nancy A.; Zaslavsky, Alan M.; Merikangas, Kathleen R.

    2009-01-01

    The Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) diagnoses that was based on the World Health Organization's Composite International Diagnostic Interview (CIDI) and implemented in the National comorbidity survey replication adolescent supplement is found to have good individual-level concordance with diagnosis based on blinded…

  1. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Fang, Xiang; Li, Ning-qiu; Fu, Xiao-zhe; Li, Kai-bin; Lin, Qiang; Liu, Li-hui; Shi, Cun-bin; Wu, Shu-qin

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.

  2. Computational fluid dynamics research at the United Technologies Research Center requiring supercomputers

    NASA Astrophysics Data System (ADS)

    Landgrebe, Anton J.

    1987-03-01

    An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers.

  3. Computational fluid dynamics research at the United Technologies Research Center requiring supercomputers

    NASA Technical Reports Server (NTRS)

    Landgrebe, Anton J.

    1987-01-01

    An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers.

  4. Antenna pattern control using impedance surfaces

    NASA Technical Reports Server (NTRS)

    Balanis, Constantine A.; Liu, Kefeng

    1992-01-01

    During this research period, we have effectively transferred existing computer codes from CRAY supercomputer to work station based systems. The work station based version of our code preserved the accuracy of the numerical computations while giving a much better turn-around time than the CRAY supercomputer. Such a task relieved us of the heavy dependence of the supercomputer account budget and made codes developed in this research project more feasible for applications. The analysis of pyramidal horns with impedance surfaces was our major focus during this research period. Three different modeling algorithms in analyzing lossy impedance surfaces were investigated and compared with measured data. Through this investigation, we discovered that a hybrid Fourier transform technique, which uses the eigen mode in the stepped waveguide section and the Fourier transformed field distributions across the stepped discontinuities for lossy impedances coating, gives a better accuracy in analyzing lossy coatings. After a further refinement of the present technique, we will perform an accurate radiation pattern synthesis in the coming reporting period.

  5. ANALYSIS OF 2H-EVAPORATOR SCALE WALL [HTF-13-82] AND POT BOTTOM [HTF-13-77] SAMPLES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oji, L.

    2013-06-21

    Savannah River Remediation (SRR) is planning to remove a buildup of sodium aluminosilicate scale from the 2H-evaporator pot by loading and soaking the pot with heated 1.5 M nitric acid solution. Sampling and analysis of the scale material has been performed so that uranium and plutonium isotopic analysis can be input into a Nuclear Criticality Safety Assessment (NCSA) for scale removal by chemical cleaning. Historically, since the operation of the Defense Waste Processing Facility (DWPF), silicon in the DWPF recycle stream combines with aluminum in the typical tank farm supernate to form sodium aluminosilicate scale mineral deposits in the 2Hevaporatormore » pot and gravity drain line. The 2H-evaporator scale samples analyzed by Savannah River National Laboratory (SRNL) came from the bottom cone sections of the 2H-evaporator pot [Sample HTF-13-77] and the wall 2H-evaporator [sample HTF-13-82]. X-ray diffraction analysis (XRD) confirmed that both the 2H-evaporator pot scale and the wall samples consist of nitrated cancrinite (a crystalline sodium aluminosilicate solid) and clarkeite (a uranium oxy-hydroxide mineral). On “as received” basis, the bottom pot section scale sample contained an average of 2.59E+00 ± 1.40E-01 wt % total uranium with a U-235 enrichment of 6.12E-01 ± 1.48E-02 %, while the wall sample contained an average of 4.03E+00 ± 9.79E-01 wt % total uranium with a U-235 enrichment of 6.03E-01% ± 1.66E-02 wt %. The bottom pot section scale sample analyses results for Pu-238, Pu-239, and Pu-241 are 3.16E- 05 ± 5.40E-06 wt %, 3.28E-04 ± 1.45E-05 wt %, and <8.80E-07 wt %, respectively. The evaporator wall scale samples analysis values for Pu-238, Pu-239, and Pu-241 averages 3.74E-05 ± 6.01E-06 wt %, 4.38E-04 ± 5.08E-05 wt %, and <1.38E-06 wt %, respectively. The Pu-241 analyses results, as presented, are upper limit values. These results are provided so that SRR can calculate the equivalent uranium-235 concentrations for the NCSA. Results confirm that the uranium contained in the scale remains depleted with respect to natural uranium. SRNL did not calculate an equivalent U-235 enrichment, which takes into account other fissionable isotopes U-233, Pu-239 and Pu-241. The applicable method for calculation of equivalent U-235 will be determined in the NCSA.« less

  6. Accessing and visualizing scientific spatiotemporal data

    NASA Technical Reports Server (NTRS)

    Katz, Daniel S.; Bergou, Attila; Berriman, G. Bruce; Block, Gary L.; Collier, Jim; Curkendall, David W.; Good, John; Husman, Laura; Jacob, Joseph C.; Laity, Anastasia; hide

    2004-01-01

    This paper discusses work done by JPL's Parallel Applications Technologies Group in helping scientists access and visualize very large data sets through the use of multiple computing resources, such as parallel supercomputers, clusters, and grids.

  7. Performance Evaluation of an Intel Haswell- and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Hood, Robert T.; Chang, Johnny; Baron, John

    2016-01-01

    We present a performance evaluation conducted on a production supercomputer of the Intel Xeon Processor E5- 2680v3, a twelve-core implementation of the fourth-generation Haswell architecture, and compare it with Intel Xeon Processor E5-2680v2, an Ivy Bridge implementation of the third-generation Sandy Bridge architecture. Several new architectural features have been incorporated in Haswell including improvements in all levels of the memory hierarchy as well as improvements to vector instructions and power management. We critically evaluate these new features of Haswell and compare with Ivy Bridge using several low-level benchmarks including subset of HPCC, HPCG and four full-scale scientific and engineering applications. We also present a model to predict the performance of HPCG and Cart3D within 5%, and Overflow within 10% accuracy.

  8. A Computational framework for telemedicine.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Foster, I.; von Laszewski, G.; Thiruvathukal, G. K.

    1998-07-01

    Emerging telemedicine applications require the ability to exploit diverse and geographically distributed resources. Highspeed networks are used to integrate advanced visualization devices, sophisticated instruments, large databases, archival storage devices, PCs, workstations, and supercomputers. This form of telemedical environment is similar to networked virtual supercomputers, also known as metacomputers. Metacomputers are already being used in many scientific application areas. In this article, we analyze requirements necessary for a telemedical computing infrastructure and compare them with requirements found in a typical metacomputing environment. We will show that metacomputing environments can be used to enable a more powerful and unified computational infrastructure formore » telemedicine. The Globus metacomputing toolkit can provide the necessary low level mechanisms to enable a large scale telemedical infrastructure. The Globus toolkit components are designed in a modular fashion and can be extended to support the specific requirements for telemedicine.« less

  9. The role of graphics super-workstations in a supercomputing environment

    NASA Technical Reports Server (NTRS)

    Levin, E.

    1989-01-01

    A new class of very powerful workstations has recently become available which integrate near supercomputer computational performance with very powerful and high quality graphics capability. These graphics super-workstations are expected to play an increasingly important role in providing an enhanced environment for supercomputer users. Their potential uses include: off-loading the supercomputer (by serving as stand-alone processors, by post-processing of the output of supercomputer calculations, and by distributed or shared processing), scientific visualization (understanding of results, communication of results), and by real time interaction with the supercomputer (to steer an iterative computation, to abort a bad run, or to explore and develop new algorithms).

  10. 48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...

  11. 48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...

  12. 48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...

  13. 48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...

  14. 48 CFR 252.225-7011 - Restriction on acquisition of supercomputers.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... of supercomputers. 252.225-7011 Section 252.225-7011 Federal Acquisition Regulations System DEFENSE... CLAUSES Text of Provisions And Clauses 252.225-7011 Restriction on acquisition of supercomputers. As prescribed in 225.7012-3, use the following clause: Restriction on Acquisition of Supercomputers (JUN 2005...

  15. Data-intensive computing on numerically-insensitive supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahrens, James P; Fasel, Patricia K; Habib, Salman

    2010-12-03

    With the advent of the era of petascale supercomputing, via the delivery of the Roadrunner supercomputing platform at Los Alamos National Laboratory, there is a pressing need to address the problem of visualizing massive petascale-sized results. In this presentation, I discuss progress on a number of approaches including in-situ analysis, multi-resolution out-of-core streaming and interactive rendering on the supercomputing platform. These approaches are placed in context by the emerging area of data-intensive supercomputing.

  16. Computer Electromagnetics and Supercomputer Architecture

    NASA Technical Reports Server (NTRS)

    Cwik, Tom

    1993-01-01

    The dramatic increase in performance over the last decade for microporcessor computations is compared with that for the supercomputer computations. This performance, the projected performance, and a number of other issues such as cost and the inherent pysical limitations in curent supercomputer technology have naturally led to parallel supercomputers and ensemble of interconnected microprocessors.

  17. Hypercube technology

    NASA Technical Reports Server (NTRS)

    Parker, Jay W.; Cwik, Tom; Ferraro, Robert D.; Liewer, Paulett C.; Patterson, Jean E.

    1991-01-01

    The JPL designed MARKIII hypercube supercomputer has been in application service since June 1988 and has had successful application to a broad problem set including electromagnetic scattering, discrete event simulation, plasma transport, matrix algorithms, neural network simulation, image processing, and graphics. Currently, problems that are not homogeneous are being attempted, and, through this involvement with real world applications, the software is evolving to handle the heterogeneous class problems efficiently.

  18. Edison - A New Cray Supercomputer Advances Discovery at NERSC

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy

    2014-02-06

    When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.

  19. Edison - A New Cray Supercomputer Advances Discovery at NERSC

    ScienceCinema

    Dosanjh, Sudip; Parkinson, Dula; Yelick, Kathy; Trebotich, David; Broughton, Jeff; Antypas, Katie; Lukic, Zarija, Borrill, Julian; Draney, Brent; Chen, Jackie

    2018-01-16

    When a supercomputing center installs a new system, users are invited to make heavy use of the computer as part of the rigorous testing. In this video, find out what top scientists have discovered using Edison, a Cray XC30 supercomputer, and how NERSC's newest supercomputer will accelerate their future research.

  20. Expanding CyberShake Physics-Based Seismic Hazard Calculations to Central California

    NASA Astrophysics Data System (ADS)

    Silva, F.; Callaghan, S.; Maechling, P. J.; Goulet, C. A.; Milner, K. R.; Graves, R. W.; Olsen, K. B.; Jordan, T. H.

    2016-12-01

    As part of its program of earthquake system science, the Southern California Earthquake Center (SCEC) has developed a simulation platform, CyberShake, to perform physics-based probabilistic seismic hazard analysis (PSHA) using 3D deterministic wave propagation simulations. CyberShake performs PSHA by first simulating a tensor-valued wavefield of Strain Green Tensors. CyberShake then takes an earthquake rupture forecast and extends it by varying the hypocenter location and slip distribution, resulting in about 500,000 rupture variations. Seismic reciprocity is used to calculate synthetic seismograms for each rupture variation at each computation site. These seismograms are processed to obtain intensity measures, such as spectral acceleration, which are then combined with probabilities from the earthquake rupture forecast to produce a hazard curve. Hazard curves are calculated at seismic frequencies up to 1 Hz for hundreds of sites in a region and the results interpolated to obtain a hazard map. In developing and verifying CyberShake, we have focused our modeling in the greater Los Angeles region. We are now expanding the hazard calculations into Central California. Using workflow tools running jobs across two large-scale open-science supercomputers, NCSA Blue Waters and OLCF Titan, we calculated 1-Hz PSHA results for over 400 locations in Central California. For each location, we produced hazard curves using both a 3D central California velocity model created via tomographic inversion, and a regionally averaged 1D model. These new results provide low-frequency exceedance probabilities for the rapidly expanding metropolitan areas of Santa Barbara, Bakersfield, and San Luis Obispo, and lend new insights into the effects of directivity-basin coupling associated with basins juxtaposed to major faults such as the San Andreas. Particularly interesting are the basin effects associated with the deep sediments of the southern San Joaquin Valley. We will compare hazard estimates from the 1D and 3D models, summarize the challenges of expanding CyberShake to a new geographic region, and describe our future CyberShake plans.

  1. Service Utilization for Lifetime Mental Disorders in U.S. Adolescents: Results of the National Comorbidity Survey-Adolescent Supplement (NCS-A)

    ERIC Educational Resources Information Center

    Merikangas, Kathleen Ries; He, Jian-ping; Burstein, Marcy; Swendsen, Joel; Avenevoli, Shelli; Case, Brady; Georgiades, Katholiki; Heaton, Leanne; Swanson, Sonja; Olfson, Mark

    2011-01-01

    Objective: Mental health policy for youth has been constrained by a paucity of nationally representative data concerning patterns and correlates of mental health service utilization in this segment of the population. The objectives of this investigation were to examine the rates and sociodemographic correlates of lifetime mental health service use…

  2. 75 FR 8013 - Serve America Act Amendments to the National and Community Service Act of 1990

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-02-23

    ... implement changes to the operation of the National Service Trust under the Serve America Act. This proposed... Corporation issued an interim final rule to implement time-sensitive changes to the Corporation's AmeriCorps... changes resulting from the interim final rule were required as a result of amendments to the NCSA and DVSA...

  3. Integrating the Apache Big Data Stack with HPC for Big Data

    NASA Astrophysics Data System (ADS)

    Fox, G. C.; Qiu, J.; Jha, S.

    2014-12-01

    There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development. However, the same is not so true for data intensive computing, even though commercially clouds devote much more resources to data analytics than supercomputers devote to simulations. We look at a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures. We suggest a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks and use these to identify a few key classes of hardware/software architectures. Our analysis builds on combining HPC and ABDS the Apache big data software stack that is well used in modern cloud computing. Initial results on clouds and HPC systems are encouraging. We propose the development of SPIDAL - Scalable Parallel Interoperable Data Analytics Library -- built on system aand data abstractions suggested by the HPC-ABDS architecture. We discuss how it can be used in several application areas including Polar Science.

  4. Impact of the Columbia Supercomputer on NASA Space and Exploration Mission

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Kwak, Dochan; Kiris, Cetin; Lawrence, Scott

    2006-01-01

    NASA's 10,240-processor Columbia supercomputer gained worldwide recognition in 2004 for increasing the space agency's computing capability ten-fold, and enabling U.S. scientists and engineers to perform significant, breakthrough simulations. Columbia has amply demonstrated its capability to accelerate NASA's key missions, including space operations, exploration systems, science, and aeronautics. Columbia is part of an integrated high-end computing (HEC) environment comprised of massive storage and archive systems, high-speed networking, high-fidelity modeling and simulation tools, application performance optimization, and advanced data analysis and visualization. In this paper, we illustrate the impact Columbia is having on NASA's numerous space and exploration applications, such as the development of the Crew Exploration and Launch Vehicles (CEV/CLV), effects of long-duration human presence in space, and damage assessment and repair recommendations for remaining shuttle flights. We conclude by discussing HEC challenges that must be overcome to solve space-related science problems in the future.

  5. Program optimizations: The interplay between power, performance, and energy

    DOE PAGES

    Leon, Edgar A.; Karlin, Ian; Grant, Ryan E.; ...

    2016-05-16

    Practical considerations for future supercomputer designs will impose limits on both instantaneous power consumption and total energy consumption. Working within these constraints while providing the maximum possible performance, application developers will need to optimize their code for speed alongside power and energy concerns. This paper analyzes the effectiveness of several code optimizations including loop fusion, data structure transformations, and global allocations. A per component measurement and analysis of different architectures is performed, enabling the examination of code optimizations on different compute subsystems. Using an explicit hydrodynamics proxy application from the U.S. Department of Energy, LULESH, we show how code optimizationsmore » impact different computational phases of the simulation. This provides insight for simulation developers into the best optimizations to use during particular simulation compute phases when optimizing code for future supercomputing platforms. Here, we examine and contrast both x86 and Blue Gene architectures with respect to these optimizations.« less

  6. Topical perspective on massive threading and parallelism.

    PubMed

    Farber, Robert M

    2011-09-01

    Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.

  7. 48 CFR 225.7012 - Restriction on supercomputers.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 3 2014-10-01 2014-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...

  8. 48 CFR 225.7012 - Restriction on supercomputers.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 48 Federal Acquisition Regulations System 3 2010-10-01 2010-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...

  9. 48 CFR 225.7012 - Restriction on supercomputers.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 3 2013-10-01 2013-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...

  10. 48 CFR 225.7012 - Restriction on supercomputers.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 3 2011-10-01 2011-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...

  11. 48 CFR 225.7012 - Restriction on supercomputers.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 3 2012-10-01 2012-10-01 false Restriction on supercomputers. 225.7012 Section 225.7012 Federal Acquisition Regulations System DEFENSE ACQUISITION REGULATIONS... supercomputers. ...

  12. High performance computing applications in neurobiological research

    NASA Technical Reports Server (NTRS)

    Ross, Muriel D.; Cheng, Rei; Doshay, David G.; Linton, Samuel W.; Montgomery, Kevin; Parnas, Bruce R.

    1994-01-01

    The human nervous system is a massively parallel processor of information. The vast numbers of neurons, synapses and circuits is daunting to those seeking to understand the neural basis of consciousness and intellect. Pervading obstacles are lack of knowledge of the detailed, three-dimensional (3-D) organization of even a simple neural system and the paucity of large scale, biologically relevant computer simulations. We use high performance graphics workstations and supercomputers to study the 3-D organization of gravity sensors as a prototype architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scale-up, three-dimensional versions run on the Cray Y-MP and CM5 supercomputers.

  13. Multi-petascale highly efficient parallel supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.

    A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time andmore » supports DMA functionality allowing for parallel processing message-passing.« less

  14. Internal computational fluid mechanics on supercomputers for aerospace propulsion systems

    NASA Technical Reports Server (NTRS)

    Andersen, Bernhard H.; Benson, Thomas J.

    1987-01-01

    The accurate calculation of three-dimensional internal flowfields for application towards aerospace propulsion systems requires computational resources available only on supercomputers. A survey is presented of three-dimensional calculations of hypersonic, transonic, and subsonic internal flowfields conducted at the Lewis Research Center. A steady state Parabolized Navier-Stokes (PNS) solution of flow in a Mach 5.0, mixed compression inlet, a Navier-Stokes solution of flow in the vicinity of a terminal shock, and a PNS solution of flow in a diffusing S-bend with vortex generators are presented and discussed. All of these calculations were performed on either the NAS Cray-2 or the Lewis Research Center Cray XMP.

  15. MEGADOCK 4.0: an ultra-high-performance protein-protein docking software for heterogeneous supercomputers.

    PubMed

    Ohue, Masahito; Shimoda, Takehiro; Suzuki, Shuji; Matsuzaki, Yuri; Ishida, Takashi; Akiyama, Yutaka

    2014-11-15

    The application of protein-protein docking in large-scale interactome analysis is a major challenge in structural bioinformatics and requires huge computing resources. In this work, we present MEGADOCK 4.0, an FFT-based docking software that makes extensive use of recent heterogeneous supercomputers and shows powerful, scalable performance of >97% strong scaling. MEGADOCK 4.0 is written in C++ with OpenMPI and NVIDIA CUDA 5.0 (or later) and is freely available to all academic and non-profit users at: http://www.bi.cs.titech.ac.jp/megadock. akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  16. CONVEX mini manual

    NASA Technical Reports Server (NTRS)

    Tennille, Geoffrey M.; Howser, Lona M.

    1993-01-01

    The use of the CONVEX computers that are an integral part of the Supercomputing Network Subsystems (SNS) of the Central Scientific Computing Complex of LaRC is briefly described. Features of the CONVEX computers that are significantly different than the CRAY supercomputers are covered, including: FORTRAN, C, architecture of the CONVEX computers, the CONVEX environment, batch job submittal, debugging, performance analysis, utilities unique to CONVEX, and documentation. This revision reflects the addition of the Applications Compiler and X-based debugger, CXdb. The document id intended for all CONVEX users as a ready reference to frequently asked questions and to more detailed information contained with the vendor manuals. It is appropriate for both the novice and the experienced user.

  17. Detecting Silent Data Corruption for Extreme-Scale Applications through Data Mining

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bautista-Gomez, Leonardo; Cappello, Franck

    Supercomputers allow scientists to study natural phenomena by means of computer simulations. Next-generation machines are expected to have more components and, at the same time, consume several times less energy per operation. These trends are pushing supercomputer construction to the limits of miniaturization and energy-saving strategies. Consequently, the number of soft errors is expected to increase dramatically in the coming years. While mechanisms are in place to correct or at least detect some soft errors, a significant percentage of those errors pass unnoticed by the hardware. Such silent errors are extremely damaging because they can make applications silently produce wrongmore » results. In this work we propose a technique that leverages certain properties of high-performance computing applications in order to detect silent errors at the application level. Our technique detects corruption solely based on the behavior of the application datasets and is completely application-agnostic. We propose multiple corruption detectors, and we couple them to work together in a fashion transparent to the user. We demonstrate that this strategy can detect the majority of the corruptions, while incurring negligible overhead. We show that with the help of these detectors, applications can have up to 80% of coverage against data corruption.« less

  18. Challenges in scaling NLO generators to leadership computers

    NASA Astrophysics Data System (ADS)

    Benjamin, D.; Childers, JT; Hoeche, S.; LeCompte, T.; Uram, T.

    2017-10-01

    Exascale computing resources are roughly a decade away and will be capable of 100 times more computing than current supercomputers. In the last year, Energy Frontier experiments crossed a milestone of 100 million core-hours used at the Argonne Leadership Computing Facility, Oak Ridge Leadership Computing Facility, and NERSC. The Fortran-based leading-order parton generator called Alpgen was successfully scaled to millions of threads to achieve this level of usage on Mira. Sherpa and MadGraph are next-to-leading order generators used heavily by LHC experiments for simulation. Integration times for high-multiplicity or rare processes can take a week or more on standard Grid machines, even using all 16-cores. We will describe our ongoing work to scale the Sherpa generator to thousands of threads on leadership-class machines and reduce run-times to less than a day. This work allows the experiments to leverage large-scale parallel supercomputers for event generation today, freeing tens of millions of grid hours for other work, and paving the way for future applications (simulation, reconstruction) on these and future supercomputers.

  19. Evolving the NCSA CyberCollaboratory for Distributed Environmental Observatory Networks

    NASA Astrophysics Data System (ADS)

    Myers, J.; Liu, Y.; Minsker, B.; Futrelle, J.; Downey, S.; Kim, I.; Rantanen, E.

    2007-12-01

    Since 2004, NCSA's Cybercollaboratory, which is built on top of the open source Liferay portal framework, has been evolving as part of NCSA's efforts to build national cyberinfrastructure to support collaborative research in environmental engineering and hydrological sciences and allow users to efficiently share contents (sensors, data, model, documents, etc.) in a context-sensitive way (e.g., providing different tools/data based on group affiliation and geospatial contexts). During this period, we provided the CyberCollaboratory to users in CLEANER (Collaborative Large-scale Engineering Analysis Network for Environmental Research, now WATer and Environmental Research Systems (WATERS) network) Project Office and several CLEANER /WATERS testbed projects. Preliminary statistics shows that one in four users (among over 400 registered users) provided contents with many other reading/accessing those contents (such as messages, documents, wikis). During the course of this use, and in evaluation by others including representatives from the CUAHSI (Consortium of Universities for the Advancement of Hydrologic Science) community, we have received significant feedback on issues of usability and suitability to various communities involved in environmental observatories. Much of this feedback applies to collaborative portals in general and some reflect a comparison of portals with newer Web 2.0 style social -networking sites. For example, users working in multiple groups found it difficult to get an overview of all of their activities and found differences in group layouts to be confusing. Users also found the standard account creation and group management processes cumbersome compared to inviting people to be friends on social sites and wanted a better sense of presence and social networks within the portal. The fragmentation of group documents between local stores, the portal document repository and email, and issues of "lost updates" was another significant concern. This poster reviews the usability feedback, identifies key issues that hinder traditional portal-based collaboration environments, and presents design changes made to the Cybercollaboratory to address them. Feedback on the effectiveness of the new design from hydrologists and environmental researchers and preliminary results from a formal usability study will also be presented.

  20. Server-side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Y.; Gunasekaran, Raghul; Ma, Xiaosong

    2016-01-01

    Inter-application I/O contention and performance interference have been recognized as severe problems. In this work, we demonstrate, through measurement from Titan (world s No. 3 supercomputer), that high I/O variance co-exists with the fact that individual storage units remain under-utilized for the majority of the time. This motivates us to propose AID, a system that performs automatic application I/O characterization and I/O-aware job scheduling. AID analyzes existing I/O traffic and batch job history logs, without any prior knowledge on applications or user/developer involvement. It identifies the small set of I/O-intensive candidates among all applications running on a supercomputer and subsequentlymore » mines their I/O patterns, using more detailed per-I/O-node traffic logs. Based on such auto- extracted information, AID provides online I/O-aware scheduling recommendations to steer I/O-intensive applications away from heavy ongoing I/O activities. We evaluate AID on Titan, using both real applications (with extracted I/O patterns validated by contacting users) and our own pseudo-applications. Our results confirm that AID is able to (1) identify I/O-intensive applications and their detailed I/O characteristics, and (2) significantly reduce these applications I/O performance degradation/variance by jointly evaluating out- standing applications I/O pattern and real-time system l/O load.« less

  1. TOP500 Supercomputers for June 2004

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

    2004-06-23

    23rd Edition of TOP500 List of World's Fastest Supercomputers Released: Japan's Earth Simulator Enters Third Year in Top Position MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 23rd edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2004) at the International Supercomputer Conference in Heidelberg, Germany.

  2. SMARTSware for SMARTS users to facilitate data reduction and data analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2005-01-01

    The software package SMARTSware is made by one of the instrument scientist on the engineering neutron diffractometer SMARTS at the Lujan Center, a national user facility at Los Alamos Neutron Scattering Center (LANSCE). The purpose of the software is to facilitate the data analysis of powder diffraction data recorded at the Lujan Center, and hence the target audience is users performing experiments at the one of the powder diffractometers (SMARTS, HIPPO, HIPD and NPDF) at the Lujan Center. The beam time at the Lujan Center is allocated by peer review of internally and extenally submitted proposals, and therefore many ofmore » the users who are granted beam time are from the international science community. Generally, the users are only at the Lujan Center for a short period of time while they are performing the experiments, and often they leave with several data sets that have not been analyzed. The distribution of the SMARTSware software package will minimize their efforts when analyzing the data once they are back at their institution. Description of software: There are two main parts of the software; a part used to generate instrument parameter files from a set of calibration runs (Smartslparm, SmartsBin, SmartsFitDif and SmartsFitspec); and a part that facilitates the batch refinement of multiple diffraction patterns (SmartsRunRep, SmartsABC, SmartsSPF and SmartsExtract). The former part may only be peripheral to most users, but is a critical part of the instrument scientists' efforts in calibrating their instruments. The latter part is highly applicable to the users as they often need to analyze or re-analyze large sets of data. The programs within the SMARTSware package heavily rely on GSAS for the Rietveld and single peak refinements of diffraction data. GSAS (General Structure Analysis System) is a public available software also originating from LANL. Subroutines and libraries from the NeXus project (a world wide trust to standardize diffraction data formats) and National Center for Supercomputing Applications (NCSA) at the University of Illinois (the Hierarchical Data Format Software Library and Utilities) are used in the programs. All these subroutines and libraries are publicly available through the GNU Public License and/or Freeware. The package also contains sample input and output text files and a manual (LA-UR 04-6581). The executables and sample files will be available for down load at http://public.lanl.gov/clausen/SMARTSware.html and ftp://lansce.lanl.gov/clausen/SMARTSware/SMARTSware.zip, but the source codes will only be made available by written request to clausen@lanl.gov.« less

  3. Performance and Scalability of the NAS Parallel Benchmarks in Java

    NASA Technical Reports Server (NTRS)

    Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    Several features make Java an attractive choice for scientific applications. In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for scientific applications.

  4. Scientific Visualization in High Speed Network Environments

    NASA Technical Reports Server (NTRS)

    Vaziri, Arsi; Kutler, Paul (Technical Monitor)

    1997-01-01

    In several cases, new visualization techniques have vastly increased the researcher's ability to analyze and comprehend data. Similarly, the role of networks in providing an efficient supercomputing environment have become more critical and continue to grow at a faster rate than the increase in the processing capabilities of supercomputers. A close relationship between scientific visualization and high-speed networks in providing an important link to support efficient supercomputing is identified. The two technologies are driven by the increasing complexities and volume of supercomputer data. The interaction of scientific visualization and high-speed networks in a Computational Fluid Dynamics simulation/visualization environment are given. Current capabilities supported by high speed networks, supercomputers, and high-performance graphics workstations at the Numerical Aerodynamic Simulation Facility (NAS) at NASA Ames Research Center are described. Applied research in providing a supercomputer visualization environment to support future computational requirements are summarized.

  5. Fast I/O for Massively Parallel Applications

    NASA Technical Reports Server (NTRS)

    OKeefe, Matthew T.

    1996-01-01

    The two primary goals for this report were the design, contruction and modeling of parallel disk arrays for scientific visualization and animation, and a study of the IO requirements of highly parallel applications. In addition, further work in parallel display systems required to project and animate the very high-resolution frames resulting from our supercomputing simulations in ocean circulation and compressible gas dynamics.

  6. Overview 1993: Computational applications

    NASA Technical Reports Server (NTRS)

    Benek, John A.

    1993-01-01

    Computational applications include projects that apply or develop computationally intensive computer programs. Such programs typically require supercomputers to obtain solutions in a timely fashion. This report describes two CSTAR projects involving Computational Fluid Dynamics (CFD) technology. The first, the Parallel Processing Initiative, is a joint development effort and the second, the Chimera Technology Development, is a transfer of government developed technology to American industry.

  7. Organizational Strategies for End-User Computing Support.

    ERIC Educational Resources Information Center

    Blackmun, Robert R.; And Others

    1988-01-01

    Effective support for end users of computers has been an important issue in higher education from the first applications of general purpose mainframe computers through minicomputers, microcomputers, and supercomputers. The development of end user support is reviewed and organizational models are examined. (Author/MLW)

  8. Motor vehicle traffic crashes as a leading cause of death in the U.S. : summary of the 1997 mortality experience and traffic crash fatality trend from 1992 to 1997

    DOT National Transportation Integrated Search

    2000-06-01

    The National Center for Statistics and Analysis (NCSA) recently completed a study of data on the causes of death for all persons, by age and sex, which occurred in the U.S. in 1997. The purpose of this study was to examine the status of motor vehicle...

  9. Towards Efficient Supercomputing: Searching for the Right Efficiency Metric

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hsu, Chung-Hsing; Kuehn, Jeffery A; Poole, Stephen W

    2012-01-01

    The efficiency of supercomputing has traditionally been in the execution time. In early 2000 s, the concept of total cost of ownership was re-introduced, with the introduction of efficiency measure to include aspects such as energy and space. Yet the supercomputing community has never agreed upon a metric that can cover these aspects altogether and also provide a fair basis for comparison. This paper exam- ines the metrics that have been proposed in the past decade, and proposes a vector-valued metric for efficient supercom- puting. Using this metric, the paper presents a study of where the supercomputing industry has beenmore » and how it stands today with respect to efficient supercomputing.« less

  10. Validation of the diagnoses of panic disorder and phobic disorders in the US National Comorbidity Survey Replication Adolescent (NCS-A) Supplement

    PubMed Central

    Green, Jennifer Greif; Avenevoli, Shelli; Finkelman, Matthew; Gruber, Michael J.; Kessler, Ronald C.; Merikangas, Kathleen R.; Sampson, Nancy A.; Zaslavsky, Alan M.

    2012-01-01

    Validity of the adolescent version of the World Health Organization Composite International Diagnostic Interview (CIDI) Version 3.0, a fully-structured research diagnostic interview designed to be used by trained lay interviewers, is assessed in comparison to independent clinical diagnoses based on the Schedule for Affective Disorders and Schizophrenia for School-Age Children (K-SADS). This assessment is carried out in the clinical reappraisal sub-sample (n = 347) of the US National Comorbidity Survey Adolescent Supplement (NCS-A), a large (n = 10,148) community epidemiological survey of the prevalence and correlates of adolescent mental disorders in the US. The diagnoses considered are panic disorder and phobic disorders (social phobia, specific phobia, agoraphobia). CIDI diagnoses are found to have good concordance with K-SADS diagnoses (AUC = .81–.94), although the CIDI diagnoses are consistency somewhat higher than the K-SADS diagnoses. Data are also presented on criterion-level concordance in an effort to pinpoint CIDI question series that might be improved in future modifications of the instrument. Finally, data are presented on the factor structure of the fears associated with social phobia, the only disorder in this series where substantial controversy exists about disorder subtypes. PMID:21495110

  11. Will Your Next Supercomputer Come from Costco?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Farber, Rob

    2007-04-15

    A fun topic for April, one that is not an April fool’s joke, is that you can purchase a commodity 200+ Gflop (single-precision) Linux supercomputer for around $600 from your favorite electronic vendor. Yes, it’s true. Just walk in and ask for a Sony Playstation 3 (PS3), take it home and install Linux on it. IBM has provided an excellent tutorial for installing Linux and building applications at http://www-128.ibm.com/developerworks/power/library/pa-linuxps3-1. If you want to raise some eyebrows at work, then submit a purchase request for a Sony PS3 game console and watch the reactions as your paperwork wends its way throughmore » the procurement process.« less

  12. Interactive 3D visualization speeds well, reservoir planning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Petzet, G.A.

    1997-11-24

    Texaco Exploration and Production has begun making expeditious analyses and drilling decisions that result from interactive, large screen visualization of seismic and other three dimensional data. A pumpkin shaped room or pod inside a 3,500 sq ft, state-of-the-art facility in Southwest Houston houses a supercomputer and projection equipment Texaco said will help its people sharply reduce 3D seismic project cycle time, boost production from existing fields, and find more reserves. Oil and gas related applications of the visualization center include reservoir engineering, plant walkthrough simulation for facilities/piping design, and new field exploration. The center houses a Silicon Graphics Onyx2 infinitemore » reality supercomputer configured with 8 processors, 3 graphics pipelines, and 6 gigabytes of main memory.« less

  13. Development of seismic tomography software for hybrid supercomputers

    NASA Astrophysics Data System (ADS)

    Nikitin, Alexandr; Serdyukov, Alexandr; Duchkov, Anton

    2015-04-01

    Seismic tomography is a technique used for computing velocity model of geologic structure from first arrival travel times of seismic waves. The technique is used in processing of regional and global seismic data, in seismic exploration for prospecting and exploration of mineral and hydrocarbon deposits, and in seismic engineering for monitoring the condition of engineering structures and the surrounding host medium. As a consequence of development of seismic monitoring systems and increasing volume of seismic data, there is a growing need for new, more effective computational algorithms for use in seismic tomography applications with improved performance, accuracy and resolution. To achieve this goal, it is necessary to use modern high performance computing systems, such as supercomputers with hybrid architecture that use not only CPUs, but also accelerators and co-processors for computation. The goal of this research is the development of parallel seismic tomography algorithms and software package for such systems, to be used in processing of large volumes of seismic data (hundreds of gigabytes and more). These algorithms and software package will be optimized for the most common computing devices used in modern hybrid supercomputers, such as Intel Xeon CPUs, NVIDIA Tesla accelerators and Intel Xeon Phi co-processors. In this work, the following general scheme of seismic tomography is utilized. Using the eikonal equation solver, arrival times of seismic waves are computed based on assumed velocity model of geologic structure being analyzed. In order to solve the linearized inverse problem, tomographic matrix is computed that connects model adjustments with travel time residuals, and the resulting system of linear equations is regularized and solved to adjust the model. The effectiveness of parallel implementations of existing algorithms on target architectures is considered. During the first stage of this work, algorithms were developed for execution on supercomputers using multicore CPUs only, with preliminary performance tests showing good parallel efficiency on large numerical grids. Porting of the algorithms to hybrid supercomputers is currently ongoing.

  14. Computational Approaches to Simulation and Optimization of Global Aircraft Trajectories

    NASA Technical Reports Server (NTRS)

    Ng, Hok Kwan; Sridhar, Banavar

    2016-01-01

    This study examines three possible approaches to improving the speed in generating wind-optimal routes for air traffic at the national or global level. They are: (a) using the resources of a supercomputer, (b) running the computations on multiple commercially available computers and (c) implementing those same algorithms into NASAs Future ATM Concepts Evaluation Tool (FACET) and compares those to a standard implementation run on a single CPU. Wind-optimal aircraft trajectories are computed using global air traffic schedules. The run time and wait time on the supercomputer for trajectory optimization using various numbers of CPUs ranging from 80 to 10,240 units are compared with the total computational time for running the same computation on a single desktop computer and on multiple commercially available computers for potential computational enhancement through parallel processing on the computer clusters. This study also re-implements the trajectory optimization algorithm for further reduction of computational time through algorithm modifications and integrates that with FACET to facilitate the use of the new features which calculate time-optimal routes between worldwide airport pairs in a wind field for use with existing FACET applications. The implementations of trajectory optimization algorithms use MATLAB, Python, and Java programming languages. The performance evaluations are done by comparing their computational efficiencies and based on the potential application of optimized trajectories. The paper shows that in the absence of special privileges on a supercomputer, a cluster of commercially available computers provides a feasible approach for national and global air traffic system studies.

  15. Virtualizing Super-Computation On-Board Uas

    NASA Astrophysics Data System (ADS)

    Salami, E.; Soler, J. A.; Cuadrado, R.; Barrado, C.; Pastor, E.

    2015-04-01

    Unmanned aerial systems (UAS, also known as UAV, RPAS or drones) have a great potential to support a wide variety of aerial remote sensing applications. Most UAS work by acquiring data using on-board sensors for later post-processing. Some require the data gathered to be downlinked to the ground in real-time. However, depending on the volume of data and the cost of the communications, this later option is not sustainable in the long term. This paper develops the concept of virtualizing super-computation on-board UAS, as a method to ease the operation by facilitating the downlink of high-level information products instead of raw data. Exploiting recent developments in miniaturized multi-core devices is the way to speed-up on-board computation. This hardware shall satisfy size, power and weight constraints. Several technologies are appearing with promising results for high performance computing on unmanned platforms, such as the 36 cores of the TILE-Gx36 by Tilera (now EZchip) or the 64 cores of the Epiphany-IV by Adapteva. The strategy for virtualizing super-computation on-board includes the benchmarking for hardware selection, the software architecture and the communications aware design. A parallelization strategy is given for the 36-core TILE-Gx36 for a UAS in a fire mission or in similar target-detection applications. The results are obtained for payload image processing algorithms and determine in real-time the data snapshot to gather and transfer to ground according to the needs of the mission, the processing time, and consumed watts.

  16. Accessing NASA Technology with the World Wide Web

    NASA Technical Reports Server (NTRS)

    Nelson, Michael L.; Bianco, David J.

    1995-01-01

    NASA Langley Research Center (LaRC) began using the World Wide Web (WWW) in the summer of 1993, becoming the first NASA installation to provide a Center-wide home page. This coincided with a reorganization of LaRC to provide a more concentrated focus on technology transfer to both aerospace and non-aerospace industry. Use of WWW and NCSA Mosaic not only provides automated information dissemination, but also allows for the implementation, evolution and integration of many technology transfer and technology awareness applications. This paper describes several of these innovative applications, including the on-line presentation of the entire Technology OPportunities Showcase (TOPS), an industrial partnering showcase that exists on the Web long after the actual 3-day event ended. The NASA Technical Report Server (NTRS) provides uniform access to many logically similar, yet physically distributed NASA report servers. WWW is also the foundation of the Langley Software Server (LSS), an experimental software distribution system which will distribute LaRC-developed software. In addition to the more formal technology distribution projects, WWW has been successful in connecting people with technologies and people with other people.

  17. OpenMP Performance on the Columbia Supercomputer

    NASA Technical Reports Server (NTRS)

    Haoqiang, Jin; Hood, Robert

    2005-01-01

    This presentation discusses Columbia World Class Supercomputer which is one of the world's fastest supercomputers providing 61 TFLOPs (10/20/04). Conceived, designed, built, and deployed in just 120 days. A 20-node supercomputer built on proven 512-processor nodes. The largest SGI system in the world with over 10,000 Intel Itanium 2 processors and provides the largest node size incorporating commodity parts (512) and the largest shared-memory environment (2048) with 88% efficiency tops the scalar systems on the Top500 list.

  18. Grasping Reality Through Illusion: Interactive Graphics Serving Science

    DTIC Science & Technology

    1988-03-01

    SIGGRAPH, or riding techniques to the enhancement of scientific computing. StarTours at Disneyland shows how stunningly far we ........ have come. We need...supercomputer References matching and steering tools. Such tools must be Bergman, L., Fuchs, H., Grant , E., Spach, S. [1986] universal and application

  19. The Computer Simulation of Liquids by Molecular Dynamics.

    ERIC Educational Resources Information Center

    Smith, W.

    1987-01-01

    Proposes a mathematical computer model for the behavior of liquids using the classical dynamic principles of Sir Isaac Newton and the molecular dynamics method invented by other scientists. Concludes that other applications will be successful using supercomputers to go beyond simple Newtonian physics. (CW)

  20. HACC: Extreme Scaling and Performance Across Diverse Architectures

    NASA Astrophysics Data System (ADS)

    Habib, Salman; Morozov, Vitali; Frontiere, Nicholas; Finkel, Hal; Pope, Adrian; Heitmann, Katrin

    2013-11-01

    Supercomputing is evolving towards hybrid and accelerator-based architectures with millions of cores. The HACC (Hardware/Hybrid Accelerated Cosmology Code) framework exploits this diverse landscape at the largest scales of problem size, obtaining high scalability and sustained performance. Developed to satisfy the science requirements of cosmological surveys, HACC melds particle and grid methods using a novel algorithmic structure that flexibly maps across architectures, including CPU/GPU, multi/many-core, and Blue Gene systems. We demonstrate the success of HACC on two very different machines, the CPU/GPU system Titan and the BG/Q systems Sequoia and Mira, attaining unprecedented levels of scalable performance. We demonstrate strong and weak scaling on Titan, obtaining up to 99.2% parallel efficiency, evolving 1.1 trillion particles. On Sequoia, we reach 13.94 PFlops (69.2% of peak) and 90% parallel efficiency on 1,572,864 cores, with 3.6 trillion particles, the largest cosmological benchmark yet performed. HACC design concepts are applicable to several other supercomputer applications.

  1. Network issues for large mass storage requirements

    NASA Technical Reports Server (NTRS)

    Perdue, James

    1992-01-01

    File Servers and Supercomputing environments need high performance networks to balance the I/O requirements seen in today's demanding computing scenarios. UltraNet is one solution which permits both high aggregate transfer rates and high task-to-task transfer rates as demonstrated in actual tests. UltraNet provides this capability as both a Server-to-Server and Server-to-Client access network giving the supercomputing center the following advantages highest performance Transport Level connections (to 40 MBytes/sec effective rates); matches the throughput of the emerging high performance disk technologies, such as RAID, parallel head transfer devices and software striping; supports standard network and file system applications using SOCKET's based application program interface such as FTP, rcp, rdump, etc.; supports access to the Network File System (NFS) and LARGE aggregate bandwidth for large NFS usage; provides access to a distributed, hierarchical data server capability using DISCOS UniTree product; supports file server solutions available from multiple vendors, including Cray, Convex, Alliant, FPS, IBM, and others.

  2. SiGN-SSM: open source parallel software for estimating gene networks with state space models.

    PubMed

    Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru

    2011-04-15

    SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.

  3. Transferring ecosystem simulation codes to supercomputers

    NASA Technical Reports Server (NTRS)

    Skiles, J. W.; Schulbach, C. H.

    1995-01-01

    Many ecosystem simulation computer codes have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Supercomputing platforms (both parallel and distributed systems) have been largely unused, however, because of the perceived difficulty in accessing and using the machines. Also, significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers must be considered. We have transferred a grassland simulation model (developed on a VAX) to a Cray Y-MP/C90. We describe porting the model to the Cray and the changes we made to exploit the parallelism in the application and improve code execution. The Cray executed the model 30 times faster than the VAX and 10 times faster than a Unix workstation. We achieved an additional speedup of 30 percent by using the compiler's vectoring and 'in-line' capabilities. The code runs at only about 5 percent of the Cray's peak speed because it ineffectively uses the vector and parallel processing capabilities of the Cray. We expect that by restructuring the code, it could execute an additional six to ten times faster.

  4. Cots Correlator Platform

    NASA Astrophysics Data System (ADS)

    Schaaf, Kjeld; Overeem, Ruud

    2004-06-01

    Moore’s law is best exploited by using consumer market hardware. In particular, the gaming industry pushes the limit of processor performance thus reducing the cost per raw flop even faster than Moore’s law predicts. Next to the cost benefits of Common-Of-The-Shelf (COTS) processing resources, there is a rapidly growing experience pool in cluster based processing. The typical Beowulf cluster of PC’s supercomputers are well known. Multiple examples exists of specialised cluster computers based on more advanced server nodes or even gaming stations. All these cluster machines build upon the same knowledge about cluster software management, scheduling, middleware libraries and mathematical libraries. In this study, we have integrated COTS processing resources and cluster nodes into a very high performance processing platform suitable for streaming data applications, in particular to implement a correlator. The required processing power for the correlator in modern radio telescopes is in the range of the larger supercomputers, which motivates the usage of supercomputer technology. Raw processing power is provided by graphical processors and is combined with an Infiniband host bus adapter with integrated data stream handling logic. With this processing platform a scalable correlator can be built with continuously growing processing power at consumer market prices.

  5. Most Social Scientists Shun Free Use of Supercomputers.

    ERIC Educational Resources Information Center

    Kiernan, Vincent

    1998-01-01

    Social scientists, who frequently complain that the federal government spends too little on them, are passing up what scholars in the physical and natural sciences see as the government's best give-aways: free access to supercomputers. Some social scientists say the supercomputers are difficult to use; others find desktop computers provide…

  6. A fault tolerant spacecraft supercomputer to enable a new class of scientific discovery

    NASA Technical Reports Server (NTRS)

    Katz, D. S.; McVittie, T. I.; Silliman, A. G., Jr.

    2000-01-01

    The goal of the Remote Exploration and Experimentation (REE) Project is to move supercomputeing into space in a coste effective manner and to allow the use of inexpensive, state of the art, commercial-off-the-shelf components and subsystems in these space-based supercomputers.

  7. TOP500 Supercomputers for November 2003

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

    2003-11-16

    22nd Edition of TOP500 List of World s Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 22nd edition of the TOP500 list of the worlds fastest supercomputers was released today (November 16, 2003). The Earth Simulator supercomputer retains the number one position with its Linpack benchmark performance of 35.86 Tflop/s (''teraflops'' or trillions of calculations per second). It was built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan.

  8. Survey of new vector computers: The CRAY 1S from CRAY research; the CYBER 205 from CDC and the parallel computer from ICL - architecture and programming

    NASA Technical Reports Server (NTRS)

    Gentzsch, W.

    1982-01-01

    Problems which can arise with vector and parallel computers are discussed in a user oriented context. Emphasis is placed on the algorithms used and the programming techniques adopted. Three recently developed supercomputers are examined and typical application examples are given in CRAY FORTRAN, CYBER 205 FORTRAN and DAP (distributed array processor) FORTRAN. The systems performance is compared. The addition of parts of two N x N arrays is considered. The influence of the architecture on the algorithms and programming language is demonstrated. Numerical analysis of magnetohydrodynamic differential equations by an explicit difference method is illustrated, showing very good results for all three systems. The prognosis for supercomputer development is assessed.

  9. Ultrascalable petaflop parallel supercomputer

    DOEpatents

    Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  10. Vectorized program architectures for supercomputer-aided circuit design

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rizzoli, V.; Ferlito, M.; Neri, A.

    1986-01-01

    Vector processors (supercomputers) can be effectively employed in MIC or MMIC applications to solve problems of large numerical size such as broad-band nonlinear design or statistical design (yield optimization). In order to fully exploit the capabilities of a vector hardware, any program architecture must be structured accordingly. This paper presents a possible approach to the ''semantic'' vectorization of microwave circuit design software. Speed-up factors of the order of 50 can be obtained on a typical vector processor (Cray X-MP), with respect to the most powerful scaler computers (CDC 7600), with cost reductions of more than one order of magnitude. Thismore » could broaden the horizon of microwave CAD techniques to include problems that are practically out of the reach of conventional systems.« less

  11. Parallel Computing and Model Evaluation for Environmental Systems: An Overview of the Supermuse and Frames Software Technologies

    EPA Science Inventory

    ERD’s Supercomputer for Model Uncertainty and Sensitivity Evaluation (SuperMUSE) is a key to enhancing quality assurance in environmental models and applications. Uncertainty analysis and sensitivity analysis remain critical, though often overlooked steps in the development and e...

  12. NASA Tech Briefs, November/December 1986, Special Edition

    NASA Technical Reports Server (NTRS)

    1986-01-01

    Topics: Computing: The View from NASA Headquarters; Earth Resources Laboratory Applications Software: Versatile Tool for Data Analysis; The Hypercube: Cost-Effective Supercomputing; Artificial Intelligence: Rendezvous with NASA; NASA's Ada Connection; COSMIC: NASA's Software Treasurehouse; Golden Oldies: Tried and True NASA Software; Computer Technical Briefs; NASA TU Services; Digital Fly-by-Wire.

  13. SuperComputers for Space Applications

    DTIC Science & Technology

    2005-07-13

    also ADM001791, Potentially Disruptive Technologies and Their Impact in Space Programs Held in Marseille, France on 4-6 July 2005. , The original...Performance Embedded Computing will allow Ambitious Space Science Investigation", Proc. First Symp. on Potentially Disruptive Technologies and Their Impact in Space Programs, 2005. ➦ SOMMAIRE/SUMMARY ➦ Data Processing

  14. Distributed user services for supercomputers

    NASA Technical Reports Server (NTRS)

    Sowizral, Henry A.

    1989-01-01

    User-service operations at supercomputer facilities are examined. The question is whether a single, possibly distributed, user-services organization could be shared by NASA's supercomputer sites in support of a diverse, geographically dispersed, user community. A possible structure for such an organization is identified as well as some of the technologies needed in operating such an organization.

  15. Implementation of the NAS Parallel Benchmarks in Java

    NASA Technical Reports Server (NTRS)

    Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan (Technical Monitor)

    2002-01-01

    Several features make Java an attractive choice for High Performance Computing (HPC). In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for CFD applications.

  16. Applications of Massive Mathematical Computations

    DTIC Science & Technology

    1990-04-01

    particles from the first principles of QCD . This problem is under intensive numerical study 11-6 using special purpose parallel supercomputers in...several places around the world. The method used here is the Monte Carlo integration for a fixed 3-D plus time lattices . Reliable results are still years...mathematical and theoretical physics, but its most promising applications are in the numerical realization of QCD computations. Our programs for the solution

  17. Demonstration of Cost-Effective, High-Performance Computing at Performance and Reliability Levels Equivalent to a 1994 Vector Supercomputer

    NASA Technical Reports Server (NTRS)

    Babrauckas, Theresa

    2000-01-01

    The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.

  18. Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

    2009-01-01

    This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.

  19. Non-preconditioned conjugate gradient on cell and FPCA-based hybrid supercomputer nodes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

    2009-03-10

    This work presents a detailed implementation of a double precision, Non-Preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{trademark} in conjunction with x86 Opteron{trademark} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.

  20. Validation of the diagnoses of panic disorder and phobic disorders in the US National Comorbidity Survey Replication Adolescent (NCS-A) supplement.

    PubMed

    Green, Jennifer Greif; Avenevoli, Shelli; Finkelman, Matthew; Gruber, Michael J; Kessler, Ronald C; Merikangas, Kathleen R; Sampson, Nancy A; Zaslavsky, Alan M

    2011-06-01

    Validity of the adolescent version of the World Health Organization Composite International Diagnostic Interview (CIDI) Version 3.0, a fully-structured research diagnostic interview designed to be used by trained lay interviewers, is assessed in comparison to independent clinical diagnoses based on the Schedule for Affective Disorders and Schizophrenia for School-age Children (K-SADS). This assessment is carried out in the clinical reappraisal sub-sample (n = 347) of the US National Comorbidity Survey Adolescent (NCS-A) supplement, a large (n = 10,148) community epidemiological survey of the prevalence and correlates of adolescent mental disorders in the United States. The diagnoses considered are panic disorder and phobic disorders (social phobia, specific phobia, agoraphobia). CIDI diagnoses are found to have good concordance with K-SADS diagnoses [area under the receiver operating characteristic curve (AUC) = 0.81-0.94], although the CIDI diagnoses are consistency somewhat higher than the K-SADS diagnoses. Data are also presented on criterion-level concordance in an effort to pinpoint CIDI question series that might be improved in future modifications of the instrument. Finally, data are presented on the factor structure of the fears associated with social phobia, the only disorder in this series where substantial controversy exists about disorder subtypes. Copyright © 2011 John Wiley & Sons, Ltd.

  1. High-Performance Computing: Industry Uses of Supercomputers and High-Speed Networks. Report to Congressional Requesters.

    ERIC Educational Resources Information Center

    General Accounting Office, Washington, DC. Information Management and Technology Div.

    This report was prepared in response to a request for information on supercomputers and high-speed networks from the Senate Committee on Commerce, Science, and Transportation, and the House Committee on Science, Space, and Technology. The following information was requested: (1) examples of how various industries are using supercomputers to…

  2. Supercomputer Provides Molecular Insight into Cellulose (Fact Sheet)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    2011-02-01

    Groundbreaking research at the National Renewable Energy Laboratory (NREL) has used supercomputing simulations to calculate the work that enzymes must do to deconstruct cellulose, which is a fundamental step in biomass conversion technologies for biofuels production. NREL used the new high-performance supercomputer Red Mesa to conduct several million central processing unit (CPU) hours of simulation.

  3. The Centre of High-Performance Scientific Computing, Geoverbund, ABC/J - Geosciences enabled by HPSC

    NASA Astrophysics Data System (ADS)

    Kollet, Stefan; Görgen, Klaus; Vereecken, Harry; Gasper, Fabian; Hendricks-Franssen, Harrie-Jan; Keune, Jessica; Kulkarni, Ketan; Kurtz, Wolfgang; Sharples, Wendy; Shrestha, Prabhakar; Simmer, Clemens; Sulis, Mauro; Vanderborght, Jan

    2016-04-01

    The Centre of High-Performance Scientific Computing (HPSC TerrSys) was founded 2011 to establish a centre of competence in high-performance scientific computing in terrestrial systems and the geosciences enabling fundamental and applied geoscientific research in the Geoverbund ABC/J (geoscientfic research alliance of the Universities of Aachen, Cologne, Bonn and the Research Centre Jülich, Germany). The specific goals of HPSC TerrSys are to achieve relevance at the national and international level in (i) the development and application of HPSC technologies in the geoscientific community; (ii) student education; (iii) HPSC services and support also to the wider geoscientific community; and in (iv) the industry and public sectors via e.g., useful applications and data products. A key feature of HPSC TerrSys is the Simulation Laboratory Terrestrial Systems, which is located at the Jülich Supercomputing Centre (JSC) and provides extensive capabilities with respect to porting, profiling, tuning and performance monitoring of geoscientific software in JSC's supercomputing environment. We will present a summary of success stories of HPSC applications including integrated terrestrial model development, parallel profiling and its application from watersheds to the continent; massively parallel data assimilation using physics-based models and ensemble methods; quasi-operational terrestrial water and energy monitoring; and convection permitting climate simulations over Europe. The success stories stress the need for a formalized education of students in the application of HPSC technologies in future.

  4. High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL.

    PubMed

    Stone, John E; Messmer, Peter; Sisneros, Robert; Schulten, Klaus

    2016-05-01

    Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications.

  5. High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL

    PubMed Central

    Stone, John E.; Messmer, Peter; Sisneros, Robert; Schulten, Klaus

    2016-01-01

    Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications. PMID:27747137

  6. GREEN SUPERCOMPUTING IN A DESKTOP BOX

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    HSU, CHUNG-HSING; FENG, WU-CHUN; CHING, AVERY

    2007-01-17

    The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an open-source operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicatedmore » computer cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren - a shared (and power-hungry) datacenter resource that must reside in a machine-cooled room in order to operate properly. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a 'green' desktop supercomputer - a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 'pizza box' workstation. In this paper, they present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop supercomputer that achieves 14 Gflops on Linpack but sips only 185 watts of power at load, resulting in a performance-power ratio that is over 300% better than their reference SMP platform.« less

  7. Use of high performance networks and supercomputers for real-time flight simulation

    NASA Technical Reports Server (NTRS)

    Cleveland, Jeff I., II

    1993-01-01

    In order to meet the stringent time-critical requirements for real-time man-in-the-loop flight simulation, computer processing operations must be consistent in processing time and be completed in as short a time as possible. These operations include simulation mathematical model computation and data input/output to the simulators. In 1986, in response to increased demands for flight simulation performance, NASA's Langley Research Center (LaRC), working with the contractor, developed extensions to the Computer Automated Measurement and Control (CAMAC) technology which resulted in a factor of ten increase in the effective bandwidth and reduced latency of modules necessary for simulator communication. This technology extension is being used by more than 80 leading technological developers in the United States, Canada, and Europe. Included among the commercial applications are nuclear process control, power grid analysis, process monitoring, real-time simulation, and radar data acquisition. Personnel at LaRC are completing the development of the use of supercomputers for mathematical model computation to support real-time flight simulation. This includes the development of a real-time operating system and development of specialized software and hardware for the simulator network. This paper describes the data acquisition technology and the development of supercomputing for flight simulation.

  8. Using a multifrontal sparse solver in a high performance, finite element code

    NASA Technical Reports Server (NTRS)

    King, Scott D.; Lucas, Robert; Raefsky, Arthur

    1990-01-01

    We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.

  9. The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.

    PubMed

    Kunkel, Susanne; Schenck, Wolfram

    2017-01-01

    NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.

  10. The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code

    PubMed Central

    Kunkel, Susanne; Schenck, Wolfram

    2017-01-01

    NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946

  11. ArcticDEM Validation and Accuracy Assessment

    NASA Astrophysics Data System (ADS)

    Candela, S. G.; Howat, I.; Noh, M. J.; Porter, C. C.; Morin, P. J.

    2017-12-01

    ArcticDEM comprises a growing inventory Digital Elevation Models (DEMs) covering all land above 60°N. As of August, 2017, ArcticDEM had openly released 2-m resolution, individual DEM covering over 51 million km2, which includes areas of repeat coverage for change detection, as well as over 15 million km2 of 5-m resolution seamless mosaics. By the end of the project, over 80 million km2 of 2-m DEMs will be produced, averaging four repeats of the 20 million km2 Arctic landmass. ArcticDEM is produced from sub-meter resolution, stereoscopic imagery using open source software (SETSM) on the NCSA Blue Waters supercomputer. These DEMs have known biases of several meters due to errors in the sensor models generated from satellite positioning. These systematic errors are removed through three-dimensional registration to high-precision Lidar or other control datasets. ArcticDEM is registered to seasonally-subsetted ICESat elevations due its global coverage and high report accuracy ( 10 cm). The vertical accuracy of ArcticDEM is then obtained from the statistics of the fit to the ICESat point cloud, which averages -0.01 m ± 0.07 m. ICESat, however, has a relatively coarse measurement footprint ( 70 m) which may impact the precision of the registration. Further, the ICESat data predates the ArcticDEM imagery by a decade, so that temporal changes in the surface may also impact the registration. Finally, biases may exist between different the different sensors in the ArcticDEM constellation. Here we assess the accuracy of ArcticDEM and the ICESat registration through comparison to multiple high-resolution airborne lidar datasets that were acquired within one year of the imagery used in ArcticDEM. We find the ICESat dataset is performing as anticipated, introducing no systematic bias during the coregistration process, and reducing vertical errors to within the uncertainty of the airborne Lidars. Preliminary sensor comparisons show no significant difference post coregistration, suggesting that there is no sensor bias between platforms, and all data is suitable for analysis without further correction. Here we will present accuracy assessments, observations and comparisons over diverse terrain in parts of Alaska and Greenland.

  12. DataHub: Knowledge-based data management for data discovery

    NASA Astrophysics Data System (ADS)

    Handley, Thomas H.; Li, Y. Philip

    1993-08-01

    Currently available database technology is largely designed for business data-processing applications, and seems inadequate for scientific applications. The research described in this paper, the DataHub, will address the issues associated with this shortfall in technology utilization and development. The DataHub development is addressing the key issues in scientific data management of scientific database models and resource sharing in a geographically distributed, multi-disciplinary, science research environment. Thus, the DataHub will be a server between the data suppliers and data consumers to facilitate data exchanges, to assist science data analysis, and to provide as systematic approach for science data management. More specifically, the DataHub's objectives are to provide support for (1) exploratory data analysis (i.e., data driven analysis); (2) data transformations; (3) data semantics capture and usage; analysis-related knowledge capture and usage; and (5) data discovery, ingestion, and extraction. Applying technologies that vary from deductive databases, semantic data models, data discovery, knowledge representation and inferencing, exploratory data analysis techniques and modern man-machine interfaces, DataHub will provide a prototype, integrated environement to support research scientists' needs in multiple disciplines (i.e. oceanography, geology, and atmospheric) while addressing the more general science data management issues. Additionally, the DataHub will provide data management services to exploratory data analysis applications such as LinkWinds and NCSA's XIMAGE.

  13. Cyberinfrastructure for Atmospheric Discovery

    NASA Astrophysics Data System (ADS)

    Wilhelmson, R.; Moore, C. W.

    2004-12-01

    Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed to create an integrated, scalable for identifying, accessing, preparing, assimilating, predicting, managing, analyzing, mining, and visualizing a broad array of meteorological data and model output, independent of format and physical location. A transforming element of LEAD is Workflow Orchestration for On-Demand, Real-Time, Dynamically-Adaptive Systems (WOORDS), which allows the use of analysis tools, forecast models, and data repositories as dynamically adaptive, on-demand, Grid-enabled systems that can a) change configuration rapidly and automatically in response to weather; b) continually be steered by new data; c) respond to decision-driven inputs from users; d) initiate other processes automatically; and e) steer remote observing technologies to optimize data collection for the problem at hand. Although LEAD efforts are primiarly directed at mesoscale meteorology, the IT services being developed has general applicability to other geoscience and environmental science. Integration of traditional and new data sources is a crucial component in LEAD for data analysis and assimilation, for integration of (ensemble mining) of data from sets of simulations, and for comparing results to observational data. As part of the integration effort, LEAD is creating a myLEAD metadata catalog service: a personal metacatalog that extends the Globus MCS system and is built on top of the OGSA-DAI system developed at the National e-Science Center in Edinburgh, Scotland.

  14. Implementing Scientific Simulation Codes Highly Tailored for Vector Architectures Using Custom Configurable Computing Machines

    NASA Technical Reports Server (NTRS)

    Rutishauser, David

    2006-01-01

    The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters that attempts to minimize execution time, while staying within resource constraints. The flexibility of using a custom reconfigurable implementation is exploited in a unique manner to leverage the lessons learned in vector supercomputer development. The vector processing framework is tailored to the application, with variable parameters that are fixed in traditional vector processing. Benchmark data that demonstrates the functionality and utility of the approach is presented. The benchmark data includes an identified bottleneck in a real case study example vector code, the NASA Langley Terminal Area Simulation System (TASS) application.

  15. Kriging for Spatial-Temporal Data on the Bridges Supercomputer

    NASA Astrophysics Data System (ADS)

    Hodgess, E. M.

    2017-12-01

    Currently, kriging of spatial-temporal data is slow and limited to relatively small vector sizes. We have developed a method on the Bridges supercomputer, at the Pittsburgh supercomputer center, which uses a combination of the tools R, Fortran, the Message Passage Interface (MPI), OpenACC, and special R packages for big data. This combination of tools now permits us to complete tasks which could previously not be completed, or takes literally hours to complete. We ran simulation studies from a laptop against the supercomputer. We also look at "real world" data sets, such as the Irish wind data, and some weather data. We compare the timings. We note that the timings are suprising good.

  16. Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

    PubMed

    Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

    2016-04-01

    Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .

  17. On the energy footprint of I/O management in Exascale HPC systems

    DOE PAGES

    Dorier, Matthieu; Yildiz, Orcun; Ibrahim, Shadi; ...

    2016-03-21

    The advent of unprecedentedly scalable yet energy hungry Exascale supercomputers poses a major challenge in sustaining a high performance-per-watt ratio. With I/O management acquiring a crucial role in supporting scientific simulations, various I/O management approaches have been proposed to achieve high performance and scalability. But, the details of how these approaches affect energy consumption have not been studied yet. Therefore, this paper aims to explore how much energy a supercomputer consumes while running scientific simulations when adopting various I/O management approaches. In particular, we closely examine three radically different I/O schemes including time partitioning, dedicated cores, and dedicated nodes. Tomore » accomplish this, we implement the three approaches within the Damaris I/O middleware and perform extensive experiments with one of the target HPC applications of the Blue Waters sustained-petaflop supercomputer project: the CM1 atmospheric model. Our experimental results obtained on the French Grid'5000 platform highlight the differences among these three approaches and illustrate in which way various configurations of the application and of the system can impact performance and energy consumption. Moreover, we propose and validate a mathematical model that estimates the energy consumption of a HPC simulation under different I/O approaches. This proposed model gives hints to pre-select the most energy-efficient I/O approach for a particular simulation on a particular HPC system and therefore provides a step towards energy-efficient HPC simulations in Exascale systems. To the best of our knowledge, our work provides the first in-depth look into the energy-performance tradeoffs of I/O management approaches.« less

  18. Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

    DOE PAGES

    Wang, Bei; Ethier, Stephane; Tang, William; ...

    2017-06-29

    The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less

  19. Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Bei; Ethier, Stephane; Tang, William

    The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less

  20. Federated data storage system prototype for LHC experiments and data intensive science

    NASA Astrophysics Data System (ADS)

    Kiryanov, A.; Klimentov, A.; Krasnopevtsev, D.; Ryabinkin, E.; Zarochentsev, A.

    2017-10-01

    Rapid increase of data volume from the experiments running at the Large Hadron Collider (LHC) prompted physics computing community to evaluate new data handling and processing solutions. Russian grid sites and universities’ clusters scattered over a large area aim at the task of uniting their resources for future productive work, at the same time giving an opportunity to support large physics collaborations. In our project we address the fundamental problem of designing a computing architecture to integrate distributed storage resources for LHC experiments and other data-intensive science applications and to provide access to data from heterogeneous computing facilities. Studies include development and implementation of federated data storage prototype for Worldwide LHC Computing Grid (WLCG) centres of different levels and University clusters within one National Cloud. The prototype is based on computing resources located in Moscow, Dubna, Saint Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations such as read/write/transfer and access via WAN from Grid centres, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments, as well as compute-intensive bioinformatics applications (PALEOMIX) running on supercomputers. We present topology and architecture of the designed system, report performance and statistics for different access patterns and show how federated data storage can be used efficiently by physicists and biologists. We also describe how sharing data on a widely distributed storage system can lead to a new computing model and reformations of computing style, for instance how bioinformatics program running on supercomputers can read/write data from the federated storage.

  1. Supercomputing in Aerospace

    NASA Technical Reports Server (NTRS)

    Kutler, Paul; Yee, Helen

    1987-01-01

    Topics addressed include: numerical aerodynamic simulation; computational mechanics; supercomputers; aerospace propulsion systems; computational modeling in ballistics; turbulence modeling; computational chemistry; computational fluid dynamics; and computational astrophysics.

  2. NAS technical summaries: Numerical aerodynamic simulation program, March 1991 - February 1992

    NASA Technical Reports Server (NTRS)

    1992-01-01

    NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefiting other supercomputer centers in Government and industry. This report contains selected scientific results from the 1991-92 NAS Operational Year, March 4, 1991 to March 3, 1992, which is the fifth year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP. The Cray-2, the first generation supercomputer, has four processors, 256 megawords of central memory, and a total sustained speed of 250 million floating point operations per second. The Cray Y-MP, the second generation supercomputer, has eight processors and a total sustained speed of one billion floating point operations per second. Additional memory was installed this year, doubling capacity from 128 to 256 megawords of solid-state storage-device memory. Because of its higher performance, the Cray Y-MP delivered approximately 77 percent of the total number of supercomputer hours used during this year.

  3. Real science at the petascale.

    PubMed

    Saksena, Radhika S; Boghosian, Bruce; Fazendeiro, Luis; Kenway, Owain A; Manos, Steven; Mazzeo, Marco D; Sadiq, S Kashif; Suter, James L; Wright, David; Coveney, Peter V

    2009-06-28

    We describe computational science research that uses petascale resources to achieve scientific results at unprecedented scales and resolution. The applications span a wide range of domains, from investigation of fundamental problems in turbulence through computational materials science research to biomedical applications at the forefront of HIV/AIDS research and cerebrovascular haemodynamics. This work was mainly performed on the US TeraGrid 'petascale' resource, Ranger, at Texas Advanced Computing Center, in the first half of 2008 when it was the largest computing system in the world available for open scientific research. We have sought to use this petascale supercomputer optimally across application domains and scales, exploiting the excellent parallel scaling performance found on up to at least 32 768 cores for certain of our codes in the so-called 'capability computing' category as well as high-throughput intermediate-scale jobs for ensemble simulations in the 32-512 core range. Furthermore, this activity provides evidence that conventional parallel programming with MPI should be successful at the petascale in the short to medium term. We also report on the parallel performance of some of our codes on up to 65 636 cores on the IBM Blue Gene/P system at the Argonne Leadership Computing Facility, which has recently been named the fastest supercomputer in the world for open science.

  4. Multi-threaded ATLAS simulation on Intel Knights Landing processors

    NASA Astrophysics Data System (ADS)

    Farrell, Steven; Calafiura, Paolo; Leggett, Charles; Tsulaia, Vakhtang; Dotti, Andrea; ATLAS Collaboration

    2017-10-01

    The Knights Landing (KNL) release of the Intel Many Integrated Core (MIC) Xeon Phi line of processors is a potential game changer for HEP computing. With 72 cores and deep vector registers, the KNL cards promise significant performance benefits for highly-parallel, compute-heavy applications. Cori, the newest supercomputer at the National Energy Research Scientific Computing Center (NERSC), was delivered to its users in two phases with the first phase online at the end of 2015 and the second phase now online at the end of 2016. Cori Phase 2 is based on the KNL architecture and contains over 9000 compute nodes with 96GB DDR4 memory. ATLAS simulation with the multithreaded Athena Framework (AthenaMT) is a good potential use-case for the KNL architecture and supercomputers like Cori. ATLAS simulation jobs have a high ratio of CPU computation to disk I/O and have been shown to scale well in multi-threading and across many nodes. In this paper we will give an overview of the ATLAS simulation application with details on its multi-threaded design. Then, we will present a performance analysis of the application on KNL devices and compare it to a traditional x86 platform to demonstrate the capabilities of the architecture and evaluate the benefits of utilizing KNL platforms like Cori for ATLAS production.

  5. : A Scalable and Transparent System for Simulating MPI Programs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perumalla, Kalyan S

    2010-01-01

    is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, andmore » MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.« less

  6. Chickscope Realized: A Situated Evaluation of a Sixth-Grade Classroom.

    ERIC Educational Resources Information Center

    Hogan, Maureen P.

    2000-01-01

    This is a case study of two sixth-grade teachers from Illinois who participated in a semester-long inservice to learn about Chickscope, a supercomputing application that allows students and teachers remote access to magnetic resonance images of chicken embryos. Shows how they produced an inquiry-based unit on chicken weight and measurement.…

  7. Measurements over distributed high performance computing and storage systems

    NASA Technical Reports Server (NTRS)

    Williams, Elizabeth; Myers, Tom

    1993-01-01

    A strawman proposal is given for a framework for presenting a common set of metrics for supercomputers, workstations, file servers, mass storage systems, and the networks that interconnect them. Production control and database systems are also included. Though other applications and third part software systems are not addressed, it is important to measure them as well.

  8. Applications Development for a Parallel COTS Spaceborne Computer

    NASA Technical Reports Server (NTRS)

    Katz, Daniel S.; Springer, Paul L.; Granat, Robert; Turmon, Michael

    2000-01-01

    This presentation reviews the Remote Exploration and Experimentation Project (REE) program for utilization of scalable supercomputing technology in space. The implementation of REE will be the use of COTS hardware and software to the maximum extent possible, keeping overhead low. Since COTS systems will be used, with little or no special modification, there will be significant cost reduction.

  9. Science & Technology Review June 2012

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poyneer, L A

    2012-04-20

    This month's issue has the following articles: (1) A New Era in Climate System Analysis - Commentary by William H. Goldstein; (2) Seeking Clues to Climate Change - By comparing past climate records with results from computer simulations, Livermore scientists can better understand why Earth's climate has changed and how it might change in the future; (3) Finding and Fixing a Supercomputer's Faults - Livermore experts have developed innovative methods to detect hardware faults in supercomputers and help applications recover from errors that do occur; (4) Targeting Ignition - Enhancements to the cryogenic targets for National Ignition Facility experiments aremore » furthering work to achieve fusion ignition with energy gain; (5) Neural Implants Come of Age - A new generation of fully implantable, biocompatible neural prosthetics offers hope to patients with neurological impairment; and (6) Incubator Busy Growing Energy Technologies - Six collaborations with industrial partners are using the Laboratory's high-performance computing resources to find solutions to urgent energy-related problems.« less

  10. High Performance Computing at NASA

    NASA Technical Reports Server (NTRS)

    Bailey, David H.; Cooper, D. M. (Technical Monitor)

    1994-01-01

    The speaker will give an overview of high performance computing in the U.S. in general and within NASA in particular, including a description of the recently signed NASA-IBM cooperative agreement. The latest performance figures of various parallel systems on the NAS Parallel Benchmarks will be presented. The speaker was one of the authors of the NAS (National Aerospace Standards) Parallel Benchmarks, which are now widely cited in the industry as a measure of sustained performance on realistic high-end scientific applications. It will be shown that significant progress has been made by the highly parallel supercomputer industry during the past year or so, with several new systems, based on high-performance RISC processors, that now deliver superior performance per dollar compared to conventional supercomputers. Various pitfalls in reporting performance will be discussed. The speaker will then conclude by assessing the general state of the high performance computing field.

  11. Using a Cray Y-MP as an array processor for a RISC Workstation

    NASA Technical Reports Server (NTRS)

    Lamaster, Hugh; Rogallo, Sarah J.

    1992-01-01

    As microprocessors increase in power, the economics of centralized computing has changed dramatically. At the beginning of the 1980's, mainframes and super computers were often considered to be cost-effective machines for scalar computing. Today, microprocessor-based RISC (reduced-instruction-set computer) systems have displaced many uses of mainframes and supercomputers. Supercomputers are still cost competitive when processing jobs that require both large memory size and high memory bandwidth. One such application is array processing. Certain numerical operations are appropriate to use in a Remote Procedure Call (RPC)-based environment. Matrix multiplication is an example of an operation that can have a sufficient number of arithmetic operations to amortize the cost of an RPC call. An experiment which demonstrates that matrix multiplication can be executed remotely on a large system to speed the execution over that experienced on a workstation is described.

  12. Hybrid petacomputing meets cosmology: The Roadrunner Universe project

    NASA Astrophysics Data System (ADS)

    Habib, Salman; Pope, Adrian; Lukić, Zarija; Daniel, David; Fasel, Patricia; Desai, Nehal; Heitmann, Katrin; Hsu, Chung-Hsing; Ankeny, Lee; Mark, Graham; Bhattacharya, Suman; Ahrens, James

    2009-07-01

    The target of the Roadrunner Universe project at Los Alamos National Laboratory is a set of very large cosmological N-body simulation runs on the hybrid supercomputer Roadrunner, the world's first petaflop platform. Roadrunner's architecture presents opportunities and difficulties characteristic of next-generation supercomputing. We describe a new code designed to optimize performance and scalability by explicitly matching the underlying algorithms to the machine architecture, and by using the physics of the problem as an essential aid in this process. While applications will differ in specific exploits, we believe that such a design process will become increasingly important in the future. The Roadrunner Universe project code, MC3 (Mesh-based Cosmology Code on the Cell), uses grid and direct particle methods to balance the capabilities of Roadrunner's conventional (Opteron) and accelerator (Cell BE) layers. Mirrored particle caches and spectral techniques are used to overcome communication bandwidth limitations and possible difficulties with complicated particle-grid interaction templates.

  13. Visualization at Supercomputing Centers: The Tale of Little Big Iron and the Three Skinny Guys

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bethel, E. Wes; van Rosendale, John; Southard, Dale

    2010-12-01

    Supercomputing Centers (SC's) are unique resources that aim to enable scientific knowledge discovery through the use of large computational resources, the Big Iron. Design, acquisition, installation, and management of the Big Iron are activities that are carefully planned and monitored. Since these Big Iron systems produce a tsunami of data, it is natural to co-locate visualization and analysis infrastructure as part of the same facility. This infrastructure consists of hardware (Little Iron) and staff (Skinny Guys). Our collective experience suggests that design, acquisition, installation, and management of the Little Iron and Skinny Guys does not receive the same level ofmore » treatment as that of the Big Iron. The main focus of this article is to explore different aspects of planning, designing, fielding, and maintaining the visualization and analysis infrastructure at supercomputing centers. Some of the questions we explore in this article include:"How should the Little Iron be sized to adequately support visualization and analysis of data coming off the Big Iron?" What sort of capabilities does it need to have?" Related questions concern the size of visualization support staff:"How big should a visualization program be (number of persons) and what should the staff do?" and"How much of the visualization should be provided as a support service, and how much should applications scientists be expected to do on their own?"« less

  14. Color graphics, interactive processing, and the supercomputer

    NASA Technical Reports Server (NTRS)

    Smith-Taylor, Rudeen

    1987-01-01

    The development of a common graphics environment for the NASA Langley Research Center user community and the integration of a supercomputer into this environment is examined. The initial computer hardware, the software graphics packages, and their configurations are described. The addition of improved computer graphics capability to the supercomputer, and the utilization of the graphic software and hardware are discussed. Consideration is given to the interactive processing system which supports the computer in an interactive debugging, processing, and graphics environment.

  15. Automated Help System For A Supercomputer

    NASA Technical Reports Server (NTRS)

    Callas, George P.; Schulbach, Catherine H.; Younkin, Michael

    1994-01-01

    Expert-system software developed to provide automated system of user-helping displays in supercomputer system at Ames Research Center Advanced Computer Facility. Users located at remote computer terminals connected to supercomputer and each other via gateway computers, local-area networks, telephone lines, and satellite links. Automated help system answers routine user inquiries about how to use services of computer system. Available 24 hours per day and reduces burden on human experts, freeing them to concentrate on helping users with complicated problems.

  16. Bellerophon

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lingerfelt, Eric J; Messer, II, Otis E

    2017-01-02

    The Bellerophon software system supports CHIMERA, a production-level HPC application that simulates the evolution of core-collapse supernovae. Bellerophon enables CHIMERA's geographically dispersed team of collaborators to perform job monitoring and real-time data analysis from multiple supercomputing resources, including platforms at OLCF, NERSC, and NICS. Its multi-tier architecture provides an encapsulated, end-to-end software solution that enables the CHIMERA team to quickly and easily access highly customizable animated and static views of results from anywhere in the world via a cross-platform desktop application.

  17. The NAS parallel benchmarks

    NASA Technical Reports Server (NTRS)

    Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

    1993-01-01

    A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

  18. NASA Advanced Supercomputing (NAS) User Services Group

    NASA Technical Reports Server (NTRS)

    Pandori, John; Hamilton, Chris; Niggley, C. E.; Parks, John W. (Technical Monitor)

    2002-01-01

    This viewgraph presentation provides an overview of NAS (NASA Advanced Supercomputing), its goals, and its mainframe computer assets. Also covered are its functions, including systems monitoring and technical support.

  19. Scientific Computing Paradigm

    NASA Technical Reports Server (NTRS)

    VanZandt, John

    1994-01-01

    The usage model of supercomputers for scientific applications, such as computational fluid dynamics (CFD), has changed over the years. Scientific visualization has moved scientists away from looking at numbers to looking at three-dimensional images, which capture the meaning of the data. This change has impacted the system models for computing. This report details the model which is used by scientists at NASA's research centers.

  20. Production experience with the ATLAS Event Service

    NASA Astrophysics Data System (ADS)

    Benjamin, D.; Calafiura, P.; Childers, T.; De, K.; Guan, W.; Maeno, T.; Nilsson, P.; Tsulaia, V.; Van Gemmeren, P.; Wenaus, T.; ATLAS Collaboration

    2017-10-01

    The ATLAS Event Service (AES) has been designed and implemented for efficient running of ATLAS production workflows on a variety of computing platforms, ranging from conventional Grid sites to opportunistic, often short-lived resources, such as spot market commercial clouds, supercomputers and volunteer computing. The Event Service architecture allows real time delivery of fine grained workloads to running payload applications which process dispatched events or event ranges and immediately stream the outputs to highly scalable Object Stores. Thanks to its agile and flexible architecture the AES is currently being used by grid sites for assigning low priority workloads to otherwise idle computing resources; similarly harvesting HPC resources in an efficient back-fill mode; and massively scaling out to the 50-100k concurrent core level on the Amazon spot market to efficiently utilize those transient resources for peak production needs. Platform ports in development include ATLAS@Home (BOINC) and the Google Compute Engine, and a growing number of HPC platforms. After briefly reviewing the concept and the architecture of the Event Service, we will report the status and experience gained in AES commissioning and production operations on supercomputers, and our plans for extending ES application beyond Geant4 simulation to other workflows, such as reconstruction and data analysis.

  1. Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

    DOE PAGES

    Basu, Protonu; Williams, Samuel; Van Straalen, Brian; ...

    2017-04-05

    GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. Thus, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and maintain two versions of their applications or frameworks. In this paper, we explore the use of a compiler-based autotuning framework based on CUDA-CHiLL to deliver not only portability, but also performance portability across CPU- and GPU-accelerated platforms for the geometric multigrid linear solvers found inmore » many scientific applications. We also show that with autotuning we can attain near Roofline (a performance bound for a computation and target architecture) performance across the key operations in the miniGMG benchmark for both CPU- and GPU-based architectures as well as for a multiple stencil discretizations and smoothers. We show that our technology is readily interoperable with MPI resulting in performance at scale equal to that obtained via hand-optimized MPI+CUDA implementation.« less

  2. Computational chemistry

    NASA Technical Reports Server (NTRS)

    Arnold, J. O.

    1987-01-01

    With the advent of supercomputers, modern computational chemistry algorithms and codes, a powerful tool was created to help fill NASA's continuing need for information on the properties of matter in hostile or unusual environments. Computational resources provided under the National Aerodynamics Simulator (NAS) program were a cornerstone for recent advancements in this field. Properties of gases, materials, and their interactions can be determined from solutions of the governing equations. In the case of gases, for example, radiative transition probabilites per particle, bond-dissociation energies, and rates of simple chemical reactions can be determined computationally as reliably as from experiment. The data are proving to be quite valuable in providing inputs to real-gas flow simulation codes used to compute aerothermodynamic loads on NASA's aeroassist orbital transfer vehicles and a host of problems related to the National Aerospace Plane Program. Although more approximate, similar solutions can be obtained for ensembles of atoms simulating small particles of materials with and without the presence of gases. Computational chemistry has application in studying catalysis, properties of polymers, all of interest to various NASA missions, including those previously mentioned. In addition to discussing these applications of computational chemistry within NASA, the governing equations and the need for supercomputers for their solution is outlined.

  3. Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Basu, Protonu; Williams, Samuel; Van Straalen, Brian

    GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. Thus, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and maintain two versions of their applications or frameworks. In this paper, we explore the use of a compiler-based autotuning framework based on CUDA-CHiLL to deliver not only portability, but also performance portability across CPU- and GPU-accelerated platforms for the geometric multigrid linear solvers found inmore » many scientific applications. We also show that with autotuning we can attain near Roofline (a performance bound for a computation and target architecture) performance across the key operations in the miniGMG benchmark for both CPU- and GPU-based architectures as well as for a multiple stencil discretizations and smoothers. We show that our technology is readily interoperable with MPI resulting in performance at scale equal to that obtained via hand-optimized MPI+CUDA implementation.« less

  4. An efficient framework for Java data processing systems in HPC environments

    NASA Astrophysics Data System (ADS)

    Fries, Aidan; Castañeda, Javier; Isasi, Yago; Taboada, Guillermo L.; Portell de Mora, Jordi; Sirvent, Raül

    2011-11-01

    Java is a commonly used programming language, although its use in High Performance Computing (HPC) remains relatively low. One of the reasons is a lack of libraries offering specific HPC functions to Java applications. In this paper we present a Java-based framework, called DpcbTools, designed to provide a set of functions that fill this gap. It includes a set of efficient data communication functions based on message-passing, thus providing, when a low latency network such as Myrinet is available, higher throughputs and lower latencies than standard solutions used by Java. DpcbTools also includes routines for the launching, monitoring and management of Java applications on several computing nodes by making use of JMX to communicate with remote Java VMs. The Gaia Data Processing and Analysis Consortium (DPAC) is a real case where scientific data from the ESA Gaia astrometric satellite will be entirely processed using Java. In this paper we describe the main elements of DPAC and its usage of the DpcbTools framework. We also assess the usefulness and performance of DpcbTools through its performance evaluation and the analysis of its impact on some DPAC systems deployed in the MareNostrum supercomputer (Barcelona Supercomputing Center).

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bailey, David H.

    The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, althoughmore » the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.« less

  6. NSF Commits to Supercomputers.

    ERIC Educational Resources Information Center

    Waldrop, M. Mitchell

    1985-01-01

    The National Science Foundation (NSF) has allocated at least $200 million over the next five years to support four new supercomputer centers. Issues and trends related to this NSF initiative are examined. (JN)

  7. Mira: Argonne's 10-petaflops supercomputer

    ScienceCinema

    Papka, Michael; Coghlan, Susan; Isaacs, Eric; Peters, Mark; Messina, Paul

    2018-02-13

    Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.

  8. Adventures in Computational Grids

    NASA Technical Reports Server (NTRS)

    Walatka, Pamela P.; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    Sometimes one supercomputer is not enough. Or your local supercomputers are busy, or not configured for your job. Or you don't have any supercomputers. You might be trying to simulate worldwide weather changes in real time, requiring more compute power than you could get from any one machine. Or you might be collecting microbiological samples on an island, and need to examine them with a special microscope located on the other side of the continent. These are the times when you need a computational grid.

  9. Mira: Argonne's 10-petaflops supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Papka, Michael; Coghlan, Susan; Isaacs, Eric

    2013-07-03

    Mira, Argonne's petascale IBM Blue Gene/Q system, ushers in a new era of scientific supercomputing at the Argonne Leadership Computing Facility. An engineering marvel, the 10-petaflops supercomputer is capable of carrying out 10 quadrillion calculations per second. As a machine for open science, any researcher with a question that requires large-scale computing resources can submit a proposal for time on Mira, typically in allocations of millions of core-hours, to run programs for their experiments. This adds up to billions of hours of computing time per year.

  10. Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)

    ScienceCinema

    Guenther, Chris

    2018-05-23

    The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.

  11. A high level language for a high performance computer

    NASA Technical Reports Server (NTRS)

    Perrott, R. H.

    1978-01-01

    The proposed computational aerodynamic facility will join the ranks of the supercomputers due to its architecture and increased execution speed. At present, the languages used to program these supercomputers have been modifications of programming languages which were designed many years ago for sequential machines. A new programming language should be developed based on the techniques which have proved valuable for sequential programming languages and incorporating the algorithmic techniques required for these supercomputers. The design objectives for such a language are outlined.

  12. Floating point arithmetic in future supercomputers

    NASA Technical Reports Server (NTRS)

    Bailey, David H.; Barton, John T.; Simon, Horst D.; Fouts, Martin J.

    1989-01-01

    Considerations in the floating-point design of a supercomputer are discussed. Particular attention is given to word size, hardware support for extended precision, format, and accuracy characteristics. These issues are discussed from the perspective of the Numerical Aerodynamic Simulation Systems Division at NASA Ames. The features believed to be most important for a future supercomputer floating-point design include: (1) a 64-bit IEEE floating-point format with 11 exponent bits, 52 mantissa bits, and one sign bit and (2) hardware support for reasonably fast double-precision arithmetic.

  13. Breakthrough: NETL's Simulation-Based Engineering User Center (SBEUC)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guenther, Chris

    The National Energy Technology Laboratory relies on supercomputers to develop many novel ideas that become tomorrow's energy solutions. Supercomputers provide a cost-effective, efficient platform for research and usher technologies into widespread use faster to bring benefits to the nation. In 2013, Secretary of Energy Dr. Ernest Moniz dedicated NETL's new supercomputer, the Simulation Based Engineering User Center, or SBEUC. The SBEUC is dedicated to fossil energy research and is a collaborative tool for all of NETL and our regional university partners.

  14. Tracing Scientific Facilities through the Research Literature Using Persistent Identifiers

    NASA Astrophysics Data System (ADS)

    Mayernik, M. S.; Maull, K. E.

    2016-12-01

    Tracing persistent identifiers to their source publications is an easy task when authors use them, since it is a simple matter of matching the persistent identifier to the specific text string of the identifier. However, trying to understand if a publication uses the resource behind an identifier when such identifier is not referenced explicitly is a harder task. In this research, we explore the effectiveness of alternative strategies of associating publications with uses of the resource referenced by an identifier when it may not be explicit. This project is explored within the context of the NCAR supercomputer, where we are broadly interesting in the science that can be traced to the usage of the NCAR supercomputing facility, by way of the peer-reviewed research publications that utilize and reference it. In this project we explore several ways of drawing linkages between publications and the NCAR supercomputing resources. Identifying and compiling peer-reviewed publications related to NCAR supercomputer usage are explored via three sources: 1) User-supplied publications gathered through a community survey, 2) publications that were identified via manual searching of the Google scholar search index, and 3) publications associated with National Science Foundation (NSF) grants extracted from a public NSF database. These three sources represent three styles of collecting information about publications that likely imply usage of the NCAR supercomputing facilities. Each source has strengths and weaknesses, thus our discussion will explore how our publication identification and analysis methods vary in terms of accuracy, reliability, and effort. We will also discuss strategies for enabling more efficient tracing of research impacts of supercomputing facilities going forward through the assignment of a persistent web identifier to the NCAR supercomputer. While this solution has potential to greatly enhance our ability to trace the use of the facility through publications, authors must cite the facility consistently. It is therefore necessary to provide recommendations for citation and attribution behavior, and we will conclude our discussion with how such recommendations have improved tracing the supercomputer facility allowing for more consistent and widespread measurement of its impact.

  15. High temporal resolution mapping of seismic noise sources using heterogeneous supercomputers

    NASA Astrophysics Data System (ADS)

    Gokhberg, Alexey; Ermert, Laura; Paitz, Patrick; Fichtner, Andreas

    2017-04-01

    Time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems. Significant interest in seismic noise source maps with high temporal resolution (days) is expected to come from a number of domains, including natural resources exploration, analysis of active earthquake fault zones and volcanoes, as well as geothermal and hydrocarbon reservoir monitoring. Currently, knowledge of noise sources is insufficient for high-resolution subsurface monitoring applications. Near-real-time seismic data, as well as advanced imaging methods to constrain seismic noise sources have recently become available. These methods are based on the massive cross-correlation of seismic noise records from all available seismic stations in the region of interest and are therefore very computationally intensive. Heterogeneous massively parallel supercomputing systems introduced in the recent years combine conventional multi-core CPU with GPU accelerators and provide an opportunity for manifold increase and computing performance. Therefore, these systems represent an efficient platform for implementation of a noise source mapping solution. We present the first results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service that provides seismic noise source maps for Central Europe with high temporal resolution (days to few weeks depending on frequency and data availability). The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept in order to provide the interested external researchers the regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise mapping application is composed of four principal modules: (1) pre-processing of raw data, (2) massive cross-correlation, (3) post-processing of correlation data based on computation of logarithmic energy ratio and (4) generation of source maps from post-processed data. Implementation of the solution posed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service oriented architecture for coordination of various sub-systems, and engineering an appropriate data storage solution. The present pilot version of the service implements noise source maps for Switzerland. Extension of the solution to Central Europe is planned for the next project phase.

  16. Energy Efficient Supercomputing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anypas, Katie

    2014-10-17

    Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.

  17. Energy Efficient Supercomputing

    ScienceCinema

    Anypas, Katie

    2018-05-07

    Katie Anypas, Head of NERSC's Services Department discusses the Lab's research into developing increasingly powerful and energy efficient supercomputers at our '8 Big Ideas' Science at the Theater event on October 8th, 2014, in Oakland, California.

  18. Job Management Requirements for NAS Parallel Systems and Clusters

    NASA Technical Reports Server (NTRS)

    Saphir, William; Tanner, Leigh Ann; Traversat, Bernard

    1995-01-01

    A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance computing industry. Newer job management systems offer new functionality but do not solve fundamental problems. We address some of the main issues in resource allocation and job scheduling we have encountered on two parallel computers - a 160-node IBM SP2 and a cluster of 20 high performance workstations located at the Numerical Aerodynamic Simulation facility. We describe the requirements for resource allocation and job management that are necessary to provide a production supercomputing environment on these machines, prioritizing according to difficulty and importance, and advocating a return to fundamental issues.

  19. Supercomputing Drives Innovation - Continuum Magazine | NREL

    Science.gov Websites

    years, NREL scientists have used supercomputers to simulate 3D models of the primary enzymes and Scientist, discuss a 3D model of wind plant aerodynamics, showing low velocity wakes and impact on

  20. A mass storage system for supercomputers based on Unix

    NASA Technical Reports Server (NTRS)

    Richards, J.; Kummell, T.; Zarlengo, D. G.

    1988-01-01

    The authors present the design, implementation, and utilization of a large mass storage subsystem (MSS) for the numerical aerodynamics simulation. The MSS supports a large networked, multivendor Unix-based supercomputing facility. The MSS at Ames Research Center provides all processors on the numerical aerodynamics system processing network, from workstations to supercomputers, the ability to store large amounts of data in a highly accessible, long-term repository. The MSS uses Unix System V and is capable of storing hundreds of thousands of files ranging from a few bytes to 2 Gb in size.

  1. Supercomputer algorithms for efficient linear octree encoding of three-dimensional brain images.

    PubMed

    Berger, S B; Reis, D J

    1995-02-01

    We designed and implemented algorithms for three-dimensional (3-D) reconstruction of brain images from serial sections using two important supercomputer architectures, vector and parallel. These architectures were represented by the Cray YMP and Connection Machine CM-2, respectively. The programs operated on linear octree representations of the brain data sets, and achieved 500-800 times acceleration when compared with a conventional laboratory workstation. As the need for higher resolution data sets increases, supercomputer algorithms may offer a means of performing 3-D reconstruction well above current experimental limits.

  2. Intelligent supercomputers: the Japanese computer sputnik

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Walter, G.

    1983-11-01

    Japan's government-supported fifth-generation computer project has had a pronounced effect on the American computer and information systems industry. The US firms are intensifying their research on and production of intelligent supercomputers, a combination of computer architecture and artificial intelligence software programs. While the present generation of computers is built for the processing of numbers, the new supercomputers will be designed specifically for the solution of symbolic problems and the use of artificial intelligence software. This article discusses new and exciting developments that will increase computer capabilities in the 1990s. 4 references.

  3. Introducing Mira, Argonne's Next-Generation Supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    2013-03-19

    Mira, the new petascale IBM Blue Gene/Q system installed at the ALCF, will usher in a new era of scientific supercomputing. An engineering marvel, the 10-petaflops machine is capable of carrying out 10 quadrillion calculations per second.

  4. Green Supercomputing at Argonne

    ScienceCinema

    Pete Beckman

    2017-12-09

    Pete Beckman, head of Argonne's Leadership Computing Facility (ALCF) talks about Argonne National Laboratory's green supercomputing—everything from designing algorithms to use fewer kilowatts per operation to using cold Chicago winter air to cool the machine more efficiently.

  5. TOP500 Supercomputers for June 2003

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

    2003-06-23

    21st Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 21st edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2003). The Earth Simulator supercomputer built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan, with its Linpack benchmark performance of 35.86 Tflop/s (teraflops or trillions of calculations per second), retains the number one position. The number 2 position is held by the re-measured ASCI Q system at Los Alamosmore » National Laboratory. With 13.88 Tflop/s, it is the second system ever to exceed the 10 Tflop/smark. ASCIQ was built by Hewlett-Packard and is based on the AlphaServerSC computer system.« less

  6. CFD lends the government a hand

    NASA Technical Reports Server (NTRS)

    Lekoudis, Spiro; Singleton, Robert E.; Mehta, Unmeel B.

    1992-01-01

    The present survey of important and novel CFD applications being developed and implemented by U.S. Government contractors gives attention to naval vessel flow-modeling, Army ballistic and rotary wing aerodynamics, and NASA hypersonic vehicle related applications of CFD. CFD-generated knowledge of numerical algorithms, fluid motion, and supercomputer use is being incorporated into such additional areas as computational electromagnetics and acoustics. Attention is presently given to CFD methods' development status in such fields as submarine boundary layers, hypersonic kinetic energy projectile shock structures, helicopter main rotor tip flows, and National Aerospace Plane aerothermodynamics.

  7. The NAS parallel benchmarks

    NASA Technical Reports Server (NTRS)

    Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.

    1991-01-01

    A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

  8. Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Doerfler, Douglas; Austin, Brian; Cook, Brandon

    There are many potential issues associated with deploying the Intel Xeon Phi™ (code named Knights Landing [KNL]) manycore processor in a large-scale supercomputer. One in particular is the ability to fully utilize the high-speed communications network, given that the serial performance of a Xeon Phi TM core is a fraction of a Xeon®core. In this paper, we take a look at the trade-offs associated with allocating enough cores to fully utilize the Aries high-speed network versus cores dedicated to computation, e.g., the trade-off between MPI and OpenMP. In addition, we evaluate new features of Cray MPI in support of KNL,more » such as internode optimizations. We also evaluate one-sided programming models such as Unified Parallel C. We quantify the impact of the above trade-offs and features using a suite of National Energy Research Scientific Computing Center applications.« less

  9. Preparing for in situ processing on upcoming leading-edge supercomputers

    DOE PAGES

    Kress, James; Churchill, Randy Michael; Klasky, Scott; ...

    2016-10-01

    High performance computing applications are producing increasingly large amounts of data and placing enormous stress on current capabilities for traditional post-hoc visualization techniques. Because of the growing compute and I/O imbalance, data reductions, including in situ visualization, are required. These reduced data are used for analysis and visualization in a variety of different ways. Many of he visualization and analysis requirements are known a priori, but when they are not, scientists are dependent on the reduced data to accurately represent the simulation in post hoc analysis. The contributions of this paper is a description of the directions we are pursuingmore » to assist a large scale fusion simulation code succeed on the next generation of supercomputers. Finally, these directions include the role of in situ processing for performing data reductions, as well as the tradeoffs between data size and data integrity within the context of complex operations in a typical scientific workflow.« less

  10. Supercomputer description of human lung morphology for imaging analysis.

    PubMed

    Martonen, T B; Hwang, D; Guan, X; Fleming, J S

    1998-04-01

    A supercomputer code that describes the three-dimensional branching structure of the human lung has been developed. The algorithm was written for the Cray C94. In our simulations, the human lung was divided into a matrix containing discrete volumes (voxels) so as to be compatible with analyses of SPECT images. The matrix has 3840 voxels. The matrix can be segmented into transverse, sagittal and coronal layers analogous to human subject examinations. The compositions of individual voxels were identified by the type and respective number of airways present. The code provides a mapping of the spatial positions of the almost 17 million airways in human lungs and unambiguously assigns each airway to a voxel. Thus, the clinician and research scientist in the medical arena have a powerful new tool to be used in imaging analyses. The code was designed to be integrated into diverse applications, including the interpretation of SPECT images, the design of inhalation exposure experiments and the targeted delivery of inhaled pharmacologic drugs.

  11. Massively parallel implementation of 3D-RISM calculation with volumetric 3D-FFT.

    PubMed

    Maruyama, Yutaka; Yoshida, Norio; Tadano, Hiroto; Takahashi, Daisuke; Sato, Mitsuhisa; Hirata, Fumio

    2014-07-05

    A new three-dimensional reference interaction site model (3D-RISM) program for massively parallel machines combined with the volumetric 3D fast Fourier transform (3D-FFT) was developed, and tested on the RIKEN K supercomputer. The ordinary parallel 3D-RISM program has a limitation on the number of parallelizations because of the limitations of the slab-type 3D-FFT. The volumetric 3D-FFT relieves this limitation drastically. We tested the 3D-RISM calculation on the large and fine calculation cell (2048(3) grid points) on 16,384 nodes, each having eight CPU cores. The new 3D-RISM program achieved excellent scalability to the parallelization, running on the RIKEN K supercomputer. As a benchmark application, we employed the program, combined with molecular dynamics simulation, to analyze the oligomerization process of chymotrypsin Inhibitor 2 mutant. The results demonstrate that the massive parallel 3D-RISM program is effective to analyze the hydration properties of the large biomolecular systems. Copyright © 2014 Wiley Periodicals, Inc.

  12. Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers

    NASA Astrophysics Data System (ADS)

    Oyarzun, Guillermo; Borrell, Ricard; Gorobets, Andrey; Oliva, Assensi

    2017-10-01

    Nowadays, high performance computing (HPC) systems experience a disruptive moment with a variety of novel architectures and frameworks, without any clarity of which one is going to prevail. In this context, the portability of codes across different architectures is of major importance. This paper presents a portable implementation model based on an algebraic operational approach for direct numerical simulation (DNS) and large eddy simulation (LES) of incompressible turbulent flows using unstructured hybrid meshes. The strategy proposed consists in representing the whole time-integration algorithm using only three basic algebraic operations: sparse matrix-vector product, a linear combination of vectors and dot product. The main idea is based on decomposing the nonlinear operators into a concatenation of two SpMV operations. This provides high modularity and portability. An exhaustive analysis of the proposed implementation for hybrid CPU/GPU supercomputers has been conducted with tests using up to 128 GPUs. The main objective consists in understanding the challenges of implementing CFD codes on new architectures.

  13. The core legion object model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lewis, M.; Grimshaw, A.

    1996-12-31

    The Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides secure shared object and shared name spaces, application adjustable fault-tolerance, improved response time, and greater throughput. Legion targets wide area assemblies of workstations, supercomputers, and parallel supercomputers, Legion tackles problems not solved by existing workstation based parallel processing tools; the system will enable fault-tolerance, wide area parallel processing, inter-operability, heterogeneity, a single global name space, protection, security, efficient scheduling, and comprehensive resource management. This paper describes themore » core Legion object model, which specifies the composition and functionality of Legion`s core objects-those objects that cooperate to create, locate, manage, and remove objects in the Legion system. The object model facilitates a flexible extensible implementation, provides a single global name space, grants site autonomy to participating organizations, and scales to millions of sites and trillions of objects.« less

  14. Design applications for supercomputers

    NASA Technical Reports Server (NTRS)

    Studerus, C. J.

    1987-01-01

    The complexity of codes for solutions of real aerodynamic problems has progressed from simple two-dimensional models to three-dimensional inviscid and viscous models. As the algorithms used in the codes increased in accuracy, speed and robustness, the codes were steadily incorporated into standard design processes. The highly sophisticated codes, which provide solutions to the truly complex flows, require computers with large memory and high computational speed. The advent of high-speed supercomputers, such that the solutions of these complex flows become more practical, permits the introduction of the codes into the design system at an earlier stage. The results of several codes which either were already introduced into the design process or are rapidly in the process of becoming so, are presented. The codes fall into the area of turbomachinery aerodynamics and hypersonic propulsion. In the former category, results are presented for three-dimensional inviscid and viscous flows through nozzle and unducted fan bladerows. In the latter category, results are presented for two-dimensional inviscid and viscous flows for hypersonic vehicle forebodies and engine inlets.

  15. Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aktulga, Hasan Metin; Coffman, Paul; Shan, Tzu-Ray

    2015-12-01

    Hybrid parallelism allows high performance computing applications to better leverage the increasing on-node parallelism of modern supercomputers. In this paper, we present a hybrid parallel implementation of the widely used LAMMPS/ReaxC package, where the construction of bonded and nonbonded lists and evaluation of complex ReaxFF interactions are implemented efficiently using OpenMP parallelism. Additionally, the performance of the QEq charge equilibration scheme is examined and a dual-solver is implemented. We present the performance of the resulting ReaxC-OMP package on a state-of-the-art multi-core architecture Mira, an IBM BlueGene/Q supercomputer. For system sizes ranging from 32 thousand to 16.6 million particles, speedups inmore » the range of 1.5-4.5x are observed using the new ReaxC-OMP software. Sustained performance improvements have been observed for up to 262,144 cores (1,048,576 processes) of Mira with a weak scaling efficiency of 91.5% in larger simulations containing 16.6 million particles.« less

  16. Scaling of Multimillion-Atom Biological Molecular Dynamics Simulation on a Petascale Supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schulz, Roland; Lindner, Benjamin; Petridis, Loukas

    2009-01-01

    A strategy is described for a fast all-atom molecular dynamics simulation of multimillion-atom biological systems on massively parallel supercomputers. The strategy is developed using benchmark systems of particular interest to bioenergy research, comprising models of cellulose and lignocellulosic biomass in an aqueous solution. The approach involves using the reaction field (RF) method for the computation of long-range electrostatic interactions, which permits efficient scaling on many thousands of cores. Although the range of applicability of the RF method for biomolecular systems remains to be demonstrated, for the benchmark systems the use of the RF produces molecular dipole moments, Kirkwood G factors,more » other structural properties, and mean-square fluctuations in excellent agreement with those obtained with the commonly used Particle Mesh Ewald method. With RF, three million- and five million atom biological systems scale well up to 30k cores, producing 30 ns/day. Atomistic simulations of very large systems for time scales approaching the microsecond would, therefore, appear now to be within reach.« less

  17. Scaling of Multimillion-Atom Biological Molecular Dynamics Simulation on a Petascale Supercomputer.

    PubMed

    Schulz, Roland; Lindner, Benjamin; Petridis, Loukas; Smith, Jeremy C

    2009-10-13

    A strategy is described for a fast all-atom molecular dynamics simulation of multimillion-atom biological systems on massively parallel supercomputers. The strategy is developed using benchmark systems of particular interest to bioenergy research, comprising models of cellulose and lignocellulosic biomass in an aqueous solution. The approach involves using the reaction field (RF) method for the computation of long-range electrostatic interactions, which permits efficient scaling on many thousands of cores. Although the range of applicability of the RF method for biomolecular systems remains to be demonstrated, for the benchmark systems the use of the RF produces molecular dipole moments, Kirkwood G factors, other structural properties, and mean-square fluctuations in excellent agreement with those obtained with the commonly used Particle Mesh Ewald method. With RF, three million- and five million-atom biological systems scale well up to ∼30k cores, producing ∼30 ns/day. Atomistic simulations of very large systems for time scales approaching the microsecond would, therefore, appear now to be within reach.

  18. Advanced Computing for Manufacturing.

    ERIC Educational Resources Information Center

    Erisman, Albert M.; Neves, Kenneth W.

    1987-01-01

    Discusses ways that supercomputers are being used in the manufacturing industry, including the design and production of airplanes and automobiles. Describes problems that need to be solved in the next few years for supercomputers to assume a major role in industry. (TW)

  19. Efficient development of memory bounded geo-applications to scale on modern supercomputers

    NASA Astrophysics Data System (ADS)

    Räss, Ludovic; Omlin, Samuel; Licul, Aleksandar; Podladchikov, Yuri; Herman, Frédéric

    2016-04-01

    Numerical modeling is an actual key tool in the area of geosciences. The current challenge is to solve problems that are multi-physics and for which the length scale and the place of occurrence might not be known in advance. Also, the spatial extend of the investigated domain might strongly vary in size, ranging from millimeters for reactive transport to kilometers for glacier erosion dynamics. An efficient way to proceed is to develop simple but robust algorithms that perform well and scale on modern supercomputers and permit therefore very high-resolution simulations. We propose an efficient approach to solve memory bounded real-world applications on modern supercomputers architectures. We optimize the software to run on our newly acquired state-of-the-art GPU cluster "octopus". Our approach shows promising preliminary results on important geodynamical and geomechanical problematics: we have developed a Stokes solver for glacier flow and a poromechanical solver including complex rheologies for nonlinear waves in stressed rocks porous rocks. We solve the system of partial differential equations on a regular Cartesian grid and use an iterative finite difference scheme with preconditioning of the residuals. The MPI communication happens only locally (point-to-point); this method is known to scale linearly by construction. The "octopus" GPU cluster, which we use for the computations, has been designed to achieve maximal data transfer throughput at minimal hardware cost. It is composed of twenty compute nodes, each hosting four Nvidia Titan X GPU accelerators. These high-density nodes are interconnected with a parallel (dual-rail) FDR InfiniBand network. Our efforts show promising preliminary results for the different physics investigated. The glacier flow solver achieves good accuracy in the relevant benchmarks and the coupled poromechanical solver permits to explain previously unresolvable focused fluid flow as a natural outcome of the porosity setup. In both cases, near peak memory bandwidth transfer is achieved. Our approach allows us to get the best out of the current hardware.

  20. High Temporal Resolution Mapping of Seismic Noise Sources Using Heterogeneous Supercomputers

    NASA Astrophysics Data System (ADS)

    Paitz, P.; Gokhberg, A.; Ermert, L. A.; Fichtner, A.

    2017-12-01

    The time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems like earthquake fault zones, volcanoes, geothermal and hydrocarbon reservoirs. We present results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service providing seismic noise source maps for Central Europe with high temporal resolution. We use source imaging methods based on the cross-correlation of seismic noise records from all seismic stations available in the region of interest. The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept to provide the interested researchers worldwide with regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for the generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise source mapping itself rests on the measurement of logarithmic amplitude ratios in suitably pre-processed noise correlations, and the use of simplified sensitivity kernels. During the implementation we addressed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service-oriented architecture for coordination of various sub-systems, and engineering an appropriate data storage solution. The present pilot version of the service implements noise source maps for Switzerland. Extension of the solution to Central Europe is planned for the next project phase.

  1. Supercomputers Join the Fight against Cancer – U.S. Department of Energy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    The Department of Energy has some of the best supercomputers in the world. Now, they’re joining the fight against cancer. Learn about our new partnership with the National Cancer Institute and GlaxoSmithKline Pharmaceuticals.

  2. NAS-current status and future plans

    NASA Technical Reports Server (NTRS)

    Bailey, F. R.

    1987-01-01

    The Numerical Aerodynamic Simulation (NAS) has met its first major milestone, the NAS Processing System Network (NPSN) Initial Operating Configuration (IOC). The program has met its goal of providing a national supercomputer facility capable of greatly enhancing the Nation's research and development efforts. Furthermore, the program is fulfilling its pathfinder role by defining and implementing a paradigm for supercomputing system environments. The IOC is only the begining and the NAS Program will aggressively continue to develop and implement emerging supercomputer, communications, storage, and software technologies to strengthen computations as a critical element in supporting the Nation's leadership role in aeronautics.

  3. CRAY mini manual. Revision D

    NASA Technical Reports Server (NTRS)

    Tennille, Geoffrey M.; Howser, Lona M.

    1993-01-01

    This document briefly describes the use of the CRAY supercomputers that are an integral part of the Supercomputing Network Subsystem of the Central Scientific Computing Complex at LaRC. Features of the CRAY supercomputers are covered, including: FORTRAN, C, PASCAL, architectures of the CRAY-2 and CRAY Y-MP, the CRAY UNICOS environment, batch job submittal, debugging, performance analysis, parallel processing, utilities unique to CRAY, and documentation. The document is intended for all CRAY users as a ready reference to frequently asked questions and to more detailed information contained in the vendor manuals. It is appropriate for both the novice and the experienced user.

  4. Scaling of data communications for an advanced supercomputer network

    NASA Technical Reports Server (NTRS)

    Levin, E.; Eaton, C. K.; Young, Bruce

    1986-01-01

    The goal of NASA's Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations and by remote communication to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. The implications of a projected 20-fold increase in processing power on the data communications requirements are described.

  5. Some recent applications of Navier-Stokes codes to rotorcraft

    NASA Technical Reports Server (NTRS)

    Mccroskey, W. J.

    1992-01-01

    Many operational limitations of helicopters and other rotary-wing aircraft are due to nonlinear aerodynamic phenomena incuding unsteady, three-dimensional transonic and separated flow near the surfaces and highly vortical flow in the wakes of rotating blades. Modern computational fluid dynamics (CFD) technology offers new tools to study and simulate these complex flows. However, existing Euler and Navier-Stokes codes have to be modified significantly for rotorcraft applications, and the enormous computational requirements presently limit their use in routine design applications. Nevertheless, the Euler/Navier-Stokes technology is progressing in anticipation of future supercomputers that will enable meaningful calculations to be made for complete rotorcraft configurations.

  6. JESPP: Joint Experimentation on Scalable Parallel Processors Supercomputers

    DTIC Science & Technology

    2010-03-01

    were for the relatively small market of scientific and engineering applications. Contrast this with GPUs that are designed to improve the end- user...experience in mass- market arenas such as gaming. In order to get meaningful speed-up using the GPU, it was determined that the data transfer and...Included) Conference Year Effectively using a Large GPGPU-Enhanced Linux Cluster HPCMP UGC 2009 FLOPS per Watt: Heterogeneous-Computing’s Approach

  7. Performance Analysis and Scaling Behavior of the Terrestrial Systems Modeling Platform TerrSysMP in Large-Scale Supercomputing Environments

    NASA Astrophysics Data System (ADS)

    Kollet, S. J.; Goergen, K.; Gasper, F.; Shresta, P.; Sulis, M.; Rihani, J.; Simmer, C.; Vereecken, H.

    2013-12-01

    In studies of the terrestrial hydrologic, energy and biogeochemical cycles, integrated multi-physics simulation platforms take a central role in characterizing non-linear interactions, variances and uncertainties of system states and fluxes in reciprocity with observations. Recently developed integrated simulation platforms attempt to honor the complexity of the terrestrial system across multiple time and space scales from the deeper subsurface including groundwater dynamics into the atmosphere. Technically, this requires the coupling of atmospheric, land surface, and subsurface-surface flow models in supercomputing environments, while ensuring a high-degree of efficiency in the utilization of e.g., standard Linux clusters and massively parallel resources. A systematic performance analysis including profiling and tracing in such an application is crucial in the understanding of the runtime behavior, to identify optimum model settings, and is an efficient way to distinguish potential parallel deficiencies. On sophisticated leadership-class supercomputers, such as the 28-rack 5.9 petaFLOP IBM Blue Gene/Q 'JUQUEEN' of the Jülich Supercomputing Centre (JSC), this is a challenging task, but even more so important, when complex coupled component models are to be analysed. Here we want to present our experience from coupling, application tuning (e.g. 5-times speedup through compiler optimizations), parallel scaling and performance monitoring of the parallel Terrestrial Systems Modeling Platform TerrSysMP. The modeling platform consists of the weather prediction system COSMO of the German Weather Service; the Community Land Model, CLM of NCAR; and the variably saturated surface-subsurface flow code ParFlow. The model system relies on the Multiple Program Multiple Data (MPMD) execution model where the external Ocean-Atmosphere-Sea-Ice-Soil coupler (OASIS3) links the component models. TerrSysMP has been instrumented with the performance analysis tool Scalasca and analyzed on JUQUEEN with processor counts on the order of 10,000. The instrumentation is used in weak and strong scaling studies with real data cases and hypothetical idealized numerical experiments for detailed profiling and tracing analysis. The profiling is not only useful in identifying wait states that are due to the MPMD execution model, but also in fine-tuning resource allocation to the component models in search of the most suitable load balancing. This is especially necessary, as with numerical experiments that cover multiple (high resolution) spatial scales, the time stepping, coupling frequencies, and communication overheads are constantly shifting, which makes it necessary to re-determine the model setup with each new experimental design.

  8. Performance Analysis, Modeling and Scaling of HPC Applications and Tools

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhatele, Abhinav

    2016-01-13

    E cient use of supercomputers at DOE centers is vital for maximizing system throughput, mini- mizing energy costs and enabling science breakthroughs faster. This requires complementary e orts along several directions to optimize the performance of scienti c simulation codes and the under- lying runtimes and software stacks. This in turn requires providing scalable performance analysis tools and modeling techniques that can provide feedback to physicists and computer scientists developing the simulation codes and runtimes respectively. The PAMS project is using time allocations on supercomputers at ALCF, NERSC and OLCF to further the goals described above by performing research alongmore » the following fronts: 1. Scaling Study of HPC applications; 2. Evaluation of Programming Models; 3. Hardening of Performance Tools; 4. Performance Modeling of Irregular Codes; and 5. Statistical Analysis of Historical Performance Data. We are a team of computer and computational scientists funded by both DOE/NNSA and DOE/ ASCR programs such as ECRP, XStack (Traleika Glacier, PIPER), ExaOSR (ARGO), SDMAV II (MONA) and PSAAP II (XPACC). This allocation will enable us to study big data issues when analyzing performance on leadership computing class systems and to assist the HPC community in making the most e ective use of these resources.« less

  9. Roadrunner Supercomputer Breaks the Petaflop Barrier

    ScienceCinema

    Los Alamos National Lab - Brian Albright, Charlie McMillan, Lin Yin

    2017-12-09

    At 3:30 a.m. on May 26, 2008, Memorial Day, the "Roadrunner" supercomputer exceeded a sustained speed of 1 petaflop/s, or 1 million billion calculations per second. The sustained performance makes Roadrunner more than twice as fast as the current number 1

  10. QCD on the BlueGene/L Supercomputer

    NASA Astrophysics Data System (ADS)

    Bhanot, G.; Chen, D.; Gara, A.; Sexton, J.; Vranas, P.

    2005-03-01

    In June 2004 QCD was simulated for the first time at sustained speed exceeding 1 TeraFlops in the BlueGene/L supercomputer at the IBM T.J. Watson Research Lab. The implementation and performance of QCD in the BlueGene/L is presented.

  11. Supercomputer Issues from a University Perspective.

    ERIC Educational Resources Information Center

    Beering, Steven C.

    1984-01-01

    Discusses issues related to the access of and training of university researchers in using supercomputers, considering National Science Foundation's (NSF) role in this area, microcomputers on campuses, and the limited use of existing telecommunication networks. Includes examples of potential scientific projects (by subject area) utilizing…

  12. Towards the Interoperability of Web, Database, and Mass Storage Technologies for Petabyte Archives

    NASA Technical Reports Server (NTRS)

    Moore, Reagan; Marciano, Richard; Wan, Michael; Sherwin, Tom; Frost, Richard

    1996-01-01

    At the San Diego Supercomputer Center, a massive data analysis system (MDAS) is being developed to support data-intensive applications that manipulate terabyte sized data sets. The objective is to support scientific application access to data whether it is located at a Web site, stored as an object in a database, and/or storage in an archival storage system. We are developing a suite of demonstration programs which illustrate how Web, database (DBMS), and archival storage (mass storage) technologies can be integrated. An application presentation interface is being designed that integrates data access to all of these sources. We have developed a data movement interface between the Illustra object-relational database and the NSL UniTree archival storage system running in a production mode at the San Diego Supercomputer Center. With this interface, an Illustra client can transparently access data on UniTree under the control of the Illustr DBMS server. The current implementation is based on the creation of a new DBMS storage manager class, and a set of library functions that allow the manipulation and migration of data stored as Illustra 'large objects'. We have extended this interface to allow a Web client application to control data movement between its local disk, the Web server, the DBMS Illustra server, and the UniTree mass storage environment. This paper describes some of the current approaches successfully integrating these technologies. This framework is measured against a representative sample of environmental data extracted from the San Diego Ba Environmental Data Repository. Practical lessons are drawn and critical research areas are highlighted.

  13. CyberShake Physics-Based PSHA in Central California

    NASA Astrophysics Data System (ADS)

    Callaghan, S.; Maechling, P. J.; Goulet, C. A.; Milner, K. R.; Graves, R. W.; Olsen, K. B.; Jordan, T. H.

    2017-12-01

    The Southern California Earthquake Center (SCEC) has developed a simulation platform, CyberShake, which performs physics-based probabilistic seismic hazard analyis (PSHA) using 3D deterministic wave propagation simulations. CyberShake performs PSHA by simulating a wavefield of Strain Green Tensors. An earthquake rupture forecast (ERF) is then extended by varying hypocenters and slips on finite faults, generating about 500,000 events per site of interest. Seismic reciprocity is used to calculate synthetic seismograms, which are processed to obtain intensity measures (IMs) such as RotD100. These are combined with ERF probabilities to produce hazard curves. PSHA results from hundreds of locations across a region are interpolated to produce a hazard map. CyberShake simulations with SCEC 3D Community Velocity Models have shown how the site and path effects vary with differences in upper crustal structure, and they are particularly informative about epistemic uncertainties in basin effects, which are not well parameterized by depths to iso-velocity surfaces, common inputs to GMPEs. In 2017, SCEC performed CyberShake Study 17.3, expanding into Central California for the first time. Seismic hazard calculations were performed at 1 Hz at 438 sites, using both a 3D tomographically-derived central California velocity model and a regionally averaged 1D model. Our simulation volumes extended outside of Central California, so we included other SCEC velocity models and developed a smoothing algorithm to minimize reflection and refraction effects along interfaces. CyberShake Study 17.3 ran for 31 days on NCSA's Blue Waters and ORNL's Titan supercomputers, burning 21.6 million core-hours and producing 285 million two-component seismograms and 43 billion IMs. These results demonstrate that CyberShake can be successfully expanded into new regions, and lend insights into the effects of directivity-basin coupling associated with basins near major faults such as the San Andreas. In particular, we observe in the 3D results that basin amplification for sites in the southern San Joaquin Valley is less than for sites in smaller basins such as around Ventura. We will present CyberShake hazard estimates from the 1D and 3D models, compare results to those from previous CyberShake studies and GMPEs, and describe our future plans.

  14. Advances and issues from the simulation of planetary magnetospheres with recent supercomputer systems

    NASA Astrophysics Data System (ADS)

    Fukazawa, K.; Walker, R. J.; Kimura, T.; Tsuchiya, F.; Murakami, G.; Kita, H.; Tao, C.; Murata, K. T.

    2016-12-01

    Planetary magnetospheres are very large, while phenomena within them occur on meso- and micro-scales. These scales range from 10s of planetary radii to kilometers. To understand dynamics in these multi-scale systems, numerical simulations have been performed by using the supercomputer systems. We have studied the magnetospheres of Earth, Jupiter and Saturn by using 3-dimensional magnetohydrodynamic (MHD) simulations for a long time, however, we have not obtained the phenomena near the limits of the MHD approximation. In particular, we have not studied meso-scale phenomena that can be addressed by using MHD.Recently we performed our MHD simulation of Earth's magnetosphere by using the K-computer which is the first 10PFlops supercomputer and obtained multi-scale flow vorticity for the both northward and southward IMF. Furthermore, we have access to supercomputer systems which have Xeon, SPARC64, and vector-type CPUs and can compare simulation results between the different systems. Finally, we have compared the results of our parameter survey of the magnetosphere with observations from the HISAKI spacecraft.We have encountered a number of difficulties effectively using the latest supercomputer systems. First the size of simulation output increases greatly. Now a simulation group produces over 1PB of output. Storage and analysis of this much data is difficult. The traditional way to analyze simulation results is to move the results to the investigator's home computer. This takes over three months using an end-to-end 10Gbps network. In reality, there are problems at some nodes such as firewalls that can increase the transfer time to over one year. Another issue is post-processing. It is hard to treat a few TB of simulation output due to the memory limitations of a post-processing computer. To overcome these issues, we have developed and introduced the parallel network storage, the highly efficient network protocol and the CUI based visualization tools.In this study, we will show the latest simulation results using the petascale supercomputer and problems from the use of these supercomputer systems.

  15. Finite element methods on supercomputers - The scatter-problem

    NASA Technical Reports Server (NTRS)

    Loehner, R.; Morgan, K.

    1985-01-01

    Certain problems arise in connection with the use of supercomputers for the implementation of finite-element methods. These problems are related to the desirability of utilizing the power of the supercomputer as fully as possible for the rapid execution of the required computations, taking into account the gain in speed possible with the aid of pipelining operations. For the finite-element method, the time-consuming operations may be divided into three categories. The first two present no problems, while the third type of operation can be a reason for the inefficient performance of finite-element programs. Two possibilities for overcoming certain difficulties are proposed, giving attention to a scatter-process.

  16. Code IN Exhibits - Supercomputing 2000

    NASA Technical Reports Server (NTRS)

    Yarrow, Maurice; McCann, Karen M.; Biswas, Rupak; VanderWijngaart, Rob F.; Kwak, Dochan (Technical Monitor)

    2000-01-01

    The creation of parameter study suites has recently become a more challenging problem as the parameter studies have become multi-tiered and the computational environment has become a supercomputer grid. The parameter spaces are vast, the individual problem sizes are getting larger, and researchers are seeking to combine several successive stages of parameterization and computation. Simultaneously, grid-based computing offers immense resource opportunities but at the expense of great difficulty of use. We present ILab, an advanced graphical user interface approach to this problem. Our novel strategy stresses intuitive visual design tools for parameter study creation and complex process specification, and also offers programming-free access to grid-based supercomputer resources and process automation.

  17. NSF Establishes First Four National Supercomputer Centers.

    ERIC Educational Resources Information Center

    Lepkowski, Wil

    1985-01-01

    The National Science Foundation (NSF) has awarded support for supercomputer centers at Cornell University, Princeton University, University of California (San Diego), and University of Illinois. These centers are to be the nucleus of a national academic network for use by scientists and engineers throughout the United States. (DH)

  18. Library Services in a Supercomputer Center.

    ERIC Educational Resources Information Center

    Layman, Mary

    1991-01-01

    Describes library services that are offered at the San Diego Supercomputer Center (SDSC), which is located at the University of California at San Diego. Topics discussed include the user population; online searching; microcomputer use; electronic networks; current awareness programs; library catalogs; and the slide collection. A sidebar outlines…

  19. Probing the cosmic causes of errors in supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    Cosmic rays from outer space are causing errors in supercomputers. The neutrons that pass through the CPU may be causing binary data to flip leading to incorrect calculations. Los Alamos National Laboratory has developed detectors to determine how much data is being corrupted by these cosmic particles.

  20. The feasibility of an efficient drug design method with high-performance computers.

    PubMed

    Yamashita, Takefumi; Ueda, Akihiko; Mitsui, Takashi; Tomonaga, Atsushi; Matsumoto, Shunji; Kodama, Tatsuhiko; Fujitani, Hideaki

    2015-01-01

    In this study, we propose a supercomputer-assisted drug design approach involving all-atom molecular dynamics (MD)-based binding free energy prediction after the traditional design/selection step. Because this prediction is more accurate than the empirical binding affinity scoring of the traditional approach, the compounds selected by the MD-based prediction should be better drug candidates. In this study, we discuss the applicability of the new approach using two examples. Although the MD-based binding free energy prediction has a huge computational cost, it is feasible with the latest 10 petaflop-scale computer. The supercomputer-assisted drug design approach also involves two important feedback procedures: The first feedback is generated from the MD-based binding free energy prediction step to the drug design step. While the experimental feedback usually provides binding affinities of tens of compounds at one time, the supercomputer allows us to simultaneously obtain the binding free energies of hundreds of compounds. Because the number of calculated binding free energies is sufficiently large, the compounds can be classified into different categories whose properties will aid in the design of the next generation of drug candidates. The second feedback, which occurs from the experiments to the MD simulations, is important to validate the simulation parameters. To demonstrate this, we compare the binding free energies calculated with various force fields to the experimental ones. The results indicate that the prediction will not be very successful, if we use an inaccurate force field. By improving/validating such simulation parameters, the next prediction can be made more accurate.

  1. ICON-MIC: Implementing a CPU/MIC Collaboration Parallel Framework for ICON on Tianhe-2 Supercomputer.

    PubMed

    Wang, Zihao; Chen, Yu; Zhang, Jingrong; Li, Lun; Wan, Xiaohua; Liu, Zhiyong; Sun, Fei; Zhang, Fa

    2018-03-01

    Electron tomography (ET) is an important technique for studying the three-dimensional structures of the biological ultrastructure. Recently, ET has reached sub-nanometer resolution for investigating the native and conformational dynamics of macromolecular complexes by combining with the sub-tomogram averaging approach. Due to the limited sampling angles, ET reconstruction typically suffers from the "missing wedge" problem. Using a validation procedure, iterative compressed-sensing optimized nonuniform fast Fourier transform (NUFFT) reconstruction (ICON) demonstrates its power in restoring validated missing information for a low-signal-to-noise ratio biological ET dataset. However, the huge computational demand has become a bottleneck for the application of ICON. In this work, we implemented a parallel acceleration technology ICON-many integrated core (MIC) on Xeon Phi cards to address the huge computational demand of ICON. During this step, we parallelize the element-wise matrix operations and use the efficient summation of a matrix to reduce the cost of matrix computation. We also developed parallel versions of NUFFT on MIC to achieve a high acceleration of ICON by using more efficient fast Fourier transform (FFT) calculation. We then proposed a hybrid task allocation strategy (two-level load balancing) to improve the overall performance of ICON-MIC by making full use of the idle resources on Tianhe-2 supercomputer. Experimental results using two different datasets show that ICON-MIC has high accuracy in biological specimens under different noise levels and a significant acceleration, up to 13.3 × , compared with the CPU version. Further, ICON-MIC has good scalability efficiency and overall performance on Tianhe-2 supercomputer.

  2. Flux-Level Transit Injection Experiments with NASA Pleiades Supercomputer

    NASA Astrophysics Data System (ADS)

    Li, Jie; Burke, Christopher J.; Catanzarite, Joseph; Seader, Shawn; Haas, Michael R.; Batalha, Natalie; Henze, Christopher; Christiansen, Jessie; Kepler Project, NASA Advanced Supercomputing Division

    2016-06-01

    Flux-Level Transit Injection (FLTI) experiments are executed with NASA's Pleiades supercomputer for the Kepler Mission. The latest release (9.3, January 2016) of the Kepler Science Operations Center Pipeline is used in the FLTI experiments. Their purpose is to validate the Analytic Completeness Model (ACM), which can be computed for all Kepler target stars, thereby enabling exoplanet occurrence rate studies. Pleiades, a facility of NASA's Advanced Supercomputing Division, is one of the world's most powerful supercomputers and represents NASA's state-of-the-art technology. We discuss the details of implementing the FLTI experiments on the Pleiades supercomputer. For example, taking into account that ~16 injections are generated by one core of the Pleiades processors in an hour, the “shallow” FLTI experiment, in which ~2000 injections are required per target star, can be done for 16% of all Kepler target stars in about 200 hours. Stripping down the transit search to bare bones, i.e. only searching adjacent high/low periods at high/low pulse durations, makes the computationally intensive FLTI experiments affordable. The design of the FLTI experiments and the analysis of the resulting data are presented in “Validating an Analytic Completeness Model for Kepler Target Stars Based on Flux-level Transit Injection Experiments” by Catanzarite et al. (#2494058).Kepler was selected as the 10th mission of the Discovery Program. Funding for the Kepler Mission has been provided by the NASA Science Mission Directorate.

  3. Data Flow for the TERRA-REF project

    NASA Astrophysics Data System (ADS)

    Kooper, R.; Burnette, M.; Maloney, J.; LeBauer, D.

    2017-12-01

    The Transportation Energy Resources from Renewable Agriculture Phenotyping Reference Platform (TERRA-REF) program aims to identify crop traits that are best suited to producing high-energy sustainable biofuels and match those plant characteristics to their genes to speed the plant breeding process. One tool used to achieve this goal is a high-throughput phenotyping robot outfitted with sensors and cameras to monitor the growth of 1.25 acres of sorghum. Data types range from hyperspectral imaging to 3D reconstructions and thermal profiles, all at 1mm resolution. This system produces thousands of daily measurements with high spatiotemporal resolution. The team at NCSA processes, annotates, organizes and stores the massive amounts of data produced by this system - up to 5 TB per day. Data from the sensors is streamed to a local gantry-cache server. The standardized sensor raw data stream is automatically and securely delivered to NCSA using Globus Connect service. Once files have been successfully received by the Globus endpoint, the files are removed from the gantry-cache server. As each dataset arrives or is created the Clowder system automatically triggers different software tools to analyze each file, extract information, and convert files to a common format. Other tools can be triggered to run after all required data is uploaded. For example, a stitched image of the entire field is created after all images of the field become available. Some of these tools were developed by external collaborators based on predictive models and algorithms, others were developed as part of other projects and could be leveraged by the TERRA project. Data will be stored for the lifetime of the project and is estimated to reach 10 PB over 3 years. The Clowder system, BETY and other systems will allow users to easily find data by browsing or searching the extracted information.

  4. A parallel algorithm for generation and assembly of finite element stiffness and mass matrices

    NASA Technical Reports Server (NTRS)

    Storaasli, O. O.; Carmona, E. A.; Nguyen, D. T.; Baddourah, M. A.

    1991-01-01

    A new algorithm is proposed for parallel generation and assembly of the finite element stiffness and mass matrices. The proposed assembly algorithm is based on a node-by-node approach rather than the more conventional element-by-element approach. The new algorithm's generality and computation speed-up when using multiple processors are demonstrated for several practical applications on multi-processor Cray Y-MP and Cray 2 supercomputers.

  5. Monthly average polar sea-ice concentration

    USGS Publications Warehouse

    Schweitzer, Peter N.

    1995-01-01

    The data contained in this CD-ROM depict monthly averages of sea-ice concentration in the modern polar oceans. These averages were derived from the Scanning Multichannel Microwave Radiometer (SMMR) and Special Sensor Microwave/Imager (SSM/I) instruments aboard satellites of the U.S. Air Force Defense Meteorological Satellite Program from 1978 through 1992. The data are provided as 8-bit images using the Hierarchical Data Format (HDF) developed by the National Center for Supercomputing Applications.

  6. MODA A Framework for Memory Centric Performance Characterization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shrestha, Sunil; Su, Chun-Yi; White, Amanda M.

    2012-06-29

    In the age of massive parallelism, the focus of performance analysis has switched from the processor and related structures to the memory and I/O resources. Adapting to this new reality, a performance analysis tool has to provide a way to analyze resource usage to pinpoint existing and potential problems in a given application. This paper provides an overview of the Memory Observant Data Analysis (MODA) tool, a memory-centric tool first implemented on the Cray XMT supercomputer. Throughout the paper, MODA's capabilities have been showcased with experiments done on matrix multiply and Graph-500 application codes.

  7. NAS: The first year

    NASA Technical Reports Server (NTRS)

    Bailey, F. R.; Kutler, Paul

    1988-01-01

    Discussed are the capabilities of NASA's Numerical Aerodynamic Simulation (NAS) Program and its application as an advanced supercomputing system for computational fluid dynamics (CFD) research. First, the paper describes the NAS computational system, called the NAS Processing System Network, and the advanced computational capabilities it offers as a consequence of carrying out the NAS pathfinder objective. Second, it presents examples of pioneering CFD research accomplished during NAS's first operational year. Examples are included which illustrate CFD applications for predicting fluid phenomena, complementing and supplementing experimentation, and aiding in design. Finally, pacing elements and future directions for CFD and NAS are discussed.

  8. The Sky's the Limit When Super Students Meet Supercomputers.

    ERIC Educational Resources Information Center

    Trotter, Andrew

    1991-01-01

    In a few select high schools in the U.S., supercomputers are allowing talented students to attempt sophisticated research projects using simultaneous simulations of nature, culture, and technology not achievable by ordinary microcomputers. Schools can get their students online by entering contests and seeking grants and partnerships with…

  9. NSF Says It Will Support Supercomputer Centers in California and Illinois.

    ERIC Educational Resources Information Center

    Strosnider, Kim; Young, Jeffrey R.

    1997-01-01

    The National Science Foundation will increase support for supercomputer centers at the University of California, San Diego and the University of Illinois, Urbana-Champaign, while leaving unclear the status of the program at Cornell University (New York) and a cooperative Carnegie-Mellon University (Pennsylvania) and University of Pittsburgh…

  10. Access to Supercomputers. Higher Education Panel Report 69.

    ERIC Educational Resources Information Center

    Holmstrom, Engin Inel

    This survey was conducted to provide the National Science Foundation with baseline information on current computer use in the nation's major research universities, including the actual and potential use of supercomputers. Questionnaires were sent to 207 doctorate-granting institutions; after follow-ups, 167 institutions (91% of the institutions…

  11. NOAA announces significant investment in next generation of supercomputers

    Science.gov Websites

    provide more timely, accurate weather forecasts. (Credit: istockphoto.com) Today, NOAA announced the next phase in the agency's efforts to increase supercomputing capacity to provide more timely, accurate turn will lead to more timely, accurate, and reliable forecasts." Ahead of this upgrade, each of

  12. Developments in the simulation of compressible inviscid and viscous flow on supercomputers

    NASA Technical Reports Server (NTRS)

    Steger, J. L.; Buning, P. G.

    1985-01-01

    In anticipation of future supercomputers, finite difference codes are rapidly being extended to simulate three-dimensional compressible flow about complex configurations. Some of these developments are reviewed. The importance of computational flow visualization and diagnostic methods to three-dimensional flow simulation is also briefly discussed.

  13. PoPLAR: Portal for Petascale Lifescience Applications and Research

    PubMed Central

    2013-01-01

    Background We are focusing specifically on fast data analysis and retrieval in bioinformatics that will have a direct impact on the quality of human health and the environment. The exponential growth of data generated in biology research, from small atoms to big ecosystems, necessitates an increasingly large computational component to perform analyses. Novel DNA sequencing technologies and complementary high-throughput approaches--such as proteomics, genomics, metabolomics, and meta-genomics--drive data-intensive bioinformatics. While individual research centers or universities could once provide for these applications, this is no longer the case. Today, only specialized national centers can deliver the level of computing resources required to meet the challenges posed by rapid data growth and the resulting computational demand. Consequently, we are developing massively parallel applications to analyze the growing flood of biological data and contribute to the rapid discovery of novel knowledge. Methods The efforts of previous National Science Foundation (NSF) projects provided for the generation of parallel modules for widely used bioinformatics applications on the Kraken supercomputer. We have profiled and optimized the code of some of the scientific community's most widely used desktop and small-cluster-based applications, including BLAST from the National Center for Biotechnology Information (NCBI), HMMER, and MUSCLE; scaled them to tens of thousands of cores on high-performance computing (HPC) architectures; made them robust and portable to next-generation architectures; and incorporated these parallel applications in science gateways with a web-based portal. Results This paper will discuss the various developmental stages, challenges, and solutions involved in taking bioinformatics applications from the desktop to petascale with a front-end portal for very-large-scale data analysis in the life sciences. Conclusions This research will help to bridge the gap between the rate of data generation and the speed at which scientists can study this data. The ability to rapidly analyze data at such a large scale is having a significant, direct impact on science achieved by collaborators who are currently using these tools on supercomputers. PMID:23902523

  14. Adapting the serial Alpgen parton-interaction generator to simulate LHC collisions on millions of parallel threads

    NASA Astrophysics Data System (ADS)

    Childers, J. T.; Uram, T. D.; LeCompte, T. J.; Papka, M. E.; Benjamin, D. P.

    2017-01-01

    As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. This paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application and the performance that was achieved.

  15. High Performance Computing Software Applications for Space Situational Awareness

    NASA Astrophysics Data System (ADS)

    Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

    The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.

  16. Supercomputer use in orthopaedic biomechanics research: focus on functional adaptation of bone.

    PubMed

    Hart, R T; Thongpreda, N; Van Buskirk, W C

    1988-01-01

    The authors describe two biomechanical analyses carried out using numerical methods. One is an analysis of the stress and strain in a human mandible, and the other analysis involves modeling the adaptive response of a sheep bone to mechanical loading. The computing environment required for the two types of analyses is discussed. It is shown that a simple stress analysis of a geometrically complex mandible can be accomplished using a minicomputer. However, more sophisticated analyses of the same model with dynamic loading or nonlinear materials would require supercomputer capabilities. A supercomputer is also required for modeling the adaptive response of living bone, even when simple geometric and material models are use.

  17. NREL's Building-Integrated Supercomputer Provides Heating and Efficient Computing (Fact Sheet)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    2014-09-01

    NREL's Energy Systems Integration Facility (ESIF) is meant to investigate new ways to integrate energy sources so they work together efficiently, and one of the key tools to that investigation, a new supercomputer, is itself a prime example of energy systems integration. NREL teamed with Hewlett-Packard (HP) and Intel to develop the innovative warm-water, liquid-cooled Peregrine supercomputer, which not only operates efficiently but also serves as the primary source of building heat for ESIF offices and laboratories. This innovative high-performance computer (HPC) can perform more than a quadrillion calculations per second as part of the world's most energy-efficient HPC datamore » center.« less

  18. Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer

    NASA Technical Reports Server (NTRS)

    Hornfeck, William A.

    1988-01-01

    A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.

  19. NAS Technical Summaries, March 1993 - February 1994

    NASA Technical Reports Server (NTRS)

    1995-01-01

    NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1993-94 operational year concluded with 448 high-speed processor projects and 95 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.

  20. NAS technical summaries. Numerical aerodynamic simulation program, March 1992 - February 1993

    NASA Technical Reports Server (NTRS)

    1994-01-01

    NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1992-93 operational year concluded with 399 high-speed processor projects and 91 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.

  1. 369 TFlop/s molecular dynamics simulations on the Roadrunner general-purpose heterogeneous supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swaminarayan, Sriram; Germann, Timothy C; Kadau, Kai

    2008-01-01

    The authors present timing and performance numbers for a short-range parallel molecular dynamics (MD) code, SPaSM, that has been rewritten for the heterogeneous Roadrunner supercomputer. Each Roadrunner compute node consists of two AMD Opteron dual-core microprocessors and four PowerXCell 8i enhanced Cell microprocessors, so that there are four MPI ranks per node, each with one Opteron and one Cell. The interatomic forces are computed on the Cells (each with one PPU and eight SPU cores), while the Opterons are used to direct inter-rank communication and perform I/O-heavy periodic analysis, visualization, and checkpointing tasks. The performance measured for our initial implementationmore » of a standard Lennard-Jones pair potential benchmark reached a peak of 369 Tflop/s double-precision floating-point performance on the full Roadrunner system (27.7% of peak), corresponding to 124 MFlop/Watt/s at a price of approximately 3.69 MFlops/dollar. They demonstrate an initial target application, the jetting and ejection of material from a shocked surface.« less

  2. High Resolution Aerospace Applications using the NASA Columbia Supercomputer

    NASA Technical Reports Server (NTRS)

    Mavriplis, Dimitri J.; Aftosmis, Michael J.; Berger, Marsha

    2005-01-01

    This paper focuses on the parallel performance of two high-performance aerodynamic simulation packages on the newly installed NASA Columbia supercomputer. These packages include both a high-fidelity, unstructured, Reynolds-averaged Navier-Stokes solver, and a fully-automated inviscid flow package for cut-cell Cartesian grids. The complementary combination of these two simulation codes enables high-fidelity characterization of aerospace vehicle design performance over the entire flight envelope through extensive parametric analysis and detailed simulation of critical regions of the flight envelope. Both packages. are industrial-level codes designed for complex geometry and incorpor.ats. CuStomized multigrid solution algorithms. The performance of these codes on Columbia is examined using both MPI and OpenMP and using both the NUMAlink and InfiniBand interconnect fabrics. Numerical results demonstrate good scalability on up to 2016 CPUs using the NUMAIink4 interconnect, with measured computational rates in the vicinity of 3 TFLOP/s, while InfiniBand showed some performance degradation at high CPU counts, particularly with multigrid. Nonetheless, the results are encouraging enough to indicate that larger test cases using combined MPI/OpenMP communication should scale well on even more processors.

  3. Magnetosphere simulations with a high-performance 3D AMR MHD Code

    NASA Astrophysics Data System (ADS)

    Gombosi, Tamas; Dezeeuw, Darren; Groth, Clinton; Powell, Kenneth; Song, Paul

    1998-11-01

    BATS-R-US is a high-performance 3D AMR MHD code for space physics applications running on massively parallel supercomputers. In BATS-R-US the electromagnetic and fluid equations are solved with a high-resolution upwind numerical scheme in a tightly coupled manner. The code is very robust and it is capable of spanning a wide range of plasma parameters (such as β, acoustic and Alfvénic Mach numbers). Our code is highly scalable: it achieved a sustained performance of 233 GFLOPS on a Cray T3E-1200 supercomputer with 1024 PEs. This talk reports results from the BATS-R-US code for the GGCM (Geospace General Circularculation Model) Phase 1 Standard Model Suite. This model suite contains 10 different steady-state configurations: 5 IMF clock angles (north, south, and three equally spaced angles in- between) with 2 IMF field strengths for each angle (5 nT and 10 nT). The other parameters are: solar wind speed =400 km/sec; solar wind number density = 5 protons/cc; Hall conductance = 0; Pedersen conductance = 5 S; parallel conductivity = ∞.

  4. Large-Scale NASA Science Applications on the Columbia Supercluster

    NASA Technical Reports Server (NTRS)

    Brooks, Walter

    2005-01-01

    Columbia, NASA's newest 61 teraflops supercomputer that became operational late last year, is a highly integrated Altix cluster of 10,240 processors, and was named to honor the crew of the Space Shuttle lost in early 2003. Constructed in just four months, Columbia increased NASA's computing capability ten-fold, and revitalized the Agency's high-end computing efforts. Significant cutting-edge science and engineering simulations in the areas of space and Earth sciences, as well as aeronautics and space operations, are already occurring on this largest operational Linux supercomputer, demonstrating its capacity and capability to accelerate NASA's space exploration vision. The presentation will describe how an integrated environment consisting not only of next-generation systems, but also modeling and simulation, high-speed networking, parallel performance optimization, and advanced data analysis and visualization, is being used to reduce design cycle time, accelerate scientific discovery, conduct parametric analysis of multiple scenarios, and enhance safety during the life cycle of NASA missions. The talk will conclude by discussing how NAS partnered with various NASA centers, other government agencies, computer industry, and academia, to create a national resource in large-scale modeling and simulation.

  5. A new understanding of inert gas narcosis

    NASA Astrophysics Data System (ADS)

    Meng, Zhang; Yi, Gao; Haiping, Fang

    2016-01-01

    Anesthetics are extremely important in modern surgery to greatly reduce the patient’s pain. The understanding of anesthesia at molecular level is the preliminary step for the application of anesthetics in clinic safely and effectively. Inert gases, with low chemical activity, have been found to cause anesthesia for centuries, but the mechanism is unclear yet. In this review, we first summarize the progress of theories about general anesthesia, especially for inert gas narcosis, and then propose a new hypothesis that the aggregated rather than the dispersed inert gas molecules are the key to trigger the narcosis to explain the steep dose-response relationship of anesthesia. Project supported by the Supercomputing Center of Chinese Academy of Sciences in Beijing, China, the Shanghai Supercomputer Center, China, the National Natural Science Foundation of China (Grant Nos. 21273268, 11290164, and 11175230), the Startup Funding from Shanghai Institute of Applied Physics, Chinese Academy of Sciences (Grant No. Y290011011), “Hundred People Project” from Chinese Academy of Sciences, and “Pu-jiang Rencai Project” from Science and Technology Commission of Shanghai Municipality, China (Grant No. 13PJ1410400).

  6. Congressional Panel Seeks To Curb Access of Foreign Students to U.S. Supercomputers.

    ERIC Educational Resources Information Center

    Kiernan, Vincent

    1999-01-01

    Fearing security problems, a congressional committee on Chinese espionage recommends that foreign students and other foreign nationals be barred from using supercomputers at national laboratories unless they first obtain export licenses from the federal government. University officials dispute the data on which the report is based and find the…

  7. The Age of the Supercomputer Gives Way to the Age of the Super Infrastructure.

    ERIC Educational Resources Information Center

    Young, Jeffrey R.

    1997-01-01

    In October 1997, the National Science Foundation will discontinue financial support for two university-based supercomputer facilities to concentrate resources on partnerships led by facilities at the University of California, San Diego and the University of Illinois, Urbana-Champaign. The reconfigured program will develop more user-friendly and…

  8. Extracting the Textual and Temporal Structure of Supercomputing Logs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jain, S; Singh, I; Chandra, A

    2009-05-26

    Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an onlinemore » clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.« less

  9. Supercomputations and big-data analysis in strong-field ultrafast optical physics: filamentation of high-peak-power ultrashort laser pulses

    NASA Astrophysics Data System (ADS)

    Voronin, A. A.; Panchenko, V. Ya; Zheltikov, A. M.

    2016-06-01

    High-intensity ultrashort laser pulses propagating in gas media or in condensed matter undergo complex nonlinear spatiotemporal evolution where temporal transformations of optical field waveforms are strongly coupled to an intricate beam dynamics and ultrafast field-induced ionization processes. At the level of laser peak powers orders of magnitude above the critical power of self-focusing, the beam exhibits modulation instabilities, producing random field hot spots and breaking up into multiple noise-seeded filaments. This problem is described by a (3  +  1)-dimensional nonlinear field evolution equation, which needs to be solved jointly with the equation for ultrafast ionization of a medium. Analysis of this problem, which is equivalent to solving a billion-dimensional evolution problem, is only possible by means of supercomputer simulations augmented with coordinated big-data processing of large volumes of information acquired through theory-guiding experiments and supercomputations. Here, we review the main challenges of supercomputations and big-data processing encountered in strong-field ultrafast optical physics and discuss strategies to confront these challenges.

  10. A distributed analysis and visualization system for model and observational data

    NASA Technical Reports Server (NTRS)

    Wilhelmson, Robert B.

    1994-01-01

    Software was developed with NASA support to aid in the analysis and display of the massive amounts of data generated from satellites, observational field programs, and from model simulations. This software was developed in the context of the PATHFINDER (Probing ATmospHeric Flows in an Interactive and Distributed EnviRonment) Project. The overall aim of this project is to create a flexible, modular, and distributed environment for data handling, modeling simulations, data analysis, and visualization of atmospheric and fluid flows. Software completed with NASA support includes GEMPAK analysis, data handling, and display modules for which collaborators at NASA had primary responsibility, and prototype software modules for three-dimensional interactive and distributed control and display as well as data handling, for which NSCA was responsible. Overall process control was handled through a scientific and visualization application builder from Silicon Graphics known as the Iris Explorer. In addition, the GEMPAK related work (GEMVIS) was also ported to the Advanced Visualization System (AVS) application builder. Many modules were developed to enhance those already available in Iris Explorer including HDF file support, improved visualization and display, simple lattice math, and the handling of metadata through development of a new grid datatype. Complete source and runtime binaries along with on-line documentation is available via the World Wide Web at: http://redrock.ncsa.uiuc.edu/ PATHFINDER/pathre12/top/top.html.

  11. The " Swarm of Ants vs. Herd of Elephants" Debated Revisited: Performance Measurements of PVM-Overflow Across a Wide Spectrum of Architectures

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Jespersen, Dennis; Buning, Peter; Bailey, David (Technical Monitor)

    1996-01-01

    The Gorden Bell Prizes given out at Supercomputing every year includes at least two catergories: performance (highest GFLOP count) and price-performance (GFLOP/million $$) for real applications. In the past five years, the winners of the price-performance categories all came from networks of work-stations. This reflects three important facts: 1. supercomputers are still too expensive for the masses; 2. achieving high performance for real applications takes real work; and, most importantly; 3. it is possible to obtain acceptable performance for certain real applications on network of work stations. With the continued advance of network technology as well as increased performance of "desktop" workstation, the "Swarm of Ants vs. Herd of Elephants" debate, which began with vector multiprocessors (VPPs) against SIMD type multiprocessors (e.g. CM2), is now recast as VPPs against Symetric Multiprocessors (SMPs, e.g. SGI PowerChallenge). This paper reports on performance studies we performed solving a large scale (2-million grid pt.s) CFD problem involving a Boeing 747 based on a parallel version of OVERFLOW that utilizes message passing on PVM. A performance monitoring tool developed under NASA HPCC, called AIMS, was used to instrument and analyze the the performance data thus obtained. We plan to compare its performance data obtained across a wide spectrum of architectures including: the Cray C90, IBM/SP2, SGI/Power Challenge Cluster, to a group of workstations connected over a simple network. The metrics of comparison includes speed-up, price-performance, throughput, and turn-around time. We also plan to present a plan of attack for various issues that will make the execution of Grand Challenge Applications across the Global Information Infrastructure a reality.

  12. A survey of parallel programming tools

    NASA Technical Reports Server (NTRS)

    Cheng, Doreen Y.

    1991-01-01

    This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.

  13. Algorithms for the Euler and Navier-Stokes equations for supercomputers

    NASA Technical Reports Server (NTRS)

    Turkel, E.

    1985-01-01

    The steady state Euler and Navier-Stokes equations are considered for both compressible and incompressible flow. Methods are found for accelerating the convergence to a steady state. This acceleration is based on preconditioning the system so that it is no longer time consistent. In order that the acceleration technique be scheme-independent, this preconditioning is done at the differential equation level. Applications are presented for very slow flows and also for the incompressible equations.

  14. Air Force Maui Optical and Supercomputing Site (AMOS) Application Briefs 2004

    DTIC Science & Technology

    2004-01-01

    Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection...17. LIMITATION OF ABSTRACT Same as Report (SAR) 18. NUMBER OF PAGES 60 19a. NAME OF RESPONSIBLE PERSON a. REPORT unclassified b. ABSTRACT...SOR methods, the work in parallel matrix iterative methods is very relevant to the parallelization of a CA. The Moore neighborhood used in this work

  15. Chemical calculations on Cray computers

    NASA Technical Reports Server (NTRS)

    Taylor, Peter R.; Bauschlicher, Charles W., Jr.; Schwenke, David W.

    1989-01-01

    The influence of recent developments in supercomputing on computational chemistry is discussed with particular reference to Cray computers and their pipelined vector/limited parallel architectures. After reviewing Cray hardware and software the performance of different elementary program structures are examined, and effective methods for improving program performance are outlined. The computational strategies appropriate for obtaining optimum performance in applications to quantum chemistry and dynamics are discussed. Finally, some discussion is given of new developments and future hardware and software improvements.

  16. Challenges and opportunities of cloud computing for atmospheric sciences

    NASA Astrophysics Data System (ADS)

    Pérez Montes, Diego A.; Añel, Juan A.; Pena, Tomás F.; Wallom, David C. H.

    2016-04-01

    Cloud computing is an emerging technological solution widely used in many fields. Initially developed as a flexible way of managing peak demand it has began to make its way in scientific research. One of the greatest advantages of cloud computing for scientific research is independence of having access to a large cyberinfrastructure to fund or perform a research project. Cloud computing can avoid maintenance expenses for large supercomputers and has the potential to 'democratize' the access to high-performance computing, giving flexibility to funding bodies for allocating budgets for the computational costs associated with a project. Two of the most challenging problems in atmospheric sciences are computational cost and uncertainty in meteorological forecasting and climate projections. Both problems are closely related. Usually uncertainty can be reduced with the availability of computational resources to better reproduce a phenomenon or to perform a larger number of experiments. Here we expose results of the application of cloud computing resources for climate modeling using cloud computing infrastructures of three major vendors and two climate models. We show how the cloud infrastructure compares in performance to traditional supercomputers and how it provides the capability to complete experiments in shorter periods of time. The monetary cost associated is also analyzed. Finally we discuss the future potential of this technology for meteorological and climatological applications, both from the point of view of operational use and research.

  17. Constructing a Foundational Platform Driven by Japan's K Supercomputer for Next-Generation Drug Design.

    PubMed

    Brown, J B; Nakatsui, Masahiko; Okuno, Yasushi

    2014-12-01

    The cost of pharmaceutical R&D has risen enormously, both worldwide and in Japan. However, Japan faces a particularly difficult situation in that its population is aging rapidly, and the cost of pharmaceutical R&D affects not only the industry but the entire medical system as well. To attempt to reduce costs, the newly launched K supercomputer is available for big data drug discovery and structural simulation-based drug discovery. We have implemented both primary (direct) and secondary (infrastructure, data processing) methods for the two types of drug discovery, custom tailored to maximally use the 88 128 compute nodes/CPUs of K, and evaluated the implementations. We present two types of results. In the first, we executed the virtual screening of nearly 19 billion compound-protein interactions, and calculated the accuracy of predictions against publicly available experimental data. In the second investigation, we implemented a very computationally intensive binding free energy algorithm, and found that comparison of our binding free energies was considerably accurate when validated against another type of publicly available experimental data. The common feature of both result types is the scale at which computations were executed. The frameworks presented in this article provide prospectives and applications that, while tuned to the computing resources available in Japan, are equally applicable to any equivalent large-scale infrastructure provided elsewhere. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. A report documenting the completion of the Los Alamos National Laboratory portion of the ASC level II milestone ""Visualization on the supercomputing platform

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahrens, James P; Patchett, John M; Lo, Li - Ta

    2011-01-24

    This report provides documentation for the completion of the Los Alamos portion of the ASC Level II 'Visualization on the Supercomputing Platform' milestone. This ASC Level II milestone is a joint milestone between Sandia National Laboratory and Los Alamos National Laboratory. The milestone text is shown in Figure 1 with the Los Alamos portions highlighted in boldfaced text. Visualization and analysis of petascale data is limited by several factors which must be addressed as ACES delivers the Cielo platform. Two primary difficulties are: (1) Performance of interactive rendering, which is the most computationally intensive portion of the visualization process. Formore » terascale platforms, commodity clusters with graphics processors (GPUs) have been used for interactive rendering. For petascale platforms, visualization and rendering may be able to run efficiently on the supercomputer platform itself. (2) I/O bandwidth, which limits how much information can be written to disk. If we simply analyze the sparse information that is saved to disk we miss the opportunity to analyze the rich information produced every timestep by the simulation. For the first issue, we are pursuing in-situ analysis, in which simulations are coupled directly with analysis libraries at runtime. This milestone will evaluate the visualization and rendering performance of current and next generation supercomputers in contrast to GPU-based visualization clusters, and evaluate the perfromance of common analysis libraries coupled with the simulation that analyze and write data to disk during a running simulation. This milestone will explore, evaluate and advance the maturity level of these technologies and their applicability to problems of interest to the ASC program. In conclusion, we improved CPU-based rendering performance by a a factor of 2-10 times on our tests. In addition, we evaluated CPU and CPU-based rendering performance. We encourage production visualization experts to consider using CPU-based rendering solutions when it is appropriate. For example, on remote supercomputers CPU-based rendering can offer a means of viewing data without having to offload the data or geometry onto a CPU-based visualization system. In terms of comparative performance of the CPU and CPU we believe that further optimizations of the performance of both CPU or CPU-based rendering are possible. The simulation community is currently confronting this reality as they work to port their simulations to different hardware architectures. What is interesting about CPU rendering of massive datasets is that for part two decades CPU performance has significantly outperformed CPU-based systems. Based on our advancements, evaluations and explorations we believe that CPU-based rendering has returned as one viable option for the visualization of massive datasets.« less

  19. Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Villa, Oreste; Tumeo, Antonino; Secchi, Simone

    Irregular applications, such as data mining and analysis or graph-based computations, show unpredictable memory/network access patterns and control structures. Highly multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2 and XMT, appear to address their requirements better than commodity clusters. However, the research on highly multithreaded systems is currently limited by the lack of adequate architectural simulation infrastructures due to issues such as size of the machines, memory footprint, simulation speed, accuracy and customization. At the same time, Shared-memory MultiProcessors (SMPs) with multi-core processors have become an attractive platform to simulate large scale machines. In this paper, wemore » introduce a cycle-level simulator of the highly multithreaded Cray XMT supercomputer. The simulator runs unmodified XMT applications. We discuss how we tackled the challenges posed by its development, detailing the techniques introduced to make the simulation as fast as possible while maintaining a high accuracy. By mapping XMT processors (ThreadStorm with 128 hardware threads) to host computing cores, the simulation speed remains constant as the number of simulated processors increases, up to the number of available host cores. The simulator supports zero-overhead switching among different accuracy levels at run-time and includes a network model that takes into account contention. On a modern 48-core SMP host, our infrastructure simulates a large set of irregular applications 500 to 2000 times slower than real time when compared to a 128-processor XMT, while remaining within 10\\% of accuracy. Emulation is only from 25 to 200 times slower than real time.« less

  20. The impact of the U.S. supercomputing initiative will be global

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crawford, Dona

    2016-01-15

    Last July, President Obama issued an executive order that created a coordinated federal strategy for HPC research, development, and deployment called the U.S. National Strategic Computing Initiative (NSCI). However, this bold, necessary step toward building the next generation of supercomputers has inaugurated a new era for U.S. high performance computing (HPC).

  1. Parallel-vector solution of large-scale structural analysis problems on supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

    1989-01-01

    A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.

  2. Predicting Hurricanes with Supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    2010-01-01

    Hurricane Emily, formed in the Atlantic Ocean on July 10, 2005, was the strongest hurricane ever to form before August. By checking computer models against the actual path of the storm, researchers can improve hurricane prediction. In 2010, NOAA researchers were awarded 25 million processor-hours on Argonne's BlueGene/P supercomputer for the project. Read more at http://go.usa.gov/OLh

  3. Supercomputers Of The Future

    NASA Technical Reports Server (NTRS)

    Peterson, Victor L.; Kim, John; Holst, Terry L.; Deiwert, George S.; Cooper, David M.; Watson, Andrew B.; Bailey, F. Ron

    1992-01-01

    Report evaluates supercomputer needs of five key disciplines: turbulence physics, aerodynamics, aerothermodynamics, chemistry, and mathematical modeling of human vision. Predicts these fields will require computer speed greater than 10(Sup 18) floating-point operations per second (FLOP's) and memory capacity greater than 10(Sup 15) words. Also, new parallel computer architectures and new structured numerical methods will make necessary speed and capacity available.

  4. Supercomputing Sheds Light on the Dark Universe

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Habib, Salman; Heitmann, Katrin

    2012-11-15

    At Argonne National Laboratory, scientists are using supercomputers to shed light on one of the great mysteries in science today, the Dark Universe. With Mira, a petascale supercomputer at the Argonne Leadership Computing Facility, a team led by physicists Salman Habib and Katrin Heitmann will run the largest, most complex simulation of the universe ever attempted. By contrasting the results from Mira with state-of-the-art telescope surveys, the scientists hope to gain new insights into the distribution of matter in the universe, advancing future investigations of dark energy and dark matter into a new realm. The team's research was named amore » finalist for the 2012 Gordon Bell Prize, an award recognizing outstanding achievement in high-performance computing.« less

  5. Surprise

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Curran, L.

    1988-03-03

    Interest has been building in recent months over the imminent arrival of a new class of supercomputer, called the ''supercomputer on a desk'' or the single-user model. Most observers expected the first such product to come from either of two startups, Ardent Computer Corp. or Stellar Computer Inc. But a surprise entry has shown up. Apollo Computer Inc. is launching a new work station this week that racks up an impressive list of industry first as it puts supercomputer power at the disposal of a single user. The new series 10000 from the Chelmsford, Mass., a company is built aroundmore » a reduced-instruction-set architecture that the company calls Prism, for parallel reduced-instruction-set multiprocessor. This article describes the 10000 and Prism.« less

  6. The Pawsey Supercomputer geothermal cooling project

    NASA Astrophysics Data System (ADS)

    Regenauer-Lieb, K.; Horowitz, F.; Western Australian Geothermal Centre Of Excellence, T.

    2010-12-01

    The Australian Government has funded the Pawsey supercomputer in Perth, Western Australia, providing computational infrastructure intended to support the future operations of the Australian Square Kilometre Array radiotelescope and to boost next-generation computational geosciences in Australia. Supplementary funds have been directed to the development of a geothermal exploration well to research the potential for direct heat use applications at the Pawsey Centre site. Cooling the Pawsey supercomputer may be achieved by geothermal heat exchange rather than by conventional electrical power cooling, thus reducing the carbon footprint of the Pawsey Centre and demonstrating an innovative green technology that is widely applicable in industry and urban centres across the world. The exploration well is scheduled to be completed in 2013, with drilling due to commence in the third quarter of 2011. One year is allocated to finalizing the design of the exploration, monitoring and research well. Success in the geothermal exploration and research program will result in an industrial-scale geothermal cooling facility at the Pawsey Centre, and will provide a world-class student training environment in geothermal energy systems. A similar system is partially funded and in advanced planning to provide base-load air-conditioning for the main campus of the University of Western Australia. Both systems are expected to draw ~80-95 degrees C water from aquifers lying between 2000 and 3000 meters depth from naturally permeable rocks of the Perth sedimentary basin. The geothermal water will be run through absorption chilling devices, which only require heat (as opposed to mechanical work) to power a chilled water stream adequate to meet the cooling requirements. Once the heat has been removed from the geothermal water, licensing issues require the water to be re-injected back into the aquifer system. These systems are intended to demonstrate the feasibility of powering large-scale air-conditioning systems from the direct use of geothermal power from Hot Sedimentary Aquifer (HSA) systems. HSA systems underlie many of the world's population centers, and thus have the potential to offset a significant fraction of the world's consumption of electrical power for air-conditioning.

  7. Analysis Of 2H-Evaporator Scale Wall [HTF-13-82] And Pot Bottom [HTF-13-77] Samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oji, L. N.

    2013-09-11

    Savannah River Remediation (SRR) is planning to remove a buildup of sodium aluminosilicate scale from the 2H-evaporator pot by loading and soaking the pot with heated 1.5 M nitric acid solution. Sampling and analysis of the scale material has been performed so that uranium and plutonium isotopic analysis can be input into a Nuclear Criticality Safety Assessment (NCSA) for scale removal by chemical cleaning. Historically, since the operation of the Defense Waste Processing Facility (DWPF), silicon in the DWPF recycle stream combines with aluminum in the typical tank farm supernate to form sodium aluminosilicate scale mineral deposits in the 2H-evaporatormore » pot and gravity drain line. The 2H-evaporator scale samples analyzed by Savannah River National Laboratory (SRNL) came from two different locations within the evaporator pot; the bottom cone sections of the 2H-evaporator pot [Sample HTF-13-77] and the wall 2H-evaporator [sample HTF-13-82]. X-ray diffraction analysis (XRD) confirmed that both the 2H-evaporator pot scale and the wall samples consist of nitrated cancrinite (a crystalline sodium aluminosilicate solid) and clarkeite (a uranium oxyhydroxide mineral). On ''as received'' basis, the bottom pot section scale sample contained an average of 2.59E+00 {+-} 1.40E-01 wt % total uranium with a U-235 enrichment of 6.12E-01 {+-} 1.48E-02 %, while the wall sample contained an average of 4.03E+00 {+-} 9.79E-01 wt % total uranium with a U-235 enrichment of 6.03E-01% {+-} 1.66E-02 wt %. The bottom pot section scale sample analyses results for Pu-238, Pu-239, and Pu-241 are 3.16E-05 {+-} 5.40E-06 wt %, 3.28E-04 {+-} 1.45E-05 wt %, and <8.80E-07 wt %, respectively. The evaporator wall scale samples analysis values for Pu-238, Pu-239, and Pu-241 averages 3.74E-05 {+-} 6.01E-06 wt %, 4.38E-04 {+-} 5.08E-05 wt %, and <1.38E-06 wt %, respectively. The Pu-241 analyses results, as presented, are upper limit values. For these two evaporator scale samples obtained at two different locations within the evaporator pot the major radioactive components (on a mass basis) in the additional radionuclide analyses were Sr-90, Cs-137 Np-237, Pu-239/240 and Th-232. Small quantities of americium and curium were detected in the blanks used for Am/Cm method for these radionuclides. These trace radionuclide amounts are assumed to come from airborne contamination in the shielded cells drying or digestion oven, which has been replaced. Therefore, the Am/Cm results, as presented, may be higher than the true Am/Cm values for these samples. These results are provided so that SRR can calculate the equivalent uranium-235 concentrations for the NCSA. Results confirm that the uranium contained in the scale remains depleted with respect to natural uranium. SRNL did not calculate an equivalent U-235 enrichment, which takes into account other fissionable isotopes U-233, Pu-239 and Pu-241. The applicable method for calculation of equivalent U-235 will be determined in the NCSA. With a few exceptions, a comparison of select radionuclides measurements from this 2013 2H evaporator scale characterization (pot bottom and wall scale samples) with those measurements for the same radionuclides in the 2010 2H evaporator scale analysis shows that the radionuclide analysis for both years are fairly comparable; the analyses results are about the same order of magnitude.« less

  8. Requirements for migration of NSSD code systems from LTSS to NLTSS

    NASA Technical Reports Server (NTRS)

    Pratt, M.

    1984-01-01

    The purpose of this document is to address the requirements necessary for a successful conversion of the Nuclear Design (ND) application code systems to the NLTSS environment. The ND application code system community can be characterized as large-scale scientific computation carried out on supercomputers. NLTSS is a distributed operating system being developed at LLNL to replace the LTSS system currently in use. The implications of change are examined including a description of the computational environment and users in ND. The discussion then turns to requirements, first in a general way, followed by specific requirements, including a proposal for managing the transition.

  9. Clock Agreement Among Parallel Supercomputer Nodes

    DOE Data Explorer

    Jones, Terry R.; Koenig, Gregory A.

    2014-04-30

    This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lusk, Ewing; Butler, Ralph; Pieper, Steven C.

    Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generalitymore » and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.« less

  11. Semiconductor lasers for versatile applications from global communications to on-chip interconnects

    NASA Astrophysics Data System (ADS)

    Arai, Shigehisa

    2015-01-01

    Since semiconductor lasers were realized in 1962, various efforts have been made to enrich human life thorough novel equipments and services. Among them optical fiber communications in global communications have brought out marvelous information technology age represented by the internet. In this paper, emerging topics made on GaInAsP/InP based long-wavelength lasers toward ultra-low power consumption semiconductor lasers for optical interconnects in supercomputers as well as in future LSIs are presented.

  12. A Spectral Algorithm for Solving the Relativistic Vlasov-Maxwell Equations

    NASA Technical Reports Server (NTRS)

    Shebalin, John V.

    2001-01-01

    A spectral method algorithm is developed for the numerical solution of the full six-dimensional Vlasov-Maxwell system of equations. Here, the focus is on the electron distribution function, with positive ions providing a constant background. The algorithm consists of a Jacobi polynomial-spherical harmonic formulation in velocity space and a trigonometric formulation in position space. A transform procedure is used to evaluate nonlinear terms. The algorithm is suitable for performing moderate resolution simulations on currently available supercomputers for both scientific and engineering applications.

  13. Parallel-vector out-of-core equation solver for computational mechanics

    NASA Technical Reports Server (NTRS)

    Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.

    1993-01-01

    A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.

  14. Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Byun, Chansup; Kwak, Dochan (Technical Monitor)

    2001-01-01

    A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel super computers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.

  15. An OpenACC-Based Unified Programming Model for Multi-accelerator Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Jungwon; Lee, Seyong; Vetter, Jeffrey S

    2015-01-01

    This paper proposes a novel SPMD programming model of OpenACC. Our model integrates the different granularities of parallelism from vector-level parallelism to node-level parallelism into a single, unified model based on OpenACC. It allows programmers to write programs for multiple accelerators using a uniform programming model whether they are in shared or distributed memory systems. We implement a prototype of our model and evaluate its performance with a GPU-based supercomputer using three benchmark applications.

  16. Floating-point performance of ARM cores and their efficiency in classical molecular dynamics

    NASA Astrophysics Data System (ADS)

    Nikolskiy, V.; Stegailov, V.

    2016-02-01

    Supercomputing of the exascale era is going to be inevitably limited by power efficiency. Nowadays different possible variants of CPU architectures are considered. Recently the development of ARM processors has come to the point when their floating point performance can be seriously considered for a range of scientific applications. In this work we present the analysis of the floating point performance of the latest ARM cores and their efficiency for the algorithms of classical molecular dynamics.

  17. Assessing the Need for Supercomputing Resources Within the Pacific Area of Responsibility

    DTIC Science & Technology

    2015-05-26

    portion of today’s research and development dollars are going toward developing machines that will be better suited for addressing big data applications...2009; Radu Sion, “To Cloud or Not to? Musings on Clouds, Security and Big Data ,” in Secure Data Management, Vol. 8425, May 2014, pp. 3–5; Yao Chen...Applied Parallel and Scientific Computing, Vol. 7134, 2010. Sion, Radu, “To Cloud or Not to? Musings on Clouds, Security and Big Data ,” in Secure Data

  18. Adapting the serial Alpgen parton-interaction generator to simulate LHC collisions on millions of parallel threads

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Childers, J. T.; Uram, T. D.; LeCompte, T. J.

    As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the World- wide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. This paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less

  19. Adapting the serial Alpgen parton-interaction generator to simulate LHC collisions on millions of parallel threads

    DOE PAGES

    Childers, J. T.; Uram, T. D.; LeCompte, T. J.; ...

    2016-09-29

    As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. Finally, this paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less

  20. Adapting the serial Alpgen parton-interaction generator to simulate LHC collisions on millions of parallel threads

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Childers, J. T.; Uram, T. D.; LeCompte, T. J.

    As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. Finally, this paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less

  1. Ice Storm Supercomputer

    ScienceCinema

    None

    2018-05-01

    A new Idaho National Laboratory supercomputer is helping scientists create more realistic simulations of nuclear fuel. Dubbed "Ice Storm" this 2048-processor machine allows researchers to model and predict the complex physics behind nuclear reactor behavior. And with a new visualization lab, the team can see the results of its simulations on the big screen. For more information about INL research, visit http://www.facebook.com/idahonationallaboratory.

  2. Open Skies Project Computational Fluid Dynamic Analysis

    DTIC Science & Technology

    1994-03-01

    109 -. -_ _ 9 . CONCLUSIONSI1 f 10. LIST OF REFERENCES _________ ___________112 APPENDIX A: Transition Prediction __________________116 B...Behind the Open Skies Plate 20 8. VSAERO Results on the Alternate Fairing 21 9 . Centerline Cp Comparisons 22 10. VSAERO Wing Effects Study Centerline C...problems. The assistance Mrs. Mary Ann Mages, at Kirtland Supercomputer Center ( PL /SCPR) gave by setting a precedent for supercomputer account

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack

    20th Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 20th edition of the TOP500 list of the world's fastest supercomputers was released today (November 15, 2002). The Earth Simulator supercomputer installed earlier this year at the Earth Simulator Center in Yokohama, Japan, is with its Linpack benchmark performance of 35.86 Tflop/s (trillions of calculations per second) retains the number one position. The No.2 and No.3 positions are held by two new, identical ASCI Q systems at Los Alamos National Laboratorymore » (7.73Tflop/s each). These systems are built by Hewlett-Packard and based on the Alpha Server SC computer system.« less

  4. STAMPS: Software Tool for Automated MRI Post-processing on a supercomputer.

    PubMed

    Bigler, Don C; Aksu, Yaman; Miller, David J; Yang, Qing X

    2009-08-01

    This paper describes a Software Tool for Automated MRI Post-processing (STAMP) of multiple types of brain MRIs on a workstation and for parallel processing on a supercomputer (STAMPS). This software tool enables the automation of nonlinear registration for a large image set and for multiple MR image types. The tool uses standard brain MRI post-processing tools (such as SPM, FSL, and HAMMER) for multiple MR image types in a pipeline fashion. It also contains novel MRI post-processing features. The STAMP image outputs can be used to perform brain analysis using Statistical Parametric Mapping (SPM) or single-/multi-image modality brain analysis using Support Vector Machines (SVMs). Since STAMPS is PBS-based, the supercomputer may be a multi-node computer cluster or one of the latest multi-core computers.

  5. Japanese project aims at supercomputer that executes 10 gflops

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burskey, D.

    1984-05-03

    Dubbed supercom by its multicompany design team, the decade-long project's goal is an engineering supercomputer that can execute 10 billion floating-point operations/s-about 20 times faster than today's supercomputers. The project, guided by Japan's Ministry of International Trade and Industry (MITI) and the Agency of Industrial Science and Technology encompasses three parallel research programs, all aimed at some angle of the superconductor. One program should lead to superfast logic and memory circuits, another to a system architecture that will afford the best performance, and the last to the software that will ultimately control the computer. The work on logic and memorymore » chips is based on: GAAS circuit; Josephson junction devices; and high electron mobility transistor structures. The architecture will involve parallel processing.« less

  6. Status and future perspective of applications of high temperature superconductors

    NASA Astrophysics Data System (ADS)

    Tanaka, Shoji

    The material research on the high temperature superconductivity for the past ten years gave us sufficient information on the new phenomena of these new materials. It seems that new applications in a very wide range of industries are increasing rapidly. In this report three main topics of the applications are given ; [a] progress of the superconducting bulk materials and their applications to the flywheel electricity storage system and others, [b] progress in the development of superconducting tapes and their applications to power cables, the high field superconducting magnet for the SMES and for the pulling system of large silicon single crystal, and [c] development of new superconducting electronic devices (SFQ) and the possiblity of the application to next generation supercomputers. These examples show the great capability of the superconductivity technology and it is expected that the real superconductivity industry will take off around the year of 2005.

  7. The World Wide Web and Technology Transfer at NASA Langley Research Center

    NASA Technical Reports Server (NTRS)

    Nelson, Michael L.; Bianco, David J.

    1994-01-01

    NASA Langley Research Center (LaRC) began using the World Wide Web (WWW) in the summer of 1993, becoming the first NASA installation to provide a Center-wide home page. This coincided with a reorganization of LaRC to provide a more concentrated focus on technology transfer to both aerospace and non-aerospace industry. Use of the WWW and NCSA Mosaic not only provides automated information dissemination, but also allows for the implementation, evolution and integration of many technology transfer applications. This paper describes several of these innovative applications, including the on-line presentation of the entire Technology Opportunities Showcase (TOPS), an industrial partnering showcase that exists on the Web long after the actual 3-day event ended. During its first year on the Web, LaRC also developed several WWW-based information repositories. The Langley Technical Report Server (LTRS), a technical paper delivery system with integrated searching and retrieval, has proved to be quite popular. The NASA Technical Report Server (NTRS), an outgrowth of LTRS, provides uniform access to many logically similar, yet physically distributed NASA report servers. WWW is also the foundation of the Langley Software Server (LSS), an experimental software distribution system which will distribute LaRC-developed software with the possible phase-out of NASA's COSMIC program. In addition to the more formal technology distribution projects, WWW has been successful in connecting people with technologies and people with other people. With the completion of the LaRC reorganization, the Technology Applications Group, charged with interfacing with non-aerospace companies, opened for business with a popular home page.

  8. SCORPIO: A Scalable Two-Phase Parallel I/O Library With Application To A Large Scale Subsurface Simulator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sreepathi, Sarat; Sripathi, Vamsi; Mills, Richard T

    2013-01-01

    Inefficient parallel I/O is known to be a major bottleneck among scientific applications employed on supercomputers as the number of processor cores grows into the thousands. Our prior experience indicated that parallel I/O libraries such as HDF5 that rely on MPI-IO do not scale well beyond 10K processor cores, especially on parallel file systems (like Lustre) with single point of resource contention. Our previous optimization efforts for a massively parallel multi-phase and multi-component subsurface simulator (PFLOTRAN) led to a two-phase I/O approach at the application level where a set of designated processes participate in the I/O process by splitting themore » I/O operation into a communication phase and a disk I/O phase. The designated I/O processes are created by splitting the MPI global communicator into multiple sub-communicators. The root process in each sub-communicator is responsible for performing the I/O operations for the entire group and then distributing the data to rest of the group. This approach resulted in over 25X speedup in HDF I/O read performance and 3X speedup in write performance for PFLOTRAN at over 100K processor cores on the ORNL Jaguar supercomputer. This research describes the design and development of a general purpose parallel I/O library, SCORPIO (SCalable block-ORiented Parallel I/O) that incorporates our optimized two-phase I/O approach. The library provides a simplified higher level abstraction to the user, sitting atop existing parallel I/O libraries (such as HDF5) and implements optimized I/O access patterns that can scale on larger number of processors. Performance results with standard benchmark problems and PFLOTRAN indicate that our library is able to maintain the same speedups as before with the added flexibility of being applicable to a wider range of I/O intensive applications.« less

  9. An interactive environment for the analysis of large Earth observation and model data sets

    NASA Technical Reports Server (NTRS)

    Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.

    1993-01-01

    We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X DataSlice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.

  10. An interactive environment for the analysis of large Earth observation and model data sets

    NASA Technical Reports Server (NTRS)

    Bowman, Kenneth P.; Walsh, John E.; Wilhelmson, Robert B.

    1992-01-01

    We propose to develop an interactive environment for the analysis of large Earth science observation and model data sets. We will use a standard scientific data storage format and a large capacity (greater than 20 GB) optical disk system for data management; develop libraries for coordinate transformation and regridding of data sets; modify the NCSA X Image and X Data Slice software for typical Earth observation data sets by including map transformations and missing data handling; develop analysis tools for common mathematical and statistical operations; integrate the components described above into a system for the analysis and comparison of observations and model results; and distribute software and documentation to the scientific community.

  11. Japanese supercomputer technology.

    PubMed

    Buzbee, B L; Ewald, R H; Worlton, W J

    1982-12-17

    Under the auspices of the Ministry for International Trade and Industry the Japanese have launched a National Superspeed Computer Project intended to produce high-performance computers for scientific computation and a Fifth-Generation Computer Project intended to incorporate and exploit concepts of artificial intelligence. If these projects are successful, which appears likely, advanced economic and military research in the United States may become dependent on access to supercomputers of foreign manufacture.

  12. Supercomputer Simulations Help Develop New Approach to Fight Antibiotic Resistance

    ScienceCinema

    Zgurskaya, Helen; Smith, Jeremy

    2018-06-13

    ORNL leveraged powerful supercomputing to support research led by University of Oklahoma scientists to identify chemicals that seek out and disrupt bacterial proteins called efflux pumps, known to be a major cause of antibiotic resistance. By running simulations on Titan, the team selected molecules most likely to target and potentially disable the assembly of efflux pumps found in E. coli bacteria cells.

  13. Enhancing Environmental HPC Applications: The EnCompAS approach

    NASA Astrophysics Data System (ADS)

    Frank, Anton; Donners, John; Pursula, Antti; Seinstra, Frank; Kranzlmüller, Dieter

    2015-04-01

    Many HPC applications in geoscience are of very high scientific quality and highly optimized for supercomputers. However, some of these codes lack the uptake by other adjacent scientific communities or industry due to deficiencies in usability, quality, and availability. Since enhancing software by, e.g., adding a graphical user interface, respecting data standards, setting up a support structure, or writing an extensive documentation is not of direct and immediate scientific relevance, most developers are not willing to invest any additional effort in these issues. Furthermore, if scientists, who are not directly involved in the development of some scientific software, could make benefit from additional features or interfaces, respective requests are often turned down due to the lack of time and resources. On the other hand, such enhancements are crucial for the sustainability of the scientific assets as well as the widespread or even worldwide distribution of European environmental software. Closely collaborating with environmental scientists the national supercomputing and eScience centres in Helsinki, Amsterdam, and Munich have identified that an enhancement of HPC and data analysis software must be provided as a service to the scientists developing such software. Therefore, first steps have been taken to establish respective services at these centres. In this talk we will present the already existing and envisioned service portfolio, some first success stories, and the approach to harmonize the current status aiming to turn this local effort into a pan-European service offering for environmental science.

  14. Simulating functional magnetic materials on supercomputers.

    PubMed

    Gruner, Markus Ernst; Entel, Peter

    2009-07-22

    The recent passing of the petaflop per second landmark by the Roadrunner project at the Los Alamos National Laboratory marks a preliminary peak of an impressive world-wide development in the high-performance scientific computing sector. Also, purely academic state-of-the-art supercomputers such as the IBM Blue Gene/P at Forschungszentrum Jülich allow us nowadays to investigate large systems of the order of 10(3) spin polarized transition metal atoms by means of density functional theory. Three applications will be presented where large-scale ab initio calculations contribute to the understanding of key properties emerging from a close interrelation between structure and magnetism. The first two examples discuss the size dependent evolution of equilibrium structural motifs in elementary iron and binary Fe-Pt and Co-Pt transition metal nanoparticles, which are currently discussed as promising candidates for ultra-high-density magnetic data storage media. However, the preference for multiply twinned morphologies at smaller cluster sizes counteracts the formation of a single-crystalline L1(0) phase, which alone provides the required hard magnetic properties. The third application is concerned with the magnetic shape memory effect in the Ni-Mn-Ga Heusler alloy, which is a technologically relevant candidate for magnetomechanical actuators and sensors. In this material strains of up to 10% can be induced by external magnetic fields due to the field induced shifting of martensitic twin boundaries, requiring an extremely high mobility of the martensitic twin boundaries, but also the selection of the appropriate martensitic structure from the rich phase diagram.

  15. Aviation Research and the Internet

    NASA Technical Reports Server (NTRS)

    Scott, Antoinette M.

    1995-01-01

    The Internet is a network of networks. It was originally funded by the Defense Advanced Research Projects Agency or DOD/DARPA and evolved in part from the connection of supercomputer sites across the United States. The National Science Foundation (NSF) made the most of their supercomputers by connecting the sites to each other. This made the supercomputers more efficient and now allows scientists, engineers and researchers to access the supercomputers from their own labs and offices. The high speed networks that connect the NSF supercomputers form the backbone of the Internet. The World Wide Web (WWW) is a menu system. It gathers Internet resources from all over the world into a series of screens that appear on your computer. The WWW is also a distributed. The distributed system stores data information on many computers (servers). These servers can go out and get data when you ask for it. Hypermedia is the base of the WWW. One can 'click' on a section and visit other hypermedia (pages). Our approach to demonstrating the importance of aviation research through the Internet began with learning how to put pages on the Internet (on-line) ourselves. We were assigned two aviation companies; Vision Micro Systems Inc. and Innovative Aerodynamic Technologies (IAT). We developed home pages for these SBIR companies. The equipment used to create the pages were the UNIX and Macintosh machines. HTML Supertext software was used to write the pages and the Sharp JX600S scanner to scan the images. As a result, with the use of the UNIX, Macintosh, Sun, PC, and AXIL machines, we were able to present our home pages to over 800,000 visitors.

  16. Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan

    While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less

  17. Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

    NASA Astrophysics Data System (ADS)

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.

    2012-06-01

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  18. Energy consumption optimization of the total-FETI solver by changing the CPU frequency

    NASA Astrophysics Data System (ADS)

    Horak, David; Riha, Lubomir; Sojka, Radim; Kruzik, Jakub; Beseda, Martin; Cermak, Martin; Schuchart, Joseph

    2017-07-01

    The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power and energy consumption is required on both software and hardware side. This paper deals with the energy consumption evaluation of the Finite Element Tearing and Interconnect (FETI) based solvers of linear systems, which is an established method for solving real-world engineering problems. We have evaluated the effect of the CPU frequency on the energy consumption of the FETI solver using a linear elasticity 3D cube synthetic benchmark. In this problem, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the FETI method. The paper provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The paper shows that static tuning brings up 12% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 3%.

  19. Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project.

    PubMed

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L

    2012-06-13

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  20. Combining density functional theory calculations, supercomputing, and data-driven methods to design new materials (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Jain, Anubhav

    2017-04-01

    Density functional theory (DFT) simulations solve for the electronic structure of materials starting from the Schrödinger equation. Many case studies have now demonstrated that researchers can often use DFT to design new compounds in the computer (e.g., for batteries, catalysts, and hydrogen storage) before synthesis and characterization in the lab. In this talk, I will focus on how DFT calculations can be executed on large supercomputing resources in order to generate very large data sets on new materials for functional applications. First, I will briefly describe the Materials Project, an effort at LBNL that has virtually characterized over 60,000 materials using DFT and has shared the results with over 17,000 registered users. Next, I will talk about how such data can help discover new materials, describing how preliminary computational screening led to the identification and confirmation of a new family of bulk AMX2 thermoelectric compounds with measured zT reaching 0.8. I will outline future plans for how such data-driven methods can be used to better understand the factors that control thermoelectric behavior, e.g., for the rational design of electronic band structures, in ways that are different from conventional approaches.

  1. TopoLens: Building a cyberGIS community data service for enhancing the usability of high-resolution National Topographic datasets

    USGS Publications Warehouse

    Hu, Hao; Hong, Xingchen; Terstriep, Jeff; Liu, Yan; Finn, Michael P.; Rush, Johnathan; Wendel, Jeffrey; Wang, Shaowen

    2016-01-01

    Geospatial data, often embedded with geographic references, are important to many application and science domains, and represent a major type of big data. The increased volume and diversity of geospatial data have caused serious usability issues for researchers in various scientific domains, which call for innovative cyberGIS solutions. To address these issues, this paper describes a cyberGIS community data service framework to facilitate geospatial big data access, processing, and sharing based on a hybrid supercomputer architecture. Through the collaboration between the CyberGIS Center at the University of Illinois at Urbana-Champaign (UIUC) and the U.S. Geological Survey (USGS), a community data service for accessing, customizing, and sharing digital elevation model (DEM) and its derived datasets from the 10-meter national elevation dataset, namely TopoLens, is created to demonstrate the workflow integration of geospatial big data sources, computation, analysis needed for customizing the original dataset for end user needs, and a friendly online user environment. TopoLens provides online access to precomputed and on-demand computed high-resolution elevation data by exploiting the ROGER supercomputer. The usability of this prototype service has been acknowledged in community evaluation.

  2. Computing at the speed limit (supercomputers)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bernhard, R.

    1982-07-01

    The author discusses how unheralded efforts in the United States, mainly in universities, have removed major stumbling blocks to building cost-effective superfast computers for scientific and engineering applications within five years. These computers would have sustained speeds of billions of floating-point operations per second (flops), whereas with the fastest machines today the top sustained speed is only 25 million flops, with bursts to 160 megaflops. Cost-effective superfast machines can be built because of advances in very large-scale integration and the special software needed to program the new machines. VLSI greatly reduces the cost per unit of computing power. The developmentmore » of such computers would come at an opportune time. Although the US leads the world in large-scale computer technology, its supremacy is now threatened, not surprisingly, by the Japanese. Publicized reports indicate that the Japanese government is funding a cooperative effort by commercial computer manufacturers to develop superfast computers-about 1000 times faster than modern supercomputers. The US computer industry, by contrast, has balked at attempting to boost computer power so sharply because of the uncertain market for the machines and the failure of similar projects in the past to show significant results.« less

  3. Visualization at supercomputing centers: the tale of little big iron and the three skinny guys.

    PubMed

    Bethel, E W; van Rosendale, J; Southard, D; Gaither, K; Childs, H; Brugger, E; Ahern, S

    2011-01-01

    Supercomputing centers are unique resources that aim to enable scientific knowledge discovery by employing large computational resources-the "Big Iron." Design, acquisition, installation, and management of the Big Iron are carefully planned and monitored. Because these Big Iron systems produce a tsunami of data, it's natural to colocate the visualization and analysis infrastructure. This infrastructure consists of hardware (Little Iron) and staff (Skinny Guys). Our collective experience suggests that design, acquisition, installation, and management of the Little Iron and Skinny Guys doesn't receive the same level of treatment as that of the Big Iron. This article explores the following questions about the Little Iron: How should we size the Little Iron to adequately support visualization and analysis of data coming off the Big Iron? What sort of capabilities must it have? Related questions concern the size of visualization support staff: How big should a visualization program be-that is, how many Skinny Guys should it have? What should the staff do? How much of the visualization should be provided as a support service, and how much should applications scientists be expected to do on their own?

  4. PFLOTRAN: Reactive Flow & Transport Code for Use on Laptops to Leadership-Class Supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hammond, Glenn E.; Lichtner, Peter C.; Lu, Chuan

    PFLOTRAN, a next-generation reactive flow and transport code for modeling subsurface processes, has been designed from the ground up to run efficiently on machines ranging from leadership-class supercomputers to laptops. Based on an object-oriented design, the code is easily extensible to incorporate additional processes. It can interface seamlessly with Fortran 9X, C and C++ codes. Domain decomposition parallelism is employed, with the PETSc parallel framework used to manage parallel solvers, data structures and communication. Features of the code include a modular input file, implementation of high-performance I/O using parallel HDF5, ability to perform multiple realization simulations with multiple processors permore » realization in a seamless manner, and multiple modes for multiphase flow and multicomponent geochemical transport. Chemical reactions currently implemented in the code include homogeneous aqueous complexing reactions and heterogeneous mineral precipitation/dissolution, ion exchange, surface complexation and a multirate kinetic sorption model. PFLOTRAN has demonstrated petascale performance using 2{sup 17} processor cores with over 2 billion degrees of freedom. Accomplishments achieved to date include applications to the Hanford 300 Area and modeling CO{sub 2} sequestration in deep geologic formations.« less

  5. Evolution of a minimal parallel programming model

    DOE PAGES

    Lusk, Ewing; Butler, Ralph; Pieper, Steven C.

    2017-04-30

    Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generalitymore » and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.« less

  6. Towards real-time photon Monte Carlo dose calculation in the cloud

    NASA Astrophysics Data System (ADS)

    Ziegenhein, Peter; Kozin, Igor N.; Kamerling, Cornelis Ph; Oelfke, Uwe

    2017-06-01

    Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.

  7. Performance Evaluation and Modeling Techniques for Parallel Processors. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Dimpsey, Robert Tod

    1992-01-01

    In practice, the performance evaluation of supercomputers is still substantially driven by singlepoint estimates of metrics (e.g., MFLOPS) obtained by running characteristic benchmarks or workloads. With the rapid increase in the use of time-shared multiprogramming in these systems, such measurements are clearly inadequate. This is because multiprogramming and system overhead, as well as other degradations in performance due to time varying characteristics of workloads, are not taken into account. In multiprogrammed environments, multiple jobs and users can dramatically increase the amount of system overhead and degrade the performance of the machine. Performance techniques, such as benchmarking, which characterize performance on a dedicated machine ignore this major component of true computer performance. Due to the complexity of analysis, there has been little work done in analyzing, modeling, and predicting the performance of applications in multiprogrammed environments. This is especially true for parallel processors, where the costs and benefits of multi-user workloads are exacerbated. While some may claim that the issue of multiprogramming is not a viable one in the supercomputer market, experience shows otherwise. Even in recent massively parallel machines, multiprogramming is a key component. It has even been claimed that a partial cause of the demise of the CM2 was the fact that it did not efficiently support time-sharing. In the same paper, Gordon Bell postulates that, multicomputers will evolve to multiprocessors in order to support efficient multiprogramming. Therefore, it is clear that parallel processors of the future will be required to offer the user a time-shared environment with reasonable response times for the applications. In this type of environment, the most important performance metric is the completion of response time of a given application. However, there are a few evaluation efforts addressing this issue.

  8. Towards real-time photon Monte Carlo dose calculation in the cloud.

    PubMed

    Ziegenhein, Peter; Kozin, Igor N; Kamerling, Cornelis Ph; Oelfke, Uwe

    2017-06-07

    Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bland, Arthur S Buddy; Hack, James J; Baker, Ann E

    Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energymore » assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools and resources for next-generation systems.« less

  10. Statistical correlations and risk analyses techniques for a diving dual phase bubble model and data bank using massively parallel supercomputers.

    PubMed

    Wienke, B R; O'Leary, T R

    2008-05-01

    Linking model and data, we detail the LANL diving reduced gradient bubble model (RGBM), dynamical principles, and correlation with data in the LANL Data Bank. Table, profile, and meter risks are obtained from likelihood analysis and quoted for air, nitrox, helitrox no-decompression time limits, repetitive dive tables, and selected mixed gas and repetitive profiles. Application analyses include the EXPLORER decompression meter algorithm, NAUI tables, University of Wisconsin Seafood Diver tables, comparative NAUI, PADI, Oceanic NDLs and repetitive dives, comparative nitrogen and helium mixed gas risks, USS Perry deep rebreather (RB) exploration dive,world record open circuit (OC) dive, and Woodville Karst Plain Project (WKPP) extreme cave exploration profiles. The algorithm has seen extensive and utilitarian application in mixed gas diving, both in recreational and technical sectors, and forms the bases forreleased tables and decompression meters used by scientific, commercial, and research divers. The LANL Data Bank is described, and the methods used to deduce risk are detailed. Risk functions for dissolved gas and bubbles are summarized. Parameters that can be used to estimate profile risk are tallied. To fit data, a modified Levenberg-Marquardt routine is employed with L2 error norm. Appendices sketch the numerical methods, and list reports from field testing for (real) mixed gas diving. A Monte Carlo-like sampling scheme for fast numerical analysis of the data is also detailed, as a coupled variance reduction technique and additional check on the canonical approach to estimating diving risk. The method suggests alternatives to the canonical approach. This work represents a first time correlation effort linking a dynamical bubble model with deep stop data. Supercomputing resources are requisite to connect model and data in application.

  11. Next Generation Security for the 10,240 Processor Columbia System

    NASA Technical Reports Server (NTRS)

    Hinke, Thomas; Kolano, Paul; Shaw, Derek; Keller, Chris; Tweton, Dave; Welch, Todd; Liu, Wen (Betty)

    2005-01-01

    This presentation includes a discussion of the Columbia 10,240-processor system located at the NASA Advanced Supercomputing (NAS) division at the NASA Ames Research Center which supports each of NASA's four missions: science, exploration systems, aeronautics, and space operations. It is comprised of 20 Silicon Graphics nodes, each consisting of 512 Itanium II processors. A 64 processor Columbia front-end system supports users as they prepare their jobs and then submits them to the PBS system. Columbia nodes and front-end systems use the Linux OS. Prior to SC04, the Columbia system was used to attain a processing speed of 51.87 TeraFlops, which made it number two on the Top 500 list of the world's supercomputers and the world's fastest "operational" supercomputer since it was fully engaged in supporting NASA users.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fang, Aiman; Laguna, Ignacio; Sato, Kento

    Future high-performance computing systems may face frequent failures with their rapid increase in scale and complexity. Resilience to faults has become a major challenge for large-scale applications running on supercomputers, which demands fault tolerance support for prevalent MPI applications. Among failure scenarios, process failures are one of the most severe issues as they usually lead to termination of applications. However, the widely used MPI implementations do not provide mechanisms for fault tolerance. We propose FTA-MPI (Fault Tolerance Assistant MPI), a programming model that provides support for failure detection, failure notification and recovery. Specifically, FTA-MPI exploits a try/catch model that enablesmore » failure localization and transparent recovery of process failures in MPI applications. We demonstrate FTA-MPI with synthetic applications and a molecular dynamics code CoMD, and show that FTA-MPI provides high programmability for users and enables convenient and flexible recovery of process failures.« less

  13. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.

  14. A Layered Solution for Supercomputing Storage

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grider, Gary

    To solve the supercomputing challenge of memory keeping up with processing speed, a team at Los Alamos National Laboratory developed two innovative memory management and storage technologies. Burst buffers peel off data onto flash memory to support the checkpoint/restart paradigm of large simulations. MarFS adds a thin software layer enabling a new tier for campaign storage—based on inexpensive, failure-prone disk drives—between disk drives and tape archives.

  15. A Heterogeneous High-Performance System for Computational and Computer Science

    DTIC Science & Technology

    2016-11-15

    Patents Submitted Patents Awarded Awards Graduate Students Names of Post Doctorates Names of Faculty Supported Names of Under Graduate students supported...team of research faculty from the departments of computer science and natural science at Bowie State University. The supercomputer is not only to...accelerated HPC systems. The supercomputer is also ideal for the research conducted in the Department of Natural Science, as research faculty work on

  16. LLMapReduce: Multi-Lingual Map-Reduce for Supercomputing Environments

    DTIC Science & Technology

    2015-11-20

    1990s. Popularized by Google [36] and Apache Hadoop [37], map-reduce has become a staple technology of the ever- growing big data community...Lexington, MA, U.S.A Abstract— The map-reduce parallel programming model has become extremely popular in the big data community. Many big data ...to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing simple parallel programming

  17. Advanced Numerical Techniques of Performance Evaluation. Volume 1

    DTIC Science & Technology

    1990-06-01

    system scheduling3thread. The scheduling thread then runs any other ready thread that can be found. A thread can only sleep or switch out on itself...Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers C...Kuck 1987] C.D. Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Trans. on Comp

  18. Supporting geoscience with graphical-user-interface Internet tools for the Macintosh

    NASA Astrophysics Data System (ADS)

    Robin, Bernard

    1995-07-01

    This paper describes a suite of Macintosh graphical-user-interface (GUI) software programs that can be used in conjunction with the Internet to support geoscience education. These software programs allow science educators to access and retrieve a large body of resources from an increasing number of network sites, taking advantage of the intuitive, simple-to-use Macintosh operating system. With these tools, educators easily can locate, download, and exchange not only text files but also sound resources, video movie clips, and software application files from their desktop computers. Another major advantage of these software tools is that they are available at no cost and may be distributed freely. The following GUI software tools are described including examples of how they can be used in an educational setting: ∗ Eudora—an e-mail program ∗ NewsWatcher—a newsreader ∗ TurboGopher—a Gopher program ∗ Fetch—a software application for easy File Transfer Protocol (FTP) ∗ NCSA Mosaic—a worldwide hypertext browsing program. An explosive growth of online archives currently is underway as new electronic sites are being added continuously to the Internet. Many of these resources may be of interest to science educators who learn they can share not only ASCII text files, but also graphic image files, sound resources, QuickTime movie clips, and hypermedia projects with colleagues from locations around the world. These powerful, yet simple to learn GUI software tools are providing a revolution in how knowledge can be accessed, retrieved, and shared.

  19. A Web Portal-Based Time-Aware KML Animation Tool for Exploring Spatiotemporal Dynamics of Hydrological Events

    NASA Astrophysics Data System (ADS)

    Bao, X.; Cai, X.; Liu, Y.

    2009-12-01

    Understanding spatiotemporal dynamics of hydrological events such as storms and droughts is highly valuable for decision making on disaster mitigation and recovery. Virtual Globe-based technologies such as Google Earth and Open Geospatial Consortium KML standards show great promises for collaborative exploration of such events using visual analytical approaches. However, currently there are two barriers for wider usage of such approaches. First, there lacks an easy way to use open source tools to convert legacy or existing data formats such as shapefiles, geotiff, or web services-based data sources to KML and to produce time-aware KML files. Second, an integrated web portal-based time-aware animation tool is currently not available. Thus users usually share their files in the portal but have no means to visually explore them without leaving the portal environment which the users are familiar with. We develop a web portal-based time-aware KML animation tool for viewing extreme hydrologic events. The tool is based on Google Earth JavaScript API and Java Portlet standard 2.0 JSR-286, and it is currently deployable in one of the most popular open source portal frameworks, namely Liferay. We have also developed an open source toolkit kml-soc-ncsa (http://code.google.com/p/kml-soc-ncsa/) to facilitate the conversion of multiple formats into KML and the creation of time-aware KML files. We illustrate our tool using some example cases, in which drought and storm events with both time and space dimension can be explored in this web-based KML animation portlet. The tool provides an easy-to-use web browser-based portal environment for multiple users to collaboratively share and explore their time-aware KML files as well as improving the understanding of the spatiotemporal dynamics of the hydrological events.

  20. An evaluation of the state of time synchronization on leadership class supercomputers

    DOE PAGES

    Jones, Terry; Ostrouchov, George; Koenig, Gregory A.; ...

    2017-10-09

    We present a detailed examination of time agreement characteristics for nodes within extreme-scale parallel computers. Using a software tool we introduce in this paper, we quantify attributes of clock skew among nodes in three representative high-performance computers sited at three national laboratories. Our measurements detail the statistical properties of time agreement among nodes and how time agreement drifts over typical application execution durations. We discuss the implications of our measurements, why the current state of the field is inadequate, and propose strategies to address observed shortcomings.

  1. Application of technology developed for flight simulation at NASA. Langley Research Center

    NASA Technical Reports Server (NTRS)

    Cleveland, Jeff I., II

    1991-01-01

    In order to meet the stringent time-critical requirements for real-time man-in-the-loop flight simulation, computer processing operations including mathematical model computation and data input/output to the simulators must be deterministic and be completed in as short a time as possible. Personnel at NASA's Langley Research Center are currently developing the use of supercomputers for simulation mathematical model computation for real-time simulation. This, coupled with the use of an open systems software architecture, will advance the state-of-the-art in real-time flight simulation.

  2. Unstructured grids on SIMD torus machines

    NASA Technical Reports Server (NTRS)

    Bjorstad, Petter E.; Schreiber, Robert

    1994-01-01

    Unstructured grids lead to unstructured communication on distributed memory parallel computers, a problem that has been considered difficult. Here, we consider adaptive, offline communication routing for a SIMD processor grid. Our approach is empirical. We use large data sets drawn from supercomputing applications instead of an analytic model of communication load. The chief contribution of this paper is an experimental demonstration of the effectiveness of certain routing heuristics. Our routing algorithm is adaptive, nonminimal, and is generally designed to exploit locality. We have a parallel implementation of the router, and we report on its performance.

  3. An evaluation of the state of time synchronization on leadership class supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, Terry; Ostrouchov, George; Koenig, Gregory A.

    We present a detailed examination of time agreement characteristics for nodes within extreme-scale parallel computers. Using a software tool we introduce in this paper, we quantify attributes of clock skew among nodes in three representative high-performance computers sited at three national laboratories. Our measurements detail the statistical properties of time agreement among nodes and how time agreement drifts over typical application execution durations. We discuss the implications of our measurements, why the current state of the field is inadequate, and propose strategies to address observed shortcomings.

  4. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.

  5. NASA's Pleiades Supercomputer Crunches Data For Groundbreaking Analysis and Visualizations

    NASA Image and Video Library

    2016-11-23

    The Pleiades supercomputer at NASA's Ames Research Center, recently named the 13th fastest computer in the world, provides scientists and researchers high-fidelity numerical modeling of complex systems and processes. By using detailed analyses and visualizations of large-scale data, Pleiades is helping to advance human knowledge and technology, from designing the next generation of aircraft and spacecraft to understanding the Earth's climate and the mysteries of our galaxy.

  6. A Layered Solution for Supercomputing Storage

    ScienceCinema

    Grider, Gary

    2018-06-13

    To solve the supercomputing challenge of memory keeping up with processing speed, a team at Los Alamos National Laboratory developed two innovative memory management and storage technologies. Burst buffers peel off data onto flash memory to support the checkpoint/restart paradigm of large simulations. MarFS adds a thin software layer enabling a new tier for campaign storage—based on inexpensive, failure-prone disk drives—between disk drives and tape archives.

  7. A Long History of Supercomputing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grider, Gary

    As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nation’s first computer to building the first machine to break the petaflop barrier, Los Alamos holds many “firsts” in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.

  8. ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.

    PubMed

    Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping

    2018-04-27

    A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.

  9. Graphics supercomputer for computational fluid dynamics research

    NASA Astrophysics Data System (ADS)

    Liaw, Goang S.

    1994-11-01

    The objective of this project is to purchase a state-of-the-art graphics supercomputer to improve the Computational Fluid Dynamics (CFD) research capability at Alabama A & M University (AAMU) and to support the Air Force research projects. A cutting-edge graphics supercomputer system, Onyx VTX, from Silicon Graphics Computer Systems (SGI), was purchased and installed. Other equipment including a desktop personal computer, PC-486 DX2 with a built-in 10-BaseT Ethernet card, a 10-BaseT hub, an Apple Laser Printer Select 360, and a notebook computer from Zenith were also purchased. A reading room has been converted to a research computer lab by adding some furniture and an air conditioning unit in order to provide an appropriate working environments for researchers and the purchase equipment. All the purchased equipment were successfully installed and are fully functional. Several research projects, including two existing Air Force projects, are being performed using these facilities.

  10. Modelling sodium cobaltate by mapping onto magnetic Ising model

    NASA Astrophysics Data System (ADS)

    Gemperline, Patrick; Morris, David Jonathan Pryce

    Fast Ion conductors are a class of crystals that are frequently used as battery materials, especially in smart phones, laptops, and other portable devices. Sodium Cobalt Oxide, NaxCoO2, falls into this class of crystals, but is unique because it possesses the ability to act as a thermoelectric material and a superconductor at different concentrations of Na+. The crystal lattice is mapped onto an Ising Magnetic Spin model and a Monte-Carol Simulation is used to find the most energetically favorable configuration of spins. This spin configuration is mapped back to the crystal lattice resulting in the most stable crystal structure of Sodium Cobalt Oxide at various concentrations. Knowing the atomic structures of the crystals will aid in the research of the materials capabilities and the possible uses of the material commercially. Ohio Supercomputer Center. 1987. Ohio Supercomputer Center. Columbus OH: Ohio Supercomputer Center. and the John Hauck Foundation.

  11. PyMercury: Interactive Python for the Mercury Monte Carlo Particle Transport Code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Iandola, F N; O'Brien, M J; Procassini, R J

    2010-11-29

    Monte Carlo particle transport applications are often written in low-level languages (C/C++) for optimal performance on clusters and supercomputers. However, this development approach often sacrifices straightforward usability and testing in the interest of fast application performance. To improve usability, some high-performance computing applications employ mixed-language programming with high-level and low-level languages. In this study, we consider the benefits of incorporating an interactive Python interface into a Monte Carlo application. With PyMercury, a new Python extension to the Mercury general-purpose Monte Carlo particle transport code, we improve application usability without diminishing performance. In two case studies, we illustrate how PyMercury improvesmore » usability and simplifies testing and validation in a Monte Carlo application. In short, PyMercury demonstrates the value of interactive Python for Monte Carlo particle transport applications. In the future, we expect interactive Python to play an increasingly significant role in Monte Carlo usage and testing.« less

  12. Ubiquitous Green Computing Techniques for High Demand Applications in Smart Environments

    PubMed Central

    Zapater, Marina; Sanchez, Cesar; Ayala, Jose L.; Moya, Jose M.; Risco-Martín, José L.

    2012-01-01

    Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time. PMID:23112621

  13. Scalable parallel distance field construction for large-scale applications

    DOE PAGES

    Yu, Hongfeng; Xie, Jinrong; Ma, Kwan -Liu; ...

    2015-10-01

    Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. Anew distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking overtime, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate itsmore » efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. In conclusion, our work greatly extends the usability of distance fields for demanding applications.« less

  14. Scalable Parallel Distance Field Construction for Large-Scale Applications.

    PubMed

    Yu, Hongfeng; Xie, Jinrong; Ma, Kwan-Liu; Kolla, Hemanth; Chen, Jacqueline H

    2015-10-01

    Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. A new distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking over time, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. Our work greatly extends the usability of distance fields for demanding applications.

  15. Ubiquitous green computing techniques for high demand applications in Smart environments.

    PubMed

    Zapater, Marina; Sanchez, Cesar; Ayala, Jose L; Moya, Jose M; Risco-Martín, José L

    2012-01-01

    Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time.

  16. Neighborhood Sociodemographic Predictors of Serious Emotional Disturbance (SED) in Schools: Demonstrating a Small Area Estimation Method in the National Comorbidity Survey (NCS-A) Adolescent Supplement

    PubMed Central

    Alegría, Margarita; Kessler, Ronald C.; McLaughlin, Katie A.; Gruber, Michael J.; Sampson, Nancy A.; Zaslavsky, Alan M.

    2014-01-01

    We evaluate the precision of a model estimating school prevalence of SED using a small area estimation method based on readily-available predictors from area-level census block data and school principal questionnaires. Adolescents at 314 schools participated in the National Comorbidity Supplement, a national survey of DSM-IV disorders among adolescents. A multilevel model indicated that predictors accounted for under half of the variance in school-level SED and even less when considering block-group predictors or principal report alone. While Census measures and principal questionnaires are significant predictors of individual-level SED, associations are too weak to generate precise school-level predictions of SED prevalence. PMID:24740174

  17. US Department of Energy High School Student Supercomputing Honors Program: A follow-up assessment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    1987-01-01

    The US DOE High School Student Supercomputing Honors Program was designed to recognize high school students with superior skills in mathematics and computer science and to provide them with formal training and experience with advanced computer equipment. This document reports on the participants who attended the first such program, which was held at the National Magnetic Fusion Energy Computer Center at the Lawrence Livermore National Laboratory (LLNL) during August 1985.

  18. Green Supercomputing at Argonne

    ScienceCinema

    Beckman, Pete

    2018-02-07

    Pete Beckman, head of Argonne's Leadership Computing Facility (ALCF) talks about Argonne National Laboratory's green supercomputing—everything from designing algorithms to use fewer kilowatts per operation to using cold Chicago winter air to cool the machine more efficiently. Argonne was recognized for green computing in the 2009 HPCwire Readers Choice Awards. More at http://www.anl.gov/Media_Center/News/2009/news091117.html Read more about the Argonne Leadership Computing Facility at http://www.alcf.anl.gov/

  19. Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kozacik, Stephen

    Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.

  20. A highly scalable particle tracking algorithm using partitioned global address space (PGAS) programming for extreme-scale turbulence simulations

    NASA Astrophysics Data System (ADS)

    Buaria, D.; Yeung, P. K.

    2017-12-01

    A new parallel algorithm utilizing a partitioned global address space (PGAS) programming model to achieve high scalability is reported for particle tracking in direct numerical simulations of turbulent fluid flow. The work is motivated by the desire to obtain Lagrangian information necessary for the study of turbulent dispersion at the largest problem sizes feasible on current and next-generation multi-petaflop supercomputers. A large population of fluid particles is distributed among parallel processes dynamically, based on instantaneous particle positions such that all of the interpolation information needed for each particle is available either locally on its host process or neighboring processes holding adjacent sub-domains of the velocity field. With cubic splines as the preferred interpolation method, the new algorithm is designed to minimize the need for communication, by transferring between adjacent processes only those spline coefficients determined to be necessary for specific particles. This transfer is implemented very efficiently as a one-sided communication, using Co-Array Fortran (CAF) features which facilitate small data movements between different local partitions of a large global array. The cost of monitoring transfer of particle properties between adjacent processes for particles migrating across sub-domain boundaries is found to be small. Detailed benchmarks are obtained on the Cray petascale supercomputer Blue Waters at the University of Illinois, Urbana-Champaign. For operations on the particles in a 81923 simulation (0.55 trillion grid points) on 262,144 Cray XE6 cores, the new algorithm is found to be orders of magnitude faster relative to a prior algorithm in which each particle is tracked by the same parallel process at all times. This large speedup reduces the additional cost of tracking of order 300 million particles to just over 50% of the cost of computing the Eulerian velocity field at this scale. Improving support of PGAS models on major compilers suggests that this algorithm will be of wider applicability on most upcoming supercomputers.

  1. Remote visualization and scale analysis of large turbulence datatsets

    NASA Astrophysics Data System (ADS)

    Livescu, D.; Pulido, J.; Burns, R.; Canada, C.; Ahrens, J.; Hamann, B.

    2015-12-01

    Accurate simulations of turbulent flows require solving all the dynamically relevant scales of motions. This technique, called Direct Numerical Simulation, has been successfully applied to a variety of simple flows; however, the large-scale flows encountered in Geophysical Fluid Dynamics (GFD) would require meshes outside the range of the most powerful supercomputers for the foreseeable future. Nevertheless, the current generation of petascale computers has enabled unprecedented simulations of many types of turbulent flows which focus on various GFD aspects, from the idealized configurations extensively studied in the past to more complex flows closer to the practical applications. The pace at which such simulations are performed only continues to increase; however, the simulations themselves are restricted to a small number of groups with access to large computational platforms. Yet the petabytes of turbulence data offer almost limitless information on many different aspects of the flow, from the hierarchy of turbulence moments, spectra and correlations, to structure-functions, geometrical properties, etc. The ability to share such datasets with other groups can significantly reduce the time to analyze the data, help the creative process and increase the pace of discovery. Using the largest DOE supercomputing platforms, we have performed some of the biggest turbulence simulations to date, in various configurations, addressing specific aspects of turbulence production and mixing mechanisms. Until recently, the visualization and analysis of such datasets was restricted by access to large supercomputers. The public Johns Hopkins Turbulence database simplifies the access to multi-Terabyte turbulence datasets and facilitates turbulence analysis through the use of commodity hardware. First, one of our datasets, which is part of the database, will be described and then a framework that adds high-speed visualization and wavelet support for multi-resolution analysis of turbulence will be highlighted. The addition of wavelet support reduces the latency and bandwidth requirements for visualization, allowing for many concurrent users, and enables new types of analyses, including scale decomposition and coherent feature extraction.

  2. An Analysis of Performance Enhancement Techniques for Overset Grid Applications

    NASA Technical Reports Server (NTRS)

    Djomehri, J. J.; Biswas, R.; Potsdam, M.; Strawn, R. C.; Biegel, Bryan (Technical Monitor)

    2002-01-01

    The overset grid methodology has significantly reduced time-to-solution of high-fidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement techniques on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machine. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.

  3. Study of the TRAC Airfoil Table Computational System

    NASA Technical Reports Server (NTRS)

    Hu, Hong

    1999-01-01

    The report documents the study of the application of the TRAC airfoil table computational package (TRACFOIL) to the prediction of 2D airfoil force and moment data over a wide range of angle of attack and Mach number. The TRACFOIL generates the standard C-81 airfoil table for input into rotorcraft comprehensive codes such as CAM- RAD. The existing TRACFOIL computer package is successfully modified to run on Digital alpha workstations and on Cray-C90 supercomputers. A step-by-step instruction for using the package on both computer platforms is provided. Application of the newer version of TRACFOIL is made for two airfoil sections. The C-81 data obtained using the TRACFOIL method are compared with those of wind-tunnel data and results are presented.

  4. RIACS

    NASA Technical Reports Server (NTRS)

    Oliger, Joseph

    1997-01-01

    Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project.

  5. Investigation of Carbohydrate Recognition via Computer Simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Quentin R.; Lindsay, Richard J.; Petridis, Loukas

    Carbohydrate recognition by proteins, such as lectins and other (bio)molecules, can be essential for many biological functions. Interest has arisen due to potential protein and drug design and future bioengineering applications. A quantitative measurement of carbohydrate-protein interaction is thus important for the full characterization of sugar recognition. Here, we focus on the aspect of utilizing computer simulations and biophysical models to evaluate the strength and specificity of carbohydrate recognition in this review. With increasing computational resources, better algorithms and refined modeling parameters, using state-of-the-art supercomputers to calculate the strength of the interaction between molecules has become increasingly mainstream. We reviewmore » the current state of this technique and its successful applications for studying protein-sugar interactions in recent years.« less

  6. Investigation of Carbohydrate Recognition via Computer Simulation

    DOE PAGES

    Johnson, Quentin R.; Lindsay, Richard J.; Petridis, Loukas; ...

    2015-04-28

    Carbohydrate recognition by proteins, such as lectins and other (bio)molecules, can be essential for many biological functions. Interest has arisen due to potential protein and drug design and future bioengineering applications. A quantitative measurement of carbohydrate-protein interaction is thus important for the full characterization of sugar recognition. Here, we focus on the aspect of utilizing computer simulations and biophysical models to evaluate the strength and specificity of carbohydrate recognition in this review. With increasing computational resources, better algorithms and refined modeling parameters, using state-of-the-art supercomputers to calculate the strength of the interaction between molecules has become increasingly mainstream. We reviewmore » the current state of this technique and its successful applications for studying protein-sugar interactions in recent years.« less

  7. A Performance Evaluation of the Cray X1 for Scientific Applications

    NASA Technical Reports Server (NTRS)

    Oliker, Leonid; Biswas, Rupak; Borrill, Julian; Canning, Andrew; Carter, Jonathan; Djomehri, M. Jahed; Shan, Hongzhang; Skinner, David

    2004-01-01

    The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to build high-end capability and cost effectiveness. However, the recent development of massively parallel vector systems is having a significant effect on the supercomputing landscape. In this paper, we compare the performance of the recently released Cray X1 vector system with that of the cacheless NEC SX-6 vector machine, and the superscalar cache-based IBM Power3 and Power4 architectures for scientific applications. Overall results demonstrate that the X1 is quite promising, but performance improvements are expected as the hardware, systems software, and numerical libraries mature. Code reengineering to effectively utilize the complex architecture may also lead to significant efficiency enhancements.

  8. Supercomputing with TOUGH2 family codes for coupled multi-physics simulations of geologic carbon sequestration

    NASA Astrophysics Data System (ADS)

    Yamamoto, H.; Nakajima, K.; Zhang, K.; Nanai, S.

    2015-12-01

    Powerful numerical codes that are capable of modeling complex coupled processes of physics and chemistry have been developed for predicting the fate of CO2 in reservoirs as well as its potential impacts on groundwater and subsurface environments. However, they are often computationally demanding for solving highly non-linear models in sufficient spatial and temporal resolutions. Geological heterogeneity and uncertainties further increase the challenges in modeling works. Two-phase flow simulations in heterogeneous media usually require much longer computational time than that in homogeneous media. Uncertainties in reservoir properties may necessitate stochastic simulations with multiple realizations. Recently, massively parallel supercomputers with more than thousands of processors become available in scientific and engineering communities. Such supercomputers may attract attentions from geoscientist and reservoir engineers for solving the large and non-linear models in higher resolutions within a reasonable time. However, for making it a useful tool, it is essential to tackle several practical obstacles to utilize large number of processors effectively for general-purpose reservoir simulators. We have implemented massively-parallel versions of two TOUGH2 family codes (a multi-phase flow simulator TOUGH2 and a chemically reactive transport simulator TOUGHREACT) on two different types (vector- and scalar-type) of supercomputers with a thousand to tens of thousands of processors. After completing implementation and extensive tune-up on the supercomputers, the computational performance was measured for three simulations with multi-million grid models, including a simulation of the dissolution-diffusion-convection process that requires high spatial and temporal resolutions to simulate the growth of small convective fingers of CO2-dissolved water to larger ones in a reservoir scale. The performance measurement confirmed that the both simulators exhibit excellent scalabilities showing almost linear speedup against number of processors up to over ten thousand cores. Generally this allows us to perform coupled multi-physics (THC) simulations on high resolution geologic models with multi-million grid in a practical time (e.g., less than a second per time step).

  9. Requirements for a network storage service

    NASA Technical Reports Server (NTRS)

    Kelly, Suzanne M.; Haynes, Rena A.

    1991-01-01

    Sandia National Laboratories provides a high performance classified computer network as a core capability in support of its mission of nuclear weapons design and engineering, physical sciences research, and energy research and development. The network, locally known as the Internal Secure Network (ISN), comprises multiple distributed local area networks (LAN's) residing in New Mexico and California. The TCP/IP protocol suite is used for inter-node communications. Scientific workstations and mid-range computers, running UNIX-based operating systems, compose most LAN's. One LAN, operated by the Sandia Corporate Computing Computing Directorate, is a general purpose resource providing a supercomputer and a file server to the entire ISN. The current file server on the supercomputer LAN is an implementation of the Common File Server (CFS). Subsequent to the design of the ISN, Sandia reviewed its mass storage requirements and chose to enter into a competitive procurement to replace the existing file server with one more adaptable to a UNIX/TCP/IP environment. The requirements study for the network was the starting point for the requirements study for the new file server. The file server is called the Network Storage Service (NSS) and its requirements are described. An application or functional description of the NSS is given. The final section adds performance, capacity, and access constraints to the requirements.

  10. Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

    NASA Astrophysics Data System (ADS)

    Amooie, M. A.; Moortgat, J.

    2017-12-01

    We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.

  11. HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies

    NASA Astrophysics Data System (ADS)

    De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.

    2017-10-01

    PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.

  12. Earth and environmental science in the 1980's: Part 1: Environmental data systems, supercomputer facilities and networks

    NASA Technical Reports Server (NTRS)

    1986-01-01

    Overview descriptions of on-line environmental data systems, supercomputer facilities, and networks are presented. Each description addresses the concepts of content, capability, and user access relevant to the point of view of potential utilization by the Earth and environmental science community. The information on similar systems or facilities is presented in parallel fashion to encourage and facilitate intercomparison. In addition, summary sheets are given for each description, and a summary table precedes each section.

  13. A Long History of Supercomputing

    ScienceCinema

    Grider, Gary

    2018-06-13

    As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nation’s first computer to building the first machine to break the petaflop barrier, Los Alamos holds many “firsts” in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.

  14. LightForce Photon-Pressure Collision Avoidance: Updated Efficiency Analysis Utilizing a Highly Parallel Simulation Approach

    DTIC Science & Technology

    2014-09-01

    simulation time frame from 30 days to one year. This was enabled by porting the simulation to the Pleiades supercomputer at NASA Ames Research Center, a...including the motivation for changes to our past approach. We then present the software implementation (3) on the NASA Ames Pleiades supercomputer...significantly updated since last year’s paper [25]. The main incentive for that was the shift to a highly parallel approach in order to utilize the Pleiades

  15. Parallel-Vector Algorithm For Rapid Structural Anlysis

    NASA Technical Reports Server (NTRS)

    Agarwal, Tarun R.; Nguyen, Duc T.; Storaasli, Olaf O.

    1993-01-01

    New algorithm developed to overcome deficiency of skyline storage scheme by use of variable-band storage scheme. Exploits both parallel and vector capabilities of modern high-performance computers. Gives engineers and designers opportunity to include more design variables and constraints during optimization of structures. Enables use of more refined finite-element meshes to obtain improved understanding of complex behaviors of aerospace structures leading to better, safer designs. Not only attractive for current supercomputers but also for next generation of shared-memory supercomputers.

  16. Science and Technology Review June 2000

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    de Pruneda, J.H.

    2000-06-01

    This issue contains the following articles: (1) ''Accelerating on the ASCI Challenge''. (2) ''New Day Daws in Supercomputing'' When the ASCI White supercomputer comes online this summer, DOE's Stockpile Stewardship Program will make another significant advanced toward helping to ensure the safety, reliability, and performance of the nation's nuclear weapons. (3) ''Uncovering the Secrets of Actinides'' Researchers are obtaining fundamental information about the actinides, a group of elements with a key role in nuclear weapons and fuels. (4) ''A Predictable Structure for Aerogels''. (5) ''Tibet--Where Continents Collide''.

  17. PerSEUS: Ultra-Low-Power High Performance Computing for Plasma Simulations

    NASA Astrophysics Data System (ADS)

    Doxas, I.; Andreou, A.; Lyon, J.; Angelopoulos, V.; Lu, S.; Pritchett, P. L.

    2017-12-01

    Peta-op SupErcomputing Unconventional System (PerSEUS) aims to explore the use for High Performance Scientific Computing (HPC) of ultra-low-power mixed signal unconventional computational elements developed by Johns Hopkins University (JHU), and demonstrate that capability on both fluid and particle Plasma codes. We will describe the JHU Mixed-signal Unconventional Supercomputing Elements (MUSE), and report initial results for the Lyon-Fedder-Mobarry (LFM) global magnetospheric MHD code, and a UCLA general purpose relativistic Particle-In-Cell (PIC) code.

  18. Heart Fibrillation and Parallel Supercomputers

    NASA Technical Reports Server (NTRS)

    Kogan, B. Y.; Karplus, W. J.; Chudin, E. E.

    1997-01-01

    The Luo and Rudy 3 cardiac cell mathematical model is implemented on the parallel supercomputer CRAY - T3D. The splitting algorithm combined with variable time step and an explicit method of integration provide reasonable solution times and almost perfect scaling for rectilinear wave propagation. The computer simulation makes it possible to observe new phenomena: the break-up of spiral waves caused by intracellular calcium and dynamics and the non-uniformity of the calcium distribution in space during the onset of the spiral wave.

  19. AIC Computations Using Navier-Stokes Equations on Single Image Supercomputers For Design Optimization

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru

    2004-01-01

    A procedure to accurately generate AIC using the Navier-Stokes solver including grid deformation is presented. Preliminary results show good comparisons between experiment and computed flutter boundaries for a rectangular wing. A full wing body configuration of an orbital space plane is selected for demonstration on a large number of processors. In the final paper the AIC of full wing body configuration will be computed. The scalability of the procedure on supercomputer will be demonstrated.

  20. Discover Supercomputer 5

    NASA Image and Video Library

    2017-12-08

    Two rows of the “Discover” supercomputer at the NASA Center for Climate Simulation (NCCS) contain more than 4,000 computer processors. Discover has a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.

  1. Discover Supercomputer 4

    NASA Image and Video Library

    2017-12-08

    This close-up view highlights one row—approximately 2,000 computer processors—of the “Discover” supercomputer at the NASA Center for Climate Simulation (NCCS). Discover has a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.

  2. Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide

    DOE PAGES

    Tang, William; Wang, Bei; Ethier, Stephane; ...

    2016-11-01

    The goal of the extreme scale plasma turbulence studies described in this paper is to expedite the delivery of reliable predictions on confinement physics in large magnetic fusion systems by using world-class supercomputers to carry out simulations with unprecedented resolution and temporal duration. This has involved architecture-dependent optimizations of performance scaling and addressing code portability and energy issues, with the metrics for multi-platform comparisons being 'time-to-solution' and 'energy-to-solution'. Realistic results addressing how confinement losses caused by plasma turbulence scale from present-day devices to the much larger $25 billion international ITER fusion facility have been enabled by innovative advances in themore » GTC-P code including (i) implementation of one-sided communication from MPI 3.0 standard; (ii) creative optimization techniques on Xeon Phi processors; and (iii) development of a novel performance model for the key kernels of the PIC code. Our results show that modeling data movement is sufficient to predict performance on modern supercomputer platforms.« less

  3. Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization

    NASA Technical Reports Server (NTRS)

    Jones, James Patton; Nitzberg, Bill

    1999-01-01

    The NAS facility has operated parallel supercomputers for the past 11 years, including the Intel iPSC/860, Intel Paragon, Thinking Machines CM-5, IBM SP-2, and Cray Origin 2000. Across this wide variety of machine architectures, across a span of 10 years, across a large number of different users, and through thousands of minor configuration and policy changes, the utilization of these machines shows three general trends: (1) scheduling using a naive FIFO first-fit policy results in 40-60% utilization, (2) switching to the more sophisticated dynamic backfilling scheduling algorithm improves utilization by about 15 percentage points (yielding about 70% utilization), and (3) reducing the maximum allowable job size further increases utilization. Most surprising is the consistency of these trends. Over the lifetime of the NAS parallel systems, we made hundreds, perhaps thousands, of small changes to hardware, software, and policy, yet, utilization was affected little. In particular these results show that the goal of achieving near 100% utilization while supporting a real parallel supercomputing workload is unrealistic.

  4. Data communication requirements for the advanced NAS network

    NASA Technical Reports Server (NTRS)

    Levin, Eugene; Eaton, C. K.; Young, Bruce

    1986-01-01

    The goal of the Numerical Aerodynamic Simulation (NAS) Program is to provide a powerful computational environment for advanced research and development in aeronautics and related disciplines. The present NAS system consists of a Cray 2 supercomputer connected by a data network to a large mass storage system, to sophisticated local graphics workstations, and by remote communications to researchers throughout the United States. The program plan is to continue acquiring the most powerful supercomputers as they become available. In the 1987/1988 time period it is anticipated that a computer with 4 times the processing speed of a Cray 2 will be obtained and by 1990 an additional supercomputer with 16 times the speed of the Cray 2. The implications of this 20-fold increase in processing power on the data communications requirements are described. The analysis was based on models of the projected workload and system architecture. The results are presented together with the estimates of their sensitivity to assumptions inherent in the models.

  5. Improved cyberinfrastructure for integrated hydrometeorological predictions within the fully-coupled WRF-Hydro modeling system

    NASA Astrophysics Data System (ADS)

    gochis, David; hooper, Rick; parodi, Antonio; Jha, Shantenu; Yu, Wei; Zaslavsky, Ilya; Ganapati, Dinesh

    2014-05-01

    The community WRF-Hydro system is currently being used in a variety of flood prediction and regional hydroclimate impacts assessment applications around the world. Despite its increasingly wide use certain cyberinfrastructure bottlenecks exist in the setup, execution and post-processing of WRF-Hydro model runs. These bottlenecks result in wasted time, labor, data transfer bandwidth and computational resource use. Appropriate development and use of cyberinfrastructure to setup and manage WRF-Hydro modeling applications will streamline the entire workflow of hydrologic model predictions. This talk will present recent advances in the development and use of new open-source cyberinfrastructure tools for the WRF-Hydro architecture. These tools include new web-accessible pre-processing applications, supercomputer job management applications and automated verification and visualization applications. The tools will be described successively and then demonstrated in a set of flash flood use cases for recent destructive flood events in the U.S. and in Europe. Throughout, an emphasis on the implementation and use of community data standards for data exchange is made.

  6. Radio Astronomy at the Centre for High Performance Computing in South Africa

    NASA Astrophysics Data System (ADS)

    Catherine Cress; UWC Simulation Team

    2014-04-01

    I will present results on galaxy evolution and cosmology which we obtained using the supercomputing facilities at the CHPC. These include cosmological-scale N-body simulations modelling neutral hydrogen as well as the study of the clustering of radio galaxies to probe the relationship between dark and luminous matter in the universe. I will also discuss the various roles that the CHPC is playing in Astronomy in SA, including the provision of HPC for a variety of Astronomical applications, the provision of storage for radio data, our educational programs and our participation in planning for the SKA.

  7. Simulation of High-Resolution Magnetic Resonance Images on the IBM Blue Gene/L Supercomputer Using SIMRI

    DOE PAGES

    Baum, K. G.; Menezes, G.; Helguera, M.

    2011-01-01

    Medical imaging system simulators are tools that provide a means to evaluate system architecture and create artificial image sets that are appropriate for specific applications. We have modified SIMRI, a Bloch equation-based magnetic resonance image simulator, in order to successfully generate high-resolution 3D MR images of the Montreal brain phantom using Blue Gene/L systems. Results show that redistribution of the workload allows an anatomically accurate 256 3 voxel spin-echo simulation in less than 5 hours when executed on an 8192-node partition of a Blue Gene/L system.

  8. A visualization environment for supercomputing-based applications in computational mechanics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pavlakos, C.J.; Schoof, L.A.; Mareda, J.F.

    1993-06-01

    In this paper, we characterize a visualization environment that has been designed and prototyped for a large community of scientists and engineers, with an emphasis in superconducting-based computational mechanics. The proposed environment makes use of a visualization server concept to provide effective, interactive visualization to the user`s desktop. Benefits of using the visualization server approach are discussed. Some thoughts regarding desirable features for visualization server hardware architectures are also addressed. A brief discussion of the software environment is included. The paper concludes by summarizing certain observations which we have made regarding the implementation of such visualization environments.

  9. Simulation of High-Resolution Magnetic Resonance Images on the IBM Blue Gene/L Supercomputer Using SIMRI.

    PubMed

    Baum, K G; Menezes, G; Helguera, M

    2011-01-01

    Medical imaging system simulators are tools that provide a means to evaluate system architecture and create artificial image sets that are appropriate for specific applications. We have modified SIMRI, a Bloch equation-based magnetic resonance image simulator, in order to successfully generate high-resolution 3D MR images of the Montreal brain phantom using Blue Gene/L systems. Results show that redistribution of the workload allows an anatomically accurate 256(3) voxel spin-echo simulation in less than 5 hours when executed on an 8192-node partition of a Blue Gene/L system.

  10. A numerical code for the simulation of non-equilibrium chemically reacting flows on hybrid CPU-GPU clusters

    NASA Astrophysics Data System (ADS)

    Kudryavtsev, Alexey N.; Kashkovsky, Alexander V.; Borisov, Semyon P.; Shershnev, Anton A.

    2017-10-01

    In the present work a computer code RCFS for numerical simulation of chemically reacting compressible flows on hybrid CPU/GPU supercomputers is developed. It solves 3D unsteady Euler equations for multispecies chemically reacting flows in general curvilinear coordinates using shock-capturing TVD schemes. Time advancement is carried out using the explicit Runge-Kutta TVD schemes. Program implementation uses CUDA application programming interface to perform GPU computations. Data between GPUs is distributed via domain decomposition technique. The developed code is verified on the number of test cases including supersonic flow over a cylinder.

  11. The CSM testbed software system: A development environment for structural analysis methods on the NAS CRAY-2

    NASA Technical Reports Server (NTRS)

    Gillian, Ronnie E.; Lotts, Christine G.

    1988-01-01

    The Computational Structural Mechanics (CSM) Activity at Langley Research Center is developing methods for structural analysis on modern computers. To facilitate that research effort, an applications development environment has been constructed to insulate the researcher from the many computer operating systems of a widely distributed computer network. The CSM Testbed development system was ported to the Numerical Aerodynamic Simulator (NAS) Cray-2, at the Ames Research Center, to provide a high end computational capability. This paper describes the implementation experiences, the resulting capability, and the future directions for the Testbed on supercomputers.

  12. Application of computational physics within Northrop

    NASA Technical Reports Server (NTRS)

    George, M. W.; Ling, R. T.; Mangus, J. F.; Thompkins, W. T.

    1987-01-01

    An overview of Northrop programs in computational physics is presented. These programs depend on access to today's supercomputers, such as the Numerical Aerodynamical Simulator (NAS), and future growth on the continuing evolution of computational engines. Descriptions here are concentrated on the following areas: computational fluid dynamics (CFD), computational electromagnetics (CEM), computer architectures, and expert systems. Current efforts and future directions in these areas are presented. The impact of advances in the CFD area is described, and parallels are drawn to analagous developments in CEM. The relationship between advances in these areas and the development of advances (parallel) architectures and expert systems is also presented.

  13. A History of High-Performance Computing

    NASA Technical Reports Server (NTRS)

    2006-01-01

    Faster than most speedy computers. More powerful than its NASA data-processing predecessors. Able to leap large, mission-related computational problems in a single bound. Clearly, it s neither a bird nor a plane, nor does it need to don a red cape, because it s super in its own way. It's Columbia, NASA s newest supercomputer and one of the world s most powerful production/processing units. Named Columbia to honor the STS-107 Space Shuttle Columbia crewmembers, the new supercomputer is making it possible for NASA to achieve breakthroughs in science and engineering, fulfilling the Agency s missions, and, ultimately, the Vision for Space Exploration. Shortly after being built in 2004, Columbia achieved a benchmark rating of 51.9 teraflop/s on 10,240 processors, making it the world s fastest operational computer at the time of completion. Putting this speed into perspective, 20 years ago, the most powerful computer at NASA s Ames Research Center, home of the NASA Advanced Supercomputing Division (NAS), ran at a speed of about 1 gigaflop (one billion calculations per second). The Columbia supercomputer is 50,000 times faster than this computer and offers a tenfold increase in capacity over the prior system housed at Ames. What s more, Columbia is considered the world s largest Linux-based, shared-memory system. The system is offering immeasurable benefits to society and is the zenith of years of NASA/private industry collaboration that has spawned new generations of commercial, high-speed computing systems.

  14. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    PubMed

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  15. Integrating Visualization Applications, such as ParaView, into HEP Software Frameworks for In-situ Event Displays

    NASA Astrophysics Data System (ADS)

    Lyon, A. L.; Kowalkowski, J. B.; Jones, C. D.

    2017-10-01

    ParaView is a high performance visualization application not widely used in High Energy Physics (HEP). It is a long standing open source project led by Kitware and involves several Department of Energy (DOE) and Department of Defense (DOD) laboratories. Futhermore, it has been adopted by many DOE supercomputing centers and other sites. ParaView is unique in speed and efficiency by using state-of-the-art techniques developed by the academic visualization community that are often not found in applications written by the HEP community. In-situ visualization of events, where event details are visualized during processing/analysis, is a common task for experiment software frameworks. Kitware supplies Catalyst, a library that enables scientific software to serve visualization objects to client ParaView viewers yielding a real-time event display. Connecting ParaView to the Fermilab art framework will be described and the capabilities it brings discussed.

  16. High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations

    NASA Technical Reports Server (NTRS)

    Parks, John (Technical Monitor); Biswas, Rupak; Yan, Jerry C.; Brooks, Walter F.; Sterling, Thomas L.

    2003-01-01

    Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.

  17. Quantum information processing with superconducting circuits: a review.

    PubMed

    Wendin, G

    2017-10-01

    During the last ten years, superconducting circuits have passed from being interesting physical devices to becoming contenders for near-future useful and scalable quantum information processing (QIP). Advanced quantum simulation experiments have been shown with up to nine qubits, while a demonstration of quantum supremacy with fifty qubits is anticipated in just a few years. Quantum supremacy means that the quantum system can no longer be simulated by the most powerful classical supercomputers. Integrated classical-quantum computing systems are already emerging that can be used for software development and experimentation, even via web interfaces. Therefore, the time is ripe for describing some of the recent development of superconducting devices, systems and applications. As such, the discussion of superconducting qubits and circuits is limited to devices that are proven useful for current or near future applications. Consequently, the centre of interest is the practical applications of QIP, such as computation and simulation in Physics and Chemistry.

  18. Quantum information processing with superconducting circuits: a review

    NASA Astrophysics Data System (ADS)

    Wendin, G.

    2017-10-01

    During the last ten years, superconducting circuits have passed from being interesting physical devices to becoming contenders for near-future useful and scalable quantum information processing (QIP). Advanced quantum simulation experiments have been shown with up to nine qubits, while a demonstration of quantum supremacy with fifty qubits is anticipated in just a few years. Quantum supremacy means that the quantum system can no longer be simulated by the most powerful classical supercomputers. Integrated classical-quantum computing systems are already emerging that can be used for software development and experimentation, even via web interfaces. Therefore, the time is ripe for describing some of the recent development of superconducting devices, systems and applications. As such, the discussion of superconducting qubits and circuits is limited to devices that are proven useful for current or near future applications. Consequently, the centre of interest is the practical applications of QIP, such as computation and simulation in Physics and Chemistry.

  19. Performance Enhancement Strategies for Multi-Block Overset Grid CFD Applications

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Biswas, Rupak

    2003-01-01

    The overset grid methodology has significantly reduced time-to-solution of highfidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement strategies on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machinc. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Details of a sophisticated graph partitioning technique for grid grouping are also provided. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.

  20. Supercomputer analysis of sedimentary basins.

    PubMed

    Bethke, C M; Altaner, S P; Harrison, W J; Upson, C

    1988-01-15

    Geological processes of fluid transport and chemical reaction in sedimentary basins have formed many of the earth's energy and mineral resources. These processes can be analyzed on natural time and distance scales with the use of supercomputers. Numerical experiments are presented that give insights to the factors controlling subsurface pressures, temperatures, and reactions; the origin of ores; and the distribution and quality of hydrocarbon reservoirs. The results show that numerical analysis combined with stratigraphic, sea level, and plate tectonic histories provides a powerful tool for studying the evolution of sedimentary basins over geologic time.

  1. Discover Supercomputer 3

    NASA Image and Video Library

    2017-12-08

    The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.

  2. Discover Supercomputer 2

    NASA Image and Video Library

    2017-12-08

    The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.

  3. Discover Supercomputer 1

    NASA Image and Video Library

    2017-12-08

    The heart of the NASA Center for Climate Simulation (NCCS) is the “Discover” supercomputer. In 2009, NCCS added more than 8,000 computer processors to Discover, for a total of nearly 15,000 processors. Credit: NASA/Pat Izzo To learn more about NCCS go to: www.nasa.gov/topics/earth/features/climate-sim-center.html NASA Goddard Space Flight Center is home to the nation's largest organization of combined scientists, engineers and technologists that build spacecraft, instruments and new technology to study the Earth, the sun, our solar system, and the universe.

  4. The Navier-Stokes computer

    NASA Technical Reports Server (NTRS)

    Nosenchuck, D. M.; Littman, M. G.

    1986-01-01

    The Navier-Stokes computer (NSC) has been developed for solving problems in fluid mechanics involving complex flow simulations that require more speed and capacity than provided by current and proposed Class VI supercomputers. The machine is a parallel processing supercomputer with several new architectural elements which can be programmed to address a wide range of problems meeting the following criteria: (1) the problem is numerically intensive, and (2) the code makes use of long vectors. A simulation of two-dimensional nonsteady viscous flows is presented to illustrate the architecture, programming, and some of the capabilities of the NSC.

  5. Merging the Machines of Modern Science

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolf, Laura; Collins, Jim

    Two recent projects have harnessed supercomputing resources at the US Department of Energy’s Argonne National Laboratory in a novel way to support major fusion science and particle collider experiments. Using leadership computing resources, one team ran fine-grid analysis of real-time data to make near-real-time adjustments to an ongoing experiment, while a second team is working to integrate Argonne’s supercomputers into the Large Hadron Collider/ATLAS workflow. Together these efforts represent a new paradigm of the high-performance computing center as a partner in experimental science.

  6. A Performance Evaluation of the Cray X1 for Scientific Applications

    NASA Technical Reports Server (NTRS)

    Oliker, Leonid; Biswas, Rupak; Borrill, Julian; Canning, Andrew; Carter, Jonathan; Djomehri, M. Jahed; Shan, Hongzhang; Skinner, David

    2003-01-01

    The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to build high-end capability and capacity computers because of their generality, scalability, and cost effectiveness. However, the recent development of massively parallel vector systems is having a significant effect on the supercomputing landscape. In this paper, we compare the performance of the recently-released Cray X1 vector system with that of the cacheless NEC SX-6 vector machine, and the superscalar cache-based IBM Power3 and Power4 architectures for scientific applications. Overall results demonstrate that the X1 is quite promising, but performance improvements are expected as the hardware, systems software, and numerical libraries mature. Code reengineering to effectively utilize the complex architecture may also lead to significant efficiency enhancements.

  7. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  8. Evaluation of Job Queuing/Scheduling Software: Phase I Report

    NASA Technical Reports Server (NTRS)

    Jones, James Patton

    1996-01-01

    The recent proliferation of high performance work stations and the increased reliability of parallel systems have illustrated the need for robust job management systems to support parallel applications. To address this issue, the national Aerodynamic Simulation (NAS) supercomputer facility compiled a requirements checklist for job queuing/scheduling software. Next, NAS began an evaluation of the leading job management system (JMS) software packages against the checklist. This report describes the three-phase evaluation process, and presents the results of Phase 1: Capabilities versus Requirements. We show that JMS support for running parallel applications on clusters of workstations and parallel systems is still insufficient, even in the leading JMS's. However, by ranking each JMS evaluated against the requirements, we provide data that will be useful to other sites in selecting a JMS.

  9. Towards quantum chemistry on a quantum computer.

    PubMed

    Lanyon, B P; Whitfield, J D; Gillett, G G; Goggin, M E; Almeida, M P; Kassal, I; Biamonte, J D; Mohseni, M; Powell, B J; Barbieri, M; Aspuru-Guzik, A; White, A G

    2010-02-01

    Exact first-principles calculations of molecular properties are currently intractable because their computational cost grows exponentially with both the number of atoms and basis set size. A solution is to move to a radically different model of computing by building a quantum computer, which is a device that uses quantum systems themselves to store and process data. Here we report the application of the latest photonic quantum computer technology to calculate properties of the smallest molecular system: the hydrogen molecule in a minimal basis. We calculate the complete energy spectrum to 20 bits of precision and discuss how the technique can be expanded to solve large-scale chemical problems that lie beyond the reach of modern supercomputers. These results represent an early practical step toward a powerful tool with a broad range of quantum-chemical applications.

  10. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, A.; Ellis, Carla; Kotz, David; Nieuwejaar, Nils; Best, Michael L.

    1995-01-01

    High-performance parallel file systems are needed to satisfy tremendous I/O requirements of parallel scientific applications. The design of such high-performance parallel file systems depends on a comprehensive understanding of the expected workload, but so far there have been very few usage studies of multiprocessor file systems. This paper is part of the CHARISMA project, which intends to fill this void by measuring real file-system workloads on various production parallel machines. In particular, we present results from the CM-5 at the National Center for Supercomputing Applications. Our results are unique because we collect information about nearly every individual I/O request from the mix of jobs running on the machine. Analysis of the traces leads to various recommendations for parallel file-system design.

  11. Havens: Explicit Reliable Memory Regions for HPC Applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hukerikar, Saurabh; Engelmann, Christian

    2016-01-01

    Supporting error resilience in future exascale-class supercomputing systems is a critical challenge. Due to transistor scaling trends and increasing memory density, scientific simulations are expected to experience more interruptions caused by transient errors in the system memory. Existing hardware-based detection and recovery techniques will be inadequate to manage the presence of high memory fault rates. In this paper we propose a partial memory protection scheme based on region-based memory management. We define the concept of regions called havens that provide fault protection for program objects. We provide reliability for the regions through a software-based parity protection mechanism. Our approach enablesmore » critical program objects to be placed in these havens. The fault coverage provided by our approach is application agnostic, unlike algorithm-based fault tolerance techniques.« less

  12. Adventures in supercomputing, a K-12 program in computational science: An assessment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oliver, C.E.; Hicks, H.R.; Iles-Brechak, K.D.

    1994-10-01

    In this paper, the authors describe only those elements of the Department of Energy Adventures in Supercomputing (AiS) program for high school teachers, such as school selection, which have a direct bearing on assessment. Schools submit an application to participate in the AiS program. They propose a team of at least two teachers to implement the AiS curriculum. The applications are evaluated by selection committees in each of the five participating states to determine which schools are the most qualified to carry out the program and reach a significant number of women, minorities, and economically disadvantaged students, all of whommore » have historically been underrepresented in the sciences. Typically, selected schools either have a large disadvantaged student population, or the applying teachers propose specific means to attract these segments of their student body into AiS classes. Some areas with AiS schools have significant numbers of minority students, some have economically disadvantaged, usually rural, students, and all areas have the potential to reach a higher proportion of women than technical classes usually attract. This report presents preliminary findings based on three types of data: demographic, student journals, and contextual. Demographic information is obtained for both students and teachers. Students have been asked to maintain journals which include replies to specific questions that are posed each month. An analysis of the answers to these questions helps to form a picture of how students progress through the course of the school year. Onsite visits by assessment professionals conducting student and teacher interviews, provide a more in depth, qualitative basis for understanding student motivations.« less

  13. Fortran for the nineties

    NASA Technical Reports Server (NTRS)

    Himer, J. T.

    1992-01-01

    Fortran has largely enjoyed prominence for the past few decades as the computer programming language of choice for numerically intensive scientific, engineering, and process control applications. Fortran's well understood static language syntax has allowed resulting parsers and compiler optimizing technologies to often generate among the most efficient and fastest run-time executables, particularly on high-end scalar and vector supercomputers. Computing architectures and paradigms have changed considerably since the last ANSI/ISO Fortran release in 1978, and while FORTRAN 77 has more than survived, it's aged features provide only partial functionality for today's demanding computing environments. The simple block procedural languages have been necessarily evolving, or giving way, to specialized supercomputing, network resource, and object-oriented paradigms. To address these new computing demands, ANSI has worked for the last 12-years with three international public reviews to deliver Fortran 90. Fortran 90 has superseded and replaced ISO FORTRAN 77 internationally as the sole Fortran standard; while in the US, Fortran 90 is expected to be adopted as the ANSI standard this summer, coexisting with ANSI FORTRAN 77 until at least 1996. The development path and current state of Fortran will be briefly described highlighting the many new Fortran 90 syntactic and semantic additions which support (among others): free form source; array syntax; new control structures; modules and interfaces; pointers; derived data types; dynamic memory; enhanced I/O; operator overloading; data abstraction; user optional arguments; new intrinsics for array, bit manipulation, and system inquiry; and enhanced portability through better generic control of underlying system arithmetic models. Examples from dynamical astronomy, signal and image processing will attempt to illustrate Fortran 90's applicability to today's general scalar, vector, and parallel scientific and engineering requirements and object oriented programming paradigms. Time permitting, current work proceeding on the future development of Fortran 2000 and collateral standards will be introduced.

  14. Automatic Parallelization of Numerical Python Applications using the Global Arrays Toolkit

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daily, Jeffrey A.; Lewis, Robert R.

    2011-11-30

    Global Arrays is a software system from Pacific Northwest National Laboratory that enables an efficient, portable, and parallel shared-memory programming interface to manipulate distributed dense arrays. The NumPy module is the de facto standard for numerical calculation in the Python programming language, a language whose use is growing rapidly in the scientific and engineering communities. NumPy provides a powerful N-dimensional array class as well as other scientific computing capabilities. However, like the majority of the core Python modules, NumPy is inherently serial. Using a combination of Global Arrays and NumPy, we have reimplemented NumPy as a distributed drop-in replacement calledmore » Global Arrays in NumPy (GAiN). Serial NumPy applications can become parallel, scalable GAiN applications with only minor source code changes. Scalability studies of several different GAiN applications will be presented showing the utility of developing serial NumPy codes which can later run on more capable clusters or supercomputers.« less

  15. HACC: Simulating sky surveys on state-of-the-art supercomputing architectures

    NASA Astrophysics Data System (ADS)

    Habib, Salman; Pope, Adrian; Finkel, Hal; Frontiere, Nicholas; Heitmann, Katrin; Daniel, David; Fasel, Patricia; Morozov, Vitali; Zagaris, George; Peterka, Tom; Vishwanath, Venkatram; Lukić, Zarija; Sehrish, Saba; Liao, Wei-keng

    2016-01-01

    Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the 'Dark Universe', dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers that enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models and algorithms. It has been demonstrated at scale on Cell- and GPU-accelerated systems, standard multi-core node clusters, and Blue Gene systems. HACC's design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available. We present a description of the design philosophy of HACC, the underlying algorithms and code structure, and outline implementation details for several specific architectures. We show selected accuracy and performance results from some of the largest high resolution cosmological simulations so far performed, including benchmarks evolving more than 3.6 trillion particles.

  16. HACC: Simulating sky surveys on state-of-the-art supercomputing architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Habib, Salman; Pope, Adrian; Finkel, Hal

    2016-01-01

    Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the ‘Dark Universe’, dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers thatmore » enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models and algorithms. It has been demonstrated at scale on Cell- and GPU-accelerated systems, standard multi-core node clusters, and Blue Gene systems. HACC’s design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available. We present a description of the design philosophy of HACC, the underlying algorithms and code structure, and outline implementation details for several specific architectures. We show selected accuracy and performance results from some of the largest high resolution cosmological simulations so far performed, including benchmarks evolving more than 3.6 trillion particles.« less

  17. Calibrating Building Energy Models Using Supercomputer Trained Machine Learning Agents

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanyal, Jibonananda; New, Joshua Ryan; Edwards, Richard

    2014-01-01

    Building Energy Modeling (BEM) is an approach to model the energy usage in buildings for design and retrofit purposes. EnergyPlus is the flagship Department of Energy software that performs BEM for different types of buildings. The input to EnergyPlus can often extend in the order of a few thousand parameters which have to be calibrated manually by an expert for realistic energy modeling. This makes it challenging and expensive thereby making building energy modeling unfeasible for smaller projects. In this paper, we describe the Autotune research which employs machine learning algorithms to generate agents for the different kinds of standardmore » reference buildings in the U.S. building stock. The parametric space and the variety of building locations and types make this a challenging computational problem necessitating the use of supercomputers. Millions of EnergyPlus simulations are run on supercomputers which are subsequently used to train machine learning algorithms to generate agents. These agents, once created, can then run in a fraction of the time thereby allowing cost-effective calibration of building models.« less

  18. Sign: large-scale gene network estimation environment for high performance computing.

    PubMed

    Tamada, Yoshinori; Shimamura, Teppei; Yamaguchi, Rui; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru

    2011-01-01

    Our research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1. In these three programs, five different models are available: static and dynamic nonparametric Bayesian networks, state space models, graphical Gaussian models, and vector autoregressive models. All these models require a huge amount of computational resources for estimating large-scale gene networks and therefore are designed to be able to exploit the speed of 10 petaflops. The software will be available freely for "K computer" and HGC supercomputer system users. The estimated networks can be viewed and analyzed by Cell Illustrator Online and SBiP (Systems Biology integrative Pipeline). The software project web site is available at http://sign.hgc.jp/ .

  19. Massively parallel algorithm and implementation of RI-MP2 energy calculation for peta-scale many-core supercomputers.

    PubMed

    Katouda, Michio; Naruse, Akira; Hirano, Yukihiko; Nakajima, Takahito

    2016-11-15

    A new parallel algorithm and its implementation for the RI-MP2 energy calculation utilizing peta-flop-class many-core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual-level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi-node and multi-GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi-node and multi-GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  20. Optical clock distribution in supercomputers using polyimide-based waveguides

    NASA Astrophysics Data System (ADS)

    Bihari, Bipin; Gan, Jianhua; Wu, Linghui; Liu, Yujie; Tang, Suning; Chen, Ray T.

    1999-04-01

    Guided-wave optics is a promising way to deliver high-speed clock-signal in supercomputer with minimized clock-skew. Si- CMOS compatible polymer-based waveguides for optoelectronic interconnects and packaging have been fabricated and characterized. A 1-to-48 fanout optoelectronic interconnection layer (OIL) structure based on Ultradel 9120/9020 for the high-speed massive clock signal distribution for a Cray T-90 supercomputer board has been constructed. The OIL employs multimode polymeric channel waveguides in conjunction with surface-normal waveguide output coupler and 1-to-2 splitters. Surface-normal couplers can couple the optical clock signals into and out from the H-tree polyimide waveguides surface-normally, which facilitates the integration of photodetectors to convert optical-signal to electrical-signal. A 45-degree surface- normal couplers has been integrated at each output end. The measured output coupling efficiency is nearly 100 percent. The output profile from 45-degree surface-normal coupler were calculated using Fresnel approximation. the theoretical result is in good agreement with experimental result. A total insertion loss of 7.98 dB at 850 nm was measured experimentally.

  1. Flow visualization of CFD using graphics workstations

    NASA Technical Reports Server (NTRS)

    Lasinski, Thomas; Buning, Pieter; Choi, Diana; Rogers, Stuart; Bancroft, Gordon

    1987-01-01

    High performance graphics workstations are used to visualize the fluid flow dynamics obtained from supercomputer solutions of computational fluid dynamic programs. The visualizations can be done independently on the workstation or while the workstation is connected to the supercomputer in a distributed computing mode. In the distributed mode, the supercomputer interactively performs the computationally intensive graphics rendering tasks while the workstation performs the viewing tasks. A major advantage of the workstations is that the viewers can interactively change their viewing position while watching the dynamics of the flow fields. An overview of the computer hardware and software required to create these displays is presented. For complex scenes the workstation cannot create the displays fast enough for good motion analysis. For these cases, the animation sequences are recorded on video tape or 16 mm film a frame at a time and played back at the desired speed. The additional software and hardware required to create these video tapes or 16 mm movies are also described. Photographs illustrating current visualization techniques are discussed. Examples of the use of the workstations for flow visualization through animation are available on video tape.

  2. Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode

    NASA Technical Reports Server (NTRS)

    Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William

    1986-01-01

    The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.

  3. Long-Term file activity patterns in a UNIX workstation environment

    NASA Technical Reports Server (NTRS)

    Gibson, Timothy J.; Miller, Ethan L.

    1998-01-01

    As mass storage technology becomes more affordable for sites smaller than supercomputer centers, understanding their file access patterns becomes crucial for developing systems to store rarely used data on tertiary storage devices such as tapes and optical disks. This paper presents a new way to collect and analyze file system statistics for UNIX-based file systems. The collection system runs in user-space and requires no modification of the operating system kernel. The statistics package provides details about file system operations at the file level: creations, deletions, modifications, etc. The paper analyzes four months of file system activity on a university file system. The results confirm previously published results gathered from supercomputer file systems, but differ in several important areas. Files in this study were considerably smaller than those at supercomputer centers, and they were accessed less frequently. Additionally, the long-term creation rate on workstation file systems is sufficiently low so that all data more than a day old could be cheaply saved on a mass storage device, allowing the integration of time travel into every file system.

  4. A comprehensive study of MPI parallelism in three-dimensional discrete element method (DEM) simulation of complex-shaped granular particles

    NASA Astrophysics Data System (ADS)

    Yan, Beichuan; Regueiro, Richard A.

    2018-02-01

    A three-dimensional (3D) DEM code for simulating complex-shaped granular particles is parallelized using message-passing interface (MPI). The concepts of link-block, ghost/border layer, and migration layer are put forward for design of the parallel algorithm, and theoretical scalability function of 3-D DEM scalability and memory usage is derived. Many performance-critical implementation details are managed optimally to achieve high performance and scalability, such as: minimizing communication overhead, maintaining dynamic load balance, handling particle migrations across block borders, transmitting C++ dynamic objects of particles between MPI processes efficiently, eliminating redundant contact information between adjacent MPI processes. The code executes on multiple US Department of Defense (DoD) supercomputers and tests up to 2048 compute nodes for simulating 10 million three-axis ellipsoidal particles. Performance analyses of the code including speedup, efficiency, scalability, and granularity across five orders of magnitude of simulation scale (number of particles) are provided, and they demonstrate high speedup and excellent scalability. It is also discovered that communication time is a decreasing function of the number of compute nodes in strong scaling measurements. The code's capability of simulating a large number of complex-shaped particles on modern supercomputers will be of value in both laboratory studies on micromechanical properties of granular materials and many realistic engineering applications involving granular materials.

  5. Requirements for a network storage service

    NASA Technical Reports Server (NTRS)

    Kelly, Suzanne M.; Haynes, Rena A.

    1992-01-01

    Sandia National Laboratories provides a high performance classified computer network as a core capability in support of its mission of nuclear weapons design and engineering, physical sciences research, and energy research and development. The network, locally known as the Internal Secure Network (ISN), was designed in 1989 and comprises multiple distributed local area networks (LAN's) residing in Albuquerque, New Mexico and Livermore, California. The TCP/IP protocol suite is used for inner-node communications. Scientific workstations and mid-range computers, running UNIX-based operating systems, compose most LAN's. One LAN, operated by the Sandia Corporate Computing Directorate, is a general purpose resource providing a supercomputer and a file server to the entire ISN. The current file server on the supercomputer LAN is an implementation of the Common File System (CFS) developed by Los Alamos National Laboratory. Subsequent to the design of the ISN, Sandia reviewed its mass storage requirements and chose to enter into a competitive procurement to replace the existing file server with one more adaptable to a UNIX/TCP/IP environment. The requirements study for the network was the starting point for the requirements study for the new file server. The file server is called the Network Storage Services (NSS) and is requirements are described in this paper. The next section gives an application or functional description of the NSS. The final section adds performance, capacity, and access constraints to the requirements.

  6. A unified framework for spiking and gap-junction interactions in distributed neuronal network simulations.

    PubMed

    Hahne, Jan; Helias, Moritz; Kunkel, Susanne; Igarashi, Jun; Bolten, Matthias; Frommer, Andreas; Diesmann, Markus

    2015-01-01

    Contemporary simulators for networks of point and few-compartment model neurons come with a plethora of ready-to-use neuron and synapse models and support complex network topologies. Recent technological advancements have broadened the spectrum of application further to the efficient simulation of brain-scale networks on supercomputers. In distributed network simulations the amount of spike data that accrues per millisecond and process is typically low, such that a common optimization strategy is to communicate spikes at relatively long intervals, where the upper limit is given by the shortest synaptic transmission delay in the network. This approach is well-suited for simulations that employ only chemical synapses but it has so far impeded the incorporation of gap-junction models, which require instantaneous neuronal interactions. Here, we present a numerical algorithm based on a waveform-relaxation technique which allows for network simulations with gap junctions in a way that is compatible with the delayed communication strategy. Using a reference implementation in the NEST simulator, we demonstrate that the algorithm and the required data structures can be smoothly integrated with existing code such that they complement the infrastructure for spiking connections. To show that the unified framework for gap-junction and spiking interactions achieves high performance and delivers high accuracy in the presence of gap junctions, we present benchmarks for workstations, clusters, and supercomputers. Finally, we discuss limitations of the novel technology.

  7. RISC Processors and High Performance Computing

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Bailey, David H.; Lasinski, T. A. (Technical Monitor)

    1995-01-01

    In this tutorial, we will discuss top five current RISC microprocessors: The IBM Power2, which is used in the IBM RS6000/590 workstation and in the IBM SP2 parallel supercomputer, the DEC Alpha, which is in the DEC Alpha workstation and in the Cray T3D; the MIPS R8000, which is used in the SGI Power Challenge; the HP PA-RISC 7100, which is used in the HP 700 series workstations and in the Convex Exemplar; and the Cray proprietary processor, which is used in the new Cray J916. The architecture of these microprocessors will first be presented. The effective performance of these processors will then be compared, both by citing standard benchmarks and also in the context of implementing a real applications. In the process, different programming models such as data parallel (CM Fortran and HPF) and message passing (PVM and MPI) will be introduced and compared. The latest NAS Parallel Benchmark (NPB) absolute performance and performance per dollar figures will be presented. The next generation of the NP13 will also be described. The tutorial will conclude with a discussion of general trends in the field of high performance computing, including likely future developments in hardware and software technology, and the relative roles of vector supercomputers tightly coupled parallel computers, and clusters of workstations. This tutorial will provide a unique cross-machine comparison not available elsewhere.

  8. A unified framework for spiking and gap-junction interactions in distributed neuronal network simulations

    PubMed Central

    Hahne, Jan; Helias, Moritz; Kunkel, Susanne; Igarashi, Jun; Bolten, Matthias; Frommer, Andreas; Diesmann, Markus

    2015-01-01

    Contemporary simulators for networks of point and few-compartment model neurons come with a plethora of ready-to-use neuron and synapse models and support complex network topologies. Recent technological advancements have broadened the spectrum of application further to the efficient simulation of brain-scale networks on supercomputers. In distributed network simulations the amount of spike data that accrues per millisecond and process is typically low, such that a common optimization strategy is to communicate spikes at relatively long intervals, where the upper limit is given by the shortest synaptic transmission delay in the network. This approach is well-suited for simulations that employ only chemical synapses but it has so far impeded the incorporation of gap-junction models, which require instantaneous neuronal interactions. Here, we present a numerical algorithm based on a waveform-relaxation technique which allows for network simulations with gap junctions in a way that is compatible with the delayed communication strategy. Using a reference implementation in the NEST simulator, we demonstrate that the algorithm and the required data structures can be smoothly integrated with existing code such that they complement the infrastructure for spiking connections. To show that the unified framework for gap-junction and spiking interactions achieves high performance and delivers high accuracy in the presence of gap junctions, we present benchmarks for workstations, clusters, and supercomputers. Finally, we discuss limitations of the novel technology. PMID:26441628

  9. Extensible Computational Chemistry Environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2012-08-09

    ECCE provides a sophisticated graphical user interface, scientific visualization tools, and the underlying data management framework enabling scientists to efficiently set up calculations and store, retrieve, and analyze the rapidly growing volumes of data produced by computational chemistry studies. ECCE was conceived as part of the Environmental Molecular Sciences Laboratory construction to solve the problem of researchers being able to effectively utilize complex computational chemistry codes and massively parallel high performance compute resources. Bringing the power of these codes and resources to the desktops of researcher and thus enabling world class research without users needing a detailed understanding of themore » inner workings of either the theoretical codes or the supercomputers needed to run them was a grand challenge problem in the original version of the EMSL. ECCE allows collaboration among researchers using a web-based data repository where the inputs and results for all calculations done within ECCE are organized. ECCE is a first of kind end-to-end problem solving environment for all phases of computational chemistry research: setting up calculations with sophisticated GUI and direct manipulation visualization tools, submitting and monitoring calculations on remote high performance supercomputers without having to be familiar with the details of using these compute resources, and performing results visualization and analysis including creating publication quality images. ECCE is a suite of tightly integrated applications that are employed as the user moves through the modeling process.« less

  10. Networking Technologies Enable Advances in Earth Science

    NASA Technical Reports Server (NTRS)

    Johnson, Marjory; Freeman, Kenneth; Gilstrap, Raymond; Beck, Richard

    2004-01-01

    This paper describes an experiment to prototype a new way of conducting science by applying networking and distributed computing technologies to an Earth Science application. A combination of satellite, wireless, and terrestrial networking provided geologists at a remote field site with interactive access to supercomputer facilities at two NASA centers, thus enabling them to validate and calibrate remotely sensed geological data in near-real time. This represents a fundamental shift in the way that Earth scientists analyze remotely sensed data. In this paper we describe the experiment and the network infrastructure that enabled it, analyze the data flow during the experiment, and discuss the scientific impact of the results.

  11. Wakefield Computations for the CLIC PETS using the Parallel Finite Element Time-Domain Code T3P

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Candel, A; Kabel, A.; Lee, L.

    In recent years, SLAC's Advanced Computations Department (ACD) has developed the high-performance parallel 3D electromagnetic time-domain code, T3P, for simulations of wakefields and transients in complex accelerator structures. T3P is based on advanced higher-order Finite Element methods on unstructured grids with quadratic surface approximation. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with unprecedented accuracy, aiding the design of the next generation of accelerator facilities. Applications to the Compact Linear Collider (CLIC) Power Extraction and Transfer Structure (PETS) are presented.

  12. A Laboratory Facility for Research in Parallel Computation: Project Final Report.

    DTIC Science & Technology

    1987-07-01

    87 UNCLASSIFED AFOSR-TR-87-i9gi AFMS-86-279 F/ G 12/6 U MENE .306 fil L -0 1 25 1 4 1111 Llj i CHART 04.- 0 . FL F0. A- h 0 r .WrnKw -- w F-U-ML la...34A software tool for Building Supercomputer Applications" (I ) G ~Ij ONAVAILABILITY OF ABSTRACT 21. ABSTRACT SECURITY CLASSIFICATION %(I T ,V/,I rDIijN...processors may display different be- haviors. For example assume we have a processor g with a "good" local structure and a processor b with a "bad" local

  13. Building black holes: supercomputer cinema.

    PubMed

    Shapiro, S L; Teukolsky, S A

    1988-07-22

    A new computer code can solve Einstein's equations of general relativity for the dynamical evolution of a relativistic star cluster. The cluster may contain a large number of stars that move in a strong gravitational field at speeds approaching the speed of light. Unstable star clusters undergo catastrophic collapse to black holes. The collapse of an unstable cluster to a supermassive black hole at the center of a galaxy may explain the origin of quasars and active galactic nuclei. By means of a supercomputer simulation and color graphics, the whole process can be viewed in real time on a movie screen.

  14. Supercomputer analysis of purine and pyrimidine metabolism leading to DNA synthesis.

    PubMed

    Heinmets, F

    1989-06-01

    A model-system is established to analyze purine and pyrimidine metabolism leading to DNA synthesis. The principal aim is to explore the flow and regulation of terminal deoxynucleoside triophosphates (dNTPs) in various input and parametric conditions. A series of flow equations are established, which are subsequently converted to differential equations. These are programmed (Fortran) and analyzed on a Cray chi-MP/48 supercomputer. The pool concentrations are presented as a function of time in conditions in which various pertinent parameters of the system are modified. The system is formulated by 100 differential equations.

  15. Performance of the Widely-Used CFD Code OVERFLOW on the Pleides Supercomputer

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.

    2017-01-01

    Computational performance studies were made for NASA's widely used Computational Fluid Dynamics code OVERFLOW on the Pleiades Supercomputer. Two test cases were considered: a full launch vehicle with a grid of 286 million points and a full rotorcraft model with a grid of 614 million points. Computations using up to 8000 cores were run on Sandy Bridge and Ivy Bridge nodes. Performance was monitored using times reported in the day files from the Portable Batch System utility. Results for two grid topologies are presented and compared in detail. Observations and suggestions for future work are made.

  16. Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias; hide

    2006-01-01

    The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.

  17. A heterogeneous computing environment for simulating astrophysical fluid flows

    NASA Technical Reports Server (NTRS)

    Cazes, J.

    1994-01-01

    In the Concurrent Computing Laboratory in the Department of Physics and Astronomy at Louisiana State University we have constructed a heterogeneous computing environment that permits us to routinely simulate complicated three-dimensional fluid flows and to readily visualize the results of each simulation via three-dimensional animation sequences. An 8192-node MasPar MP-1 computer with 0.5 GBytes of RAM provides 250 MFlops of execution speed for our fluid flow simulations. Utilizing the parallel virtual machine (PVM) language, at periodic intervals data is automatically transferred from the MP-1 to a cluster of workstations where individual three-dimensional images are rendered for inclusion in a single animation sequence. Work is underway to replace executions on the MP-1 with simulations performed on the 512-node CM-5 at NCSA and to simultaneously gain access to more potent volume rendering workstations.

  18. An object oriented fully 3D tomography visual toolkit.

    PubMed

    Agostinelli, S; Paoli, G

    2001-04-01

    In this paper we present a modern object oriented component object model (COMM) C + + toolkit dedicated to fully 3D cone-beam tomography. The toolkit allows the display and visual manipulation of analytical phantoms, projection sets and volumetric data through a standard Windows graphical user interface. Data input/output is performed using proprietary file formats but import/export of industry standard file formats, including raw binary, Windows bitmap and AVI, ACR/NEMA DICOMM 3 and NCSA HDF is available. At the time of writing built-in implemented data manipulators include a basic phantom ray-tracer and a Matrox Genesis frame grabbing facility. A COMM plug-in interface is provided for user-defined custom backprojector algorithms: a simple Feldkamp ActiveX control, including source code, is provided as an example; our fast Feldkamp plug-in is also available.

  19. Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation

    NASA Technical Reports Server (NTRS)

    Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)

    2002-01-01

    Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.

  20. Using supercomputers for the time history analysis of old gravity dams

    NASA Astrophysics Data System (ADS)

    Rouve, G.; Peters, A.

    Some of the old masonry dams that were built in Germany at the beginning of this century are a matter of concern today. In the course of time certain deterioration caused or amplified by aging has appeared and raised questions about the safety of these old dams. The Finite Element Method, which in the past two decades has found a widespread application, offers a suitable tool to re-evaluate the safety of these old gravity dams. The reliability of the results, however, strongly depends on the knowledge of the material parameters. Using historical records and observations a numerical back-analysis models has been developed to simulate the behaviour of these old masonry structures and to estimate their material properties by calibration. Only an implementation on a fourth generation vector computer made the application of this large model possible in practice.

Top