Modular HPC I/O characterization with Darshan
DOE Office of Scientific and Technical Information (OSTI.GOV)
Snyder, Shane; Carns, Philip; Harms, Kevin
2016-11-13
Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on a number of different compute platforms in their lifetime. These large-scale HPC platforms employ increasingly complex I/O subsystems to provide a suitable level of I/O performance to applications. Tuning I/O workloads for such a system is nontrivial, and the results generally are not portable to other HPC systems. I/O profiling tools can help to address this challenge, but most existing tools only instrument specific components within the I/O subsystem that provide a limited perspective on I/O performance. The increasing diversity of scientificmore » applications and computing platforms calls for greater flexibililty and scope in I/O characterization.« less
On the Impact of Execution Models: A Case Study in Computational Chemistry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chavarría-Miranda, Daniel; Halappanavar, Mahantesh; Krishnamoorthy, Sriram
2015-05-25
Efficient utilization of high-performance computing (HPC) platforms is an important and complex problem. Execution models, abstract descriptions of the dynamic runtime behavior of the execution stack, have significant impact on the utilization of HPC systems. Using a computational chemistry kernel as a case study and a wide variety of execution models combined with load balancing techniques, we explore the impact of execution models on the utilization of an HPC system. We demonstrate a 50 percent improvement in performance by using work stealing relative to a more traditional static scheduling approach. We also use a novel semi-matching technique for load balancingmore » that has comparable performance to a traditional hypergraph-based partitioning implementation, which is computationally expensive. Using this study, we found that execution model design choices and assumptions can limit critical optimizations such as global, dynamic load balancing and finding the correct balance between available work units and different system and runtime overheads. With the emergence of multi- and many-core architectures and the consequent growth in the complexity of HPC platforms, we believe that these lessons will be beneficial to researchers tuning diverse applications on modern HPC platforms, especially on emerging dynamic platforms with energy-induced performance variability.« less
RAPPORT: running scientific high-performance computing applications on the cloud.
Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt
2013-01-28
Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform
NASA Technical Reports Server (NTRS)
Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak
2012-01-01
The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.
Canon, Shane
2018-01-24
DOE JGI's Zhong Wang, chair of the High-performance Computing session, gives a brief introduction before Berkeley Lab's Shane Canon talks about "Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy
NASA Astrophysics Data System (ADS)
Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli
2014-03-01
One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.
Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges.
Yin, Zekun; Lan, Haidong; Tan, Guangming; Lu, Mian; Vasilakos, Athanasios V; Liu, Weiguo
2017-01-01
The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need to rapidly perform data analysis tasks in life sciences. As a result, both biologists and computer scientists are facing the challenge of gaining a profound insight into the deepest biological functions from big biological data. This in turn requires massive computational resources. Therefore, high performance computing (HPC) platforms are highly needed as well as efficient and scalable algorithms that can take advantage of these platforms. In this paper, we survey the state-of-the-art HPC platforms for big biological data analytics. We first list the characteristics of big biological data and popular computing platforms. Then we provide a taxonomy of different biological data analysis applications and a survey of the way they have been mapped onto various computing platforms. After that, we present a case study to compare the efficiency of different computing platforms for handling the classical biological sequence alignment problem. At last we discuss the open issues in big biological data analytics.
Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers
NASA Astrophysics Data System (ADS)
Dreher, Patrick; Scullin, William; Vouk, Mladen
2015-09-01
Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers.
2012-09-30
platform (HPC) was developed, called the HPC-Acoustic Data Accelerator, or HPC-ADA for short. The HPC-ADA was designed based on fielded systems [1-4...software (Detection cLassificaiton for MAchine learning - High Peformance Computing). The software package was designed to utilize parallel and...Sedna [7] and is designed using a parallel architecture2, allowing existing algorithms to distribute to the various processing nodes with minimal changes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Renke; Jin, Shuangshuang; Chen, Yousu
This paper presents a faster-than-real-time dynamic simulation software package that is designed for large-size power system dynamic simulation. It was developed on the GridPACKTM high-performance computing (HPC) framework. The key features of the developed software package include (1) faster-than-real-time dynamic simulation for a WECC system (17,000 buses) with different types of detailed generator, controller, and relay dynamic models, (2) a decoupled parallel dynamic simulation algorithm with optimized computation architecture to better leverage HPC resources and technologies, (3) options for HPC-based linear and iterative solvers, (4) hidden HPC details, such as data communication and distribution, to enable development centered on mathematicalmore » models and algorithms rather than on computational details for power system researchers, and (5) easy integration of new dynamic models and related algorithms into the software package.« less
Using Performance Tools to Support Experiments in HPC Resilience
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naughton, III, Thomas J; Boehm, Swen; Engelmann, Christian
2014-01-01
The high performance computing (HPC) community is working to address fault tolerance and resilience concerns for current and future large scale computing platforms. This is driving enhancements in the programming environ- ments, specifically research on enhancing message passing libraries to support fault tolerant computing capabilities. The community has also recognized that tools for resilience experimentation are greatly lacking. However, we argue that there are several parallels between performance tools and resilience tools . As such, we believe the rich set of HPC performance-focused tools can be extended (repurposed) to benefit the resilience community. In this paper, we describe the initialmore » motivation to leverage standard HPC per- formance analysis techniques to aid in developing diagnostic tools to assist fault tolerance experiments for HPC applications. These diagnosis procedures help to provide context for the system when the errors (failures) occurred. We describe our initial work in leveraging an MPI performance trace tool to assist in provid- ing global context during fault injection experiments. Such tools will assist the HPC resilience community as they extend existing and new application codes to support fault tolerances.« less
Computational Science and Innovation
NASA Astrophysics Data System (ADS)
Dean, D. J.
2011-09-01
Simulations - utilizing computers to solve complicated science and engineering problems - are a key ingredient of modern science. The U.S. Department of Energy (DOE) is a world leader in the development of high-performance computing (HPC), the development of applied math and algorithms that utilize the full potential of HPC platforms, and the application of computing to science and engineering problems. An interesting general question is whether the DOE can strategically utilize its capability in simulations to advance innovation more broadly. In this article, I will argue that this is certainly possible.
NASA Astrophysics Data System (ADS)
Evans, B. J. K.; Pugh, T.; Wyborn, L. A.; Porter, D.; Allen, C.; Smillie, J.; Antony, J.; Trenham, C.; Evans, B. J.; Beckett, D.; Erwin, T.; King, E.; Hodge, J.; Woodcock, R.; Fraser, R.; Lescinsky, D. T.
2014-12-01
The National Computational Infrastructure (NCI) has co-located a priority set of national data assets within a HPC research platform. This powerful in-situ computational platform has been created to help serve and analyse the massive amounts of data across the spectrum of environmental collections - in particular the climate, observational data and geoscientific domains. This paper examines the infrastructure, innovation and opportunity for this significant research platform. NCI currently manages nationally significant data collections (10+ PB) categorised as 1) earth system sciences, climate and weather model data assets and products, 2) earth and marine observations and products, 3) geosciences, 4) terrestrial ecosystem, 5) water management and hydrology, and 6) astronomy, social science and biosciences. The data is largely sourced from the NCI partners (who include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. By co-locating these large valuable data assets, new opportunities have arisen by harmonising the data collections, making a powerful transdisciplinary research platformThe data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. New scientific software, cloud-scale techniques, server-side visualisation and data services have been harnessed and integrated into the platform, so that analysis is performed seamlessly across the traditional boundaries of the underlying data domains. Characterisation of the techniques along with performance profiling ensures scalability of each software component, all of which can either be enhanced or replaced through future improvements. A Development-to-Operations (DevOps) framework has also been implemented to manage the scale of the software complexity alone. This ensures that software is both upgradable and maintainable, and can be readily reused with complexly integrated systems and become part of the growing global trusted community tools for cross-disciplinary research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lingerfelt, Eric J; Endeve, Eirik; Hui, Yawei
Improvements in scientific instrumentation allow imaging at mesoscopic to atomic length scales, many spectroscopic modes, and now--with the rise of multimodal acquisition systems and the associated processing capability--the era of multidimensional, informationally dense data sets has arrived. Technical issues in these combinatorial scientific fields are exacerbated by computational challenges best summarized as a necessity for drastic improvement in the capability to transfer, store, and analyze large volumes of data. The Bellerophon Environment for Analysis of Materials (BEAM) platform provides material scientists the capability to directly leverage the integrated computational and analytical power of High Performance Computing (HPC) to perform scalablemore » data analysis and simulation and manage uploaded data files via an intuitive, cross-platform client user interface. This framework delivers authenticated, "push-button" execution of complex user workflows that deploy data analysis algorithms and computational simulations utilizing compute-and-data cloud infrastructures and HPC environments like Titan at the Oak Ridge Leadershp Computing Facility (OLCF).« less
Pinthong, Watthanai; Muangruen, Panya
2016-01-01
Development of high-throughput technologies, such as Next-generation sequencing, allows thousands of experiments to be performed simultaneously while reducing resource requirement. Consequently, a massive amount of experiment data is now rapidly generated. Nevertheless, the data are not readily usable or meaningful until they are further analysed and interpreted. Due to the size of the data, a high performance computer (HPC) is required for the analysis and interpretation. However, the HPC is expensive and difficult to access. Other means were developed to allow researchers to acquire the power of HPC without a need to purchase and maintain one such as cloud computing services and grid computing system. In this study, we implemented grid computing in a computer training center environment using Berkeley Open Infrastructure for Network Computing (BOINC) as a job distributor and data manager combining all desktop computers to virtualize the HPC. Fifty desktop computers were used for setting up a grid system during the off-hours. In order to test the performance of the grid system, we adapted the Basic Local Alignment Search Tools (BLAST) to the BOINC system. Sequencing results from Illumina platform were aligned to the human genome database by BLAST on the grid system. The result and processing time were compared to those from a single desktop computer and HPC. The estimated durations of BLAST analysis for 4 million sequence reads on a desktop PC, HPC and the grid system were 568, 24 and 5 days, respectively. Thus, the grid implementation of BLAST by BOINC is an efficient alternative to the HPC for sequence alignment. The grid implementation by BOINC also helped tap unused computing resources during the off-hours and could be easily modified for other available bioinformatics software. PMID:27547555
Platform for Automated Real-Time High Performance Analytics on Medical Image Data.
Allen, William J; Gabr, Refaat E; Tefera, Getaneh B; Pednekar, Amol S; Vaughn, Matthew W; Narayana, Ponnada A
2018-03-01
Biomedical data are quickly growing in volume and in variety, providing clinicians an opportunity for better clinical decision support. Here, we demonstrate a robust platform that uses software automation and high performance computing (HPC) resources to achieve real-time analytics of clinical data, specifically magnetic resonance imaging (MRI) data. We used the Agave application programming interface to facilitate communication, data transfer, and job control between an MRI scanner and an off-site HPC resource. In this use case, Agave executed the graphical pipeline tool GRAphical Pipeline Environment (GRAPE) to perform automated, real-time, quantitative analysis of MRI scans. Same-session image processing will open the door for adaptive scanning and real-time quality control, potentially accelerating the discovery of pathologies and minimizing patient callbacks. We envision this platform can be adapted to other medical instruments, HPC resources, and analytics tools.
The Design of a High Performance Earth Imagery and Raster Data Management and Processing Platform
NASA Astrophysics Data System (ADS)
Xie, Qingyun
2016-06-01
This paper summarizes the general requirements and specific characteristics of both geospatial raster database management system and raster data processing platform from a domain-specific perspective as well as from a computing point of view. It also discusses the need of tight integration between the database system and the processing system. These requirements resulted in Oracle Spatial GeoRaster, a global scale and high performance earth imagery and raster data management and processing platform. The rationale, design, implementation, and benefits of Oracle Spatial GeoRaster are described. Basically, as a database management system, GeoRaster defines an integrated raster data model, supports image compression, data manipulation, general and spatial indices, content and context based queries and updates, versioning, concurrency, security, replication, standby, backup and recovery, multitenancy, and ETL. It provides high scalability using computer and storage clustering. As a raster data processing platform, GeoRaster provides basic operations, image processing, raster analytics, and data distribution featuring high performance computing (HPC). Specifically, HPC features include locality computing, concurrent processing, parallel processing, and in-memory computing. In addition, the APIs and the plug-in architecture are discussed.
NASA Astrophysics Data System (ADS)
Valasek, Lukas; Glasa, Jan
2017-12-01
Current fire simulation systems are capable to utilize advantages of high-performance computer (HPC) platforms available and to model fires efficiently in parallel. In this paper, efficiency of a corridor fire simulation on a HPC computer cluster is discussed. The parallel MPI version of Fire Dynamics Simulator is used for testing efficiency of selected strategies of allocation of computational resources of the cluster using a greater number of computational cores. Simulation results indicate that if the number of cores used is not equal to a multiple of the total number of cluster node cores there are allocation strategies which provide more efficient calculations.
ERIC Educational Resources Information Center
Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu
2013-01-01
With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…
NASA Astrophysics Data System (ADS)
Zhu, Dehua; Echendu, Shirley; Xuan, Yunqing; Webster, Mike; Cluckie, Ian
2016-11-01
Impact-focused studies of extreme weather require coupling of accurate simulations of weather and climate systems and impact-measuring hydrological models which themselves demand larger computer resources. In this paper, we present a preliminary analysis of a high-performance computing (HPC)-based hydrological modelling approach, which is aimed at utilizing and maximizing HPC power resources, to support the study on extreme weather impact due to climate change. Here, four case studies are presented through implementation on the HPC Wales platform of the UK mesoscale meteorological Unified Model (UM) with high-resolution simulation suite UKV, alongside a Linux-based hydrological model, Hydrological Predictions for the Environment (HYPE). The results of this study suggest that the coupled hydro-meteorological model was still able to capture the major flood peaks, compared with the conventional gauge- or radar-driving forecast, but with the added value of much extended forecast lead time. The high-resolution rainfall estimation produced by the UKV performs similarly to that of radar rainfall products in the first 2-3 days of tested flood events, but the uncertainties particularly increased as the forecast horizon goes beyond 3 days. This study takes a step forward to identify how the online mode approach can be used, where both numerical weather prediction and the hydrological model are executed, either simultaneously or on the same hardware infrastructures, so that more effective interaction and communication can be achieved and maintained between the models. But the concluding comments are that running the entire system on a reasonably powerful HPC platform does not yet allow for real-time simulations, even without the most complex and demanding data simulation part.
BrainFrame: a node-level heterogeneous accelerator platform for neuron simulations
NASA Astrophysics Data System (ADS)
Smaragdos, Georgios; Chatzikonstantis, Georgios; Kukreja, Rahul; Sidiropoulos, Harry; Rodopoulos, Dimitrios; Sourdis, Ioannis; Al-Ars, Zaid; Kachris, Christoforos; Soudris, Dimitrios; De Zeeuw, Chris I.; Strydis, Christos
2017-12-01
Objective. The advent of high-performance computing (HPC) in recent years has led to its increasing use in brain studies through computational models. The scale and complexity of such models are constantly increasing, leading to challenging computational requirements. Even though modern HPC platforms can often deal with such challenges, the vast diversity of the modeling field does not permit for a homogeneous acceleration platform to effectively address the complete array of modeling requirements. Approach. In this paper we propose and build BrainFrame, a heterogeneous acceleration platform that incorporates three distinct acceleration technologies, an Intel Xeon-Phi CPU, a NVidia GP-GPU and a Maxeler Dataflow Engine. The PyNN software framework is also integrated into the platform. As a challenging proof of concept, we analyze the performance of BrainFrame on different experiment instances of a state-of-the-art neuron model, representing the inferior-olivary nucleus using a biophysically-meaningful, extended Hodgkin-Huxley representation. The model instances take into account not only the neuronal-network dimensions but also different network-connectivity densities, which can drastically affect the workload’s performance characteristics. Main results. The combined use of different HPC technologies demonstrates that BrainFrame is better able to cope with the modeling diversity encountered in realistic experiments while at the same time running on significantly lower energy budgets. Our performance analysis clearly shows that the model directly affects performance and all three technologies are required to cope with all the model use cases. Significance. The BrainFrame framework is designed to transparently configure and select the appropriate back-end accelerator technology for use per simulation run. The PyNN integration provides a familiar bridge to the vast number of models already available. Additionally, it gives a clear roadmap for extending the platform support beyond the proof of concept, with improved usability and directly useful features to the computational-neuroscience community, paving the way for wider adoption.
BrainFrame: a node-level heterogeneous accelerator platform for neuron simulations.
Smaragdos, Georgios; Chatzikonstantis, Georgios; Kukreja, Rahul; Sidiropoulos, Harry; Rodopoulos, Dimitrios; Sourdis, Ioannis; Al-Ars, Zaid; Kachris, Christoforos; Soudris, Dimitrios; De Zeeuw, Chris I; Strydis, Christos
2017-12-01
The advent of high-performance computing (HPC) in recent years has led to its increasing use in brain studies through computational models. The scale and complexity of such models are constantly increasing, leading to challenging computational requirements. Even though modern HPC platforms can often deal with such challenges, the vast diversity of the modeling field does not permit for a homogeneous acceleration platform to effectively address the complete array of modeling requirements. In this paper we propose and build BrainFrame, a heterogeneous acceleration platform that incorporates three distinct acceleration technologies, an Intel Xeon-Phi CPU, a NVidia GP-GPU and a Maxeler Dataflow Engine. The PyNN software framework is also integrated into the platform. As a challenging proof of concept, we analyze the performance of BrainFrame on different experiment instances of a state-of-the-art neuron model, representing the inferior-olivary nucleus using a biophysically-meaningful, extended Hodgkin-Huxley representation. The model instances take into account not only the neuronal-network dimensions but also different network-connectivity densities, which can drastically affect the workload's performance characteristics. The combined use of different HPC technologies demonstrates that BrainFrame is better able to cope with the modeling diversity encountered in realistic experiments while at the same time running on significantly lower energy budgets. Our performance analysis clearly shows that the model directly affects performance and all three technologies are required to cope with all the model use cases. The BrainFrame framework is designed to transparently configure and select the appropriate back-end accelerator technology for use per simulation run. The PyNN integration provides a familiar bridge to the vast number of models already available. Additionally, it gives a clear roadmap for extending the platform support beyond the proof of concept, with improved usability and directly useful features to the computational-neuroscience community, paving the way for wider adoption.
Automating NEURON Simulation Deployment in Cloud Resources.
Stockton, David B; Santamaria, Fidel
2017-01-01
Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the OpenStack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon's proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model.
Automating NEURON Simulation Deployment in Cloud Resources
Santamaria, Fidel
2016-01-01
Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the Open-Stack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon’s proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model. PMID:27655341
Arc4nix: A cross-platform geospatial analytical library for cluster and cloud computing
NASA Astrophysics Data System (ADS)
Tang, Jingyin; Matyas, Corene J.
2018-02-01
Big Data in geospatial technology is a grand challenge for processing capacity. The ability to use a GIS for geospatial analysis on Cloud Computing and High Performance Computing (HPC) clusters has emerged as a new approach to provide feasible solutions. However, users lack the ability to migrate existing research tools to a Cloud Computing or HPC-based environment because of the incompatibility of the market-dominating ArcGIS software stack and Linux operating system. This manuscript details a cross-platform geospatial library "arc4nix" to bridge this gap. Arc4nix provides an application programming interface compatible with ArcGIS and its Python library "arcpy". Arc4nix uses a decoupled client-server architecture that permits geospatial analytical functions to run on the remote server and other functions to run on the native Python environment. It uses functional programming and meta-programming language to dynamically construct Python codes containing actual geospatial calculations, send them to a server and retrieve results. Arc4nix allows users to employ their arcpy-based script in a Cloud Computing and HPC environment with minimal or no modification. It also supports parallelizing tasks using multiple CPU cores and nodes for large-scale analyses. A case study of geospatial processing of a numerical weather model's output shows that arcpy scales linearly in a distributed environment. Arc4nix is open-source software.
Getting ready for petaflop capacities and beyond: a utility perspective
NASA Astrophysics Data System (ADS)
Hamelin, J. F.; Berthou, J. Y.
2008-07-01
Why should EDF, the leading producer and marketer of electricity in Europe, start adding teraflops to its terawatt-hours and become involved in high-performance computing (HPC)? In this paper we answer this question through examples of major opportunities that HPC brings to our business today and, we hope well into the future of petaflop and exaflop computing. Five cases are presented dealing with nondestructive testing, nuclear fuel management, mechanical behavior of nuclear fuel assemblies, water management, and energy management. For each case we show the benefits brought by HPC, describe the current level of numerical simulation performance, and discuss the perspectives for future steps. We also present the general background that explains why EDF is moving to this technology and briefly comment on the development of user-oriented simulation platforms.
First experience with particle-in-cell plasma physics code on ARM-based HPC systems
NASA Astrophysics Data System (ADS)
Sáez, Xavier; Soba, Alejandro; Sánchez, Edilberto; Mantsinen, Mervi; Mateo, Sergi; Cela, José M.; Castejón, Francisco
2015-09-01
In this work, we will explore the feasibility of porting a Particle-in-cell code (EUTERPE) to an ARM multi-core platform from the Mont-Blanc project. The used prototype is based on a system-on-chip Samsung Exynos 5 with an integrated GPU. It is the first prototype that could be used for High-Performance Computing (HPC), since it supports double precision and parallel programming languages.
Production experience with the ATLAS Event Service
NASA Astrophysics Data System (ADS)
Benjamin, D.; Calafiura, P.; Childers, T.; De, K.; Guan, W.; Maeno, T.; Nilsson, P.; Tsulaia, V.; Van Gemmeren, P.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The ATLAS Event Service (AES) has been designed and implemented for efficient running of ATLAS production workflows on a variety of computing platforms, ranging from conventional Grid sites to opportunistic, often short-lived resources, such as spot market commercial clouds, supercomputers and volunteer computing. The Event Service architecture allows real time delivery of fine grained workloads to running payload applications which process dispatched events or event ranges and immediately stream the outputs to highly scalable Object Stores. Thanks to its agile and flexible architecture the AES is currently being used by grid sites for assigning low priority workloads to otherwise idle computing resources; similarly harvesting HPC resources in an efficient back-fill mode; and massively scaling out to the 50-100k concurrent core level on the Amazon spot market to efficiently utilize those transient resources for peak production needs. Platform ports in development include ATLAS@Home (BOINC) and the Google Compute Engine, and a growing number of HPC platforms. After briefly reviewing the concept and the architecture of the Event Service, we will report the status and experience gained in AES commissioning and production operations on supercomputers, and our plans for extending ES application beyond Geant4 simulation to other workflows, such as reconstruction and data analysis.
The HARNESS Workbench: Unified and Adaptive Access to Diverse HPC Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sunderam, Vaidy S.
2012-03-20
The primary goal of the Harness WorkBench (HWB) project is to investigate innovative software environments that will help enhance the overall productivity of applications science on diverse HPC platforms. Two complementary frameworks were designed: one, a virtualized command toolkit for application building, deployment, and execution, that provides a common view across diverse HPC systems, in particular the DOE leadership computing platforms (Cray, IBM, SGI, and clusters); and two, a unified runtime environment that consolidates access to runtime services via an adaptive framework for execution-time and post processing activities. A prototype of the first was developed based on the concept ofmore » a 'system-call virtual machine' (SCVM), to enhance portability of the HPC application deployment process across heterogeneous high-end machines. The SCVM approach to portable builds is based on the insertion of toolkit-interpretable directives into original application build scripts. Modifications resulting from these directives preserve the semantics of the original build instruction flow. The execution of the build script is controlled by our toolkit that intercepts build script commands in a manner transparent to the end-user. We have applied this approach to a scientific production code (Gamess-US) on the Cray-XT5 machine. The second facet, termed Unibus, aims to facilitate provisioning and aggregation of multifaceted resources from resource providers and end-users perspectives. To achieve that, Unibus proposes a Capability Model and mediators (resource drivers) to virtualize access to diverse resources, and soft and successive conditioning to enable automatic and user-transparent resource provisioning. A proof of concept implementation has demonstrated the viability of this approach on high end machines, grid systems and computing clouds.« less
A high performance scientific cloud computing environment for materials simulations
NASA Astrophysics Data System (ADS)
Jorissen, K.; Vila, F. D.; Rehr, J. J.
2012-09-01
We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
FLAME: A platform for high performance computing of complex systems, applied for three case studies
Kiran, Mariam; Bicak, Mesude; Maleki-Dizaji, Saeedeh; ...
2011-01-01
FLAME allows complex models to be automatically parallelised on High Performance Computing (HPC) grids enabling large number of agents to be simulated over short periods of time. Modellers are hindered by complexities of porting models on parallel platforms and time taken to run large simulations on a single machine, which FLAME overcomes. Three case studies from different disciplines were modelled using FLAME, and are presented along with their performance results on a grid.
Scalability improvements to NRLMOL for DFT calculations of large molecules
NASA Astrophysics Data System (ADS)
Diaz, Carlos Manuel
Advances in high performance computing (HPC) have provided a way to treat large, computationally demanding tasks using thousands of processors. With the development of more powerful HPC architectures, the need to create efficient and scalable code has grown more important. Electronic structure calculations are valuable in understanding experimental observations and are routinely used for new materials predictions. For the electronic structure calculations, the memory and computation time are proportional to the number of atoms. Memory requirements for these calculations scale as N2, where N is the number of atoms. While the recent advances in HPC offer platforms with large numbers of cores, the limited amount of memory available on a given node and poor scalability of the electronic structure code hinder their efficient usage of these platforms. This thesis will present some developments to overcome these bottlenecks in order to study large systems. These developments, which are implemented in the NRLMOL electronic structure code, involve the use of sparse matrix storage formats and the use of linear algebra using sparse and distributed matrices. These developments along with other related development now allow ground state density functional calculations using up to 25,000 basis functions and the excited state calculations using up to 17,000 basis functions while utilizing all cores on a node. An example on a light-harvesting triad molecule is described. Finally, future plans to further improve the scalability will be presented.
OCCAM: a flexible, multi-purpose and extendable HPC cluster
NASA Astrophysics Data System (ADS)
Aldinucci, M.; Bagnasco, S.; Lusso, S.; Pasteris, P.; Rabellino, S.; Vallero, S.
2017-10-01
The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multipurpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Sezione di Torino of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible, reconfigurable and extendable infrastructure to cater to a wide range of different scientific computing use cases, including ones from solid-state chemistry, high-energy physics, computer science, big data analytics, computational biology, genomics and many others. Furthermore, it will serve as a platform for R&D activities on computational technologies themselves, with topics ranging from GPU acceleration to Cloud Computing technologies. A heterogeneous and reconfigurable system like this poses a number of challenges related to the frequency at which heterogeneous hardware resources might change their availability and shareability status, which in turn affect methods and means to allocate, manage, optimize, bill, monitor VMs, containers, virtual farms, jobs, interactive bare-metal sessions, etc. This work describes some of the use cases that prompted the design and construction of the HPC cluster, its architecture and resource provisioning model, along with a first characterization of its performance by some synthetic benchmark tools and a few realistic use-case tests.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kang, Shujiang; Kline, Keith L; Nair, S. Surendran
A global energy crop productivity model that provides geospatially explicit quantitative details on biomass potential and factors affecting sustainability would be useful, but does not exist now. This study describes a modeling platform capable of meeting many challenges associated with global-scale agro-ecosystem modeling. We designed an analytical framework for bioenergy crops consisting of six major components: (i) standardized natural resources datasets, (ii) global field-trial data and crop management practices, (iii) simulation units and management scenarios, (iv) model calibration and validation, (v) high-performance computing (HPC) simulation, and (vi) simulation output processing and analysis. The HPC-Environmental Policy Integrated Climate (HPC-EPIC) model simulatedmore » a perennial bioenergy crop, switchgrass (Panicum virgatum L.), estimating feedstock production potentials and effects across the globe. This modeling platform can assess soil C sequestration, net greenhouse gas (GHG) emissions, nonpoint source pollution (e.g., nutrient and pesticide loss), and energy exchange with the atmosphere. It can be expanded to include additional bioenergy crops (e.g., miscanthus, energy cane, and agave) and food crops under different management scenarios. The platform and switchgrass field-trial dataset are available to support global analysis of biomass feedstock production potential and corresponding metrics of sustainability.« less
The Convergence of High Performance Computing and Large Scale Data Analytics
NASA Astrophysics Data System (ADS)
Duffy, D.; Bowen, M. K.; Thompson, J. H.; Yang, C. P.; Hu, F.; Wills, B.
2015-12-01
As the combinations of remote sensing observations and model outputs have grown, scientists are increasingly burdened with both the necessity and complexity of large-scale data analysis. Scientists are increasingly applying traditional high performance computing (HPC) solutions to solve their "Big Data" problems. While this approach has the benefit of limiting data movement, the HPC system is not optimized to run analytics, which can create problems that permeate throughout the HPC environment. To solve these issues and to alleviate some of the strain on the HPC environment, the NASA Center for Climate Simulation (NCCS) has created the Advanced Data Analytics Platform (ADAPT), which combines both HPC and cloud technologies to create an agile system designed for analytics. Large, commonly used data sets are stored in this system in a write once/read many file system, such as Landsat, MODIS, MERRA, and NGA. High performance virtual machines are deployed and scaled according to the individual scientist's requirements specifically for data analysis. On the software side, the NCCS and GMU are working with emerging commercial technologies and applying them to structured, binary scientific data in order to expose the data in new ways. Native NetCDF data is being stored within a Hadoop Distributed File System (HDFS) enabling storage-proximal processing through MapReduce while continuing to provide accessibility of the data to traditional applications. Once the data is stored within HDFS, an additional indexing scheme is built on top of the data and placed into a relational database. This spatiotemporal index enables extremely fast mappings of queries to data locations to dramatically speed up analytics. These are some of the first steps toward a single unified platform that optimizes for both HPC and large-scale data analysis, and this presentation will elucidate the resulting and necessary exascale architectures required for future systems.
Towards New Metrics for High-Performance Computing Resilience
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Ashraf, Rizwan A; Engelmann, Christian
Ensuring the reliability of applications is becoming an increasingly important challenge as high-performance computing (HPC) systems experience an ever-growing number of faults, errors and failures. While the HPC community has made substantial progress in developing various resilience solutions, it continues to rely on platform-based metrics to quantify application resiliency improvements. The resilience of an HPC application is concerned with the reliability of the application outcome as well as the fault handling efficiency. To understand the scope of impact, effective coverage and performance efficiency of existing and emerging resilience solutions, there is a need for new metrics. In this paper, wemore » develop new ways to quantify resilience that consider both the reliability and the performance characteristics of the solutions from the perspective of HPC applications. As HPC systems continue to evolve in terms of scale and complexity, it is expected that applications will experience various types of faults, errors and failures, which will require applications to apply multiple resilience solutions across the system stack. The proposed metrics are intended to be useful for understanding the combined impact of these solutions on an application's ability to produce correct results and to evaluate their overall impact on an application's performance in the presence of various modes of faults.« less
Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Shuangshuang; Chen, Yousu; Wu, Di
2015-12-09
Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Messagemore » Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.« less
Seismic waveform modeling over cloud
NASA Astrophysics Data System (ADS)
Luo, Cong; Friederich, Wolfgang
2016-04-01
With the fast growing computational technologies, numerical simulation of seismic wave propagation achieved huge successes. Obtaining the synthetic waveforms through numerical simulation receives an increasing amount of attention from seismologists. However, computational seismology is a data-intensive research field, and the numerical packages usually come with a steep learning curve. Users are expected to master considerable amount of computer knowledge and data processing skills. Training users to use the numerical packages, correctly access and utilize the computational resources is a troubled task. In addition to that, accessing to HPC is also a common difficulty for many users. To solve these problems, a cloud based solution dedicated on shallow seismic waveform modeling has been developed with the state-of-the-art web technologies. It is a web platform integrating both software and hardware with multilayer architecture: a well designed SQL database serves as the data layer, HPC and dedicated pipeline for it is the business layer. Through this platform, users will no longer need to compile and manipulate various packages on the local machine within local network to perform a simulation. By providing users professional access to the computational code through its interfaces and delivering our computational resources to the users over cloud, users can customize the simulation at expert-level, submit and run the job through it.
Bethel, EW; Bauer, A; Abbasi, H; ...
2016-06-10
The considerable interest in the high performance computing (HPC) community regarding analyzing and visualization data without first writing to disk, i.e., in situ processing, is due to several factors. First is an I/O cost savings, where data is analyzed /visualized while being generated, without first storing to a filesystem. Second is the potential for increased accuracy, where fine temporal sampling of transient analysis might expose some complex behavior missed in coarse temporal sampling. Third is the ability to use all available resources, CPU’s and accelerators, in the computation of analysis products. This STAR paper brings together researchers, developers and practitionersmore » using in situ methods in extreme-scale HPC with the goal to present existing methods, infrastructures, and a range of computational science and engineering applications using in situ analysis and visualization.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dennig, Yasmin
Sandia National Laboratories has a long history of significant contributions to the high performance community and industry. Our innovative computer architectures allowed the United States to become the first to break the teraFLOP barrier—propelling us to the international spotlight. Our advanced simulation and modeling capabilities have been integral in high consequence US operations such as Operation Burnt Frost. Strong partnerships with industry leaders, such as Cray, Inc. and Goodyear, have enabled them to leverage our high performance computing (HPC) capabilities to gain a tremendous competitive edge in the marketplace. As part of our continuing commitment to providing modern computing infrastructuremore » and systems in support of Sandia missions, we made a major investment in expanding Building 725 to serve as the new home of HPC systems at Sandia. Work is expected to be completed in 2018 and will result in a modern facility of approximately 15,000 square feet of computer center space. The facility will be ready to house the newest National Nuclear Security Administration/Advanced Simulation and Computing (NNSA/ASC) Prototype platform being acquired by Sandia, with delivery in late 2019 or early 2020. This new system will enable continuing advances by Sandia science and engineering staff in the areas of operating system R&D, operation cost effectiveness (power and innovative cooling technologies), user environment and application code performance.« less
An Approach to Integrate a Space-Time GIS Data Model with High Performance Computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Dali; Zhao, Ziliang; Shaw, Shih-Lung
2011-01-01
In this paper, we describe an approach to integrate a Space-Time GIS data model on a high performance computing platform. The Space-Time GIS data model has been developed on a desktop computing environment. We use the Space-Time GIS data model to generate GIS module, which organizes a series of remote sensing data. We are in the process of porting the GIS module into an HPC environment, in which the GIS modules handle large dataset directly via parallel file system. Although it is an ongoing project, authors hope this effort can inspire further discussions on the integration of GIS on highmore » performance computing platforms.« less
Python and HPC for High Energy Physics Data Analyses
Sehrish, S.; Kowalkowski, J.; Paterno, M.; ...
2017-01-01
High level abstractions in Python that can utilize computing hardware well seem to be an attractive option for writing data reduction and analysis tasks. In this paper, we explore the features available in Python which are useful and efficient for end user analysis in High Energy Physics (HEP). A typical vertical slice of an HEP data analysis is somewhat fragmented: the state of the reduction/analysis process must be saved at certain stages to allow for selective reprocessing of only parts of a generally time-consuming workflow. Also, algorithms tend to to be modular because of the heterogeneous nature of most detectorsmore » and the need to analyze different parts of the detector separately before combining the information. This fragmentation causes difficulties for interactive data analysis, and as data sets increase in size and complexity (O10 TiB for a “small” neutrino experiment to the O10 PiB currently held by the CMS experiment at the LHC), data analysis methods traditional to the field must evolve to make optimum use of emerging HPC technologies and platforms. Mainstream big data tools, while suggesting a direction in terms of what can be done if an entire data set can be available across a system and analysed with high-level programming abstractions, are not designed with either scientific computing generally, or modern HPC platform features in particular, such as data caching levels, in mind. Our example HPC use case is a search for a new elementary particle which might explain the phenomenon known as “Dark Matter”. Here, using data from the CMS detector, we will use HDF5 as our input data format, and MPI with Python to implement our use case.« less
Could Blobs Fuel Storage-Based Convergence between HPC and Big Data?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matri, Pierre; Alforov, Yevhen; Brandon, Alvaro
The increasingly growing data sets processed on HPC platforms raise major challenges for the underlying storage layer. A promising alternative to POSIX-IO- compliant file systems are simpler blobs (binary large objects), or object storage systems. Such systems offer lower overhead and better performance at the cost of largely unused features such as file hierarchies or permissions. Similarly, blobs are increasingly considered for replacing distributed file systems for big data analytics or as a base for storage abstractions such as key-value stores or time-series databases. This growing interest in such object storage on HPC and big data platforms raises the question:more » Are blobs the right level of abstraction to enable storage-based convergence between HPC and Big Data? In this paper we study the impact of blob-based storage for real-world applications on HPC and cloud environments. The results show that blobbased storage convergence is possible, leading to a significant performance improvement on both platforms« less
1001 Ways to run AutoDock Vina for virtual screening
NASA Astrophysics Data System (ADS)
Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D.
2016-03-01
Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
Computational Aspects of Data Assimilation and the ESMF
NASA Technical Reports Server (NTRS)
daSilva, A.
2003-01-01
The scientific challenge of developing advanced data assimilation applications is a daunting task. Independently developed components may have incompatible interfaces or may be written in different computer languages. The high-performance computer (HPC) platforms required by numerically intensive Earth system applications are complex, varied, rapidly evolving and multi-part systems themselves. Since the market for high-end platforms is relatively small, there is little robust middleware available to buffer the modeler from the difficulties of HPC programming. To complicate matters further, the collaborations required to develop large Earth system applications often span initiatives, institutions and agencies, involve geoscience, software engineering, and computer science communities, and cross national borders.The Earth System Modeling Framework (ESMF) project is a concerted response to these challenges. Its goal is to increase software reuse, interoperability, ease of use and performance in Earth system models through the use of a common software framework, developed in an open manner by leaders in the modeling community. The ESMF addresses the technical and to some extent the cultural - aspects of Earth system modeling, laying the groundwork for addressing the more difficult scientific aspects, such as the physical compatibility of components, in the future. In this talk we will discuss the general philosophy and architecture of the ESMF, focussing on those capabilities useful for developing advanced data assimilation applications.
1001 Ways to run AutoDock Vina for virtual screening.
Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D
2016-03-01
Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aguilo Valentin, Miguel Alejandro; Trujillo, Susie
During calendar year 2017, Sandia National Laboratories (SNL) made strides towards developing an open portable design platform rich in highperformance computing (HPC) enabled modeling, analysis and synthesis tools. The main focus was to lay the foundations of the core interfaces that will enable plug-n-play insertion of synthesis optimization technologies in the areas of modeling, analysis and synthesis.
The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science
Bruskiewich, Richard; Senger, Martin; Davenport, Guy; Ruiz, Manuel; Rouard, Mathieu; Hazekamp, Tom; Takeya, Masaru; Doi, Koji; Satoh, Kouji; Costa, Marcos; Simon, Reinhard; Balaji, Jayashree; Akintunde, Akinnola; Mauleon, Ramil; Wanchana, Samart; Shah, Trushar; Anacleto, Mylah; Portugal, Arllet; Ulat, Victor Jun; Thongjuea, Supat; Braak, Kyle; Ritter, Sebastian; Dereeper, Alexis; Skofic, Milko; Rojas, Edwin; Martins, Natalia; Pappas, Georgios; Alamban, Ryan; Almodiel, Roque; Barboza, Lord Hendrix; Detras, Jeffrey; Manansala, Kevin; Mendoza, Michael Jonathan; Morales, Jeffrey; Peralta, Barry; Valerio, Rowena; Zhang, Yi; Gregorio, Sergio; Hermocilla, Joseph; Echavez, Michael; Yap, Jan Michael; Farmer, Andrew; Schiltz, Gary; Lee, Jennifer; Casstevens, Terry; Jaiswal, Pankaj; Meintjes, Ayton; Wilkinson, Mark; Good, Benjamin; Wagner, James; Morris, Jane; Marshall, David; Collins, Anthony; Kikuchi, Shoshi; Metz, Thomas; McLaren, Graham; van Hintum, Theo
2008-01-01
The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making. PMID:18483570
Large-scale parallel genome assembler over cloud computing environment.
Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong
2017-06-01
The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Towards Portable Large-Scale Image Processing with High-Performance Computing.
Huo, Yuankai; Blaber, Justin; Damon, Stephen M; Boyd, Brian D; Bao, Shunxing; Parvathaneni, Prasanna; Noguera, Camilo Bermudez; Chaganti, Shikha; Nath, Vishwesh; Greer, Jasmine M; Lyu, Ilwoo; French, William R; Newton, Allen T; Rogers, Baxter P; Landman, Bennett A
2018-05-03
High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called "spiders." The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software development and expansion, and (3) scalable spider deployment compatible with HPC clusters and local workstations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin
The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less
Self-service for software development projects and HPC activities
NASA Astrophysics Data System (ADS)
Husejko, M.; Høimyr, N.; Gonzalez, A.; Koloventzos, G.; Asbury, D.; Trzcinska, A.; Agtzidis, I.; Botrel, G.; Otto, J.
2014-05-01
This contribution describes how CERN has implemented several essential tools for agile software development processes, ranging from version control (Git) to issue tracking (Jira) and documentation (Wikis). Running such services in a large organisation like CERN requires many administrative actions both by users and service providers, such as creating software projects, managing access rights, users and groups, and performing tool-specific customisation. Dealing with these requests manually would be a time-consuming task. Another area of our CERN computing services that has required dedicated manual support has been clusters for specific user communities with special needs. Our aim is to move all our services to a layered approach, with server infrastructure running on the internal cloud computing infrastructure at CERN. This contribution illustrates how we plan to optimise the management of our of services by means of an end-user facing platform acting as a portal into all the related services for software projects, inspired by popular portals for open-source developments such as Sourceforge, GitHub and others. Furthermore, the contribution will discuss recent activities with tests and evaluations of High Performance Computing (HPC) applications on different hardware and software stacks, and plans to offer a dynamically scalable HPC service at CERN, based on affordable hardware.
NASA Astrophysics Data System (ADS)
Alameda, J. C.
2011-12-01
Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into computational science and engineering codes. Finally, we are partnering with the lead PTP developers at IBM, to ensure we are as effective as possible within the Eclipse community development. We are also conducting training and outreach to our user community, including conference BOF sessions, monthly user calls, and an annual user meeting, so that we can best inform the improvements we make to Eclipse PTP. With these activities we endeavor to encourage use of modern software engineering practices, as enabled through the Eclipse IDE, with computational science and engineering applications. These practices include proper use of source code repositories, tracking and rectifying issues, measuring and monitoring code performance changes against both optimizations as well as ever-changing software stacks and configurations on HPC systems, as well as ultimately encouraging development and maintenance of testing suites -- things that have become commonplace in many software endeavors, but have lagged in the development of science applications. We view that the challenge with the increased complexity of both HPC systems and science applications demands the use of better software engineering methods, preferably enabled by modern tools such as Eclipse PTP, to help the computational science community thrive as we evolve the HPC landscape.
Workload Characterization of a Leadership Class Storage Cluster
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Youngjae; Gunasekaran, Raghul; Shipman, Galen M
2010-01-01
Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the scientific workloads of the world s fastest HPC (High Performance Computing) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). Spider provides an aggregate bandwidth of over 240 GB/s with over 10 petabytes of RAID 6 formatted capacity. OLCFs flagship petascale simulation platform, Jaguar, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize themore » system utilization, the demands of reads and writes, idle time, and the distribution of read requests to write requests for the storage system observed over a period of 6 months. From this study we develop synthesized workloads and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution.« less
Integration of High-Performance Computing into Cloud Computing Services
NASA Astrophysics Data System (ADS)
Vouk, Mladen A.; Sills, Eric; Dreher, Patrick
High-Performance Computing (HPC) projects span a spectrum of computer hardware implementations ranging from peta-flop supercomputers, high-end tera-flop facilities running a variety of operating systems and applications, to mid-range and smaller computational clusters used for HPC application development, pilot runs and prototype staging clusters. What they all have in common is that they operate as a stand-alone system rather than a scalable and shared user re-configurable resource. The advent of cloud computing has changed the traditional HPC implementation. In this article, we will discuss a very successful production-level architecture and policy framework for supporting HPC services within a more general cloud computing infrastructure. This integrated environment, called Virtual Computing Lab (VCL), has been operating at NC State since fall 2004. Nearly 8,500,000 HPC CPU-Hrs were delivered by this environment to NC State faculty and students during 2009. In addition, we present and discuss operational data that show that integration of HPC and non-HPC (or general VCL) services in a cloud can substantially reduce the cost of delivering cloud services (down to cents per CPU hour).
2011 Computation Directorate Annual Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crawford, D L
2012-04-11
From its founding in 1952 until today, Lawrence Livermore National Laboratory (LLNL) has made significant strategic investments to develop high performance computing (HPC) and its application to national security and basic science. Now, 60 years later, the Computation Directorate and its myriad resources and capabilities have become a key enabler for LLNL programs and an integral part of the effort to support our nation's nuclear deterrent and, more broadly, national security. In addition, the technological innovation HPC makes possible is seen as vital to the nation's economic vitality. LLNL, along with other national laboratories, is working to make supercomputing capabilitiesmore » and expertise available to industry to boost the nation's global competitiveness. LLNL is on the brink of an exciting milestone with the 2012 deployment of Sequoia, the National Nuclear Security Administration's (NNSA's) 20-petaFLOP/s resource that will apply uncertainty quantification to weapons science. Sequoia will bring LLNL's total computing power to more than 23 petaFLOP/s-all brought to bear on basic science and national security needs. The computing systems at LLNL provide game-changing capabilities. Sequoia and other next-generation platforms will enable predictive simulation in the coming decade and leverage industry trends, such as massively parallel and multicore processors, to run petascale applications. Efficient petascale computing necessitates refining accuracy in materials property data, improving models for known physical processes, identifying and then modeling for missing physics, quantifying uncertainty, and enhancing the performance of complex models and algorithms in macroscale simulation codes. Nearly 15 years ago, NNSA's Accelerated Strategic Computing Initiative (ASCI), now called the Advanced Simulation and Computing (ASC) Program, was the critical element needed to shift from test-based confidence to science-based confidence. Specifically, ASCI/ASC accelerated the development of simulation capabilities necessary to ensure confidence in the nuclear stockpile-far exceeding what might have been achieved in the absence of a focused initiative. While stockpile stewardship research pushed LLNL scientists to develop new computer codes, better simulation methods, and improved visualization technologies, this work also stimulated the exploration of HPC applications beyond the standard sponsor base. As LLNL advances to a petascale platform and pursues exascale computing (1,000 times faster than Sequoia), ASC will be paramount to achieving predictive simulation and uncertainty quantification. Predictive simulation and quantifying the uncertainty of numerical predictions where little-to-no data exists demands exascale computing and represents an expanding area of scientific research important not only to nuclear weapons, but to nuclear attribution, nuclear reactor design, and understanding global climate issues, among other fields. Aside from these lofty goals and challenges, computing at LLNL is anything but 'business as usual.' International competition in supercomputing is nothing new, but the HPC community is now operating in an expanded, more aggressive climate of global competitiveness. More countries understand how science and technology research and development are inextricably linked to economic prosperity, and they are aggressively pursuing ways to integrate HPC technologies into their native industrial and consumer products. In the interest of the nation's economic security and the science and technology that underpins it, LLNL is expanding its portfolio and forging new collaborations. We must ensure that HPC remains an asymmetric engine of innovation for the Laboratory and for the U.S. and, in doing so, protect our research and development dynamism and the prosperity it makes possible. One untapped area of opportunity LLNL is pursuing is to help U.S. industry understand how supercomputing can benefit their business. Industrial investment in HPC applications has historically been limited by the prohibitive cost of entry, the inaccessibility of software to run the powerful systems, and the years it takes to grow the expertise to develop codes and run them in an optimal way. LLNL is helping industry better compete in the global market place by providing access to some of the world's most powerful computing systems, the tools to run them, and the experts who are adept at using them. Our scientists are collaborating side by side with industrial partners to develop solutions to some of industry's toughest problems. The goal of the Livermore Valley Open Campus High Performance Computing Innovation Center is to allow American industry the opportunity to harness the power of supercomputing by leveraging the scientific and computational expertise at LLNL in order to gain a competitive advantage in the global economy.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gentile, Ann C.; Brandt, James M.; Tucker, Thomas
2011-09-01
This report provides documentation for the completion of the Sandia Level II milestone 'Develop feedback system for intelligent dynamic resource allocation to improve application performance'. This milestone demonstrates the use of a scalable data collection analysis and feedback system that enables insight into how an application is utilizing the hardware resources of a high performance computing (HPC) platform in a lightweight fashion. Further we demonstrate utilizing the same mechanisms used for transporting data for remote analysis and visualization to provide low latency run-time feedback to applications. The ultimate goal of this body of work is performance optimization in the facemore » of the ever increasing size and complexity of HPC systems.« less
NASA Astrophysics Data System (ADS)
Bielski, Conrad; Lemoine, Guido; Syryczynski, Jacek
2009-09-01
High Performance Computing (HPC) hardware solutions such as grid computing and General Processing on a Graphics Processing Unit (GPGPU) are now accessible to users with general computing needs. Grid computing infrastructures in the form of computing clusters or blades are becoming common place and GPGPU solutions that leverage the processing power of the video card are quickly being integrated into personal workstations. Our interest in these HPC technologies stems from the need to produce near real-time maps from a combination of pre- and post-event satellite imagery in support of post-disaster management. Faster processing provides a twofold gain in this situation: 1. critical information can be provided faster and 2. more elaborate automated processing can be performed prior to providing the critical information. In our particular case, we test the use of the PANTEX index which is based on analysis of image textural measures extracted using anisotropic, rotation-invariant GLCM statistics. The use of this index, applied in a moving window, has been shown to successfully identify built-up areas in remotely sensed imagery. Built-up index image masks are important input to the structuring of damage assessment interpretation because they help optimise the workload. The performance of computing the PANTEX workflow is compared on two different HPC hardware architectures: (1) a blade server with 4 blades, each having dual quad-core CPUs and (2) a CUDA enabled GPU workstation. The reference platform is a dual CPU-quad core workstation and the PANTEX workflow total computing time is measured. Furthermore, as part of a qualitative evaluation, the differences in setting up and configuring various hardware solutions and the related software coding effort is presented.
Exploiting Parallel R in the Cloud with SPRINT
Piotrowski, M.; McGilvary, G.A.; Sloan, T. M.; Mewissen, M.; Lloyd, A.D.; Forster, T.; Mitchell, L.; Ghazal, P.; Hill, J.
2012-01-01
Background Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Objectives Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon’s Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. Methods The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. Results It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of algorithm. Resource underutilization can further improve the time to result. End-user’s location impacts on costs due to factors such as local taxation. Conclusions: Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds. PMID:23223611
Exploiting parallel R in the cloud with SPRINT.
Piotrowski, M; McGilvary, G A; Sloan, T M; Mewissen, M; Lloyd, A D; Forster, T; Mitchell, L; Ghazal, P; Hill, J
2013-01-01
Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon's Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of the algorithm. Resource underutilization can further improve the time to result. End-user's location impacts on costs due to factors such as local taxation. Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sehrish, S.; Kowalkowski, J.; Paterno, M.
High level abstractions in Python that can utilize computing hardware well seem to be an attractive option for writing data reduction and analysis tasks. In this paper, we explore the features available in Python which are useful and efficient for end user analysis in High Energy Physics (HEP). A typical vertical slice of an HEP data analysis is somewhat fragmented: the state of the reduction/analysis process must be saved at certain stages to allow for selective reprocessing of only parts of a generally time-consuming workflow. Also, algorithms tend to to be modular because of the heterogeneous nature of most detectorsmore » and the need to analyze different parts of the detector separately before combining the information. This fragmentation causes difficulties for interactive data analysis, and as data sets increase in size and complexity (O10 TiB for a “small” neutrino experiment to the O10 PiB currently held by the CMS experiment at the LHC), data analysis methods traditional to the field must evolve to make optimum use of emerging HPC technologies and platforms. Mainstream big data tools, while suggesting a direction in terms of what can be done if an entire data set can be available across a system and analysed with high-level programming abstractions, are not designed with either scientific computing generally, or modern HPC platform features in particular, such as data caching levels, in mind. Our example HPC use case is a search for a new elementary particle which might explain the phenomenon known as “Dark Matter”. Here, using data from the CMS detector, we will use HDF5 as our input data format, and MPI with Python to implement our use case.« less
NASA Astrophysics Data System (ADS)
Puzyrkov, Dmitry; Polyakov, Sergey; Podryga, Viktoriia; Markizov, Sergey
2018-02-01
At the present stage of computer technology development it is possible to study the properties and processes in complex systems at molecular and even atomic levels, for example, by means of molecular dynamics methods. The most interesting are problems related with the study of complex processes under real physical conditions. Solving such problems requires the use of high performance computing systems of various types, for example, GRID systems and HPC clusters. Considering the time consuming computational tasks, the need arises of software for automatic and unified monitoring of such computations. A complex computational task can be performed over different HPC systems. It requires output data synchronization between the storage chosen by a scientist and the HPC system used for computations. The design of the computational domain is also quite a problem. It requires complex software tools and algorithms for proper atomistic data generation on HPC systems. The paper describes the prototype of a cloud service, intended for design of atomistic systems of large volume for further detailed molecular dynamic calculations and computational management for this calculations, and presents the part of its concept aimed at initial data generation on the HPC systems.
Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin; ...
2016-10-06
The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less
Understanding I/O workload characteristics of a Peta-scale storage system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Youngjae; Gunasekaran, Raghul
2015-01-01
Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the I/O workloads of scientific applications of one of the world s fastest high performance computing (HPC) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). OLCF flagship petascale simulation platform, Titan, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, storage space utilization,more » and the distribution of read requests to write requests for the Peta-scale Storage Systems. From this study, we develop synthesized workloads, and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. We also study the I/O load imbalance problems using I/O performance data collected from the Spider storage system.« less
The Prodiguer Messaging Platform
NASA Astrophysics Data System (ADS)
Denvil, S.; Greenslade, M. A.; Carenton, N.; Levavasseur, G.; Raciazek, J.
2015-12-01
CONVERGENCE is a French multi-partner national project designed to gather HPC and informatics expertise to innovate in the context of running French global climate models with differing grids and at differing resolutions. Efficient and reliable execution of these models and the management and dissemination of model output are some of the complexities that CONVERGENCE aims to resolve.At any one moment in time, researchers affiliated with the Institut Pierre Simon Laplace (IPSL) climate modeling group, are running hundreds of global climate simulations. These simulations execute upon a heterogeneous set of French High Performance Computing (HPC) environments. The IPSL's simulation execution runtime libIGCM (library for IPSL Global Climate Modeling group) has recently been enhanced so as to support hitherto impossible realtime use cases such as simulation monitoring, data publication, metrics collection, simulation control, visualizations … etc. At the core of this enhancement is Prodiguer: an AMQP (Advanced Message Queue Protocol) based event driven asynchronous distributed messaging platform. libIGCM now dispatches copious amounts of information, in the form of messages, to the platform for remote processing by Prodiguer software agents at IPSL servers in Paris. Such processing takes several forms: Persisting message content to database(s); Launching rollback jobs upon simulation failure; Notifying downstream applications; Automation of visualization pipelines; We will describe and/or demonstrate the platform's: Technical implementation; Inherent ease of scalability; Inherent adaptiveness in respect to supervising simulations; Web portal receiving simulation notifications in realtime.
WinHPC System | High-Performance Computing | NREL
System WinHPC System NREL's WinHPC system is a computing cluster running the Microsoft Windows operating system. It allows users to run jobs requiring a Windows environment such as ANSYS and MATLAB
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tock, Yoav; Mandler, Benjamin; Moreira, Jose
2013-01-01
As HPC systems and applications get bigger and more complex, we are approaching an era in which resiliency and run-time elasticity concerns be- come paramount.We offer a building block for an alternative resiliency approach in which computations will be able to make progress while components fail, in addition to enabling a dynamic set of nodes throughout a computation lifetime. The core of our solution is a hierarchical scalable membership service provid- ing eventual consistency semantics. An attribute replication service is used for hierarchy organization, and is exposed to external applications. Our solution is based on P2P technologies and provides resiliencymore » and elastic runtime support at ultra large scales. Resulting middleware is general purpose while exploiting HPC platform unique features and architecture. We have implemented and tested this system on BlueGene/P with Linux, and using worst-case analysis, evaluated the service scalability as effective for up to 1M nodes.« less
PHoToNs–A parallel heterogeneous and threads oriented code for cosmological N-body simulation
NASA Astrophysics Data System (ADS)
Wang, Qiao; Cao, Zong-Yan; Gao, Liang; Chi, Xue-Bin; Meng, Chen; Wang, Jie; Wang, Long
2018-06-01
We introduce a new code for cosmological simulations, PHoToNs, which incorporates features for performing massive cosmological simulations on heterogeneous high performance computer (HPC) systems and threads oriented programming. PHoToNs adopts a hybrid scheme to compute gravitational force, with the conventional Particle-Mesh (PM) algorithm to compute the long-range force, the Tree algorithm to compute the short range force and the direct summation Particle-Particle (PP) algorithm to compute gravity from very close particles. A self-similar space filling a Peano-Hilbert curve is used to decompose the computing domain. Threads programming is advantageously used to more flexibly manage the domain communication, PM calculation and synchronization, as well as Dual Tree Traversal on the CPU+MIC platform. PHoToNs scales well and efficiency of the PP kernel achieves 68.6% of peak performance on MIC and 74.4% on CPU platforms. We also test the accuracy of the code against the much used Gadget-2 in the community and found excellent agreement.
birgHPC: creating instant computing clusters for bioinformatics and molecular dynamics.
Chew, Teong Han; Joyce-Tan, Kwee Hong; Akma, Farizuwana; Shamsir, Mohd Shahir
2011-05-01
birgHPC, a bootable Linux Live CD has been developed to create high-performance clusters for bioinformatics and molecular dynamics studies using any Local Area Network (LAN)-networked computers. birgHPC features automated hardware and slots detection as well as provides a simple job submission interface. The latest versions of GROMACS, NAMD, mpiBLAST and ClustalW-MPI can be run in parallel by simply booting the birgHPC CD or flash drive from the head node, which immediately positions the rest of the PCs on the network as computing nodes. Thus, a temporary, affordable, scalable and high-performance computing environment can be built by non-computing-based researchers using low-cost commodity hardware. The birgHPC Live CD and relevant user guide are available for free at http://birg1.fbb.utm.my/birghpc.
NASA Astrophysics Data System (ADS)
Bocharov, A. N.; Bityurin, V. A.; Golovin, N. N.; Evstigneev, N. M.; Petrovskiy, V. P.; Ryabkov, O. I.; Teplyakov, I. O.; Shustov, A. A.; Solomonov, Yu S.; Fortov, V. E.
2016-11-01
In this paper, an approach to solve conjugate heat- and mass-transfer problems is considered to be applied to hypersonic vehicle surface of arbitrary shape. The approach under developing should satisfy the following demands. (i) The surface of the body of interest may have arbitrary geometrical shape. (ii) The shape of the body can change during calculation. (iii) The flight characteristics may vary in a wide range, specifically flight altitude, free-stream Mach number, angle-of-attack, etc. (iv) The approach should be realized with using the high-performance-computing (HPC) technologies. The approach is based on coupled solution of 3D unsteady hypersonic flow equations and 3D unsteady heat conductance problem for the thick wall. Iterative process is applied to account for ablation of wall material and, consequently, mass injection from the surface and changes in the surface shape. While iterations, unstructured computational grids both in the flow region and within the wall interior are adapted to the current geometry and flow conditions. The flow computations are done on HPC platform and are most time-consuming part of the whole problem, while heat conductance problem can be solved on many kinds of computers.
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C.
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction. PMID:24904400
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research.
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction.
NASA Astrophysics Data System (ADS)
Shute, J.; Carriere, L.; Duffy, D.; Hoy, E.; Peters, J.; Shen, Y.; Kirschbaum, D.
2017-12-01
The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center is building and maintaining an Enterprise GIS capability for its stakeholders, to include NASA scientists, industry partners, and the public. This platform is powered by three GIS subsystems operating in a highly-available, virtualized environment: 1) the Spatial Analytics Platform is the primary NCCS GIS and provides users discoverability of the vast DigitalGlobe/NGA raster assets within the NCCS environment; 2) the Disaster Mapping Platform provides mapping and analytics services to NASA's Disaster Response Group; and 3) the internal (Advanced Data Analytics Platform/ADAPT) enterprise GIS provides users with the full suite of Esri and open source GIS software applications and services. All systems benefit from NCCS's cutting edge infrastructure, to include an InfiniBand network for high speed data transfers; a mixed/heterogeneous environment featuring seamless sharing of information between Linux and Windows subsystems; and in-depth system monitoring and warning systems. Due to its co-location with the NCCS Discover High Performance Computing (HPC) environment and the Advanced Data Analytics Platform (ADAPT), the GIS platform has direct access to several large NCCS datasets including DigitalGlobe/NGA, Landsat, MERRA, and MERRA2. Additionally, the NCCS ArcGIS Desktop Windows virtual machines utilize existing NetCDF and OPeNDAP assets for visualization, modelling, and analysis - thus eliminating the need for data duplication. With the advent of this platform, Earth scientists have full access to vast data repositories and the industry-leading tools required for successful management and analysis of these multi-petabyte, global datasets. The full system architecture and integration with scientific datasets will be presented. Additionally, key applications and scientific analyses will be explained, to include the NASA Global Landslide Catalog (GLC) Reporter crowdsourcing application, the NASA GLC Viewer discovery and analysis tool, the DigitalGlobe/NGA Data Discovery Tool, the NASA Disaster Response Group Mapping Platform (https://maps.disasters.nasa.gov), and support for NASA's Arctic - Boreal Vulnerability Experiment (ABoVE).
High-performance computing — an overview
NASA Astrophysics Data System (ADS)
Marksteiner, Peter
1996-08-01
An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
CyberShake: Running Seismic Hazard Workflows on Distributed HPC Resources
NASA Astrophysics Data System (ADS)
Callaghan, S.; Maechling, P. J.; Graves, R. W.; Gill, D.; Olsen, K. B.; Milner, K. R.; Yu, J.; Jordan, T. H.
2013-12-01
As part of its program of earthquake system science research, the Southern California Earthquake Center (SCEC) has developed a simulation platform, CyberShake, to perform physics-based probabilistic seismic hazard analysis (PSHA) using 3D deterministic wave propagation simulations. CyberShake performs PSHA by simulating a tensor-valued wavefield of Strain Green Tensors, and then using seismic reciprocity to calculate synthetic seismograms for about 415,000 events per site of interest. These seismograms are processed to compute ground motion intensity measures, which are then combined with probabilities from an earthquake rupture forecast to produce a site-specific hazard curve. Seismic hazard curves for hundreds of sites in a region can be used to calculate a seismic hazard map, representing the seismic hazard for a region. We present a recently completed PHSA study in which we calculated four CyberShake seismic hazard maps for the Southern California area to compare how CyberShake hazard results are affected by different SGT computational codes (AWP-ODC and AWP-RWG) and different community velocity models (Community Velocity Model - SCEC (CVM-S4) v11.11 and Community Velocity Model - Harvard (CVM-H) v11.9). We present our approach to running workflow applications on distributed HPC resources, including systems without support for remote job submission. We show how our approach extends the benefits of scientific workflows, such as job and data management, to large-scale applications on Track 1 and Leadership class open-science HPC resources. We used our distributed workflow approach to perform CyberShake Study 13.4 on two new NSF open-science HPC computing resources, Blue Waters and Stampede, executing over 470 million tasks to calculate physics-based hazard curves for 286 locations in the Southern California region. For each location, we calculated seismic hazard curves with two different community velocity models and two different SGT codes, resulting in over 1100 hazard curves. We will report on the performance of this CyberShake study, four times larger than previous studies. Additionally, we will examine the challenges we face applying these workflow techniques to additional open-science HPC systems and discuss whether our workflow solutions continue to provide value to our large-scale PSHA calculations.
Hierarchical parallelisation of functional renormalisation group calculations - hp-fRG
NASA Astrophysics Data System (ADS)
Rohe, Daniel
2016-10-01
The functional renormalisation group (fRG) has evolved into a versatile tool in condensed matter theory for studying important aspects of correlated electron systems. Practical applications of the method often involve a high numerical effort, motivating the question in how far High Performance Computing (HPC) can leverage the approach. In this work we report on a multi-level parallelisation of the underlying computational machinery and show that this can speed up the code by several orders of magnitude. This in turn can extend the applicability of the method to otherwise inaccessible cases. We exploit three levels of parallelisation: Distributed computing by means of Message Passing (MPI), shared-memory computing using OpenMP, and vectorisation by means of SIMD units (single-instruction-multiple-data). Results are provided for two distinct High Performance Computing (HPC) platforms, namely the IBM-based BlueGene/Q system JUQUEEN and an Intel Sandy-Bridge-based development cluster. We discuss how certain issues and obstacles were overcome in the course of adapting the code. Most importantly, we conclude that this vast improvement can actually be accomplished by introducing only moderate changes to the code, such that this strategy may serve as a guideline for other researcher to likewise improve the efficiency of their codes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Habib, Salman; Roser, Robert; Gerber, Richard
The U.S. Department of Energy (DOE) Office of Science (SC) Offices of High Energy Physics (HEP) and Advanced Scientific Computing Research (ASCR) convened a programmatic Exascale Requirements Review on June 10–12, 2015, in Bethesda, Maryland. This report summarizes the findings, results, and recommendations derived from that meeting. The high-level findings and observations are as follows. Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude — and in some cases greatermore » — than that available currently. The growth rate of data produced by simulations is overwhelming the current ability of both facilities and researchers to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. Data rates and volumes from experimental facilities are also straining the current HEP infrastructure in its ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. A close integration of high-performance computing (HPC) simulation and data analysis will greatly aid in interpreting the results of HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. Long-range planning between HEP and ASCR will be required to meet HEP’s research needs. To best use ASCR HPC resources, the experimental HEP program needs (1) an established, long-term plan for access to ASCR computational and data resources, (2) the ability to map workflows to HPC resources, (3) the ability for ASCR facilities to accommodate workflows run by collaborations potentially comprising thousands of individual members, (4) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, (5) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.« less
ERIC Educational Resources Information Center
Fredette, Michelle
2012-01-01
"Rent or buy?" is a question people ask about everything from housing to textbooks. It is also a question universities must consider when it comes to high-performance computing (HPC). With the advent of Amazon's Elastic Compute Cloud (EC2), Microsoft Windows HPC Server, Rackspace's OpenStack, and other cloud-based services, researchers now have…
Singularity: Scientific containers for mobility of compute.
Kurtzer, Gregory M; Sochat, Vanessa; Bauer, Michael W
2017-01-01
Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science.
Singularity: Scientific containers for mobility of compute
Kurtzer, Gregory M.; Bauer, Michael W.
2017-01-01
Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science. PMID:28494014
Integrating Reconfigurable Hardware-Based Grid for High Performance Computing
Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos
2015-01-01
FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process. PMID:25874241
Mixing HTC and HPC Workloads with HTCondor and Slurm
NASA Astrophysics Data System (ADS)
Hollowell, C.; Barnett, J.; Caramarcu, C.; Strecker-Kellogg, W.; Wong, A.; Zaytsev, A.
2017-10-01
Traditionally, the RHIC/ATLAS Computing Facility (RACF) at Brookhaven National Laboratory (BNL) has only maintained High Throughput Computing (HTC) resources for our HEP/NP user community. We’ve been using HTCondor as our batch system for many years, as this software is particularly well suited for managing HTC processor farm resources. Recently, the RACF has also begun to design/administrate some High Performance Computing (HPC) systems for a multidisciplinary user community at BNL. In this paper, we’ll discuss our experiences using HTCondor and Slurm in an HPC context, and our facility’s attempts to allow our HTC and HPC processing farms/clusters to make opportunistic use of each other’s computing resources.
ASCR/HEP Exascale Requirements Review Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Habib, Salman; Roser, Robert; Gerber, Richard
This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, tomore » store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.« less
ASCR/HEP Exascale Requirements Review Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Habib, Salman; et al.
2016-03-30
This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, tomore » store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.« less
NASA Astrophysics Data System (ADS)
Rohde, Mitchell M.; Crawford, Justin; Toschlog, Matthew; Iagnemma, Karl D.; Kewlani, Guarav; Cummins, Christopher L.; Jones, Randolph A.; Horner, David A.
2009-05-01
It is widely recognized that simulation is pivotal to vehicle development, whether manned or unmanned. There are few dedicated choices, however, for those wishing to perform realistic, end-to-end simulations of unmanned ground vehicles (UGVs). The Virtual Autonomous Navigation Environment (VANE), under development by US Army Engineer Research and Development Center (ERDC), provides such capabilities but utilizes a High Performance Computing (HPC) Computational Testbed (CTB) and is not intended for on-line, real-time performance. A product of the VANE HPC research is a real-time desktop simulation application under development by the authors that provides a portal into the HPC environment as well as interaction with wider-scope semi-automated force simulations (e.g. OneSAF). This VANE desktop application, dubbed the Autonomous Navigation Virtual Environment Laboratory (ANVEL), enables analysis and testing of autonomous vehicle dynamics and terrain/obstacle interaction in real-time with the capability to interact within the HPC constructive geo-environmental CTB for high fidelity sensor evaluations. ANVEL leverages rigorous physics-based vehicle and vehicle-terrain interaction models in conjunction with high-quality, multimedia visualization techniques to form an intuitive, accurate engineering tool. The system provides an adaptable and customizable simulation platform that allows developers a controlled, repeatable testbed for advanced simulations. ANVEL leverages several key technologies not common to traditional engineering simulators, including techniques from the commercial video-game industry. These enable ANVEL to run on inexpensive commercial, off-the-shelf (COTS) hardware. In this paper, the authors describe key aspects of ANVEL and its development, as well as several initial applications of the system.
CDAC Student Report: Summary of LLNL Internship
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herriman, Jane E.
Multiple objectives motivated me to apply for an internship at LLNL: I wanted to experience the work environment at a national lab, to learn about research and job opportunities at LLNL in particular, and to gain greater experience with code development, particularly within the realm of high performance computing (HPC). This summer I was selected to participate in LLNL's Computational Chemistry and Material Science Summer Institute (CCMS). CCMS is a 10 week program hosted by the Quantum Simulations group leader, Dr. Eric Schwegler. CCMS connects graduate students to mentors at LLNL involved in similar re- search and provides weekly seminarsmore » on a broad array of topics from within chemistry and materials science. Dr. Xavier Andrade and Dr. Erik Draeger served as my co-mentors over the summer, and Dr. Andrade continues to mentor me now that CCMS has concluded. Dr. Andrade is a member of the Quantum Simulations group within the Physical and Life Sciences at LLNL, and Dr. Draeger leads the HPC group within the Center for Applied Scientific Computing (CASC). The two have worked together to develop Qb@ll, an open-source first principles molecular dynamics code that was the platform for my summer research project.« less
NASA Astrophysics Data System (ADS)
Evans, B. J. K.; Foster, C.; Minchin, S. A.; Pugh, T.; Lewis, A.; Wyborn, L. A.; Evans, B. J.; Uhlherr, A.
2014-12-01
The National Computational Infrastructure (NCI) has established a powerful in-situ computational environment to enable both high performance computing and data-intensive science across a wide spectrum of national environmental data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress in addressing harmonisation of the underlying data collections for future transdisciplinary research that enable accurate climate projections. NCI makes available 10+ PB major data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. The data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. This computational environment supports a catalogue of integrated reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. To enable transdisciplinary research on this scale, data needs to be harmonised so that researchers can readily apply techniques and software across the corpus of data available and not be constrained to work within artificial disciplinary boundaries. Future challenges will involve the further integration and analysis of this data across the social sciences to facilitate the impacts across the societal domain, including timely analysis to more accurately predict and forecast future climate and environmental state.
Training | High-Performance Computing | NREL
Training Training Find training resources for using NREL's high-performance computing (HPC) systems as well as related online tutorials. Upcoming Training HPC User Workshop - June 12th We will be Conference, a group meets to discuss Best Practices in HPC Training. This group developed a list of resources
NASA Astrophysics Data System (ADS)
Demenev, A. G.
2018-02-01
The present work is devoted to analyze high-performance computing (HPC) infrastructure capabilities for aircraft engine aeroacoustics problems solving at Perm State University. We explore here the ability to develop new computational aeroacoustics methods/solvers for computer-aided engineering (CAE) systems to handle complicated industrial problems of engine noise prediction. Leading aircraft engine engineering company, including “UEC-Aviadvigatel” JSC (our industrial partners in Perm, Russia), require that methods/solvers to optimize geometry of aircraft engine for fan noise reduction. We analysed Perm State University HPC-hardware resources and software services to use efficiently. The performed results demonstrate that Perm State University HPC-infrastructure are mature enough to face out industrial-like problems of development CAE-system with HPC-method and CFD-solvers.
PREFACE: High Performance Computing Symposium 2011
NASA Astrophysics Data System (ADS)
Talon, Suzanne; Mousseau, Normand; Peslherbe, Gilles; Bertrand, François; Gauthier, Pierre; Kadem, Lyes; Moitessier, Nicolas; Rouleau, Guy; Wittig, Rod
2012-02-01
HPCS (High Performance Computing Symposium) is a multidisciplinary conference that focuses on research involving High Performance Computing and its application. Attended by Canadian and international experts and renowned researchers in the sciences, all areas of engineering, the applied sciences, medicine and life sciences, mathematics, the humanities and social sciences, it is Canada's pre-eminent forum for HPC. The 25th edition was held in Montréal, at the Université du Québec à Montréal, from 15-17 June and focused on HPC in Medical Science. The conference was preceded by tutorials held at Concordia University, where 56 participants learned about HPC best practices, GPU computing, parallel computing, debugging and a number of high-level languages. 274 participants from six countries attended the main conference, which involved 11 invited and 37 contributed oral presentations, 33 posters, and an exhibit hall with 16 booths from our sponsors. The work that follows is a collection of papers presented at the conference covering HPC topics ranging from computer science to bioinformatics. They are divided here into four sections: HPC in Engineering, Physics and Materials Science, HPC in Medical Science, HPC Enabling to Explore our World and New Algorithms for HPC. We would once more like to thank the participants and invited speakers, the members of the Scientific Committee, the referees who spent time reviewing the papers and our invaluable sponsors. To hear the invited talks and learn about 25 years of HPC development in Canada visit the Symposium website: http://2011.hpcs.ca/lang/en/conference/keynote-speakers/ Enjoy the excellent papers that follow, and we look forward to seeing you in Vancouver for HPCS 2012! Gilles Peslherbe Chair of the Scientific Committee Normand Mousseau Co-Chair of HPCS 2011 Suzanne Talon Chair of the Organizing Committee UQAM Sponsors The PDF also contains photographs from the conference banquet.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Connor, Carolyn Marie; Jacobson, Andree Lars; Bonnie, Amanda Marie
Sustainable and effective computing infrastructure depends critically on the skills and expertise of domain scientists and of committed and well-trained advanced computing professionals. But, in its ongoing High Performance Computing (HPC) work, Los Alamos National Laboratory noted a persistent shortage of well-prepared applicants, particularly for entry-level cluster administration, file systems administration, and high speed networking positions. Further, based upon recruiting efforts and interactions with universities graduating students in related majors of interest (e.g., computer science (CS)), there has been a long standing skillset gap, as focused training in HPC topics is typically lacking or absent in undergraduate and in evenmore » many graduate programs. Given that the effective operation and use of HPC systems requires specialized and often advanced training, that there is a recognized HPC skillset gap, and that there is intense global competition for computing and computational science talent, there is a long-standing and critical need for innovative approaches to help bridge the gap and create a well-prepared, next generation HPC workforce. Our paper places this need in the context of the HPC work and workforce requirements at Los Alamos National Laboratory (LANL) and presents one such innovative program conceived to address the need, bridge the gap, and grow an HPC workforce pipeline at LANL. The Computer System, Cluster, and Networking Summer Institute (CSCNSI) completed its 10th year in 2016. The story of the CSCNSI and its evolution is detailed below with a description of the design of its Boot Camp, and a summary of its success and some key factors that have enabled that success.« less
Connor, Carolyn Marie; Jacobson, Andree Lars; Bonnie, Amanda Marie; ...
2016-11-01
Sustainable and effective computing infrastructure depends critically on the skills and expertise of domain scientists and of committed and well-trained advanced computing professionals. But, in its ongoing High Performance Computing (HPC) work, Los Alamos National Laboratory noted a persistent shortage of well-prepared applicants, particularly for entry-level cluster administration, file systems administration, and high speed networking positions. Further, based upon recruiting efforts and interactions with universities graduating students in related majors of interest (e.g., computer science (CS)), there has been a long standing skillset gap, as focused training in HPC topics is typically lacking or absent in undergraduate and in evenmore » many graduate programs. Given that the effective operation and use of HPC systems requires specialized and often advanced training, that there is a recognized HPC skillset gap, and that there is intense global competition for computing and computational science talent, there is a long-standing and critical need for innovative approaches to help bridge the gap and create a well-prepared, next generation HPC workforce. Our paper places this need in the context of the HPC work and workforce requirements at Los Alamos National Laboratory (LANL) and presents one such innovative program conceived to address the need, bridge the gap, and grow an HPC workforce pipeline at LANL. The Computer System, Cluster, and Networking Summer Institute (CSCNSI) completed its 10th year in 2016. The story of the CSCNSI and its evolution is detailed below with a description of the design of its Boot Camp, and a summary of its success and some key factors that have enabled that success.« less
High-Performance Computing Systems and Operations | Computational Science |
NREL Systems and Operations High-Performance Computing Systems and Operations NREL operates high-performance computing (HPC) systems dedicated to advancing energy efficiency and renewable energy technologies. Capabilities NREL's HPC capabilities include: High-Performance Computing Systems We operate
Web-based reactive transport modeling using PFLOTRAN
NASA Astrophysics Data System (ADS)
Zhou, H.; Karra, S.; Lichtner, P. C.; Versteeg, R.; Zhang, Y.
2017-12-01
Actionable understanding of system behavior in the subsurface is required for a wide spectrum of societal and engineering needs by both commercial firms and government entities and academia. These needs include, for example, water resource management, precision agriculture, contaminant remediation, unconventional energy production, CO2 sequestration monitoring, and climate studies. Such understanding requires the ability to numerically model various coupled processes that occur across different temporal and spatial scales as well as multiple physical domains (reservoirs - overburden, surface-subsurface, groundwater-surface water, saturated-unsaturated zone). Currently, this ability is typically met through an in-house approach where computational resources, model expertise, and data for model parameterization are brought together to meet modeling needs. However, such an approach has multiple drawbacks which limit the application of high-end reactive transport codes such as the Department of Energy funded[?] PFLOTRAN code. In addition, while many end users have a need for the capabilities provided by high-end reactive transport codes, they do not have the expertise - nor the time required to obtain the expertise - to effectively use these codes. We have developed and are actively enhancing a cloud-based software platform through which diverse users are able to easily configure, execute, visualize, share, and interpret PFLOTRAN models. This platform consists of a web application and available on-demand HPC computational infrastructure. The web application consists of (1) a browser-based graphical user interface which allows users to configure models and visualize results interactively, and (2) a central server with back-end relational databases which hold configuration, data, modeling results, and Python scripts for model configuration, and (3) a HPC environment for on-demand model execution. We will discuss lessons learned in the development of this platform, the rationale for different interfaces, implementation choices, as well as the planned path forward.
WinHPC System User Basics | High-Performance Computing | NREL
guidance for starting to use this high-performance computing (HPC) system at NREL. Also see WinHPC policies ) when you are finished. Simply quitting Remote Desktop will keep your session active and using resources node). 2. Log in with your NREL.gov username/password. Remember to log out when finished. Mac 1. If you
One-Time Password Tokens | High-Performance Computing | NREL
One-Time Password Tokens One-Time Password Tokens For connecting to NREL's high-performance computing (HPC) systems, learn how to set up a one-time password (OTP) token for remote and privileged a one-time pass code from the HPC Operations team. At the sign-in screen Enter your HPC Username in
The Ettention software package.
Dahmen, Tim; Marsalek, Lukas; Marniok, Nico; Turoňová, Beata; Bogachev, Sviatoslav; Trampert, Patrick; Nickels, Stefan; Slusallek, Philipp
2016-02-01
We present a novel software package for the problem "reconstruction from projections" in electron microscopy. The Ettention framework consists of a set of modular building-blocks for tomographic reconstruction algorithms. The well-known block iterative reconstruction method based on Kaczmarz algorithm is implemented using these building-blocks, including adaptations specific to electron tomography. Ettention simultaneously features (1) a modular, object-oriented software design, (2) optimized access to high-performance computing (HPC) platforms such as graphic processing units (GPU) or many-core architectures like Xeon Phi, and (3) accessibility to microscopy end-users via integration in the IMOD package and eTomo user interface. We also provide developers with a clean and well-structured application programming interface (API) that allows for extending the software easily and thus makes it an ideal platform for algorithmic research while hiding most of the technical details of high-performance computing. Copyright © 2015 Elsevier B.V. All rights reserved.
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2007-01-01
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
High-Performance Computing User Facility | Computational Science | NREL
User Facility High-Performance Computing User Facility The High-Performance Computing User Facility technologies. Photo of the Peregrine supercomputer The High Performance Computing (HPC) User Facility provides Gyrfalcon Mass Storage System. Access Our HPC User Facility Learn more about these systems and how to access
HPC Software Stack Testing Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garvey, Cormac
The HPC Software stack testing framework (hpcswtest) is used in the INL Scientific Computing Department to test the basic sanity and integrity of the HPC Software stack (Compilers, MPI, Numerical libraries and Applications) and to quickly discover hard failures, and as a by-product it will indirectly check the HPC infrastructure (network, PBS and licensing servers).
Roy Fraley Roy Fraley Professional II-Engineer Roy.Fraley@nrel.gov | 303-384-6468 Roy Fraley is the high-performance computing (HPC) data center engineer with the Computational Science Center's HPC
The Prodiguer Messaging Platform
NASA Astrophysics Data System (ADS)
Greenslade, Mark; Denvil, Sebastien; Raciazek, Jerome; Carenton, Nicolas; Levavasseur, Guillame
2014-05-01
CONVERGENCE is a French multi-partner national project designed to gather HPC and informatics expertise to innovate in the context of running French climate models with differing grids and at differing resolutions. Efficient and reliable execution of these models and the management and dissemination of model output (data and meta-data) are just some of the complexities that CONVERGENCE aims to resolve. The Institut Pierre Simon Laplace (IPSL) is responsible for running climate simulations upon a set of heterogenous HPC environments within France. With heterogeneity comes added complexity in terms of simulation instrumentation and control. Obtaining a global perspective upon the state of all simulations running upon all HPC environments has hitherto been problematic. In this presentation we detail how, within the context of CONVERGENCE, the implementation of the Prodiguer messaging platform resolves complexity and permits the development of real-time applications such as: 1. a simulation monitoring dashboard; 2. a simulation metrics visualizer; 3. an automated simulation runtime notifier; 4. an automated output data & meta-data publishing pipeline; The Prodiguer messaging platform leverages a widely used open source message broker software called RabbitMQ. RabbitMQ itself implements the Advanced Message Queue Protocol (AMPQ). Hence it will be demonstrated that the Prodiguer messaging platform is built upon both open source and open standards.
RELIABILITY, AVAILABILITY, AND SERVICEABILITY FOR PETASCALE HIGH-END COMPUTING AND BEYOND
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chokchai "Box" Leangsuksun
2011-05-31
Our project is a multi-institutional research effort that adopts interplay of RELIABILITY, AVAILABILITY, and SERVICEABILITY (RAS) aspects for solving resilience issues in highend scientific computing in the next generation of supercomputers. results lie in the following tracks: Failure prediction in a large scale HPC; Investigate reliability issues and mitigation techniques including in GPGPU-based HPC system; HPC resilience runtime & tools.
Hukerikar, Saurabh; Teranishi, Keita; Diniz, Pedro C.; ...
2017-02-11
In the presence of accelerated fault rates, which are projected to be the norm on future exascale systems, it will become increasingly difficult for high-performance computing (HPC) applications to accomplish useful computation. Due to the fault-oblivious nature of current HPC programming paradigms and execution environments, HPC applications are insufficiently equipped to deal with errors. We believe that HPC applications should be enabled with capabilities to actively search for and correct errors in their computations. The redundant multithreading (RMT) approach offers lightweight replicated execution streams of program instructions within the context of a single application process. Furthermore, the use of completemore » redundancy incurs significant overhead to the application performance.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Teranishi, Keita; Diniz, Pedro C.
In the presence of accelerated fault rates, which are projected to be the norm on future exascale systems, it will become increasingly difficult for high-performance computing (HPC) applications to accomplish useful computation. Due to the fault-oblivious nature of current HPC programming paradigms and execution environments, HPC applications are insufficiently equipped to deal with errors. We believe that HPC applications should be enabled with capabilities to actively search for and correct errors in their computations. The redundant multithreading (RMT) approach offers lightweight replicated execution streams of program instructions within the context of a single application process. Furthermore, the use of completemore » redundancy incurs significant overhead to the application performance.« less
High Performance Computing Innovation Service Portal Study (HPC-ISP)
2009-04-01
threatened by global competition. It is essential that these suppliers remain competitive and maintain their technological advantage . In this increasingly...place themselves, as well as customers who rely on them, in competitive jeopardy. Despite the potential competitive advantage associated with adopting...computing users into the HPC fold and to enable more entry-level users to exploit HPC more fully for competitive advantage . About half of the surveyed
SCEAPI: A unified Restful Web API for High-Performance Computing
NASA Astrophysics Data System (ADS)
Rongqiang, Cao; Haili, Xiao; Shasha, Lu; Yining, Zhao; Xiaoning, Wang; Xuebin, Chi
2017-10-01
The development of scientific computing is increasingly moving to collaborative web and mobile applications. All these applications need high-quality programming interface for accessing heterogeneous computing resources consisting of clusters, grid computing or cloud computing. In this paper, we introduce our high-performance computing environment that integrates computing resources from 16 HPC centers across China. Then we present a bundle of web services called SCEAPI and describe how it can be used to access HPC resources with HTTP or HTTPs protocols. We discuss SCEAPI from several aspects including architecture, implementation and security, and address specific challenges in designing compatible interfaces and protecting sensitive data. We describe the functions of SCEAPI including authentication, file transfer and job management for creating, submitting and monitoring, and how to use SCEAPI in an easy-to-use way. Finally, we discuss how to exploit more HPC resources quickly for the ATLAS experiment by implementing the custom ARC compute element based on SCEAPI, and our work shows that SCEAPI is an easy-to-use and effective solution to extend opportunistic HPC resources.
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing
NASA Astrophysics Data System (ADS)
Amooie, M. A.; Moortgat, J.
2017-12-01
We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
DOVIS: an implementation for high-throughput virtual screening using AutoDock.
Zhang, Shuxing; Kumar, Kamal; Jiang, Xiaohui; Wallqvist, Anders; Reifman, Jaques
2008-02-27
Molecular-docking-based virtual screening is an important tool in drug discovery that is used to significantly reduce the number of possible chemical compounds to be investigated. In addition to the selection of a sound docking strategy with appropriate scoring functions, another technical challenge is to in silico screen millions of compounds in a reasonable time. To meet this challenge, it is necessary to use high performance computing (HPC) platforms and techniques. However, the development of an integrated HPC system that makes efficient use of its elements is not trivial. We have developed an application termed DOVIS that uses AutoDock (version 3) as the docking engine and runs in parallel on a Linux cluster. DOVIS can efficiently dock large numbers (millions) of small molecules (ligands) to a receptor, screening 500 to 1,000 compounds per processor per day. Furthermore, in DOVIS, the docking session is fully integrated and automated in that the inputs are specified via a graphical user interface, the calculations are fully integrated with a Linux cluster queuing system for parallel processing, and the results can be visualized and queried. DOVIS removes most of the complexities and organizational problems associated with large-scale high-throughput virtual screening, and provides a convenient and efficient solution for AutoDock users to use this software in a Linux cluster platform.
Cognitive Model Exploration and Optimization: A New Challenge for Computational Science
2010-03-01
the generation and analysis of computational cognitive models to explain various aspects of cognition. Typically the behavior of these models...computational scale of a workstation, so we have turned to high performance computing (HPC) clusters and volunteer computing for large-scale...computational resources. The majority of applications on the Department of Defense HPC clusters focus on solving partial differential equations (Post
Running ANSYS Fluent on the WinHPC System | High-Performance Computing |
. If you don't have one, see WinHPC system user basics. Check License Use Status Start > All Jason Lustbader. Run Using Fluent Launcher Start Fluent launcher by opening: Start > All Programs > . Available node groups can be found from HPC Job Manager. Start > All Programs > Microsoft HPC Pack
NASA Astrophysics Data System (ADS)
Christou, Michalis; Christoudias, Theodoros; Morillo, Julián; Alvarez, Damian; Merx, Hendrik
2016-09-01
We examine an alternative approach to heterogeneous cluster-computing in the many-core era for Earth system models, using the European Centre for Medium-Range Weather Forecasts Hamburg (ECHAM)/Modular Earth Submodel System (MESSy) Atmospheric Chemistry (EMAC) model as a pilot application on the Dynamical Exascale Entry Platform (DEEP). A set of autonomous coprocessors interconnected together, called Booster, complements a conventional HPC Cluster and increases its computing performance, offering extra flexibility to expose multiple levels of parallelism and achieve better scalability. The EMAC model atmospheric chemistry code (Module Efficiently Calculating the Chemistry of the Atmosphere (MECCA)) was taskified with an offload mechanism implemented using OmpSs directives. The model was ported to the MareNostrum 3 supercomputer to allow testing with Intel Xeon Phi accelerators on a production-size machine. The changes proposed in this paper are expected to contribute to the eventual adoption of Cluster-Booster division and Many Integrated Core (MIC) accelerated architectures in presently available implementations of Earth system models, towards exploiting the potential of a fully Exascale-capable platform.
A Long History of Supercomputing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grider, Gary
As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nation’s first computer to building the first machine to break the petaflop barrier, Los Alamos holds many “firsts” in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.
System Resource Allocations | High-Performance Computing | NREL
Allocations System Resource Allocations To use NREL's high-performance computing (HPC) resources : Compute hours on NREL HPC Systems including Peregrine and Eagle Storage space (in Terabytes) on Peregrine , Eagle and Gyrfalcon. Allocations are principally done in response to an annual call for allocation
Advanced Biomedical Computing Center (ABCC) | DSITP
The Advanced Biomedical Computing Center (ABCC), located in Frederick Maryland (MD), provides HPC resources for both NIH/NCI intramural scientists and the extramural biomedical research community. Its mission is to provide HPC support, to provide collaborative research, and to conduct in-house research in various areas of computational biology and biomedical research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klitsner, Tom
The recent Executive Order creating the National Strategic Computing Initiative (NSCI) recognizes the value of high performance computing for economic competitiveness and scientific discovery and commits to accelerate delivery of exascale computing. The HPC programs at Sandia –the NNSA ASC program and Sandia’s Institutional HPC Program– are focused on ensuring that Sandia has the resources necessary to deliver computation in the national interest.
NASA Astrophysics Data System (ADS)
Guan, Zhen; Pekurovsky, Dmitry; Luce, Jason; Thornton, Katsuyo; Lowengrub, John
The structural phase field crystal (XPFC) model can be used to model grain growth in polycrystalline materials at diffusive time-scales while maintaining atomic scale resolution. However, the governing equation of the XPFC model is an integral-partial-differential-equation (IPDE), which poses challenges in implementation onto high performance computing (HPC) platforms. In collaboration with the XSEDE Extended Collaborative Support Service, we developed a distributed memory HPC solver for the XPFC model, which combines parallel multigrid and P3DFFT. The performance benchmarking on the Stampede supercomputer indicates near linear strong and weak scaling for both multigrid and transfer time between multigrid and FFT modules up to 1024 cores. Scalability of the FFT module begins to decline at 128 cores, but it is sufficient for the type of problem we will be examining. We have demonstrated simulations using 1024 cores, and we expect to achieve 4096 cores and beyond. Ongoing work involves optimization of MPI/OpenMP-based codes for the Intel KNL Many-Core Architecture. This optimizes the code for coming pre-exascale systems, in particular many-core systems such as Stampede 2.0 and Cori 2 at NERSC, without sacrificing efficiency on other general HPC systems.
What Physicists Should Know About High Performance Computing - Circa 2002
NASA Astrophysics Data System (ADS)
Frederick, Donald
2002-08-01
High Performance Computing (HPC) is a dynamic, cross-disciplinary field that traditionally has involved applied mathematicians, computer scientists, and others primarily from the various disciplines that have been major users of HPC resources - physics, chemistry, engineering, with increasing use by those in the life sciences. There is a technological dynamic that is powered by economic as well as by technical innovations and developments. This talk will discuss practical ideas to be considered when developing numerical applications for research purposes. Even with the rapid pace of development in the field, the author believes that these concepts will not become obsolete for a while, and will be of use to scientists who either are considering, or who have already started down the HPC path. These principles will be applied in particular to current parallel HPC systems, but there will also be references of value to desktop users. The talk will cover such topics as: computing hardware basics, single-cpu optimization, compilers, timing, numerical libraries, debugging and profiling tools and the emergence of Computational Grids.
A Cross-Platform Infrastructure for Scalable Runtime Application Performance Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jack Dongarra; Shirley Moore; Bart Miller, Jeffrey Hollingsworth
2005-03-15
The purpose of this project was to build an extensible cross-platform infrastructure to facilitate the development of accurate and portable performance analysis tools for current and future high performance computing (HPC) architectures. Major accomplishments include tools and techniques for multidimensional performance analysis, as well as improved support for dynamic performance monitoring of multithreaded and multiprocess applications. Previous performance tool development has been limited by the burden of having to re-write a platform-dependent low-level substrate for each architecture/operating system pair in order to obtain the necessary performance data from the system. Manual interpretation of performance data is not scalable for large-scalemore » long-running applications. The infrastructure developed by this project provides a foundation for building portable and scalable performance analysis tools, with the end goal being to provide application developers with the information they need to analyze, understand, and tune the performance of terascale applications on HPC architectures. The backend portion of the infrastructure provides runtime instrumentation capability and access to hardware performance counters, with thread-safety for shared memory environments and a communication substrate to support instrumentation of multiprocess and distributed programs. Front end interfaces provides tool developers with a well-defined, platform-independent set of calls for requesting performance data. End-user tools have been developed that demonstrate runtime data collection, on-line and off-line analysis of performance data, and multidimensional performance analysis. The infrastructure is based on two underlying performance instrumentation technologies. These technologies are the PAPI cross-platform library interface to hardware performance counters and the cross-platform Dyninst library interface for runtime modification of executable images. The Paradyn and KOJAK projects have made use of this infrastructure to build performance measurement and analysis tools that scale to long-running programs on large parallel and distributed systems and that automate much of the search for performance bottlenecks.« less
Cockrell, Chase; An, Gary
2017-10-07
Sepsis affects nearly 1 million people in the United States per year, has a mortality rate of 28-50% and requires more than $20 billion a year in hospital costs. Over a quarter century of research has not yielded a single reliable diagnostic test or a directed therapeutic agent for sepsis. Central to this insufficiency is the fact that sepsis remains a clinical/physiological diagnosis representing a multitude of molecularly heterogeneous pathological trajectories. Advances in computational capabilities offered by High Performance Computing (HPC) platforms call for an evolution in the investigation of sepsis to attempt to define the boundaries of traditional research (bench, clinical and computational) through the use of computational proxy models. We present a novel investigatory and analytical approach, derived from how HPC resources and simulation are used in the physical sciences, to identify the epistemic boundary conditions of the study of clinical sepsis via the use of a proxy agent-based model of systemic inflammation. Current predictive models for sepsis use correlative methods that are limited by patient heterogeneity and data sparseness. We address this issue by using an HPC version of a system-level validated agent-based model of sepsis, the Innate Immune Response ABM (IIRBM), as a proxy system in order to identify boundary conditions for the possible behavioral space for sepsis. We then apply advanced analysis derived from the study of Random Dynamical Systems (RDS) to identify novel means for characterizing system behavior and providing insight into the tractability of traditional investigatory methods. The behavior space of the IIRABM was examined by simulating over 70 million sepsis patients for up to 90 days in a sweep across the following parameters: cardio-respiratory-metabolic resilience; microbial invasiveness; microbial toxigenesis; and degree of nosocomial exposure. In addition to using established methods for describing parameter space, we developed two novel methods for characterizing the behavior of a RDS: Probabilistic Basins of Attraction (PBoA) and Stochastic Trajectory Analysis (STA). Computationally generated behavioral landscapes demonstrated attractor structures around stochastic regions of behavior that could be described in a complementary fashion through use of PBoA and STA. The stochasticity of the boundaries of the attractors highlights the challenge for correlative attempts to characterize and classify clinical sepsis. HPC simulations of models like the IIRABM can be used to generate approximations of the behavior space of sepsis to both establish "boundaries of futility" with respect to existing investigatory approaches and apply system engineering principles to investigate the general dynamic properties of sepsis to provide a pathway for developing control strategies. The issues that bedevil the study and treatment of sepsis, namely clinical data sparseness and inadequate experimental sampling of system behavior space, are fundamental to nearly all biomedical research, manifesting in the "Crisis of Reproducibility" at all levels. HPC-augmented simulation-based research offers an investigatory strategy more consistent with that seen in the physical sciences (which combine experiment, theory and simulation), and an opportunity to utilize the leading advances in HPC, namely deep machine learning and evolutionary computing, to form the basis of an iterative scientific process to meet the full promise of Precision Medicine (right drug, right patient, right time). Copyright © 2017. Published by Elsevier Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mayo, Jackson R.; Chen, Frank Xiaoxiao; Pebay, Philippe Pierre
2010-06-01
Effective failure prediction and mitigation strategies in high-performance computing systems could provide huge gains in resilience of tightly coupled large-scale scientific codes. These gains would come from prediction-directed process migration and resource servicing, intelligent resource allocation, and checkpointing driven by failure predictors rather than at regular intervals based on nominal mean time to failure. Given probabilistic associations of outlier behavior in hardware-related metrics with eventual failure in hardware, system software, and/or applications, this paper explores approaches for quantifying the effects of prediction and mitigation strategies and demonstrates these using actual production system data. We describe context-relevant methodologies for determining themore » accuracy and cost-benefit of predictors. While many research studies have quantified the expected impact of growing system size, and the associated shortened mean time to failure (MTTF), on application performance in large-scale high-performance computing (HPC) platforms, there has been little if any work to quantify the possible gains from predicting system resource failures with significant but imperfect accuracy. This possibly stems from HPC system complexity and the fact that, to date, no one has established any good predictors of failure in these systems. Our work in the OVIS project aims to discover these predictors via a variety of data collection techniques and statistical analysis methods that yield probabilistic predictions. The question then is, 'How good or useful are these predictions?' We investigate methods for answering this question in a general setting, and illustrate them using a specific failure predictor discovered on a production system at Sandia.« less
High Performance Computing Modeling Advances Accelerator Science for High-Energy Physics
Amundson, James; Macridin, Alexandru; Spentzouris, Panagiotis
2014-07-28
The development and optimization of particle accelerators are essential for advancing our understanding of the properties of matter, energy, space, and time. Particle accelerators are complex devices whose behavior involves many physical effects on multiple scales. Therefore, advanced computational tools utilizing high-performance computing are essential for accurately modeling them. In the past decade, the US Department of Energy's SciDAC program has produced accelerator-modeling tools that have been employed to tackle some of the most difficult accelerator science problems. The authors discuss the Synergia framework and its applications to high-intensity particle accelerator physics. Synergia is an accelerator simulation package capable ofmore » handling the entire spectrum of beam dynamics simulations. Our authors present Synergia's design principles and its performance on HPC platforms.« less
-275-4303 Kevin Regimbal oversees NREL's High Performance Computing (HPC) Systems & Operations , engineering, and operations. Kevin is interested in data center design and computing as well as data center integration and optimization. Professional Experience HPC oversight: program manager, project manager, center
Scaling up ATLAS Event Service to production levels on opportunistic computing platforms
NASA Astrophysics Data System (ADS)
Benjamin, D.; Caballero, J.; Ernst, M.; Guan, W.; Hover, J.; Lesny, D.; Maeno, T.; Nilsson, P.; Tsulaia, V.; van Gemmeren, P.; Vaniachine, A.; Wang, F.; Wenaus, T.; ATLAS Collaboration
2016-10-01
Continued growth in public cloud and HPC resources is on track to exceed the dedicated resources available for ATLAS on the WLCG. Examples of such platforms are Amazon AWS EC2 Spot Instances, Edison Cray XC30 supercomputer, backfill at Tier 2 and Tier 3 sites, opportunistic resources at the Open Science Grid (OSG), and ATLAS High Level Trigger farm between the data taking periods. Because of specific aspects of opportunistic resources such as preemptive job scheduling and data I/O, their efficient usage requires workflow innovations provided by the ATLAS Event Service. Thanks to the finer granularity of the Event Service data processing workflow, the opportunistic resources are used more efficiently. We report on our progress in scaling opportunistic resource usage to double-digit levels in ATLAS production.
A Long History of Supercomputing
Grider, Gary
2018-06-13
As part of its national security science mission, Los Alamos National Laboratory and HPC have a long, entwined history dating back to the earliest days of computing. From bringing the first problem to the nationâs first computer to building the first machine to break the petaflop barrier, Los Alamos holds many âfirstsâ in HPC breakthroughs. Today, supercomputers are integral to stockpile stewardship and the Laboratory continues to work with vendors in developing the future of HPC.
Comparative Implementation of High Performance Computing for Power System Dynamic Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng
Dynamic simulation for transient stability assessment is one of the most important, but intensive, computations for power system planning and operation. Present commercial software is mainly designed for sequential computation to run a single simulation, which is very time consuming with a single processer. The application of High Performance Computing (HPC) to dynamic simulations is very promising in accelerating the computing process by parallelizing its kernel algorithms while maintaining the same level of computation accuracy. This paper describes the comparative implementation of four parallel dynamic simulation schemes in two state-of-the-art HPC environments: Message Passing Interface (MPI) and Open Multi-Processing (OpenMP).more » These implementations serve to match the application with dedicated multi-processor computing hardware and maximize the utilization and benefits of HPC during the development process.« less
System-Level Virtualization Research at Oak Ridge National Laboratory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Scott, Stephen L; Vallee, Geoffroy R; Naughton, III, Thomas J
2010-01-01
System-level virtualization is today enjoying a rebirth as a technique to effectively share what were then considered large computing resources to subsequently fade from the spotlight as individual workstations gained in popularity with a one machine - one user approach. One reason for this resurgence is that the simple workstation has grown in capability to rival that of anything available in the past. Thus, computing centers are again looking at the price/performance benefit of sharing that single computing box via server consolidation. However, industry is only concentrating on the benefits of using virtualization for server consolidation (enterprise computing) whereas ourmore » interest is in leveraging virtualization to advance high-performance computing (HPC). While these two interests may appear to be orthogonal, one consolidating multiple applications and users on a single machine while the other requires all the power from many machines to be dedicated solely to its purpose, we propose that virtualization does provide attractive capabilities that may be exploited to the benefit of HPC interests. This does raise the two fundamental questions of: is the concept of virtualization (a machine sharing technology) really suitable for HPC and if so, how does one go about leveraging these virtualization capabilities for the benefit of HPC. To address these questions, this document presents ongoing studies on the usage of system-level virtualization in a HPC context. These studies include an analysis of the benefits of system-level virtualization for HPC, a presentation of research efforts based on virtualization for system availability, and a presentation of research efforts for the management of virtual systems. The basis for this document was material presented by Stephen L. Scott at the Collaborative and Grid Computing Technologies meeting held in Cancun, Mexico on April 12-14, 2007.« less
Cognitive Model Exploration and Optimization: A New Challenge for Computational Science
2010-01-01
Introduction Research in cognitive science often involves the generation and analysis of computational cognitive models to explain various...HPC) clusters and volunteer computing for large-scale computational resources. The majority of applications on the Department of Defense HPC... clusters focus on solving partial differential equations (Post, 2009). These tend to be lean, fast models with little noise. While we lack specific
2011-08-01
5 Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis...classification of streaming data. Example input images (top left). All digit prototypes (cluster centers) found, with size proportional to frequency (top...Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis 1 http
Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2008-10-16
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
Parallel computing in genomic research: advances and applications
Ocaña, Kary; de Oliveira, Daniel
2015-01-01
Today’s genomic experiments have to process the so-called “biological big data” that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks or months if they use their own workstations. Parallelism techniques and high-performance computing (HPC) environments can be applied for reducing the total processing time and to ease the management, treatment, and analyses of this data. However, running bioinformatics experiments in HPC environments such as clouds, grids, clusters, and graphics processing unit requires the expertise from scientists to integrate computational, biological, and mathematical techniques and technologies. Several solutions have already been proposed to allow scientists for processing their genomic experiments using HPC capabilities and parallelism techniques. This article brings a systematic review of literature that surveys the most recently published research involving genomics and parallel computing. Our objective is to gather the main characteristics, benefits, and challenges that can be considered by scientists when running their genomic experiments to benefit from parallelism techniques and HPC capabilities. PMID:26604801
High-performance computing with quantum processing units
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.; ...
2017-03-01
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
High-performance computing with quantum processing units
DOE Office of Scientific and Technical Information (OSTI.GOV)
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Parallel computing in genomic research: advances and applications.
Ocaña, Kary; de Oliveira, Daniel
2015-01-01
Today's genomic experiments have to process the so-called "biological big data" that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks or months if they use their own workstations. Parallelism techniques and high-performance computing (HPC) environments can be applied for reducing the total processing time and to ease the management, treatment, and analyses of this data. However, running bioinformatics experiments in HPC environments such as clouds, grids, clusters, and graphics processing unit requires the expertise from scientists to integrate computational, biological, and mathematical techniques and technologies. Several solutions have already been proposed to allow scientists for processing their genomic experiments using HPC capabilities and parallelism techniques. This article brings a systematic review of literature that surveys the most recently published research involving genomics and parallel computing. Our objective is to gather the main characteristics, benefits, and challenges that can be considered by scientists when running their genomic experiments to benefit from parallelism techniques and HPC capabilities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sreepathi, Sarat; Kumar, Jitendra; Mills, Richard T.
A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like themore » Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.« less
Ayachit, Utkarsh; Bauer, Andrew; Duque, Earl P. N.; ...
2016-11-01
A key trend facing extreme-scale computational science is the widening gap between computational and I/O rates, and the challenge that follows is how to best gain insight from simulation data when it is increasingly impractical to save it to persistent storage for subsequent visual exploration and analysis. One approach to this challenge is centered around the idea of in situ processing, where visualization and analysis processing is performed while data is still resident in memory. Our paper examines several key design and performance issues related to the idea of in situ processing at extreme scale on modern platforms: Scalability, overhead,more » performance measurement and analysis, comparison and contrast with a traditional post hoc approach, and interfacing with simulation codes. We illustrate these principles in practice with studies, conducted on large-scale HPC platforms, that include a miniapplication and multiple science application codes, one of which demonstrates in situ methods in use at greater than 1M-way concurrency.« less
Status report of the end-to-end ASKAP software system: towards early science operations
NASA Astrophysics Data System (ADS)
Guzman, Juan Carlos; Chapman, Jessica; Marquarding, Malte; Whiting, Matthew
2016-08-01
The Australian SKA Pathfinder (ASKAP) is a novel centimetre radio synthesis telescope currently in the commissioning phase and located in the midwest region of Western Australia. It comprises of 36 x 12 m diameter reflector antennas each equipped with state-of-the-art and award winning Phased Array Feeds (PAF) technology. The PAFs provide a wide, 30 square degree field-of-view by forming up to 36 separate dual-polarisation beams at once. This results in a high data rate: 70 TB of correlated visibilities in an 8-hour observation, requiring custom-written, high-performance software running in dedicated High Performance Computing (HPC) facilities. The first six antennas equipped with first-generation PAF technology (Mark I), named the Boolardy Engineering Test Array (BETA) have been in use since 2014 as a platform to test PAF calibration and imaging techniques, and along the way it has been producing some great science results. Commissioning of the ASKAP Array Release 1, that is the first six antennas with second-generation PAFs (Mark II) is currently under way. An integral part of the instrument is the Central Processor platform hosted at the Pawsey Supercomputing Centre in Perth, which executes custom-written software pipelines, designed specifically to meet the ASKAP imaging requirements of wide field of view and high dynamic range. There are three key hardware components of the Central Processor: The ingest nodes (16 x node cluster), the fast temporary storage (1 PB Lustre file system) and the processing supercomputer (200 TFlop system). This High-Performance Computing (HPC) platform is managed and supported by the Pawsey support team. Due to the limited amount of data generated by BETA and the first ASKAP Array Release, the Central Processor platform has been running in a more "traditional" or user-interactive mode. But this is about to change: integration and verification of the online ingest pipeline starts in early 2016, which is required to support the full 300 MHz bandwidth for Array Release 1; followed by the deployment of the real-time data processing components. In addition to the Central Processor, the first production release of the CSIRO ASKAP Science Data Archive (CASDA) has also been deployed in one of the Pawsey Supercomputing Centre facilities and it is integrated to the end-to-end ASKAP data flow system. This paper describes the current status of the "end-to-end" data flow software system from preparing observations to data acquisition, processing and archiving; and the challenges of integrating an HPC facility as a key part of the instrument. It also shares some lessons learned since the start of integration activities and the challenges ahead in preparation for the start of the Early Science program.
Notredame, Cedric
2018-05-02
Cedric Notredame from the Centre for Genomic Regulation gives a presentation on New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era at the JGI/Argonne HPC Workshop on January 26, 2010.
Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary
2015-01-01
Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis.
On-demand provisioning of HEP compute resources on cloud sites and shared HPC centers
NASA Astrophysics Data System (ADS)
Erli, G.; Fischer, F.; Fleig, G.; Giffels, M.; Hauth, T.; Quast, G.; Schnepf, M.; Heese, J.; Leppert, K.; Arnaez de Pedro, J.; Sträter, R.
2017-10-01
This contribution reports on solutions, experiences and recent developments with the dynamic, on-demand provisioning of remote computing resources for analysis and simulation workflows. Local resources of a physics institute are extended by private and commercial cloud sites, ranging from the inclusion of desktop clusters over institute clusters to HPC centers. Rather than relying on dedicated HEP computing centers, it is nowadays more reasonable and flexible to utilize remote computing capacity via virtualization techniques or container concepts. We report on recent experience from incorporating a remote HPC center (NEMO Cluster, Freiburg University) and resources dynamically requested from the commercial provider 1&1 Internet SE into our intitute’s computing infrastructure. The Freiburg HPC resources are requested via the standard batch system, allowing HPC and HEP applications to be executed simultaneously, such that regular batch jobs run side by side to virtual machines managed via OpenStack [1]. For the inclusion of the 1&1 commercial resources, a Python API and SDK as well as the possibility to upload images were available. Large scale tests prove the capability to serve the scientific use case in the European 1&1 datacenters. The described environment at the Institute of Experimental Nuclear Physics (IEKP) at KIT serves the needs of researchers participating in the CMS and Belle II experiments. In total, resources exceeding half a million CPU hours have been provided by remote sites.
NASA Astrophysics Data System (ADS)
Huang, Qian
2014-09-01
Scientific computing often requires the availability of a massive number of computers for performing large-scale simulations, and computing in mineral physics is no exception. In order to investigate physical properties of minerals at extreme conditions in computational mineral physics, parallel computing technology is used to speed up the performance by utilizing multiple computer resources to process a computational task simultaneously thereby greatly reducing computation time. Traditionally, parallel computing has been addressed by using High Performance Computing (HPC) solutions and installed facilities such as clusters and super computers. Today, it has been seen that there is a tremendous growth in cloud computing. Infrastructure as a Service (IaaS), the on-demand and pay-as-you-go model, creates a flexible and cost-effective mean to access computing resources. In this paper, a feasibility report of HPC on a cloud infrastructure is presented. It is found that current cloud services in IaaS layer still need to improve performance to be useful to research projects. On the other hand, Software as a Service (SaaS), another type of cloud computing, is introduced into an HPC system for computing in mineral physics, and an application of which is developed. In this paper, an overall description of this SaaS application is presented. This contribution can promote cloud application development in computational mineral physics, and cross-disciplinary studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Leary, Patrick
The primary challenge motivating this project is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who can perform analysis only on a small fraction of the data they calculate, resulting in the substantial likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, which is known as in situ processing. The idea in situ processing was not new at the time ofmore » the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by Department of Energy (DOE) science projects. Our objective was to produce and enable the use of production-quality in situ methods and infrastructure, at scale, on DOE high-performance computing (HPC) facilities, though we expected to have an impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve this objective, we engaged in software technology research and development (R&D), in close partnerships with DOE science code teams, to produce software technologies that were shown to run efficiently at scale on DOE HPC platforms.« less
EMHP: an accurate automated hole masking algorithm for single-particle cryo-EM image processing.
Berndsen, Zachary; Bowman, Charles; Jang, Haerin; Ward, Andrew B
2017-12-01
The Electron Microscopy Hole Punch (EMHP) is a streamlined suite of tools for quick assessment, sorting and hole masking of electron micrographs. With recent advances in single-particle electron cryo-microscopy (cryo-EM) data processing allowing for the rapid determination of protein structures using a smaller computational footprint, we saw the need for a fast and simple tool for data pre-processing that could run independent of existing high-performance computing (HPC) infrastructures. EMHP provides a data preprocessing platform in a small package that requires minimal python dependencies to function. https://www.bitbucket.org/chazbot/emhp Apache 2.0 License. bowman@scripps.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Connecting Performance Analysis and Visualization to Advance Extreme Scale Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bremer, Peer-Timo; Mohr, Bernd; Schulz, Martin
2015-07-29
The characterization, modeling, analysis, and tuning of software performance has been a central topic in High Performance Computing (HPC) since its early beginnings. The overall goal is to make HPC software run faster on particular hardware, either through better scheduling, on-node resource utilization, or more efficient distributed communication.
NASA Astrophysics Data System (ADS)
Evans, Ben; Allen, Chris; Antony, Joseph; Bastrakova, Irina; Gohar, Kashif; Porter, David; Pugh, Tim; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley
2015-04-01
The National Computational Infrastructure (NCI) has established a powerful and flexible in-situ petascale computational environment to enable both high performance computing and Data-intensive Science across a wide spectrum of national environmental and earth science data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress so far to harmonise the underlying data collections for future interdisciplinary research across these large volume data collections. NCI has established 10+ PBytes of major national and international data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the major Australian national-scale scientific collections), leading research communities, and collaborating overseas organisations. New infrastructures created at NCI mean the data collections are now accessible within an integrated High Performance Computing and Data (HPC-HPD) environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large-scale high-bandwidth Lustre filesystems. The hardware was designed at inception to ensure that it would allow the layered software environment to flexibly accommodate the advancement of future data science. New approaches to software technology and data models have also had to be developed to enable access to these large and exponentially increasing data volumes at NCI. Traditional HPC and data environments are still made available in a way that flexibly provides the tools, services and supporting software systems on these new petascale infrastructures. But to enable the research to take place at this scale, the data, metadata and software now need to evolve together - creating a new integrated high performance infrastructure. The new infrastructure at NCI currently supports a catalogue of integrated, reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. One of the challenges for NCI has been to support existing techniques and methods, while carefully preparing the underlying infrastructure for the transition needed for the next class of Data-intensive Science. In doing so, a flexible range of techniques and software can be made available for application across the corpus of data collections available, and to provide a new infrastructure for future interdisciplinary research.
2016-11-01
Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation by Kathryn Esham, Luis Bravo, Anindya Ghoshal, Muthuvel Murugan, and Michael...Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation by Luis Bravo, Anindya Ghoshal, Muthuvel...High Performance Computing (HPC)-Enabled Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation 5a
OpenTopography: Addressing Big Data Challenges Using Cloud Computing, HPC, and Data Analytics
NASA Astrophysics Data System (ADS)
Crosby, C. J.; Nandigam, V.; Phan, M.; Youn, C.; Baru, C.; Arrowsmith, R.
2014-12-01
OpenTopography (OT) is a geoinformatics-based data facility initiated in 2009 for democratizing access to high-resolution topographic data, derived products, and tools. Hosted at the San Diego Supercomputer Center (SDSC), OT utilizes cyberinfrastructure, including large-scale data management, high-performance computing, and service-oriented architectures to provide efficient Web based access to large, high-resolution topographic datasets. OT collocates data with processing tools to enable users to quickly access custom data and derived products for their application. OT's ongoing R&D efforts aim to solve emerging technical challenges associated with exponential growth in data, higher order data products, as well as user base. Optimization of data management strategies can be informed by a comprehensive set of OT user access metrics that allows us to better understand usage patterns with respect to the data. By analyzing the spatiotemporal access patterns within the datasets, we can map areas of the data archive that are highly active (hot) versus the ones that are rarely accessed (cold). This enables us to architect a tiered storage environment consisting of high performance disk storage (SSD) for the hot areas and less expensive slower disk for the cold ones, thereby optimizing price to performance. From a compute perspective, OT is looking at cloud based solutions such as the Microsoft Azure platform to handle sudden increases in load. An OT virtual machine image in Microsoft's VM Depot can be invoked and deployed quickly in response to increased system demand. OT has also integrated SDSC HPC systems like the Gordon supercomputer into our infrastructure tier to enable compute intensive workloads like parallel computation of hydrologic routing on high resolution topography. This capability also allows OT to scale to HPC resources during high loads to meet user demand and provide more efficient processing. With a growing user base and maturing scientific user community comes new requests for algorithms and processing capabilities. To address this demand, OT is developing an extensible service based architecture for integrating community-developed software. This "plugable" approach to Web service deployment will enable new processing and analysis tools to run collocated with OT hosted data.
High Productivity Computing Systems and Competitiveness Initiative
2007-07-01
planning committee for the annual, international Supercomputing Conference in 2004 and 2005. This is the leading HPC industry conference in the world. It...sector partnerships. Partnerships will form a key part of discussions at the 2nd High Performance Computing Users Conference, planned for July 13, 2005...other things an interagency roadmap for high-end computing core technologies and an accessibility improvement plan . Improving HPC Education and
The Livermore Brain: Massive Deep Learning Networks Enabled by High Performance Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Barry Y.
The proliferation of inexpensive sensor technologies like the ubiquitous digital image sensors has resulted in the collection and sharing of vast amounts of unsorted and unexploited raw data. Companies and governments who are able to collect and make sense of large datasets to help them make better decisions more rapidly will have a competitive advantage in the information era. Machine Learning technologies play a critical role for automating the data understanding process; however, to be maximally effective, useful intermediate representations of the data are required. These representations or “features” are transformations of the raw data into a form where patternsmore » are more easily recognized. Recent breakthroughs in Deep Learning have made it possible to learn these features from large amounts of labeled data. The focus of this project is to develop and extend Deep Learning algorithms for learning features from vast amounts of unlabeled data and to develop the HPC neural network training platform to support the training of massive network models. This LDRD project succeeded in developing new unsupervised feature learning algorithms for images and video and created a scalable neural network training toolkit for HPC. Additionally, this LDRD helped create the world’s largest freely-available image and video dataset supporting open multimedia research and used this dataset for training our deep neural networks. This research helped LLNL capture several work-for-others (WFO) projects, attract new talent, and establish collaborations with leading academic and commercial partners. Finally, this project demonstrated the successful training of the largest unsupervised image neural network using HPC resources and helped establish LLNL leadership at the intersection of Machine Learning and HPC research.« less
User Account Passwords | High-Performance Computing | NREL
Account Passwords User Account Passwords For NREL's high-performance computing (HPC) systems, learn about user account password requirements and how to set up, log in, and change passwords. Password Logging In the First Time After you request an HPC user account, you'll receive a temporary password. Set
Expanding HPC and Research Computing--The Sustainable Way
ERIC Educational Resources Information Center
Grush, Mary
2009-01-01
Increased demands for research and high-performance computing (HPC)--along with growing expectations for cost and environmental savings--are putting new strains on the campus data center. More and more, CIOs like the University of Notre Dame's (Indiana) Gordon Wishon are seeking creative ways to build more sustainable models for data center and…
Data Security Policy | High-Performance Computing | NREL
to use its high-performance computing (HPC) systems. NREL HPC systems are operated as research systems and may only contain data related to scientific research. These systems are categorized as low per sensitive or non-sensitive. One example of sensitive data would be personally identifiable information (PII
Shared Storage Usage Policy | High-Performance Computing | NREL
Shared Storage Usage Policy Shared Storage Usage Policy To use NREL's high-performance computing (HPC) systems, you must abide by the Shared Storage Usage Policy. /projects NREL HPC allocations include storage space in the /projects filesystem. However, /projects is a shared resource and project
Business Models of High Performance Computing Centres in Higher Education in Europe
ERIC Educational Resources Information Center
Eurich, Markus; Calleja, Paul; Boutellier, Roman
2013-01-01
High performance computing (HPC) service centres are a vital part of the academic infrastructure of higher education organisations. However, despite their importance for research and the necessary high capital expenditures, business research on HPC service centres is mostly missing. From a business perspective, it is important to find an answer to…
Uncover the Cloud for Geospatial Sciences and Applications to Adopt Cloud Computing
NASA Astrophysics Data System (ADS)
Yang, C.; Huang, Q.; Xia, J.; Liu, K.; Li, J.; Xu, C.; Sun, M.; Bambacus, M.; Xu, Y.; Fay, D.
2012-12-01
Cloud computing is emerging as the future infrastructure for providing computing resources to support and enable scientific research, engineering development, and application construction, as well as work force education. On the other hand, there is a lot of doubt about the readiness of cloud computing to support a variety of scientific research, development and educations. This research is a project funded by NASA SMD to investigate through holistic studies how ready is the cloud computing to support geosciences. Four applications with different computing characteristics including data, computing, concurrent, and spatiotemporal intensities are taken to test the readiness of cloud computing to support geosciences. Three popular and representative cloud platforms including Amazon EC2, Microsoft Azure, and NASA Nebula as well as a traditional cluster are utilized in the study. Results illustrates that cloud is ready to some degree but more research needs to be done to fully implemented the cloud benefit as advertised by many vendors and defined by NIST. Specifically, 1) most cloud platform could help stand up new computing instances, a new computer, in a few minutes as envisioned, therefore, is ready to support most computing needs in an on demand fashion; 2) the load balance and elasticity, a defining characteristic, is ready in some cloud platforms, such as Amazon EC2, to support bigger jobs, e.g., needs response in minutes, while some are not ready to support the elasticity and load balance well. All cloud platform needs further research and development to support real time application at subminute level; 3) the user interface and functionality of cloud platforms vary a lot and some of them are very professional and well supported/documented, such as Amazon EC2, some of them needs significant improvement for the general public to adopt cloud computing without professional training or knowledge about computing infrastructure; 4) the security is a big concern in cloud computing platform, with the sharing spirit of cloud computing, it is very hard to ensure higher level security, except a private cloud is built for a specific organization without public access, public cloud platform does not support FISMA medium level yet and may never be able to support FISMA high level; 5) HPC jobs needs of cloud computing is not well supported and only Amazon EC2 supports this well. The research is being taken by NASA and other agencies to consider cloud computing adoption. We hope the publication of the research would also benefit the public to adopt cloud computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joseph, Earl C.; Conway, Steve; Dekate, Chirag
This study investigated how high-performance computing (HPC) investments can improve economic success and increase scientific innovation. This research focused on the common good and provided uses for DOE, other government agencies, industry, and academia. The study created two unique economic models and an innovation index: 1 A macroeconomic model that depicts the way HPC investments result in economic advancements in the form of ROI in revenue (GDP), profits (and cost savings), and jobs. 2 A macroeconomic model that depicts the way HPC investments result in basic and applied innovations, looking at variations by sector, industry, country, and organization size. Amore » new innovation index that provides a means of measuring and comparing innovation levels. Key findings of the pilot study include: IDC collected the required data across a broad set of organizations, with enough detail to create these models and the innovation index. The research also developed an expansive list of HPC success stories.« less
Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary
2015-01-01
Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis. PMID:25806784
WinHPC System Software | High-Performance Computing | NREL
Software WinHPC System Software Learn about the software applications, tools, toolchains, and for industrial applications. Intel Compilers Development Tool, Toolchain Suite featuring an industry
Low latency network and distributed storage for next generation HPC systems: the ExaNeSt project
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Cretaro, P.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Pisani, F.; Simula, F.; Vicini, P.; Navaridas, J.; Chaix, F.; Chrysos, N.; Katevenis, M.; Papaeustathiou, V.
2017-10-01
With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended the reach of HPC from its roots in modelling and simulation of complex physical systems to a broader range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to manufacturing sectors. In this perspective, the near future HPC systems can be envisioned as composed of millions of low-power computing cores, densely packed — meaning cooling by appropriate technology — with a tightly interconnected, low latency and high performance network and equipped with a distributed storage architecture. Each of these features — dense packing, distributed storage and high performance interconnect — represents a challenge, made all the harder by the need to solve them at the same time. These challenges lie as stumbling blocks along the road towards Exascale-class systems; the ExaNeSt project acknowledges them and tasks itself with investigating ways around them.
Prediction and characterization of application power use in a high-performance computing environment
Bugbee, Bruce; Phillips, Caleb; Egan, Hilary; ...
2017-02-27
Power use in data centers and high-performance computing (HPC) facilities has grown in tandem with increases in the size and number of these facilities. Substantial innovation is needed to enable meaningful reduction in energy footprints in leadership-class HPC systems. In this paper, we focus on characterizing and investigating application-level power usage. We demonstrate potential methods for predicting power usage based on a priori and in situ characteristics. Lastly, we highlight a potential use case of this method through a simulated power-aware scheduler using historical jobs from a real scientific HPC system.
User Accounts | High-Performance Computing | NREL
see information on user account policies. ACCOUNT PASSWORDS Logging in for the first time? Forgot your Accounts User Accounts Learn how to request an NREL HPC user account. Request an HPC Account To request an HPC account, please complete our request form. This form is provided using DocuSign. REQUEST
NASA Technical Reports Server (NTRS)
Son, Chang H.
2012-01-01
The Human Powered Centrifuge (HPC) is a facility that is planned to be installed on board the International Space Station (ISS) to enable crew exercises under the artificial gravity conditions. The HPC equipment includes a "bicycle" for long-term exercises of a crewmember that provides power for rotation of HPC at a speed of 30 rpm. The crewmember exercising vigorously on the centrifuge generates the amount of carbon dioxide of about two times higher than a crewmember in ordinary conditions. The goal of the study is to analyze the airflow and carbon dioxide distribution within Pressurized Multipurpose Module (PMM) cabin when HPC is operating. A full unsteady formulation is used for airflow and CO2 transport CFD-based modeling with the so-called sliding mesh concept when the HPC equipment with the adjacent Bay 4 cabin volume is considered in the rotating reference frame while the rest of the cabin volume is considered in the stationary reference frame. The rotating part of the computational domain includes also a human body model. Localized effects of carbon dioxide dispersion are examined. Strong influence of the rotating HPC equipment on the CO2 distribution detected is discussed.
The Case for Modular Redundancy in Large-Scale High Performance Computing Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Engelmann, Christian; Ong, Hong Hoe; Scott, Stephen L
2009-01-01
Recent investigations into resilience of large-scale high-performance computing (HPC) systems showed a continuous trend of decreasing reliability and availability. Newly installed systems have a lower mean-time to failure (MTTF) and a higher mean-time to recover (MTTR) than their predecessors. Modular redundancy is being used in many mission critical systems today to provide for resilience, such as for aerospace and command \\& control systems. The primary argument against modular redundancy for resilience in HPC has always been that the capability of a HPC system, and respective return on investment, would be significantly reduced. We argue that modular redundancy can significantly increasemore » compute node availability as it removes the impact of scale from single compute node MTTR. We further argue that single compute nodes can be much less reliable, and therefore less expensive, and still be highly available, if their MTTR/MTTF ratio is maintained.« less
Chip-scale integrated optical interconnects: a key enabler for future high-performance computing
NASA Astrophysics Data System (ADS)
Haney, Michael; Nair, Rohit; Gu, Tian
2012-01-01
High Performance Computing (HPC) systems are putting ever-increasing demands on the throughput efficiency of their interconnection fabrics. In this paper, the limits of conventional metal trace-based inter-chip interconnect fabrics are examined in the context of state-of-the-art HPC systems, which currently operate near the 1 GFLOPS/W level. The analysis suggests that conventional metal trace interconnects will limit performance to approximately 6 GFLOPS/W in larger HPC systems that require many computer chips to be interconnected in parallel processing architectures. As the HPC communications bottlenecks push closer to the processing chips, integrated Optical Interconnect (OI) technology may provide the ultra-high bandwidths needed at the inter- and intra-chip levels. With inter-chip photonic link energies projected to be less than 1 pJ/bit, integrated OI is projected to enable HPC architecture scaling to the 50 GFLOPS/W level and beyond - providing a path to Peta-FLOPS-level HPC within a single rack, and potentially even Exa-FLOPSlevel HPC for large systems. A new hybrid integrated chip-scale OI approach is described and evaluated. The concept integrates a high-density polymer waveguide fabric directly on top of a multiple quantum well (MQW) modulator array that is area-bonded to the Silicon computing chip. Grayscale lithography is used to fabricate 5 μm x 5 μm polymer waveguides and associated novel small-footprint total internal reflection-based vertical input/output couplers directly onto a layer containing an array of GaAs MQW devices configured to be either absorption modulators or photodetectors. An external continuous wave optical "power supply" is coupled into the waveguide links. Contrast ratios were measured using a test rider chip in place of a Silicon processing chip. The results suggest that sub-pJ/b chip-scale communication is achievable with this concept. When integrated into high-density integrated optical interconnect fabrics, it could provide a seamless interconnect fabric spanning the intra-
The impact of the U.S. supercomputing initiative will be global
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crawford, Dona
2016-01-15
Last July, President Obama issued an executive order that created a coordinated federal strategy for HPC research, development, and deployment called the U.S. National Strategic Computing Initiative (NSCI). However, this bold, necessary step toward building the next generation of supercomputers has inaugurated a new era for U.S. high performance computing (HPC).
HPC in a HEP lab: lessons learned from setting up cost-effective HPC clusters
NASA Astrophysics Data System (ADS)
Husejko, Michal; Agtzidis, Ioannis; Baehler, Pierre; Dul, Tadeusz; Evans, John; Himyr, Nils; Meinhard, Helge
2015-12-01
In this paper we present our findings gathered during the evaluation and testing of Windows Server High-Performance Computing (Windows HPC) in view of potentially using it as a production HPC system for engineering applications. The Windows HPC package, an extension of Microsofts Windows Server product, provides all essential interfaces, utilities and management functionality for creating, operating and monitoring a Windows-based HPC cluster infrastructure. The evaluation and test phase was focused on verifying the functionalities of Windows HPC, its performance, support of commercial tools and the integration with the users work environment. We describe constraints imposed by the way the CERN Data Centre is operated, licensing for engineering tools and scalability and behaviour of the HPC engineering applications used at CERN. We will present an initial set of requirements, which were created based on the above constraints and requests from the CERN engineering user community. We will explain how we have configured Windows HPC clusters to provide job scheduling functionalities required to support the CERN engineering user community, quality of service, user- and project-based priorities, and fair access to limited resources. Finally, we will present several performance tests we carried out to verify Windows HPC performance and scalability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Yousu; Etingov, Pavel V.; Ren, Huiying
This paper describes a probabilistic look-ahead contingency analysis application that incorporates smart sampling and high-performance computing (HPC) techniques. Smart sampling techniques are implemented to effectively represent the structure and statistical characteristics of uncertainty introduced by different sources in the power system. They can significantly reduce the data set size required for multiple look-ahead contingency analyses, and therefore reduce the time required to compute them. High-performance-computing (HPC) techniques are used to further reduce computational time. These two techniques enable a predictive capability that forecasts the impact of various uncertainties on potential transmission limit violations. The developed package has been tested withmore » real world data from the Bonneville Power Administration. Case study results are presented to demonstrate the performance of the applications developed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Willis, D. K.
2016-12-01
High performance computing (HPC) has been a defining strength of Lawrence Livermore National Laboratory (LLNL) since its founding. Livermore scientists have designed and used some of the world’s most powerful computers to drive breakthroughs in nearly every mission area. Today, the Laboratory is recognized as a world leader in the application of HPC to complex science, technology, and engineering challenges. Most importantly, HPC has been integral to the National Nuclear Security Administration’s (NNSA’s) Stockpile Stewardship Program—designed to ensure the safety, security, and reliability of our nuclear deterrent without nuclear testing. A critical factor behind Lawrence Livermore’s preeminence in HPC ismore » the ongoing investments made by the Laboratory Directed Research and Development (LDRD) Program in cutting-edge concepts to enable efficient utilization of these powerful machines. Congress established the LDRD Program in 1991 to maintain the technical vitality of the Department of Energy (DOE) national laboratories. Since then, LDRD has been, and continues to be, an essential tool for exploring anticipated needs that lie beyond the planning horizon of our programs and for attracting the next generation of talented visionaries. Through LDRD, Livermore researchers can examine future challenges, propose and explore innovative solutions, and deliver creative approaches to support our missions. The present scientific and technical strengths of the Laboratory are, in large part, a product of past LDRD investments in HPC. Here, we provide seven examples of LDRD projects from the past decade that have played a critical role in building LLNL’s HPC, computer science, mathematics, and data science research capabilities, and describe how they have impacted LLNL’s mission.« less
Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bethel, Wes
2016-07-24
The primary challenge motivating this team’s work is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who are able to perform analysis only on a small fraction of the data they compute, resulting in the very real likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, an approach that is known as in situ processing. The idea in situ processing wasmore » not new at the time of the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by DOE science projects. In large, our objective was produce and enable use of production-quality in situ methods and infrastructure, at scale, on DOE HPC facilities, though we expected to have impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve that objective, we assembled a unique team of researchers consisting of representatives from DOE national laboratories, academia, and industry, and engaged in software technology R&D, as well as engaged in close partnerships with DOE science code teams, to produce software technologies that were shown to run effectively at scale on DOE HPC platforms.« less
Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.; ...
2011-03-02
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
NASA Astrophysics Data System (ADS)
Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.
2015-12-01
The CyberShake computational platform, developed by the Southern California Earthquake Center (SCEC), is an integrated collection of scientific software and middleware that performs 3D physics-based probabilistic seismic hazard analysis (PSHA) for Southern California. CyberShake integrates large-scale and high-throughput research codes to produce probabilistic seismic hazard curves for individual locations of interest and hazard maps for an entire region. A recent CyberShake calculation produced about 500,000 two-component seismograms for each of 336 locations, resulting in over 300 million synthetic seismograms in a Los Angeles-area probabilistic seismic hazard model. CyberShake calculations require a series of scientific software programs. Early computational stages produce data used as inputs by later stages, so we describe CyberShake calculations using a workflow definition language. Scientific workflow tools automate and manage the input and output data and enable remote job execution on large-scale HPC systems. To satisfy the requests of broad impact users of CyberShake data, such as seismologists, utility companies, and building code engineers, we successfully completed CyberShake Study 15.4 in April and May 2015, calculating a 1 Hz urban seismic hazard map for Los Angeles. We distributed the calculation between the NSF Track 1 system NCSA Blue Waters, the DOE Leadership-class system OLCF Titan, and USC's Center for High Performance Computing. This study ran for over 5 weeks, burning about 1.1 million node-hours and producing over half a petabyte of data. The CyberShake Study 15.4 results doubled the maximum simulated seismic frequency from 0.5 Hz to 1.0 Hz as compared to previous studies, representing a factor of 16 increase in computational complexity. We will describe how our workflow tools supported splitting the calculation across multiple systems. We will explain how we modified CyberShake software components, including GPU implementations and migrating from file-based communication to MPI messaging, to greatly reduce the I/O demands and node-hour requirements of CyberShake. We will also present performance metrics from CyberShake Study 15.4, and discuss challenges that producers of Big Data on open-science HPC resources face moving forward.
Evolution of the Virtualized HPC Infrastructure of Novosibirsk Scientific Center
NASA Astrophysics Data System (ADS)
Adakin, A.; Anisenkov, A.; Belov, S.; Chubarov, D.; Kalyuzhny, V.; Kaplin, V.; Korol, A.; Kuchin, N.; Lomakin, S.; Nikultsev, V.; Skovpen, K.; Sukharev, A.; Zaytsev, A.
2012-12-01
Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies, and Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG). Since each institute has specific requirements on the architecture of computing farms involved in its research field, currently we've got several computing facilities hosted by NSC institutes, each optimized for a particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM&MG), and a Grid Computing Facility of BINP. A dedicated optical network with the initial bandwidth of 10 Gb/s connecting these three facilities was built in order to make it possible to share the computing resources among the research communities, thus increasing the efficiency of operating the existing computing facilities and offering a common platform for building the computing infrastructure for future scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technology based on XEN and KVM platforms. This contribution gives a thorough review of the present status and future development prospects for the NSC virtualized computing infrastructure and the experience gained while using it for running production data analysis jobs related to HEP experiments being carried out at BINP, especially the KEDR detector experiment at the VEPP-4M electron-positron collider.
2012-01-01
Background Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. Results In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Conclusions Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org. PMID:23281941
El-Kalioby, Mohamed; Abouelhoda, Mohamed; Krüger, Jan; Giegerich, Robert; Sczyrba, Alexander; Wall, Dennis P; Tonellato, Peter
2012-01-01
Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org.
FOSS GIS on the GFZ HPC cluster: Towards a service-oriented Scientific Geocomputation Environment
NASA Astrophysics Data System (ADS)
Loewe, P.; Klump, J.; Thaler, J.
2012-12-01
High performance compute clusters can be used as geocomputation workbenches. Their wealth of resources enables us to take on geocomputation tasks which exceed the limitations of smaller systems. These general capabilities can be harnessed via tools such as Geographic Information System (GIS), provided they are able to utilize the available cluster configuration/architecture and provide a sufficient degree of user friendliness to allow for wide application. While server-level computing is clearly not sufficient for the growing numbers of data- or computation-intense tasks undertaken, these tasks do not get even close to the requirements needed for access to "top shelf" national cluster facilities. So until recently such kind of geocomputation research was effectively barred due to lack access to of adequate resources. In this paper we report on the experiences gained by providing GRASS GIS as a software service on a HPC compute cluster at the German Research Centre for Geosciences using Platform Computing's Load Sharing Facility (LSF). GRASS GIS is the oldest and largest Free Open Source (FOSS) GIS project. During ramp up in 2011, multiple versions of GRASS GIS (v 6.4.2, 6.5 and 7.0) were installed on the HPC compute cluster, which currently consists of 234 nodes with 480 CPUs providing 3084 cores. Nineteen different processing queues with varying hardware capabilities and priorities are provided, allowing for fine-grained scheduling and load balancing. After successful initial testing, mechanisms were developed to deploy scripted geocomputation tasks onto dedicated processing queues. The mechanisms are based on earlier work by NETELER et al. (2008) and allow to use all 3084 cores for GRASS based geocomputation work. However, in practice applications are limited to fewer resources as assigned to their respective queue. Applications of the new GIS functionality comprise so far of hydrological analysis, remote sensing and the generation of maps of simulated tsunamis in the Mediterranean Sea for the Tsunami Atlas of the FP-7 TRIDEC Project (www.tridec-online.eu). This included the processing of complex problems, requiring significant amounts of processing time up to full 20 CPU days. This GRASS GIS-based service is provided as a research utility in the sense of "Software as a Service" (SaaS) and is a first step towards a GFZ corporate cloud service.
Modeling Cardiac Electrophysiology at the Organ Level in the Peta FLOPS Computing Age
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mitchell, Lawrence; Bishop, Martin; Hoetzl, Elena
2010-09-30
Despite a steep increase in available compute power, in-silico experimentation with highly detailed models of the heart remains to be challenging due to the high computational cost involved. It is hoped that next generation high performance computing (HPC) resources lead to significant reductions in execution times to leverage a new class of in-silico applications. However, performance gains with these new platforms can only be achieved by engaging a much larger number of compute cores, necessitating strongly scalable numerical techniques. So far strong scalability has been demonstrated only for a moderate number of cores, orders of magnitude below the range requiredmore » to achieve the desired performance boost.In this study, strong scalability of currently used techniques to solve the bidomain equations is investigated. Benchmark results suggest that scalability is limited to 512-4096 cores within the range of relevant problem sizes even when systems are carefully load-balanced and advanced IO strategies are employed.« less
Clearing your Desk! Software and Data Services for Collaborative Web Based GIS Analysis
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Idaszak, R.; Horsburgh, J. S.; Ames, D. P.; Goodall, J. L.; Band, L. E.; Merwade, V.; Couch, A.; Hooper, R. P.; Maidment, D. R.; Dash, P. K.; Stealey, M.; Yi, H.; Gan, T.; Gichamo, T.; Yildirim, A. A.; Liu, Y.
2015-12-01
Can your desktop computer crunch the large GIS datasets that are becoming increasingly common across the geosciences? Do you have access to or the know-how to take advantage of advanced high performance computing (HPC) capability? Web based cyberinfrastructure takes work off your desk or laptop computer and onto infrastructure or "cloud" based data and processing servers. This talk will describe the HydroShare collaborative environment and web based services being developed to support the sharing and processing of hydrologic data and models. HydroShare supports the upload, storage, and sharing of a broad class of hydrologic data including time series, geographic features and raster datasets, multidimensional space-time data, and other structured collections of data. Web service tools and a Python client library provide researchers with access to HPC resources without requiring them to become HPC experts. This reduces the time and effort spent in finding and organizing the data required to prepare the inputs for hydrologic models and facilitates the management of online data and execution of models on HPC systems. This presentation will illustrate the use of web based data and computation services from both the browser and desktop client software. These web-based services implement the Terrain Analysis Using Digital Elevation Model (TauDEM) tools for watershed delineation, generation of hydrology-based terrain information, and preparation of hydrologic model inputs. They allow users to develop scripts on their desktop computer that call analytical functions that are executed completely in the cloud, on HPC resources using input datasets stored in the cloud, without installing specialized software, learning how to use HPC, or transferring large datasets back to the user's desktop. These cases serve as examples for how this approach can be extended to other models to enhance the use of web and data services in the geosciences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gittens, Alex; Devarakonda, Aditya; Racah, Evan
We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausibility), PCA (for its ubiquity) and CX (for data interpretability). We apply these methods to 1.6TB particle physics, 2.2TB and 16TB climate modeling and 1.1TB bioimaging data. The data matrices are tall-and-skinny which enable the algorithms to map conveniently into Spark’s data parallel model. We perform scalingmore » experiments on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide tuning guidance to obtain high performance.« less
Agelastos, Anthony; Allan, Benjamin; Brandt, Jim; ...
2016-05-18
A detailed understanding of HPC applications’ resource needs and their complex interactions with each other and HPC platform resources are critical to achieving scalability and performance. Such understanding has been difficult to achieve because typical application profiling tools do not capture the behaviors of codes under the potentially wide spectrum of actual production conditions and because typical monitoring tools do not capture system resource usage information with high enough fidelity to gain sufficient insight into application performance and demands. In this paper we present both system and application profiling results based on data obtained through synchronized system wide monitoring onmore » a production HPC cluster at Sandia National Laboratories (SNL). We demonstrate analytic and visualization techniques that we are using to characterize application and system resource usage under production conditions for better understanding of application resource needs. Furthermore, our goals are to improve application performance (through understanding application-to-resource mapping and system throughput) and to ensure that future system capabilities match their intended workloads.« less
Hierarchically Porous Carbon Materials for CO 2 Capture: The Role of Pore Structure
DOE Office of Scientific and Technical Information (OSTI.GOV)
Estevez, Luis; Barpaga, Dushyant; Zheng, Jian
2018-01-17
With advances in porous carbon synthesis techniques, hierarchically porous carbon (HPC) materials are being utilized as relatively new porous carbon sorbents for CO2 capture applications. These HPC materials were used as a platform to prepare samples with differing textural properties and morphologies to elucidate structure-property relationships. It was found that high microporous content, rather than overall surface area was of primary importance for predicting good CO2 capture performance. Two HPC materials were analyzed, each with near identical high surface area (~2700 m2/g) and colossally high pore volume (~10 cm3/g), but with different microporous content and pore size distributions, which ledmore » to dramatically different CO2 capture performance. Overall, large pore volumes obtained from distinct mesopores were found to significantly impact adsorption performance. From these results, an optimized HPC material was synthesized that achieved a high CO2 capacity of ~3.7 mmol/g at 25°C and 1 bar.« less
NASA Astrophysics Data System (ADS)
Topping, David; Alibay, Irfan; Bane, Michael
2017-04-01
To predict the evolving concentration, chemical composition and ability of aerosol particles to act as cloud droplets, we rely on numerical modeling. Mechanistic models attempt to account for the movement of compounds between the gaseous and condensed phases at a molecular level. This 'bottom up' approach is designed to increase our fundamental understanding. However, such models rely on predicting the properties of molecules and subsequent mixtures. For partitioning between the gaseous and condensed phases this includes: saturation vapour pressures; Henrys law coefficients; activity coefficients; diffusion coefficients and reaction rates. Current gas phase chemical mechanisms predict the existence of potentially millions of individual species. Within a dynamic ensemble model, this can often be used as justification for neglecting computationally expensive process descriptions. Indeed, on whether we can quantify the true sensitivity to uncertainties in molecular properties, even at the single aerosol particle level it has been impossible to embed fully coupled representations of process level knowledge with all possible compounds, typically relying on heavily parameterised descriptions. Relying on emerging numerical frameworks, and designed for the changing landscape of high-performance computing (HPC), in this study we focus specifically on the ability to capture activity coefficients in liquid solutions using the UNIFAC method. Activity coefficients are often neglected with the largely untested hypothesis that they are simply too computationally expensive to include in dynamic frameworks. We present results demonstrating increased computational efficiency for a range of typical scenarios, including a profiling of the energy use resulting from reliance on such computations. As the landscape of HPC changes, the latter aspect is important to consider in future applications.
Fingerprinting Communication and Computation on HPC Machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peisert, Sean
2010-06-02
How do we identify what is actually running on high-performance computing systems? Names of binaries, dynamic libraries loaded, or other elements in a submission to a batch queue can give clues, but binary names can be changed, and libraries provide limited insight and resolution on the code being run. In this paper, we present a method for"fingerprinting" code running on HPC machines using elements of communication and computation. We then discuss how that fingerprint can be used to determine if the code is consistent with certain other types of codes, what a user usually runs, or what the user requestedmore » an allocation to do. In some cases, our techniques enable us to fingerprint HPC codes using runtime MPI data with a high degree of accuracy.« less
Optimizing CyberShake Seismic Hazard Workflows for Large HPC Resources
NASA Astrophysics Data System (ADS)
Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.
2014-12-01
The CyberShake computational platform is a well-integrated collection of scientific software and middleware that calculates 3D simulation-based probabilistic seismic hazard curves and hazard maps for the Los Angeles region. Currently each CyberShake model comprises about 235 million synthetic seismograms from about 415,000 rupture variations computed at 286 sites. CyberShake integrates large-scale parallel and high-throughput serial seismological research codes into a processing framework in which early stages produce files used as inputs by later stages. Scientific workflow tools are used to manage the jobs, data, and metadata. The Southern California Earthquake Center (SCEC) developed the CyberShake platform using USC High Performance Computing and Communications systems and open-science NSF resources.CyberShake calculations were migrated to the NSF Track 1 system NCSA Blue Waters when it became operational in 2013, via an interdisciplinary team approach including domain scientists, computer scientists, and middleware developers. Due to the excellent performance of Blue Waters and CyberShake software optimizations, we reduced the makespan (a measure of wallclock time-to-solution) of a CyberShake study from 1467 to 342 hours. We will describe the technical enhancements behind this improvement, including judicious introduction of new GPU software, improved scientific software components, increased workflow-based automation, and Blue Waters-specific workflow optimizations.Our CyberShake performance improvements highlight the benefits of scientific workflow tools. The CyberShake workflow software stack includes the Pegasus Workflow Management System (Pegasus-WMS, which includes Condor DAGMan), HTCondor, and Globus GRAM, with Pegasus-mpi-cluster managing the high-throughput tasks on the HPC resources. The workflow tools handle data management, automatically transferring about 13 TB back to SCEC storage.We will present performance metrics from the most recent CyberShake study, executed on Blue Waters. We will compare the performance of CPU and GPU versions of our large-scale parallel wave propagation code, AWP-ODC-SGT. Finally, we will discuss how these enhancements have enabled SCEC to move forward with plans to increase the CyberShake simulation frequency to 1.0 Hz.
Integrating the Apache Big Data Stack with HPC for Big Data
NASA Astrophysics Data System (ADS)
Fox, G. C.; Qiu, J.; Jha, S.
2014-12-01
There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development. However, the same is not so true for data intensive computing, even though commercially clouds devote much more resources to data analytics than supercomputers devote to simulations. We look at a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures. We suggest a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks and use these to identify a few key classes of hardware/software architectures. Our analysis builds on combining HPC and ABDS the Apache big data software stack that is well used in modern cloud computing. Initial results on clouds and HPC systems are encouraging. We propose the development of SPIDAL - Scalable Parallel Interoperable Data Analytics Library -- built on system aand data abstractions suggested by the HPC-ABDS architecture. We discuss how it can be used in several application areas including Polar Science.
NASA Astrophysics Data System (ADS)
Filipcic, A.; Haug, S.; Hostettler, M.; Walker, R.; Weber, M.
2015-12-01
The Piz Daint Cray XC30 HPC system at CSCS, the Swiss National Supercomputing centre, was the highest ranked European system on TOP500 in 2014, also featuring GPU accelerators. Event generation and detector simulation for the ATLAS experiment have been enabled for this machine. We report on the technical solutions, performance, HPC policy challenges and possible future opportunities for HEP on extreme HPC systems. In particular a custom made integration to the ATLAS job submission system has been developed via the Advanced Resource Connector (ARC) middleware. Furthermore, a partial GPU acceleration of the Geant4 detector simulations has been implemented.
NASA Astrophysics Data System (ADS)
Gerhardt, Lisa; Bhimji, Wahid; Canon, Shane; Fasel, Markus; Jacobsen, Doug; Mustafa, Mustafa; Porter, Jeff; Tsulaia, Vakho
2017-10-01
Bringing HEP computing to HPC can be difficult. Software stacks are often very complicated with numerous dependencies that are difficult to get installed on an HPC system. To address this issue, NERSC has created Shifter, a framework that delivers Docker-like functionality to HPC. It works by extracting images from native formats and converting them to a common format that is optimally tuned for the HPC environment. We have used Shifter to deliver the CVMFS software stack for ALICE, ATLAS, and STAR on the supercomputers at NERSC. As well as enabling the distribution multi-TB sized CVMFS stacks to HPC, this approach also offers performance advantages. Software startup times are significantly reduced and load times scale with minimal variation to 1000s of nodes. We profile several successful examples of scientists using Shifter to make scientific analysis easily customizable and scalable. We will describe the Shifter framework and several efforts in HEP and NP to use Shifter to deliver their software on the Cori HPC system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCaskey, Alexander J.
There is a lack of state-of-the-art HPC simulation tools for simulating general quantum computing. Furthermore, there are no real software tools that integrate current quantum computers into existing classical HPC workflows. This product, the Quantum Virtual Machine (QVM), solves this problem by providing an extensible framework for pluggable virtual, or physical, quantum processing units (QPUs). It enables the execution of low level quantum assembly codes and returns the results of such executions.
Connecting to HPC Systems | High-Performance Computing | NREL
one of the following methods, which use multi-factor authentication. First, you will need to set up If you just need access to a command line on an HPC system, use one of the following methods
Direct SSH Gateway Access to Peregrine | High Performance Computing |
can access peregrine-ssh.nrel.gov, you must have: An active NREL HPC user account (see User Accounts ) An OTP Token (see One Time Password Tokens) Logging into peregrine-ssh.nrel.gov With your HPC account
Towards Cloud-based Asynchronous Elasticity for Iterative HPC Applications
NASA Astrophysics Data System (ADS)
da Rosa Righi, Rodrigo; Facco Rodrigues, Vinicius; André da Costa, Cristiano; Kreutz, Diego; Heiss, Hans-Ulrich
2015-10-01
Elasticity is one of the key features of cloud computing. It allows applications to dynamically scale computing and storage resources, avoiding over- and under-provisioning. In high performance computing (HPC), initiatives are normally modeled to handle bag-of-tasks or key-value applications through a load balancer and a loosely-coupled set of virtual machine (VM) instances. In the joint-field of Message Passing Interface (MPI) and tightly-coupled HPC applications, we observe the need of rewriting source codes, previous knowledge of the application and/or stop-reconfigure-and-go approaches to address cloud elasticity. Besides, there are problems related to how profit this new feature in the HPC scope, since in MPI 2.0 applications the programmers need to handle communicators by themselves, and a sudden consolidation of a VM, together with a process, can compromise the entire execution. To address these issues, we propose a PaaS-based elasticity model, named AutoElastic. It acts as a middleware that allows iterative HPC applications to take advantage of dynamic resource provisioning of cloud infrastructures without any major modification. AutoElastic provides a new concept denoted here as asynchronous elasticity, i.e., it provides a framework to allow applications to either increase or decrease their computing resources without blocking the current execution. The feasibility of AutoElastic is demonstrated through a prototype that runs a CPU-bound numerical integration application on top of the OpenNebula middleware. The results showed the saving of about 3 min at each scaling out operations, emphasizing the contribution of the new concept on contexts where seconds are precious.
Supervising simulations with the Prodiguer Messaging Platform
NASA Astrophysics Data System (ADS)
Greenslade, Mark; Carenton, Nicolas; Denvil, Sebastien
2015-04-01
At any one moment in time, researchers affiliated with the Institut Pierre Simon Laplace (IPSL) climate modeling group, are running hundreds of global climate simulations. These simulations execute upon a heterogeneous set of High Performance Computing (HPC) environments spread throughout France. The IPSL's simulation execution runtime is called libIGCM (library for IPSL Global Climate Modeling group). libIGCM has recently been enhanced so as to support realtime operational use cases. Such use cases include simulation monitoring, data publication, environment metrics collection, automated simulation control … etc. At the core of this enhancement is the Prodiguer messaging platform. libIGCM now emits information, in the form of messages, for remote processing at IPSL servers in Paris. The remote message processing takes several forms, for example: 1. Persisting message content to database(s); 2. Notifying an operator of changes in a simulation's execution status; 3. Launching rollback jobs upon simulation failure; 4. Dynamically updating controlled vocabularies; 5. Notifying downstream applications such as the Prodiguer web portal; We will describe how the messaging platform has been implemented from a technical perspective and demonstrate the Prodiguer web portal receiving realtime notifications.
Optimisation of multiplet identifier processing on a PLAYSTATION® 3
NASA Astrophysics Data System (ADS)
Hattori, Masami; Mizuno, Takashi
2010-02-01
To enable high-performance computing (HPC) for applications with large datasets using a Sony® PLAYSTATION® 3 (PS3™) video game console, we configured a hybrid system consisting of a Windows® PC and a PS3™. To validate this system, we implemented the real-time multiplet identifier (RTMI) application, which identifies multiplets of microearthquakes in terms of the similarity of their waveforms. The cross-correlation computation, which is a core algorithm of the RTMI application, was optimised for the PS3™ platform, while the rest of the computation, including data input and output remained on the PC. With this configuration, the core part of the algorithm ran 69 times faster than the original program, accelerating total computation speed more than five times. As a result, the system processed up to 2100 total microseismic events, whereas the original implementation had a limit of 400 events. These results indicate that this system enables high-performance computing for large datasets using the PS3™, as long as data transfer time is negligible compared with computation time.
Direct numerical simulation of reactor two-phase flows enabled by high-performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fang, Jun; Cambareri, Joseph J.; Brown, Cameron S.
Nuclear reactor two-phase flows remain a great engineering challenge, where the high-resolution two-phase flow database which can inform practical model development is still sparse due to the extreme reactor operation conditions and measurement difficulties. Owing to the rapid growth of computing power, the direct numerical simulation (DNS) is enjoying a renewed interest in investigating the related flow problems. A combination between DNS and an interface tracking method can provide a unique opportunity to study two-phase flows based on first principles calculations. More importantly, state-of-the-art high-performance computing (HPC) facilities are helping unlock this great potential. This paper reviews the recent researchmore » progress of two-phase flow DNS related to reactor applications. The progress in large-scale bubbly flow DNS has been focused not only on the sheer size of those simulations in terms of resolved Reynolds number, but also on the associated advanced modeling and analysis techniques. Specifically, the current areas of active research include modeling of sub-cooled boiling, bubble coalescence, as well as the advanced post-processing toolkit for bubbly flow simulations in reactor geometries. A novel bubble tracking method has been developed to track the evolution of bubbles in two-phase bubbly flow. Also, spectral analysis of DNS database in different geometries has been performed to investigate the modulation of the energy spectrum slope due to bubble-induced turbulence. In addition, the single-and two-phase analysis results are presented for turbulent flows within the pressurized water reactor (PWR) core geometries. The related simulations are possible to carry out only with the world leading HPC platforms. These simulations are allowing more complex turbulence model development and validation for use in 3D multiphase computational fluid dynamics (M-CFD) codes.« less
Parallel Application Performance on Two Generations of Intel Xeon HPC Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Christopher H.; Long, Hai; Sides, Scott
2015-10-15
Two next-generation node configurations hosting the Haswell microarchitecture were tested with a suite of microbenchmarks and application examples, and compared with a current Ivy Bridge production node on NREL" tm s Peregrine high-performance computing cluster. A primary conclusion from this study is that the additional cores are of little value to individual task performance--limitations to application parallelism, or resource contention among concurrently running but independent tasks, limits effective utilization of these added cores. Hyperthreading generally impacts throughput negatively, but can improve performance in the absence of detailed attention to runtime workflow configuration. The observations offer some guidance to procurement ofmore » future HPC systems at NREL. First, raw core count must be balanced with available resources, particularly memory bandwidth. Balance-of-system will determine value more than processor capability alone. Second, hyperthreading continues to be largely irrelevant to the workloads that are commonly seen, and were tested here, at NREL. Finally, perhaps the most impactful enhancement to productivity might occur through enabling multiple concurrent jobs per node. Given the right type and size of workload, more may be achieved by doing many slow things at once, than fast things in order.« less
Visualization assisted by parallel processing
NASA Astrophysics Data System (ADS)
Lange, B.; Rey, H.; Vasques, X.; Puech, W.; Rodriguez, N.
2011-01-01
This paper discusses the experimental results of our visualization model for data extracted from sensors. The objective of this paper is to find a computationally efficient method to produce a real time rendering visualization for a large amount of data. We develop visualization method to monitor temperature variance of a data center. Sensors are placed on three layers and do not cover all the room. We use particle paradigm to interpolate data sensors. Particles model the "space" of the room. In this work we use a partition of the particle set, using two mathematical methods: Delaunay triangulation and Voronoý cells. Avis and Bhattacharya present these two algorithms in. Particles provide information on the room temperature at different coordinates over time. To locate and update particles data we define a computational cost function. To solve this function in an efficient way, we use a client server paradigm. Server computes data and client display this data on different kind of hardware. This paper is organized as follows. The first part presents related algorithm used to visualize large flow of data. The second part presents different platforms and methods used, which was evaluated in order to determine the better solution for the task proposed. The benchmark use the computational cost of our algorithm that formed based on located particles compared to sensors and on update of particles value. The benchmark was done on a personal computer using CPU, multi core programming, GPU programming and hybrid GPU/CPU. GPU programming method is growing in the research field; this method allows getting a real time rendering instates of a precompute rendering. For improving our results, we compute our algorithm on a High Performance Computing (HPC), this benchmark was used to improve multi-core method. HPC is commonly used in data visualization (astronomy, physic, etc) for improving the rendering and getting real-time.
NASA Astrophysics Data System (ADS)
Xue, Bo; Mao, Bingjing; Chen, Xiaomei; Ni, Guoqiang
2010-11-01
This paper renders a configurable distributed high performance computing(HPC) framework for TDI-CCD imaging simulation. It uses strategy pattern to adapt multi-algorithms. Thus, this framework help to decrease the simulation time with low expense. Imaging simulation for TDI-CCD mounted on satellite contains four processes: 1) atmosphere leads degradation, 2) optical system leads degradation, 3) electronic system of TDI-CCD leads degradation and re-sampling process, 4) data integration. Process 1) to 3) utilize diversity data-intensity algorithms such as FFT, convolution and LaGrange Interpol etc., which requires powerful CPU. Even uses Intel Xeon X5550 processor, regular series process method takes more than 30 hours for a simulation whose result image size is 1500 * 1462. With literature study, there isn't any mature distributing HPC framework in this field. Here we developed a distribute computing framework for TDI-CCD imaging simulation, which is based on WCF[1], uses Client/Server (C/S) layer and invokes the free CPU resources in LAN. The server pushes the process 1) to 3) tasks to those free computing capacity. Ultimately we rendered the HPC in low cost. In the computing experiment with 4 symmetric nodes and 1 server , this framework reduced about 74% simulation time. Adding more asymmetric nodes to the computing network, the time decreased namely. In conclusion, this framework could provide unlimited computation capacity in condition that the network and task management server are affordable. And this is the brand new HPC solution for TDI-CCD imaging simulation and similar applications.
Development of a HIPAA-compliant environment for translational research data and analytics.
Bradford, Wayne; Hurdle, John F; LaSalle, Bernie; Facelli, Julio C
2014-01-01
High-performance computing centers (HPC) traditionally have far less restrictive privacy management policies than those encountered in healthcare. We show how an HPC can be re-engineered to accommodate clinical data while retaining its utility in computationally intensive tasks such as data mining, machine learning, and statistics. We also discuss deploying protected virtual machines. A critical planning step was to engage the university's information security operations and the information security and privacy office. Access to the environment requires a double authentication mechanism. The first level of authentication requires access to the university's virtual private network and the second requires that the users be listed in the HPC network information service directory. The physical hardware resides in a data center with controlled room access. All employees of the HPC and its users take the university's local Health Insurance Portability and Accountability Act training series. In the first 3 years, researcher count has increased from 6 to 58.
NASA Astrophysics Data System (ADS)
Harris, C. K.; Overeem, I.; Hutton, E.; Moriarty, J.; Wiberg, P.
2016-12-01
Numerical models are increasingly used for both research and applied sciences, and it is important that we train students to run models and analyze model data. This is especially true within oceanographic sciences, many of which use hydrodynamic models to address oceanographic transport problems. These models, however, often require a fair amount of training and computer skills before a student can run the models and analyze the large data sets produced by the models. One example is the Regional Ocean Modeling System (ROMS), an open source, three-dimensional primitive equation hydrodynamic ocean model that uses a structured curvilinear horizontal grid. It currently has thousands of users worldwide, and the full model includes modules for sediment transport and biogeochemistry, and several options for turbulence closures and numerical schemes. Implementing ROMS can be challenging to students, however, in part because the code was designed to provide flexibility for the choice of model parameterizations and processes, and to run on a variety of High Performance Computing (HPC) platforms. To provide a more accessible tool for classroom use, we have modified an existing idealized ROMS implementation to be run on a High Performance Computer (HPC) via the WMT (Web Modeling Toolkit), and developed a series of lesson plans that explore sediment transport within the idealized model domain. This has addressed our goal to provide a relatively easy introduction to the numerical modeling process that can be used within upper level undergraduate and graduate classes to explore sediment transport on continental shelves. The model implementation includes wave forcing, along-shelf currents, a riverine source, and suspended sediment transport. The model calculates suspended transport and deposition of sediment delivered to the continental shelf by a riverine flood. Lesson plans lead the students through running the model on a remote HPC, modifying the standard model. The lesson plans also include instruction for visualizing the model output within Matlab and Panoply. The lesson plans have been used within graduate, undergraduate classrooms, as well as in clinics aimed at educators. Feedback from these exercises has been used to improve the lesson plans and model implementation.
IonGAP: integrative bacterial genome analysis for Ion Torrent sequence data.
Baez-Ortega, Adrian; Lorenzo-Diaz, Fabian; Hernandez, Mariano; Gonzalez-Vila, Carlos Ignacio; Roda-Garcia, Jose Luis; Colebrook, Marcos; Flores, Carlos
2015-09-01
We introduce IonGAP, a publicly available Web platform designed for the analysis of whole bacterial genomes using Ion Torrent sequence data. Besides assembly, it integrates a variety of comparative genomics, annotation and bacterial classification routines, based on the widely used FASTQ, BAM and SRA file formats. Benchmarking with different datasets evidenced that IonGAP is a fast, powerful and simple-to-use bioinformatics tool. By releasing this platform, we aim to translate low-cost bacterial genome analysis for microbiological prevention and control in healthcare, agroalimentary and pharmaceutical industry applications. IonGAP is hosted by the ITER's Teide-HPC supercomputer and is freely available on the Web for non-commercial use at http://iongap.hpc.iter.es. mcolesan@ull.edu.es or cflores@ull.edu.es Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerber, Richard; Allcock, William; Beggio, Chris
2014-10-17
U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at themore » DOE national laboratories. The report contains findings from that review.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Armstrong, Robert C.; Ray, Jaideep; Malony, A.
2003-11-01
We present a case study of performance measurement and modeling of a CCA (Common Component Architecture) component-based application in a high performance computing environment. We explore issues peculiar to component-based HPC applications and propose a performance measurement infrastructure for HPC based loosely on recent work done for Grid environments. A prototypical implementation of the infrastructure is used to collect data for a three components in a scientific application and construct performance models for two of them. Both computational and message-passing performance are addressed.
NASA Astrophysics Data System (ADS)
Spinuso, Alessandro; Krause, Amy; Ramos Garcia, Clàudia; Casarotti, Emanuele; Magnoni, Federica; Klampanos, Iraklis A.; Frobert, Laurent; Krischer, Lion; Trani, Luca; David, Mario; Leong, Siew Hoon; Muraleedharan, Visakh
2014-05-01
The EU-funded project VERCE (Virtual Earthquake and seismology Research Community in Europe) aims to deploy technologies which satisfy the HPC and data-intensive requirements of modern seismology. As a result of VERCE's official collaboration with the EU project SCI-BUS, access to computational resources, like local clusters and international infrastructures (EGI and PRACE), is made homogeneous and integrated within a dedicated science gateway based on the gUSE framework. In this presentation we give a detailed overview on the progress achieved with the developments of the VERCE Science Gateway, according to a use-case driven implementation strategy. More specifically, we show how the computational technologies and data services have been integrated within a tool for Seismic Forward Modelling, whose objective is to offer the possibility to perform simulations of seismic waves as a service to the seismological community. We will introduce the interactive components of the OGC map based web interface and how it supports the user with setting up the simulation. We will go through the selection of input data, which are either fetched from federated seismological web services, adopting community standards, or provided by the users themselves by accessing their own document data store. The HPC scientific codes can be selected from a number of waveform simulators, currently available to the seismological community as batch tools or with limited configuration capabilities in their interactive online versions. The results will be staged out from the HPC via a secure GridFTP transfer to a VERCE data layer managed by iRODS. The provenance information of the simulation will be automatically cataloged by the data layer via NoSQL techonologies. We will try to demonstrate how data access, validation and visualisation can be supported by a general purpose provenance framework which, besides common provenance concepts imported from the OPM and the W3C-PROV initiatives, also offers an extensible metadata archive including community and user defined metadata and annotations. Finally, we will show how the VERCE Gateway platform will allow the customisation of pre and post processing phases of the simulation workflows, thanks to the availability of a registry of processing elements (PEs,) which are easily developed and maintained by the seismologists.
REVEAL: An Extensible Reduced Order Model Builder for Simulation and Modeling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Agarwal, Khushbu; Sharma, Poorva; Ma, Jinliang
2013-04-30
Many science domains need to build computationally efficient and accurate representations of high fidelity, computationally expensive simulations. These computationally efficient versions are known as reduced-order models. This paper presents the design and implementation of a novel reduced-order model (ROM) builder, the REVEAL toolset. This toolset generates ROMs based on science- and engineering-domain specific simulations executed on high performance computing (HPC) platforms. The toolset encompasses a range of sampling and regression methods that can be used to generate a ROM, automatically quantifies the ROM accuracy, and provides support for an iterative approach to improve ROM accuracy. REVEAL is designed to bemore » extensible in order to utilize the core functionality with any simulator that has published input and output formats. It also defines programmatic interfaces to include new sampling and regression techniques so that users can ‘mix and match’ mathematical techniques to best suit the characteristics of their model. In this paper, we describe the architecture of REVEAL and demonstrate its usage with a computational fluid dynamics model used in carbon capture.« less
Algorithm for fast event parameters estimation on GEM acquired data
NASA Astrophysics Data System (ADS)
Linczuk, Paweł; Krawczyk, Rafał D.; Poźniak, Krzysztof T.; Kasprowicz, Grzegorz; Wojeński, Andrzej; Chernyshova, Maryna; Czarski, Tomasz
2016-09-01
We present study of a software-hardware environment for developing fast computation with high throughput and low latency methods, which can be used as back-end in High Energy Physics (HEP) and other High Performance Computing (HPC) systems, based on high amount of input from electronic sensor based front-end. There is a parallelization possibilities discussion and testing on Intel HPC solutions with consideration of applications with Gas Electron Multiplier (GEM) measurement systems presented in this paper.
Contact Us | High-Performance Computing | NREL
Select Peregrine Merlin WinHPC Allocation project handle (if requesting HPC account) Description of "SEND REQUEST" and nothing happens, it most likely means you forgot to provide information in a required field. You may need to scroll up to see what required information is missing
WinHPC System Programming | High-Performance Computing | NREL
Programming WinHPC System Programming Learn how to build and run an MPI (message passing interface (mpi.h) and library (msmpi.lib) are. To build from the command line, run... Start > Intel Software Development Tools > Intel C++ Compiler Professional... > C++ Build Environment for applications running
National Energy Research Scientific Computing Center
Overview NERSC Mission Contact us Staff Org Chart NERSC History NERSC Stakeholders Usage and User HPC Requirements Reviews NERSC HPC Achievement Awards User Submitted Research Citations NERSC User data archive NERSC Resources Table For Users Live Status User Announcements My NERSC Getting Started
A framework for graph-based synthesis, analysis, and visualization of HPC cluster job data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mayo, Jackson R.; Kegelmeyer, W. Philip, Jr.; Wong, Matthew H.
The monitoring and system analysis of high performance computing (HPC) clusters is of increasing importance to the HPC community. Analysis of HPC job data can be used to characterize system usage and diagnose and examine failure modes and their effects. This analysis is not straightforward, however, due to the complex relationships that exist between jobs. These relationships are based on a number of factors, including shared compute nodes between jobs, proximity of jobs in time, etc. Graph-based techniques represent an approach that is particularly well suited to this problem, and provide an effective technique for discovering important relationships in jobmore » queuing and execution data. The efficacy of these techniques is rooted in the use of a semantic graph as a knowledge representation tool. In a semantic graph job data, represented in a combination of numerical and textual forms, can be flexibly processed into edges, with corresponding weights, expressing relationships between jobs, nodes, users, and other relevant entities. This graph-based representation permits formal manipulation by a number of analysis algorithms. This report presents a methodology and software implementation that leverages semantic graph-based techniques for the system-level monitoring and analysis of HPC clusters based on job queuing and execution data. Ontology development and graph synthesis is discussed with respect to the domain of HPC job data. The framework developed automates the synthesis of graphs from a database of job information. It also provides a front end, enabling visualization of the synthesized graphs. Additionally, an analysis engine is incorporated that provides performance analysis, graph-based clustering, and failure prediction capabilities for HPC systems.« less
Optimization of a Lattice Boltzmann Computation on State-of-the-Art Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Carter, Jonathan; Oliker, Leonid
2009-04-10
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of search-based performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to a lattice Boltzmann application (LBMHD) that historically has made poor use of scalar microprocessors due to its complex data structures and memory access patterns. We explore one of the broadest sets of multicore architectures in the HPC literature, including the Intel Xeon E5345 (Clovertown), AMD Opteron 2214 (Santa Rosa), AMD Opteron 2356 (Barcelona), Sun T5140 T2+ (Victoria Falls), as well asmore » a QS20 IBM Cell Blade. Rather than hand-tuning LBMHD for each system, we develop a code generator that allows us to identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our auto-tuned LBMHD application achieves up to a 15x improvement compared with the original code at a given concurrency. Additionally, we present detailed analysis of each optimization, which reveal surprising hardware bottlenecks and software challenges for future multicore systems and applications.« less
Mining Software Usage with the Automatic Library Tracking Database (ALTD)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hadri, Bilel; Fahey, Mark R
2013-01-01
Tracking software usage is important for HPC centers, computer vendors, code developers and funding agencies to provide more efficient and targeted software support, and to forecast needs and guide HPC software effort towards the Exascale era. However, accurately tracking software usage on HPC systems has been a challenging task. In this paper, we present a tool called Automatic Library Tracking Database (ALTD) that has been developed and put in production on several Cray systems. The ALTD infrastructure prototype automatically and transparently stores information about libraries linked into an application at compilation time and also the executables launched in a batchmore » job. We will illustrate the usage of libraries, compilers and third party software applications on a system managed by the National Institute for Computational Sciences.« less
WEI, GUANGQUAN; KANG, XIAOWEI; LIU, XIANPING; TANG, XING; LI, QINLONG; HAN, JUNTAO; YIN, HONG
2015-01-01
Regardless of the controversial pathogenesis, intracranial meningeal hemangiopericytoma (M-HPC) is a rare, highly cellular and vascularized mesenchymal tumor that is characterized by a high tendency for recurrence and extraneural metastasis, despite radical excision and postoperative radiotherapy. M-HPC shares similar clinical manifestations and radiological findings with meningioma, which causes difficulty in differentiation of this entity from those prognostically favorable mimics prior to surgery. Treatment of M-HPC, particularly in metastatic settings, remains a challenge. A case is described of primary M-HPC with recurrence at the initial and distant intracranial sites and extraneural multiple-organ metastases in a 36-year-old female. The metastasis of M-HPC was extremely extensive, and to the best of our knowledge this is the first case of M-HPC with delayed metastasis to the bilateral kidneys. The data suggests that preoperative computed tomography and magnetic resonance imaging could provide certain diagnostic clues and useful information for more optimal treatment planning. The results may imply that novel drugs, such as temozolomide and bevacizumab, as a component of multimodality therapy of M-HPC may deserve further investigation. PMID:26171177
Convergence in France facing Big Data era and Exascale challenges for Climate Sciences
NASA Astrophysics Data System (ADS)
Denvil, Sébastien; Dufresne, Jean-Louis; Salas, David; Meurdesoif, Yann; Valcke, Sophie; Caubel, Arnaud; Foujols, Marie-Alice; Servonnat, Jérôme; Sénési, Stéphane; Derouillat, Julien; Voury, Pascal
2014-05-01
The presentation will introduce a french national project : CONVERGENCE that has been funded for four years. This project will tackle big data and computational challenges faced by climate modeling community in HPC context. Model simulations are central to the study of complex mechanisms and feedbacks in the climate system and to provide estimates of future and past climate changes. Recent trends in climate modelling are to add more physical components in the modelled system, increasing the resolution of each individual component and the more systematic use of large suites of simulations to address many scientific questions. Climate simulations may therefore differ in their initial state, parameter values, representation of physical processes, spatial resolution, model complexity, and degree of realism or degree of idealisation. In addition, there is a strong need for evaluating, improving and monitoring the performance of climate models using a large ensemble of diagnostics and better integration of model outputs and observational data. High performance computing is currently reaching the exascale and has the potential to produce this exponential increase of size and numbers of simulations. However, post-processing, analysis, and exploration of the generated data have stalled and there is a strong need for new tools to cope with the growing size and complexity of the underlying simulations and datasets. Exascale simulations require new scalable software tools to generate, manage and mine those simulations ,and data to extract the relevant information and to take the correct decision. The primary purpose of this project is to develop a platform capable of running large ensembles of simulations with a suite of models, to handle the complex and voluminous datasets generated, to facilitate the evaluation and validation of the models and the use of higher resolution models. We propose to gather interdisciplinary skills to design, using a component-based approach, a specific programming environment for scalable scientific simulations and analytics, integrating new and efficient ways of deploying and analysing the applications on High Performance Computing (HPC) system. CONVERGENCE, gathering HPC and informatics expertise that cuts across the individual partners and the broader HPC community, will allow the national climate community to leverage information technology (IT) innovations to address its specific needs. Our methodology consists in developing an ensemble of generic elements needed to run the French climate models with different grids and different resolution, ensuring efficient and reliable execution of these models, managing large volume and number of data and allowing analysis of the results and precise evaluation of the models. These elements include data structure definition and input-output (IO), code coupling and interpolation, as well as runtime and pre/post-processing environments. A common data and metadata structure will allow transferring consistent information between the various elements. All these generic elements will be open source and publicly available. The IPSL-CM and CNRM-CM climate models will make use of these elements that will constitute a national platform for climate modelling. This platform will be used, in its entirety, to optimise and tune the next version of the IPSL-CM model and to develop a global coupled climate model with a regional grid refinement. It will also be used, at least partially, to run ensembles of the CNRM-CM model at relatively high resolution and to run a very-high resolution prototype of this model. The climate models we developed are already involved in many international projects. For instance we participate to the CMIP (Coupled Model Intercomparison Project) project that is very demanding but has a high visibility: its results are widely used and are in particular synthesised in the IPCC (Intergovernmental Panel on Climate Change) assessment reports. The CONVERGENCE project will constitute an invaluable step for the French climate community to prepare and better contribute to the next phase of the CMIP project.
Data Services in Support of High Performance Computing-Based Distributed Hydrologic Models
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Horsburgh, J. S.; Dash, P. K.; Gichamo, T.; Yildirim, A. A.; Jones, N.
2014-12-01
We have developed web-based data services to support the application of hydrologic models on High Performance Computing (HPC) systems. The purposes of these services are to provide hydrologic researchers, modelers, water managers, and users access to HPC resources without requiring them to become HPC experts and understanding the intrinsic complexities of the data services, so as to reduce the amount of time and effort spent in finding and organizing the data required to execute hydrologic models and data preprocessing tools on HPC systems. These services address some of the data challenges faced by hydrologic models that strive to take advantage of HPC. Needed data is often not in the form needed by such models, requiring researchers to spend time and effort on data preparation and preprocessing that inhibits or limits the application of these models. Another limitation is the difficult to use batch job control and queuing systems used by HPC systems. We have developed a REST-based gateway application programming interface (API) for authenticated access to HPC systems that abstracts away many of the details that are barriers to HPC use and enhances accessibility from desktop programming and scripting languages such as Python and R. We have used this gateway API to establish software services that support the delineation of watersheds to define a modeling domain, then extract terrain and land use information to automatically configure the inputs required for hydrologic models. These services support the Terrain Analysis Using Digital Elevation Model (TauDEM) tools for watershed delineation and generation of hydrology-based terrain information such as wetness index and stream networks. These services also support the derivation of inputs for the Utah Energy Balance snowmelt model used to address questions such as how climate, land cover and land use change may affect snowmelt inputs to runoff generation. To enhance access to the time varying climate data used to drive hydrologic models, we have developed services to downscale and re-grid nationally available climate analysis data from systems such as NLDAS and MERRA. These cases serve as examples for how this approach can be extended to other models to enhance the use of HPC for hydrologic modeling.
IGMS: An Integrated ISO-to-Appliance Scale Grid Modeling System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Palmintier, Bryan; Hale, Elaine; Hansen, Timothy M.
This paper describes the Integrated Grid Modeling System (IGMS), a novel electric power system modeling platform for integrated transmission-distribution analysis that co-simulates off-the-shelf tools on high performance computing (HPC) platforms to offer unprecedented resolution from ISO markets down to appliances and other end uses. Specifically, the system simultaneously models hundreds or thousands of distribution systems in co-simulation with detailed Independent System Operator (ISO) markets and AGC-level reserve deployment. IGMS uses a new MPI-based hierarchical co-simulation framework to connect existing sub-domain models. Our initial efforts integrate opensource tools for wholesale markets (FESTIV), bulk AC power flow (MATPOWER), and full-featured distribution systemsmore » including physics-based end-use and distributed generation models (many instances of GridLAB-D[TM]). The modular IGMS framework enables tool substitution and additions for multi-domain analyses. This paper describes the IGMS tool, characterizes its performance, and demonstrates the impacts of the coupled simulations for analyzing high-penetration solar PV and price responsive load scenarios.« less
Peer-to-peer architectures for exascale computing : LDRD final report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vorobeychik, Yevgeniy; Mayo, Jackson R.; Minnich, Ronald G.
2010-09-01
The goal of this research was to investigate the potential for employing dynamic, decentralized software architectures to achieve reliability in future high-performance computing platforms. These architectures, inspired by peer-to-peer networks such as botnets that already scale to millions of unreliable nodes, hold promise for enabling scientific applications to run usefully on next-generation exascale platforms ({approx} 10{sup 18} operations per second). Traditional parallel programming techniques suffer rapid deterioration of performance scaling with growing platform size, as the work of coping with increasingly frequent failures dominates over useful computation. Our studies suggest that new architectures, in which failures are treated as ubiquitousmore » and their effects are considered as simply another controllable source of error in a scientific computation, can remove such obstacles to exascale computing for certain applications. We have developed a simulation framework, as well as a preliminary implementation in a large-scale emulation environment, for exploration of these 'fault-oblivious computing' approaches. High-performance computing (HPC) faces a fundamental problem of increasing total component failure rates due to increasing system sizes, which threaten to degrade system reliability to an unusable level by the time the exascale range is reached ({approx} 10{sup 18} operations per second, requiring of order millions of processors). As computer scientists seek a way to scale system software for next-generation exascale machines, it is worth considering peer-to-peer (P2P) architectures that are already capable of supporting 10{sup 6}-10{sup 7} unreliable nodes. Exascale platforms will require a different way of looking at systems and software because the machine will likely not be available in its entirety for a meaningful execution time. Realistic estimates of failure rates range from a few times per day to more than once per hour for these platforms. P2P architectures give us a starting point for crafting applications and system software for exascale. In the context of the Internet, P2P applications (e.g., file sharing, botnets) have already solved this problem for 10{sup 6}-10{sup 7} nodes. Usually based on a fractal distributed hash table structure, these systems have proven robust in practice to constant and unpredictable outages, failures, and even subversion. For example, a recent estimate of botnet turnover (i.e., the number of machines leaving and joining) is about 11% per week. Nonetheless, P2P networks remain effective despite these failures: The Conficker botnet has grown to {approx} 5 x 10{sup 6} peers. Unlike today's system software and applications, those for next-generation exascale machines cannot assume a static structure and, to be scalable over millions of nodes, must be decentralized. P2P architectures achieve both, and provide a promising model for 'fault-oblivious computing'. This project aimed to study the dynamics of P2P networks in the context of a design for exascale systems and applications. Having no single point of failure, the most successful P2P architectures are adaptive and self-organizing. While there has been some previous work applying P2P to message passing, little attention has been previously paid to the tightly coupled exascale domain. Typically, the per-node footprint of P2P systems is small, making them ideal for HPC use. The implementation on each peer node cooperates en masse to 'heal' disruptions rather than relying on a controlling 'master' node. Understanding this cooperative behavior from a complex systems viewpoint is essential to predicting useful environments for the inextricably unreliable exascale platforms of the future. We sought to obtain theoretical insight into the stability and large-scale behavior of candidate architectures, and to work toward leveraging Sandia's Emulytics platform to test promising candidates in a realistic (ultimately {ge} 10{sup 7} nodes) setting. Our primary example applications are drawn from linear algebra: a Jacobi relaxation solver for the heat equation, and the closely related technique of value iteration in optimization. We aimed to apply P2P concepts in designing implementations capable of surviving an unreliable machine of 10{sup 6} nodes.« less
NREL Evaluates Aquarius Liquid-Cooled High-Performance Computing Technology
HPC and influence the modern data center designer towards adoption of liquid cooling. Our shared technology. Aquila and Sandia chose NREL's HPC Data Center for the initial installation and evaluation because the data center is configured for liquid cooling, along with the required instrumentation to
High performance computing (HPC) requirements for the new generation variable grid resolution (VGR) global climate models differ from that of traditional global models. A VGR global model with 15 km grids over the CONUS stretching to 60 km grids elsewhere will have about ~2.5 tim...
P43-S Computational Biology Applications Suite for High-Performance Computing (BioHPC.net)
Pillardy, J.
2007-01-01
One of the challenges of high-performance computing (HPC) is user accessibility. At the Cornell University Computational Biology Service Unit, which is also a Microsoft HPC institute, we have developed a computational biology application suite that allows researchers from biological laboratories to submit their jobs to the parallel cluster through an easy-to-use Web interface. Through this system, we are providing users with popular bioinformatics tools including BLAST, HMMER, InterproScan, and MrBayes. The system is flexible and can be easily customized to include other software. It is also scalable; the installation on our servers currently processes approximately 8500 job submissions per year, many of them requiring massively parallel computations. It also has a built-in user management system, which can limit software and/or database access to specified users. TAIR, the major database of the plant model organism Arabidopsis, and SGN, the international tomato genome database, are both using our system for storage and data analysis. The system consists of a Web server running the interface (ASP.NET C#), Microsoft SQL server (ADO.NET), compute cluster running Microsoft Windows, ftp server, and file server. Users can interact with their jobs and data via a Web browser, ftp, or e-mail. The interface is accessible at http://cbsuapps.tc.cornell.edu/.
Molgenis-impute: imputation pipeline in a box.
Kanterakis, Alexandros; Deelen, Patrick; van Dijk, Freerk; Byelas, Heorhiy; Dijkstra, Martijn; Swertz, Morris A
2015-08-19
Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters. Here we present MOLGENIS-impute, an 'imputation in a box' solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment. MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation.
Kim, Cheolho; Moon, Jun Hyuk
2018-06-13
Micro-supercapacitors (MSCs) are attractive for applications in next-generation mobile and wearable devices and have the potential to complement or even replace lithium batteries. However, many previous MSCs have often exhibited a low volumetric energy density with high-loading electrodes because of the nonuniform pore structure of the electrodes. To address this issue, we introduced a uniform-pore carbon electrode fabricated by 3D interference lithography. Furthermore, a hierarchical pore-patterned carbon (hPC) electrode was formed by introducing a micropore by chemical etching into the macropore carbon skeleton. The hPC electrodes were applied to solid-state MSCs. We achieved a constant volumetric capacitance and a corresponding volumetric energy density for electrodes of various thicknesses. The hPC MSC reached a volumetric energy density of approximately 1.43 mW h/cm 3 . The power density of the hPC MSC was 1.69 W/cm 3 . We could control the capacitance and voltage additionally by connecting the unit MSC cells in series or parallel, and we confirmed the operation of a light-emitting diode. We believe that our pore-patterned electrodes will provide a new platform for compact but high-performance energy storage devices.
Active Flash: Out-of-core Data Analytics on Flash Storage
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boboila, Simona; Kim, Youngjae; Vazhkudai, Sudharshan S
2012-01-01
Next generation science will increasingly come to rely on the ability to perform efficient, on-the-fly analytics of data generated by high-performance computing (HPC) simulations, modeling complex physical phenomena. Scientific computing workflows are stymied by the traditional chaining of simulation and data analysis, creating multiple rounds of redundant reads and writes to the storage system, which grows in cost with the ever-increasing gap between compute and storage speeds in HPC clusters. Recent HPC acquisitions have introduced compute node-local flash storage as a means to alleviate this I/O bottleneck. We propose a novel approach, Active Flash, to expedite data analysis pipelines bymore » migrating to the location of the data, the flash device itself. We argue that Active Flash has the potential to enable true out-of-core data analytics by freeing up both the compute core and the associated main memory. By performing analysis locally, dependence on limited bandwidth to a central storage system is reduced, while allowing this analysis to proceed in parallel with the main application. In addition, offloading work from the host to the more power-efficient controller reduces peak system power usage, which is already in the megawatt range and poses a major barrier to HPC system scalability. We propose an architecture for Active Flash, explore energy and performance trade-offs in moving computation from host to storage, demonstrate the ability of appropriate embedded controllers to perform data analysis and reduction tasks at speeds sufficient for this application, and present a simulation study of Active Flash scheduling policies. These results show the viability of the Active Flash model, and its capability to potentially have a transformative impact on scientific data analysis.« less
HPC enabled real-time remote processing of laparoscopic surgery
NASA Astrophysics Data System (ADS)
Ronaghi, Zahra; Sapra, Karan; Izard, Ryan; Duffy, Edward; Smith, Melissa C.; Wang, Kuang-Ching; Kwartowitz, David M.
2016-03-01
Laparoscopic surgery is a minimally invasive surgical technique. The benefit of small incisions has a disadvantage of limited visualization of subsurface tissues. Image-guided surgery (IGS) uses pre-operative and intra-operative images to map subsurface structures. One particular laparoscopic system is the daVinci-si robotic surgical system. The video streams generate approximately 360 megabytes of data per second. Real-time processing this large stream of data on a bedside PC, single or dual node setup, has become challenging and a high-performance computing (HPC) environment may not always be available at the point of care. To process this data on remote HPC clusters at the typical 30 frames per second rate, it is required that each 11.9 MB video frame be processed by a server and returned within 1/30th of a second. We have implement and compared performance of compression, segmentation and registration algorithms on Clemson's Palmetto supercomputer using dual NVIDIA K40 GPUs per node. Our computing framework will also enable reliability using replication of computation. We will securely transfer the files to remote HPC clusters utilizing an OpenFlow-based network service, Steroid OpenFlow Service (SOS) that can increase performance of large data transfers over long-distance and high bandwidth networks. As a result, utilizing high-speed OpenFlow- based network to access computing clusters with GPUs will improve surgical procedures by providing real-time medical image processing and laparoscopic data.
Mannina, Paolo; Segale, Lorena; Giovannelli, Lorella; Bonda, Andrea Foglio; Pattarino, Franco
2016-02-29
In this work, alginate, alginate-pectin and alginate-hydroxypropylcellulose pellets were produced by ionotropic gelation and characterized. Ibuprofen was selected as model drug; it was suspended in the polymeric solution in crystalline form or dissolved in a self-emulsifying phase and then dispersed into the polymeric solution. The self-emulsifying excipient platform composed of Labrasol (PEG-8 caprylic/capric glycerides) and d-α-tocopherol polyethylene glycol 1000 succinate (TPGS), able to solubilize the drug was used to improve the technological and biopharmaceutical properties of the alginate pellets. The pellets had diameters between 1317 and 2026 μm and a high drug content (>51%). DSC analysis showed the amorphous state of drug in the pellets containing the self-emulsifying phase. All the systems restricted drug release in conditions simulating the gastric environment and made the drug completely available at a pH value typical for the intestine. Only alginate-HPC systems containing the drug solubilized into the self-emulsifying phase showed the ability to partially control the release of ibuprofen at neutral pH. The self-emulsifying excipient platform is a useful tool to improve technological and biopharmaceutical properties of alginate-HPC pellets. Copyright © 2015 Elsevier B.V. All rights reserved.
Additive Manufacturing and High-Performance Computing: a Disruptive Latent Technology
NASA Astrophysics Data System (ADS)
Goodwin, Bruce
2015-03-01
This presentation will discuss the relationship between recent advances in Additive Manufacturing (AM) technology, High-Performance Computing (HPC) simulation and design capabilities, and related advances in Uncertainty Quantification (UQ), and then examines their impacts upon national and international security. The presentation surveys how AM accelerates the fabrication process, while HPC combined with UQ provides a fast track for the engineering design cycle. The combination of AM and HPC/UQ almost eliminates the engineering design and prototype iterative cycle, thereby dramatically reducing cost of production and time-to-market. These methods thereby present significant benefits for US national interests, both civilian and military, in an age of austerity. Finally, considering cyber security issues and the advent of the ``cloud,'' these disruptive, currently latent technologies may well enable proliferation and so challenge both nuclear and non-nuclear aspects of international security.
Spark and HPC for High Energy Physics Data Analyses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sehrish, Saba; Kowalkowski, Jim; Paterno, Marc
A full High Energy Physics (HEP) data analysis is divided into multiple data reduction phases. Processing within these phases is extremely time consuming, therefore intermediate results are stored in files held in mass storage systems and referenced as part of large datasets. This processing model limits what can be done with interactive data analytics. Growth in size and complexity of experimental datasets, along with emerging big data tools are beginning to cause changes to the traditional ways of doing data analyses. Use of big data tools for HEP analysis looks promising, mainly because extremely large HEP datasets can be representedmore » and held in memory across a system, and accessed interactively by encoding an analysis using highlevel programming abstractions. The mainstream tools, however, are not designed for scientific computing or for exploiting the available HPC platform features. We use an example from the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) in Geneva, Switzerland. The LHC is the highest energy particle collider in the world. Our use case focuses on searching for new types of elementary particles explaining Dark Matter in the universe. We use HDF5 as our input data format, and Spark to implement the use case. We show the benefits and limitations of using Spark with HDF5 on Edison at NERSC.« less
Scaling deep learning on GPU and knights landing clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Buluc, Aydin; Demmel, James
Training neural networks has become a big bottleneck. For example, training ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the current deep learning systems heavily rely on the hardware accelerators. However, these accelerators have limited on-chip memory compared with CPUs. We use both self-host Intel Knights Landing (KNL) clusters and multi-GPU clusters as our target platforms. From the algorithm aspect, we focus on Elastic Averaging SGD (EASGD) to design algorithms for HPC clusters. We redesign four efficient algorithms for HPC systems to improve EASGD's poor scaling on clusters. Async EASGD, Async MEASGD,more » and Hogwild EASGD are faster than existing counter-part methods (Async SGD, Async MSGD, and Hogwild SGD) in all comparisons. Sync EASGD achieves 5.3X speedup over original EASGD on the same platform. We achieve 91.5% weak scaling efficiency on 4253 KNL cores, which is higher than the state-of-the-art implementation.« less
System Resource Allocation Requests | High-Performance Computing | NREL
Account to utilize the online allocation request system. If you need a HPC User Account, please request one online: Visit User Accounts. Click the green "Request Account" Button - this will direct . Follow the online instructions provided in the DocuSign form. Write "Need HPC User Account to use
Khoo, James B.; Sittampalam, Kesavan; Chee, Soo K.
2008-01-01
Abstract We report an extremely rare case of malignant hemangiopericytoma (HPC) of the parotid gland and its metastatic spread to lung, liver, and skeletal muscle. Computed tomography (CT) imaging, histopathological and immunohistochemical methods were employed to study the features of malignant HPC and its metastases. CT imaging was helpful to determine the exact location, involvement of adjacent structures and vascularity, as well as evaluating pulmonary, hepatic, peritoneal, and muscular metastases. Immunohistochemical and histopatholgical features of the primary tumor as well as the metastases were consistent with the diagnosis of malignant HPC. PMID:18940737
Final Report on the Proposal to Provide Asian Science and Technology Information
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kahaner, David K.
2003-07-23
The Asian Technology Information Program (ATIP) conducted a seven-month Asian science and technology information program for the Office:of Energy Research (ER), U.S: Department of Energy (DOE.) The seven-month program consists of 1) monitoring, analyzing, and dissemiuating science and technology trends and developments associated with Asian high performance computing and communications (HPC), networking, and associated topics, 2) access to ATIP's annual series of Asian S&T reports for ER and HPC related personnel and, 3) supporting DOE and ER designated visits to Asia to study and assess Asian HPC.
Quantifying Scheduling Challenges for Exascale System Software
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mondragon, Oscar; Bridges, Patrick G.; Jones, Terry R
2015-01-01
The move towards high-performance computing (HPC) ap- plications comprised of coupled codes and the need to dra- matically reduce data movement is leading to a reexami- nation of time-sharing vs. space-sharing in HPC systems. In this paper, we discuss and begin to quantify the perfor- mance impact of a move away from strict space-sharing of nodes for HPC applications. Specifically, we examine the po- tential performance cost of time-sharing nodes between ap- plication components, we determine whether a simple coor- dinated scheduling mechanism can address these problems, and we research how suitable simple constraint-based opti- mization techniques are for solvingmore » scheduling challenges in this regime. Our results demonstrate that current general- purpose HPC system software scheduling and resource al- location systems are subject to significant performance de- ciencies which we quantify for six representative applica- tions. Based on these results, we discuss areas in which ad- ditional research is needed to meet the scheduling challenges of next-generation HPC systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth; Sewell, Christopher; Usher, William
Here, one of the most critical challenges for high-performance computing (HPC) scientific visualization is execution on massively threaded processors. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Our current production scientific visualization software is not designed for these new types of architectures. To address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth; Sewell, Christopher; Usher, William
Execution on massively threaded processors is one of the most critical challenges for high-performance computing (HPC) scientific visualization. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Moreover, our current production scientific visualization software is not designed for these new types of architectures. In order to address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
Havery Mudd 2014-2015 Computer Science Conduit Clinic Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aspesi, G; Bai, J; Deese, R
2015-05-12
Conduit, a new open-source library developed at Lawrence Livermore National Laboratories, provides a C++ application programming interface (API) to describe and access scientific data. Conduit’s primary use is for inmemory data exchange in high performance computing (HPC) applications. Our team tested and improved Conduit to make it more appealing to potential adopters in the HPC community. We extended Conduit’s capabilities by prototyping four libraries: one for parallel communication using MPI, one for I/O functionality, one for aggregating performance data, and one for data visualization.
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan
While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less
Improving User Notification on Frequently Changing HPC Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fuson, Christopher B; Renaud, William A
2016-01-01
Today s HPC centers user environments can be very complex. Centers often contain multiple large complicated computational systems each with their own user environment. Changes to a system s environment can be very impactful; however, a center s user environment is, in one-way or another, frequently changing. Because of this, it is vital for centers to notify users of change. For users, untracked changes can be costly, resulting in unnecessary debug time as well as wasting valuable compute allocations and research time. Communicating frequent change to diverse user communities is a common and ongoing task for HPC centers. This papermore » will cover the OLCF s current processes and methods used to communicate change to users of the center s large Cray systems and supporting resources. The paper will share lessons learned and goals as well as practices, tools, and methods used to continually improve and reach members of the OLCF user community.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Agelastos, Anthony; Allan, Benjamin; Brandt, Jim
A detailed understanding of HPC applications’ resource needs and their complex interactions with each other and HPC platform resources are critical to achieving scalability and performance. Such understanding has been difficult to achieve because typical application profiling tools do not capture the behaviors of codes under the potentially wide spectrum of actual production conditions and because typical monitoring tools do not capture system resource usage information with high enough fidelity to gain sufficient insight into application performance and demands. In this paper we present both system and application profiling results based on data obtained through synchronized system wide monitoring onmore » a production HPC cluster at Sandia National Laboratories (SNL). We demonstrate analytic and visualization techniques that we are using to characterize application and system resource usage under production conditions for better understanding of application resource needs. Furthermore, our goals are to improve application performance (through understanding application-to-resource mapping and system throughput) and to ensure that future system capabilities match their intended workloads.« less
Divide and Conquer (DC) BLAST: fast and easy BLAST execution within HPC environments
Yim, Won Cheol; Cushman, John C.
2017-07-22
Bioinformatics is currently faced with very large-scale data sets that lead to computational jobs, especially sequence similarity searches, that can take absurdly long times to run. For example, the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST and BLAST+) suite, which is by far the most widely used tool for rapid similarity searching among nucleic acid or amino acid sequences, is highly central processing unit (CPU) intensive. While the BLAST suite of programs perform searches very rapidly, they have the potential to be accelerated. In recent years, distributed computing environments have become more widely accessible andmore » used due to the increasing availability of high-performance computing (HPC) systems. Therefore, simple solutions for data parallelization are needed to expedite BLAST and other sequence analysis tools. However, existing software for parallel sequence similarity searches often requires extensive computational experience and skill on the part of the user. In order to accelerate BLAST and other sequence analysis tools, Divide and Conquer BLAST (DCBLAST) was developed to perform NCBI BLAST searches within a cluster, grid, or HPC environment by using a query sequence distribution approach. Scaling from one (1) to 256 CPU cores resulted in significant improvements in processing speed. Thus, DCBLAST dramatically accelerates the execution of BLAST searches using a simple, accessible, robust, and parallel approach. DCBLAST works across multiple nodes automatically and it overcomes the speed limitation of single-node BLAST programs. DCBLAST can be used on any HPC system, can take advantage of hundreds of nodes, and has no output limitations. Thus, this freely available tool simplifies distributed computation pipelines to facilitate the rapid discovery of sequence similarities between very large data sets.« less
Divide and Conquer (DC) BLAST: fast and easy BLAST execution within HPC environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yim, Won Cheol; Cushman, John C.
Bioinformatics is currently faced with very large-scale data sets that lead to computational jobs, especially sequence similarity searches, that can take absurdly long times to run. For example, the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST and BLAST+) suite, which is by far the most widely used tool for rapid similarity searching among nucleic acid or amino acid sequences, is highly central processing unit (CPU) intensive. While the BLAST suite of programs perform searches very rapidly, they have the potential to be accelerated. In recent years, distributed computing environments have become more widely accessible andmore » used due to the increasing availability of high-performance computing (HPC) systems. Therefore, simple solutions for data parallelization are needed to expedite BLAST and other sequence analysis tools. However, existing software for parallel sequence similarity searches often requires extensive computational experience and skill on the part of the user. In order to accelerate BLAST and other sequence analysis tools, Divide and Conquer BLAST (DCBLAST) was developed to perform NCBI BLAST searches within a cluster, grid, or HPC environment by using a query sequence distribution approach. Scaling from one (1) to 256 CPU cores resulted in significant improvements in processing speed. Thus, DCBLAST dramatically accelerates the execution of BLAST searches using a simple, accessible, robust, and parallel approach. DCBLAST works across multiple nodes automatically and it overcomes the speed limitation of single-node BLAST programs. DCBLAST can be used on any HPC system, can take advantage of hundreds of nodes, and has no output limitations. Thus, this freely available tool simplifies distributed computation pipelines to facilitate the rapid discovery of sequence similarities between very large data sets.« less
Macchi, Elena; Zema, Lucia; Pandey, Preetanshu; Gazzaniga, Andrea; Felton, Linda A
2016-03-01
In a previous study, hydroxypropyl cellulose (HPC)-based capsular shells prepared by injection molding and intended for pulsatile release were successfully coated with 10mg/cm(2) Eudragit® L film. The suitability of HPC capsules for the development of a colon delivery platform based on a time dependent approach was demonstrated. In the present work, data logging devices (PyroButton®) were used to monitor the microenvironmental conditions, i.e. temperature (T) and relative humidity (RH), during coating processes performed under different spray rates (1.2, 2.5 and 5.5g/min). As HPC-based capsules present special features, a preliminary study was conducted on commercially available gelatin capsules for comparison purposes. By means of PyroButton data-loggers it was possible to acquire information about the impact of the effective T and RH conditions experienced by HPC substrates during the process on the technological properties and release performance of the coated systems. The use of increasing spray rates seemed to promote a tendency of the HPC shells to slightly swell at the beginning of the spraying process; moreover, capsules coated under spray rates of 1.2 and 2.5g/min showed the desired release performance, i.e. ability to withstand the acidic media followed by the pulsatile release expected for uncoated capsules. Preliminary stability studies seemed to show that coating conditions might also influence the release performance of the system upon storage. Copyright © 2015 Elsevier B.V. All rights reserved.
Goel, Deepa; Babu, Sasidhara; Prayaga, Aruna K; Sundaram, Challa
2008-01-01
Meningeal hemangiopericytoma (HPC) is a rare neoplasm. It is closely related to hemangiopericytomas in systemic tissues, with a tendency to recur and metastasize outside the CNS. Only a few case reports describe the cytomorphologic appearance of these metastasizing lesions, most having primary tumor in deep soft tissues. We report a case of recurrent meningeal HPC metastasizing to lungs. A 48-year-old woman presented with a history of headache. She underwent primary surgery 10 years previously for left parietal tumor. Histopathologic diagnosis was HPC. Radiotherapy was given postoperatively. Brain magnetic resonance imaging (MRI) at admission suggested local recurrence. She also complained of dry cough and shortness of breath. On evaluation, computed tomography (CT) scan lung showed multiple, bilateral, small nodules. Fine needle aspiration cytology (FNAC) of a larger nodule revealed spindle-shaped cells arranged around blood vessels. Immunohistochemistry with CD34 on cell block confirmed metastatic HPC. FNAC is an easy, accurate, relatively noninvasive procedure for diagnosing metastases, especially in patients with a history of recurrent intracranial HPC. Immunohistochemistry on cell block material collected at the time of FNAC may aid in distinguishing HPC from other tumors that are close mimics cytologically.
GraphMeta: Managing HPC Rich Metadata in Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dai, Dong; Chen, Yong; Carns, Philip
High-performance computing (HPC) systems face increasingly critical metadata management challenges, especially in the approaching exascale era. These challenges arise not only from exploding metadata volumes, but also from increasingly diverse metadata, which contains data provenance and arbitrary user-defined attributes in addition to traditional POSIX metadata. This ‘rich’ metadata is becoming critical to supporting advanced data management functionality such as data auditing and validation. In our prior work, we identified a graph-based model as a promising solution to uniformly manage HPC rich metadata due to its flexibility and generality. However, at the same time, graph-based HPC rich metadata anagement also introducesmore » significant challenges to the underlying infrastructure. In this study, we first identify the challenges on the underlying infrastructure to support scalable, high-performance rich metadata management. Based on that, we introduce GraphMeta, a graphbased engine designed for this use case. It achieves performance scalability by introducing a new graph partitioning algorithm and a write-optimal storage engine. We evaluate GraphMeta under both synthetic and real HPC metadata workloads, compare it with other approaches, and demonstrate its advantages in terms of efficiency and usability for rich metadata management in HPC systems.« less
Linux VPN Set Up | High-Performance Computing | NREL
methods to connect to NREL's HPC systems via the HPC VPN: one using a simple command line, and a second UserID in place of the one in the example image. Connection name: hpcvpn Gateway: hpcvpn.nrel.gov User hpcvpn option as seen in the following screen shot. Screenshot image NetworkManager will present you with
WinHPC System Policies | High-Performance Computing | NREL
requiring high CPU utilization or large amounts of memory should be run on the worker nodes. WinHPC02 is not associated data are removed when NREL worker status is discontinued. Users should make arrangements to save other users. Licenses are returned to the license pool when other users close the application or after
Trends in data locality abstractions for HPC systems
Unat, Didem; Dubey, Anshu; Hoefler, Torsten; ...
2017-05-10
The cost of data movement has always been an important concern in high performance computing (HPC) systems. It has now become the dominant factor in terms of both energy consumption and performance. Support for expression of data locality has been explored in the past, but those efforts have had only modest success in being adopted in HPC applications for various reasons. However, with the increasing complexity of the memory hierarchy and higher parallelism in emerging HPC systems, locality management has acquired a new urgency. Developers can no longer limit themselves to low-level solutions and ignore the potential for productivity andmore » performance portability obtained by using locality abstractions. Fortunately, the trend emerging in recent literature on the topic alleviates many of the concerns that got in the way of their adoption by application developers. Data locality abstractions are available in the forms of libraries, data structures, languages and runtime systems; a common theme is increasing productivity without sacrificing performance. Furthermore, this paper examines these trends and identifies commonalities that can combine various locality concepts to develop a comprehensive approach to expressing and managing data locality on future large-scale high-performance computing systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hussain, Hameed; Malik, Saif Ur Rehman; Hameed, Abdul
An efficient resource allocation is a fundamental requirement in high performance computing (HPC) systems. Many projects are dedicated to large-scale distributed computing systems that have designed and developed resource allocation mechanisms with a variety of architectures and services. In our study, through analysis, a comprehensive survey for describing resource allocation in various HPCs is reported. The aim of the work is to aggregate under a joint framework, the existing solutions for HPC to provide a thorough analysis and characteristics of the resource management and allocation strategies. Resource allocation mechanisms and strategies play a vital role towards the performance improvement ofmore » all the HPCs classifications. Therefore, a comprehensive discussion of widely used resource allocation strategies deployed in HPC environment is required, which is one of the motivations of this survey. Moreover, we have classified the HPC systems into three broad categories, namely: (a) cluster, (b) grid, and (c) cloud systems and define the characteristics of each class by extracting sets of common attributes. All of the aforementioned systems are cataloged into pure software and hybrid/hardware solutions. The system classification is used to identify approaches followed by the implementation of existing resource allocation strategies that are widely presented in the literature.« less
Planck 2015 results. XII. Full focal plane simulations
NASA Astrophysics Data System (ADS)
Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Arnaud, M.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartlett, J. G.; Bartolo, N.; Battaner, E.; Benabed, K.; Benoît, A.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bock, J. J.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Boulanger, F.; Bucher, M.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Castex, G.; Catalano, A.; Challinor, A.; Chamballu, A.; Chiang, H. C.; Christensen, P. R.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Combet, C.; Couchot, F.; Coulais, A.; Crill, B. P.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Delouis, J.-M.; Désert, F.-X.; Dickinson, C.; Diego, J. M.; Dolag, K.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Fergusson, J.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Frejsel, A.; Galeotta, S.; Galli, S.; Ganga, K.; Ghosh, T.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Hanson, D.; Harrison, D. L.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Juvela, M.; Karakci, A.; Keihänen, E.; Keskitalo, R.; Kiiveri, K.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Lesgourgues, J.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; Lindholm, V.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maggio, G.; Maino, D.; Mandolesi, N.; Mangilli, A.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; McGehee, P.; Meinhold, P. R.; Melchiorri, A.; Melin, J.-B.; Mendes, L.; Mennella, A.; Migliaccio, M.; Mitra, S.; Miville-Deschênes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Netterfield, C. B.; Nørgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Paci, F.; Pagano, L.; Pajot, F.; Paoletti, D.; Pasian, F.; Patanchon, G.; Pearson, T. J.; Perdereau, O.; Perotto, L.; Perrotta, F.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Pratt, G. W.; Prézeau, G.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Roman, M.; Rosset, C.; Rossetti, M.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Santos, D.; Savelainen, M.; Scott, D.; Seiffert, M. D.; Shellard, E. P. S.; Spencer, L. D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Tuovinen, J.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; Welikala, N.; Yvon, D.; Zacchei, A.; Zonca, A.
2016-09-01
We present the 8th full focal plane simulation set (FFP8), deployed in support of the Planck 2015 results. FFP8 consists of 10 fiducial mission realizations reduced to 18 144 maps, together with the most massive suite of Monte Carlo realizations of instrument noise and CMB ever generated, comprising 104 mission realizations reduced to about 106 maps. The resulting maps incorporate the dominant instrumental, scanning, and data analysis effects, and the remaining subdominant effects will be included in future updates. Generated at a cost of some 25 million CPU-hours spread across multiple high-performance-computing (HPC) platforms, FFP8 is used to validate and verify analysis algorithms and their implementations, and to remove biases from and quantify uncertainties in the results of analyses of the real data.
Webinar: Delivering Transformational HPC Solutions to Industry
Streitz, Frederick
2018-01-16
Dr. Frederick Streitz, director of the High Performance Computing Innovation Center, discusses Lawrence Livermore National Laboratory computational capabilities and expertise available to industry in this webinar.
Research | Computational Science | NREL
Research Research NREL's computational science experts use advanced high-performance computing (HPC technologies, thereby accelerating the transformation of our nation's energy system. Enabling High-Impact Research NREL's computational science capabilities enable high-impact research. Some recent examples
Data Transfer Study HPSS Archiving
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wynne, James; Parete-Koon, Suzanne T; Mitchell, Quinn
2015-01-01
The movement of the large amounts of data produced by codes run in a High Performance Computing (HPC) environment can be a bottleneck for project workflows. To balance filesystem capacity and performance requirements, HPC centers enforce data management policies to purge old files to make room for new computation and analysis results. Users at Oak Ridge Leadership Computing Facility (OLCF) and many other HPC user facilities must archive data to avoid data loss during purges, therefore the time associated with data movement for archiving is something that all users must consider. This study observed the difference in transfer speed frommore » the originating location on the Lustre filesystem to the more permanent High Performance Storage System (HPSS). The tests were done with a number of different transfer methods for files that spanned a variety of sizes and compositions that reflect OLCF user data. This data will be used to help users of Titan and other Cray supercomputers plan their workflow and data transfers so that they are most efficient for their project. We will also discuss best practice for maintaining data at shared user facilities.« less
2013 R&D 100 Award: âMiniappsâ Bolster High Performance Computing
Belak, Jim; Richards, David
2018-06-12
Two Livermore computer scientists served on a Sandia National Laboratories-led team that developed Mantevo Suite 1.0, the first integrated suite of small software programs, also called "miniapps," to be made available to the high performance computing (HPC) community. These miniapps facilitate the development of new HPC systems and the applications that run on them. Miniapps (miniature applications) serve as stripped down surrogates for complex, full-scale applications that can require a great deal of time and effort to port to a new HPC system because they often consist of hundreds of thousands of lines of code. The miniapps are a prototype that contains some or all of the essentials of the real application but with many fewer lines of code, making the miniapp more versatile for experimentation. This allows researchers to more rapidly explore options and optimize system design, greatly improving the chances the full-scale application will perform successfully. These miniapps have become essential tools for exploring complex design spaces because they can reliably predict the performance of full applications.
Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoo, Wucherl; Koo, Michelle; Cao, Yu
Big data is prevalent in HPC computing. Many HPC projects rely on complex workflows to analyze terabytes or petabytes of data. These workflows often require running over thousands of CPU cores and performing simultaneous data accesses, data movements, and computation. It is challenging to analyze the performance involving terabytes or petabytes of workflow data or measurement data of the executions, from complex workflows over a large number of nodes and multiple parallel task executions. To help identify performance bottlenecks or debug the performance issues in large-scale scientific applications and scientific clusters, we have developed a performance analysis framework, using state-ofthe-more » art open-source big data processing tools. Our tool can ingest system logs and application performance measurements to extract key performance features, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing engine to allow users to interactively analyze a large amount of different types of logs and measurements. To illustrate the functionality of the big data analysis framework, we conduct case studies on the workflows from an astronomy project known as the Palomar Transient Factory (PTF) and the job logs from the genome analysis scientific cluster. Our study processed many terabytes of system logs and application performance measurements collected on the HPC systems at NERSC. The implementation of our tool is generic enough to be used for analyzing the performance of other HPC systems and Big Data workows.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Di, Sheng; Berrocal, Eduardo; Cappello, Franck
The silent data corruption (SDC) problem is attracting more and more attentions because it is expected to have a great impact on exascale HPC applications. SDC faults are hazardous in that they pass unnoticed by hardware and can lead to wrong computation results. In this work, we formulate SDC detection as a runtime one-step-ahead prediction method, leveraging multiple linear prediction methods in order to improve the detection results. The contributions are twofold: (1) we propose an error feedback control model that can reduce the prediction errors for different linear prediction methods, and (2) we propose a spatial-data-based even-sampling method tomore » minimize the detection overheads (including memory and computation cost). We implement our algorithms in the fault tolerance interface, a fault tolerance library with multiple checkpoint levels, such that users can conveniently protect their HPC applications against both SDC errors and fail-stop errors. We evaluate our approach by using large-scale traces from well-known, large-scale HPC applications, as well as by running those HPC applications on a real cluster environment. Experiments show that our error feedback control model can improve detection sensitivity by 34-189% for bit-flip memory errors injected with the bit positions in the range [20,30], without any degradation on detection accuracy. Furthermore, memory size can be reduced by 33% with our spatial-data even-sampling method, with only a slight and graceful degradation in the detection sensitivity.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lingerfelt, Eric J; Messer, II, Otis E
2017-01-02
The Bellerophon software system supports CHIMERA, a production-level HPC application that simulates the evolution of core-collapse supernovae. Bellerophon enables CHIMERA's geographically dispersed team of collaborators to perform job monitoring and real-time data analysis from multiple supercomputing resources, including platforms at OLCF, NERSC, and NICS. Its multi-tier architecture provides an encapsulated, end-to-end software solution that enables the CHIMERA team to quickly and easily access highly customizable animated and static views of results from anywhere in the world via a cross-platform desktop application.
2007-06-08
Lightwave VDE /200 KVM-over-Fiber (Keyboard, Video and Mouse) devices installed throughout the TARDEC campus. Implementation of this system required...development effort through the pursuit of an Army-funded Phase-II Small Business Innovative Research (SBIR) effort with IP Video Systems (formerly known as...visualization capabilities of a DoD High- Performance Computing facility, many advanced features are necessary. TARDEC-HPC’s SBIR with IP Video Systems
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures
Moreland, Kenneth; Sewell, Christopher; Usher, William; ...
2016-05-09
Here, one of the most critical challenges for high-performance computing (HPC) scientific visualization is execution on massively threaded processors. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Our current production scientific visualization software is not designed for these new types of architectures. To address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures
Moreland, Kenneth; Sewell, Christopher; Usher, William; ...
2016-05-09
Execution on massively threaded processors is one of the most critical challenges for high-performance computing (HPC) scientific visualization. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Moreover, our current production scientific visualization software is not designed for these new types of architectures. In order to address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
76 FR 45786 - Advanced Scientific Computing Advisory Committee; Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-01
... updates. EU Data Initiative. HPC & EERE Wind Program. Early Career Research on Energy Efficient Interconnect for Exascale Computing. Separating Algorithm and Implentation. Update on ASCR exascale planning...
Software engineering and automatic continuous verification of scientific software
NASA Astrophysics Data System (ADS)
Piggott, M. D.; Hill, J.; Farrell, P. E.; Kramer, S. C.; Wilson, C. R.; Ham, D.; Gorman, G. J.; Bond, T.
2011-12-01
Software engineering of scientific code is challenging for a number of reasons including pressure to publish and a lack of awareness of the pitfalls of software engineering by scientists. The Applied Modelling and Computation Group at Imperial College is a diverse group of researchers that employ best practice software engineering methods whilst developing open source scientific software. Our main code is Fluidity - a multi-purpose computational fluid dynamics (CFD) code that can be used for a wide range of scientific applications from earth-scale mantle convection, through basin-scale ocean dynamics, to laboratory-scale classic CFD problems, and is coupled to a number of other codes including nuclear radiation and solid modelling. Our software development infrastructure consists of a number of free tools that could be employed by any group that develops scientific code and has been developed over a number of years with many lessons learnt. A single code base is developed by over 30 people for which we use bazaar for revision control, making good use of the strong branching and merging capabilities. Using features of Canonical's Launchpad platform, such as code review, blueprints for designing features and bug reporting gives the group, partners and other Fluidity uers an easy-to-use platform to collaborate and allows the induction of new members of the group into an environment where software development forms a central part of their work. The code repositoriy are coupled to an automated test and verification system which performs over 20,000 tests, including unit tests, short regression tests, code verification and large parallel tests. Included in these tests are build tests on HPC systems, including local and UK National HPC services. The testing of code in this manner leads to a continuous verification process; not a discrete event performed once development has ceased. Much of the code verification is done via the "gold standard" of comparisons to analytical solutions via the method of manufactured solutions. By developing and verifying code in tandem we avoid a number of pitfalls in scientific software development and advocate similar procedures for other scientific code applications.
CartograTree: connecting tree genomes, phenotypes and environment.
Vasquez-Gross, Hans A; Yu, John J; Figueroa, Ben; Gessler, Damian D G; Neale, David B; Wegrzyn, Jill L
2013-05-01
Today, researchers spend a tremendous amount of time gathering, formatting, filtering and visualizing data collected from disparate sources. Under the umbrella of forest tree biology, we seek to provide a platform and leverage modern technologies to connect biotic and abiotic data. Our goal is to provide an integrated web-based workspace that connects environmental, genomic and phenotypic data via geo-referenced coordinates. Here, we connect the genomic query web-based workspace, DiversiTree and a novel geographical interface called CartograTree to data housed on the TreeGenes database. To accomplish this goal, we implemented Simple Semantic Web Architecture and Protocol to enable the primary genomics database, TreeGenes, to communicate with semantic web services regardless of platform or back-end technologies. The novelty of CartograTree lies in the interactive workspace that allows for geographical visualization and engagement of high performance computing (HPC) resources. The application provides a unique tool set to facilitate research on the ecology, physiology and evolution of forest tree species. CartograTree can be accessed at: http://dendrome.ucdavis.edu/cartogratree. © 2013 Blackwell Publishing Ltd.
Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Carter, Jonathan; Oliker, Leonid
2008-02-01
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of search-based performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to a lattice Boltzmann application (LBMHD) that historically has made poor use of scalar microprocessors due to its complex data structures and memory access patterns. We explore one of the broadest sets of multicore architectures in the HPC literature, including the Intel Clovertown, AMD Opteron X2, Sun Niagara2, STI Cell, as well as the single core Intel Itanium2. Rather than hand-tuning LBMHDmore » for each system, we develop a code generator that allows us identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our auto-tuned LBMHD application achieves up to a 14x improvement compared with the original code. Additionally, we present detailed analysis of each optimization, which reveal surprising hardware bottlenecks and software challenges for future multicore systems and applications.« less
Lattice Boltzmann simulation optimization on leading multicore platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, S.; Carter, J.; Oliker, L.
2008-01-01
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to a lattice Boltzmann application (LBMHD) that historically has made poor use of scalar microprocessors due to its complex data structures and memory access patterns. We explore one of the broadest sets of multicore architectures in the HPC literature, including the Intel Clovertown, AMD Opteron X2, Sun Niagara2, STI Cell, as well as the single core Intel Itanium2. Rather than hand-tuning LBMHDmore » for each system, we develop a code generator that allows us identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our autotuned LBMHD application achieves up to a 14 improvement compared with the original code. Additionally, we present detailed analysis of each optimization, which reveal surprising hardware bottlenecks and software challenges for future multicore systems and applications.« less
PERI - Auto-tuning Memory Intensive Kernels for Multicore
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bailey, David H; Williams, Samuel; Datta, Kaushik
2008-06-24
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of search-based performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to Sparse Matrix Vector Multiplication (SpMV), the explicit heat equation PDE on a regular grid (Stencil), and a lattice Boltzmann application (LBMHD). We explore one of the broadest sets of multicore architectures in the HPC literature, including the Intel Xeon Clovertown, AMD Opteron Barcelona, Sun Victoria Falls, and the Sony-Toshiba-IBM (STI) Cell. Rather than hand-tuning each kernel for each system, we developmore » a code generator for each kernel that allows us to identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our auto-tuned kernel applications often achieve a better than 4X improvement compared with the original code. Additionally, we analyze a Roofline performance model for each platform to reveal hardware bottlenecks and software challenges for future multicore systems and applications.« less
Towards Large-scale Twitter Mining for Drug-related Adverse Events.
Bian, Jiang; Topaloglu, Umit; Yu, Fan
2012-10-29
Drug-related adverse events pose substantial risks to patients who consume post-market or Drug-related adverse events pose substantial risks to patients who consume post-market or investigational drugs. Early detection of adverse events benefits not only the drug regulators, but also the manufacturers for pharmacovigilance. Existing methods rely on patients' "spontaneous" self-reports that attest problems. The increasing popularity of social media platforms like the Twitter presents us a new information source for finding potential adverse events. Given the high frequency of user updates, mining Twitter messages can lead us to real-time pharmacovigilance. In this paper, we describe an approach to find drug users and potential adverse events by analyzing the content of twitter messages utilizing Natural Language Processing (NLP) and to build Support Vector Machine (SVM) classifiers. Due to the size nature of the dataset (i.e., 2 billion Tweets), the experiments were conducted on a High Performance Computing (HPC) platform using MapReduce, which exhibits the trend of big data analytics. The results suggest that daily-life social networking data could help early detection of important patient safety issues.
Computational Science News | Computational Science | NREL
-Cooled High-Performance Computing Technology at the ESIF February 28, 2018 NREL Launches New Website for High-Performance Computing System Users The National Renewable Energy Laboratory (NREL) Computational Science Center has launched a revamped website for users of the lab's high-performance computing (HPC
High-Performance Computing and Visualization | Energy Systems Integration
Facility | NREL High-Performance Computing and Visualization High-Performance Computing and Visualization High-performance computing (HPC) and visualization at NREL propel technology innovation as a . Capabilities High-Performance Computing NREL is home to Peregrine-the largest high-performance computing system
BelleII@home: Integrate volunteer computing resources into DIRAC in a secure way
NASA Astrophysics Data System (ADS)
Wu, Wenjing; Hara, Takanori; Miyake, Hideki; Ueda, Ikuo; Kan, Wenxiao; Urquijo, Phillip
2017-10-01
The exploitation of volunteer computing resources has become a popular practice in the HEP computing community as the huge amount of potential computing power it provides. In the recent HEP experiments, the grid middleware has been used to organize the services and the resources, however it relies heavily on the X.509 authentication, which is contradictory to the untrusted feature of volunteer computing resources, therefore one big challenge to utilize the volunteer computing resources is how to integrate them into the grid middleware in a secure way. The DIRAC interware which is commonly used as the major component of the grid computing infrastructure for several HEP experiments proposes an even bigger challenge to this paradox as its pilot is more closely coupled with operations requiring the X.509 authentication compared to the implementations of pilot in its peer grid interware. The Belle II experiment is a B-factory experiment at KEK, and it uses DIRAC for its distributed computing. In the project of BelleII@home, in order to integrate the volunteer computing resources into the Belle II distributed computing platform in a secure way, we adopted a new approach which detaches the payload running from the Belle II DIRAC pilot which is a customized pilot pulling and processing jobs from the Belle II distributed computing platform, so that the payload can run on volunteer computers without requiring any X.509 authentication. In this approach we developed a gateway service running on a trusted server which handles all the operations requiring the X.509 authentication. So far, we have developed and deployed the prototype of BelleII@home, and tested its full workflow which proves the feasibility of this approach. This approach can also be applied on HPC systems whose work nodes do not have outbound connectivity to interact with the DIRAC system in general.
NASA Astrophysics Data System (ADS)
Wyborn, L. A.; Evans, B. J. K.
2015-12-01
The National Computational Infrastructure (NCI) at the Australian National University (ANU) has evolved to become Australia's peak computing centre for national computational and Data-intensive Earth system science. More recently NCI collocated 10 Petabytes of 34 major national and international environmental, climate, earth system, geophysics and astronomy data collections to create the National Environmental Research Interoperability Data Platform (NERDIP). Spatial scales of the collections range from global to local ultra-high resolution, whilst sizes range from 3PB down to a few GB. The data is highly connected to both NCI HPC and cloud resources via low latency internal networks with massive bandwidth. Now that the collections are collocated on a single data platform, the 'Hype' and expectations around potential use cases for the NERDIP are high. Not unexpected issues are emerging such as access, licensing issues, ownership, and incompatible data standards. Many communities are standardised within their domain, but achieving true interdisciplinary science will require all communities to move towards open interoperable data formats such as NetCDF4/HDF5. This transition will impact on software using proprietary or non-open standards. But before we reach the 'Plateau of Productivity', there needs to be greater 'Enlightenment' of users to encourage them to realise that this unprecedented Earth system science platform provides a rich mine of opportunities for discovery and innovation for a diverse range of both domain-specific and interdisciplinary investigations including climate and weather research, impact analysis, environment, remote sensing and geophysics and develop new and innovative interdisciplinary use cases that will guide those architecting the system and help minimise the amplitude of the 'Trough of Disillusionment' and ensure greater productivity and uptake of the collections that make NERDIP unique in the next generation of Data-intensive Science.
NASA Technical Reports Server (NTRS)
Chaudhary, Aashish; Votava, Petr; Nemani, Ramakrishna R.; Michaelis, Andrew; Kotfila, Chris
2016-01-01
We are developing capabilities for an integrated petabyte-scale Earth science collaborative analysis and visualization environment. The ultimate goal is to deploy this environment within the NASA Earth Exchange (NEX) and OpenNEX in order to enhance existing science data production pipelines in both high-performance computing (HPC) and cloud environments. Bridging of HPC and cloud is a fairly new concept under active research and this system significantly enhances the ability of the scientific community to accelerate analysis and visualization of Earth science data from NASA missions, model outputs and other sources. We have developed a web-based system that seamlessly interfaces with both high-performance computing (HPC) and cloud environments, providing tools that enable science teams to develop and deploy large-scale analysis, visualization and QA pipelines of both the production process and the data products, and enable sharing results with the community. Our project is developed in several stages each addressing separate challenge - workflow integration, parallel execution in either cloud or HPC environments and big-data analytics or visualization. This work benefits a number of existing and upcoming projects supported by NEX, such as the Web Enabled Landsat Data (WELD), where we are developing a new QA pipeline for the 25PB system.
Analytics and Visualization Pipelines for Big Data on the NASA Earth Exchange (NEX) and OpenNEX
NASA Astrophysics Data System (ADS)
Chaudhary, A.; Votava, P.; Nemani, R. R.; Michaelis, A.; Kotfila, C.
2016-12-01
We are developing capabilities for an integrated petabyte-scale Earth science collaborative analysis and visualization environment. The ultimate goal is to deploy this environment within the NASA Earth Exchange (NEX) and OpenNEX in order to enhance existing science data production pipelines in both high-performance computing (HPC) and cloud environments. Bridging of HPC and cloud is a fairly new concept under active research and this system significantly enhances the ability of the scientific community to accelerate analysis and visualization of Earth science data from NASA missions, model outputs and other sources. We have developed a web-based system that seamlessly interfaces with both high-performance computing (HPC) and cloud environments, providing tools that enable science teams to develop and deploy large-scale analysis, visualization and QA pipelines of both the production process and the data products, and enable sharing results with the community. Our project is developed in several stages each addressing separate challenge - workflow integration, parallel execution in either cloud or HPC environments and big-data analytics or visualization. This work benefits a number of existing and upcoming projects supported by NEX, such as the Web Enabled Landsat Data (WELD), where we are developing a new QA pipeline for the 25PB system.
Desktop supercomputer: what can it do?
NASA Astrophysics Data System (ADS)
Bogdanov, A.; Degtyarev, A.; Korkhov, V.
2017-12-01
The paper addresses the issues of solving complex problems that require using supercomputers or multiprocessor clusters available for most researchers nowadays. Efficient distribution of high performance computing resources according to actual application needs has been a major research topic since high-performance computing (HPC) technologies became widely introduced. At the same time, comfortable and transparent access to these resources was a key user requirement. In this paper we discuss approaches to build a virtual private supercomputer available at user's desktop: a virtual computing environment tailored specifically for a target user with a particular target application. We describe and evaluate possibilities to create the virtual supercomputer based on light-weight virtualization technologies, and analyze the efficiency of our approach compared to traditional methods of HPC resource management.
Performance Analysis, Modeling and Scaling of HPC Applications and Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhatele, Abhinav
2016-01-13
E cient use of supercomputers at DOE centers is vital for maximizing system throughput, mini- mizing energy costs and enabling science breakthroughs faster. This requires complementary e orts along several directions to optimize the performance of scienti c simulation codes and the under- lying runtimes and software stacks. This in turn requires providing scalable performance analysis tools and modeling techniques that can provide feedback to physicists and computer scientists developing the simulation codes and runtimes respectively. The PAMS project is using time allocations on supercomputers at ALCF, NERSC and OLCF to further the goals described above by performing research alongmore » the following fronts: 1. Scaling Study of HPC applications; 2. Evaluation of Programming Models; 3. Hardening of Performance Tools; 4. Performance Modeling of Irregular Codes; and 5. Statistical Analysis of Historical Performance Data. We are a team of computer and computational scientists funded by both DOE/NNSA and DOE/ ASCR programs such as ECRP, XStack (Traleika Glacier, PIPER), ExaOSR (ARGO), SDMAV II (MONA) and PSAAP II (XPACC). This allocation will enable us to study big data issues when analyzing performance on leadership computing class systems and to assist the HPC community in making the most e ective use of these resources.« less
NASA Astrophysics Data System (ADS)
Ryu, Hoon; Jeong, Yosang; Kang, Ji-Hoon; Cho, Kyu Nam
2016-12-01
Modelling of multi-million atomic semiconductor structures is important as it not only predicts properties of physically realizable novel materials, but can accelerate advanced device designs. This work elaborates a new Technology-Computer-Aided-Design (TCAD) tool for nanoelectronics modelling, which uses a sp3d5s∗ tight-binding approach to describe multi-million atomic structures, and simulate electronic structures with high performance computing (HPC), including atomic effects such as alloy and dopant disorders. Being named as Quantum simulation tool for Advanced Nanoscale Devices (Q-AND), the tool shows nice scalability on traditional multi-core HPC clusters implying the strong capability of large-scale electronic structure simulations, particularly with remarkable performance enhancement on latest clusters of Intel Xeon PhiTM coprocessors. A review of the recent modelling study conducted to understand an experimental work of highly phosphorus-doped silicon nanowires, is presented to demonstrate the utility of Q-AND. Having been developed via Intel Parallel Computing Center project, Q-AND will be open to public to establish a sound framework of nanoelectronics modelling with advanced HPC clusters of a many-core base. With details of the development methodology and exemplary study of dopant electronics, this work will present a practical guideline for TCAD development to researchers in the field of computational nanoelectronics.
NASA Astrophysics Data System (ADS)
Kjærgaard, Thomas; Baudin, Pablo; Bykov, Dmytro; Eriksen, Janus Juul; Ettenhuber, Patrick; Kristensen, Kasper; Larkin, Jeff; Liakh, Dmitry; Pawłowski, Filip; Vose, Aaron; Wang, Yang Min; Jørgensen, Poul
2017-03-01
We present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide-Expand-Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide-Expand-Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalability of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the "resolution of the identity second-order Møller-Plesset perturbation theory" (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.
Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines
NASA Astrophysics Data System (ADS)
Ivanovic, Pavle; Richter, Harald
2018-01-01
High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.
High Performance Proactive Digital Forensics
NASA Astrophysics Data System (ADS)
Alharbi, Soltan; Moa, Belaid; Weber-Jahnke, Jens; Traore, Issa
2012-10-01
With the increase in the number of digital crimes and in their sophistication, High Performance Computing (HPC) is becoming a must in Digital Forensics (DF). According to the FBI annual report, the size of data processed during the 2010 fiscal year reached 3,086 TB (compared to 2,334 TB in 2009) and the number of agencies that requested Regional Computer Forensics Laboratory assistance increasing from 689 in 2009 to 722 in 2010. Since most investigation tools are both I/O and CPU bound, the next-generation DF tools are required to be distributed and offer HPC capabilities. The need for HPC is even more evident in investigating crimes on clouds or when proactive DF analysis and on-site investigation, requiring semi-real time processing, are performed. Although overcoming the performance challenge is a major goal in DF, as far as we know, there is almost no research on HPC-DF except for few papers. As such, in this work, we extend our work on the need of a proactive system and present a high performance automated proactive digital forensic system. The most expensive phase of the system, namely proactive analysis and detection, uses a parallel extension of the iterative z algorithm. It also implements new parallel information-based outlier detection algorithms to proactively and forensically handle suspicious activities. To analyse a large number of targets and events and continuously do so (to capture the dynamics of the system), we rely on a multi-resolution approach to explore the digital forensic space. Data set from the Honeynet Forensic Challenge in 2001 is used to evaluate the system from DF and HPC perspectives.
Planck 2015 results: XII. Full focal plane simulations
Ade, P. A. R.; Aghanim, N.; Arnaud, M.; ...
2016-09-20
In this paper, we present the 8th full focal plane simulation set (FFP8), deployed in support of the Planck 2015 results. FFP8 consists of 10 fiducial mission realizations reduced to 18 144 maps, together with the most massive suite of Monte Carlo realizations of instrument noise and CMB ever generated, comprising 10 4 mission realizations reduced to about 10 6 maps. The resulting maps incorporate the dominant instrumental, scanning, and data analysis effects, and the remaining subdominant effects will be included in future updates. Finally, generated at a cost of some 25 million CPU-hours spread across multiple high-performance-computing (HPC) platforms,more » FFP8 is used to validate and verify analysis algorithms and their implementations, and to remove biases from and quantify uncertainties in the results of analyses of the real data.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Palmintier, Bryan; Hale, Elaine; Hodge, Bri-Mathias
2016-08-11
This paper discusses the development of, approaches for, experiences with, and some results from a large-scale, high-performance-computer-based (HPC-based) co-simulation of electric power transmission and distribution systems using the Integrated Grid Modeling System (IGMS). IGMS was developed at the National Renewable Energy Laboratory (NREL) as a novel Independent System Operator (ISO)-to-appliance scale electric power system modeling platform that combines off-the-shelf tools to simultaneously model 100s to 1000s of distribution systems in co-simulation with detailed ISO markets, transmission power flows, and AGC-level reserve deployment. Lessons learned from the co-simulation architecture development are shared, along with a case study that explores the reactivemore » power impacts of PV inverter voltage support on the bulk power system.« less
NASA Astrophysics Data System (ADS)
Elantkowska, Magdalena; Ruczkowski, Jarosław; Sikorski, Andrzej; Dembczyński, Jerzy
2017-11-01
A parametric analysis of the hyperfine structure (hfs) for the even parity configurations of atomic terbium (Tb I) is presented in this work. We introduce the complete set of 4fN-core states in our high-performance computing (HPC) calculations. For calculations of the huge hyperfine structure matrix, requiring approximately 5000 hours when run on a single CPU, we propose the methods utilizing a personal computer cluster or, alternatively a cluster of Microsoft Azure virtual machines (VM). These methods give a factor 12 performance boost, enabling the calculations to complete in an acceptable time.
Zhou, Ji; Applegate, Christopher; Alonso, Albor Dobon; Reynolds, Daniel; Orford, Simon; Mackiewicz, Michal; Griffiths, Simon; Penfield, Steven; Pullen, Nick
2017-01-01
Plants demonstrate dynamic growth phenotypes that are determined by genetic and environmental factors. Phenotypic analysis of growth features over time is a key approach to understand how plants interact with environmental change as well as respond to different treatments. Although the importance of measuring dynamic growth traits is widely recognised, available open software tools are limited in terms of batch image processing, multiple traits analyses, software usability and cross-referencing results between experiments, making automated phenotypic analysis problematic. Here, we present Leaf-GP (Growth Phenotypes), an easy-to-use and open software application that can be executed on different computing platforms. To facilitate diverse scientific communities, we provide three software versions, including a graphic user interface (GUI) for personal computer (PC) users, a command-line interface for high-performance computer (HPC) users, and a well-commented interactive Jupyter Notebook (also known as the iPython Notebook) for computational biologists and computer scientists. The software is capable of extracting multiple growth traits automatically from large image datasets. We have utilised it in Arabidopsis thaliana and wheat ( Triticum aestivum ) growth studies at the Norwich Research Park (NRP, UK). By quantifying a number of growth phenotypes over time, we have identified diverse plant growth patterns between different genotypes under several experimental conditions. As Leaf-GP has been evaluated with noisy image series acquired by different imaging devices (e.g. smartphones and digital cameras) and still produced reliable biological outputs, we therefore believe that our automated analysis workflow and customised computer vision based feature extraction software implementation can facilitate a broader plant research community for their growth and development studies. Furthermore, because we implemented Leaf-GP based on open Python-based computer vision, image analysis and machine learning libraries, we believe that our software not only can contribute to biological research, but also demonstrates how to utilise existing open numeric and scientific libraries (e.g. Scikit-image, OpenCV, SciPy and Scikit-learn) to build sound plant phenomics analytic solutions, in a efficient and effective way. Leaf-GP is a sophisticated software application that provides three approaches to quantify growth phenotypes from large image series. We demonstrate its usefulness and high accuracy based on two biological applications: (1) the quantification of growth traits for Arabidopsis genotypes under two temperature conditions; and (2) measuring wheat growth in the glasshouse over time. The software is easy-to-use and cross-platform, which can be executed on Mac OS, Windows and HPC, with open Python-based scientific libraries preinstalled. Our work presents the advancement of how to integrate computer vision, image analysis, machine learning and software engineering in plant phenomics software implementation. To serve the plant research community, our modulated source code, detailed comments, executables (.exe for Windows; .app for Mac), and experimental results are freely available at https://github.com/Crop-Phenomics-Group/Leaf-GP/releases.
System Connection via SSH Gateway | High-Performance Computing | NREL
;@peregrine.hpc.nrel.gov First time logging in? If this is the first time you've logged in with your new account, you will password. You will be prompted to enter it a second time, then you will be logged off. Just reconnect with your HPC password at any time, you can simply use the passwd command. Remote Users If you're connecting
High-Performance Computing Data Center Warm-Water Liquid Cooling |
Computational Science | NREL Warm-Water Liquid Cooling High-Performance Computing Data Center Warm-Water Liquid Cooling NREL's High-Performance Computing Data Center (HPC Data Center) is liquid water Liquid cooling technologies offer a more energy-efficient solution that also allows for effective
2013-12-01
authors present a Computing on Dissemination with predictable contacts ( pCoD ) algorithm, since it is impossible to reserve task execution time in advance...Computing While Charging DAG Directed Acyclic Graph 18 TTL Time-to-live pCoD Predictable contacts CoD Computing on Dissemination upCoD Unpredictable
Data Storage and Transfer | High-Performance Computing | NREL
High-Performance Computing (HPC) systems. Photo of computer server wiring and lights, blurred to show data. WinSCP for Windows File Transfers Use to transfer files from a local computer to a remote computer. Robinhood for File Management Use this tool to manage your data files on Peregrine. Best
Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan
2015-01-01
Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan
2015-01-01
Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
Running Jobs on the Peregrine System | High-Performance Computing | NREL
on the Peregrine high-performance computing (HPC) system. Running Different Types of Jobs Batch jobs scheduling policies - queue names, limits, etc. Requesting different node types Sample batch scripts
Planning for Pre-Exascale Platform Environment (Fiscal Year 2015 Level 2 Milestone 5216)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Springmeyer, R.; Lang, M.; Noe, J.
This Plan for ASC Pre-Exascale Platform Environments document constitutes the deliverable for the fiscal year 2015 (FY15) Advanced Simulation and Computing (ASC) Program Level 2 milestone Planning for Pre-Exascale Platform Environment. It acknowledges and quantifies challenges and recognized gaps for moving the ASC Program towards effective use of exascale platforms and recommends strategies to address these gaps. This document also presents an update to the concerns, strategies, and plans presented in the FY08 predecessor document that dealt with the upcoming (at the time) petascale high performance computing (HPC) platforms. With the looming push towards exascale systems, a review of themore » earlier document was appropriate in light of the myriad architectural choices currently under consideration. The ASC Program believes the platforms to be fielded in the 2020s will be fundamentally different systems that stress ASC’s ability to modify codes to take full advantage of new or unique features. In addition, the scale of components will increase the difficulty of maintaining an errorfree system, thus driving new approaches to resilience and error detection/correction. The code revamps of the past, from serial- to vector-centric code to distributed memory to threaded implementations, will be revisited as codes adapt to a new message passing interface (MPI) plus “x” or more advanced and dynamic programming models based on architectural specifics. Development efforts are already underway in some cases, and more difficult or uncertain aspects of the new architectures will require research and analysis that may inform future directions for program choices. In addition, the potential diversity of system architectures may require parallel if not duplicative efforts to analyze and modify environments, codes, subsystems, libraries, debugging tools, and performance analysis techniques as well as exploring new monitoring methodologies. It is difficult if not impossible to selectively eliminate some of these activities until more information is available through simulations of potential architectures, analysis of systems designs, and informed study of commodity technologies that will be the constituent parts of future platforms.« less
NASA Astrophysics Data System (ADS)
Brockmann, J. M.; Schuh, W.-D.
2011-07-01
The estimation of the global Earth's gravity field parametrized as a finite spherical harmonic series is computationally demanding. The computational effort depends on the one hand on the maximal resolution of the spherical harmonic expansion (i.e. the number of parameters to be estimated) and on the other hand on the number of observations (which are several millions for e.g. observations from the GOCE satellite missions). To circumvent these restrictions, a massive parallel software based on high-performance computing (HPC) libraries as ScaLAPACK, PBLAS and BLACS was designed in the context of GOCE HPF WP6000 and the GOCO consortium. A prerequisite for the use of these libraries is that all matrices are block-cyclic distributed on a processor grid comprised by a large number of (distributed memory) computers. Using this set of standard HPC libraries has the benefit that once the matrices are distributed across the computer cluster, a huge set of efficient and highly scalable linear algebra operations can be used.
On the energy footprint of I/O management in Exascale HPC systems
Dorier, Matthieu; Yildiz, Orcun; Ibrahim, Shadi; ...
2016-03-21
The advent of unprecedentedly scalable yet energy hungry Exascale supercomputers poses a major challenge in sustaining a high performance-per-watt ratio. With I/O management acquiring a crucial role in supporting scientific simulations, various I/O management approaches have been proposed to achieve high performance and scalability. But, the details of how these approaches affect energy consumption have not been studied yet. Therefore, this paper aims to explore how much energy a supercomputer consumes while running scientific simulations when adopting various I/O management approaches. In particular, we closely examine three radically different I/O schemes including time partitioning, dedicated cores, and dedicated nodes. Tomore » accomplish this, we implement the three approaches within the Damaris I/O middleware and perform extensive experiments with one of the target HPC applications of the Blue Waters sustained-petaflop supercomputer project: the CM1 atmospheric model. Our experimental results obtained on the French Grid'5000 platform highlight the differences among these three approaches and illustrate in which way various configurations of the application and of the system can impact performance and energy consumption. Moreover, we propose and validate a mathematical model that estimates the energy consumption of a HPC simulation under different I/O approaches. This proposed model gives hints to pre-select the most energy-efficient I/O approach for a particular simulation on a particular HPC system and therefore provides a step towards energy-efficient HPC simulations in Exascale systems. To the best of our knowledge, our work provides the first in-depth look into the energy-performance tradeoffs of I/O management approaches.« less
Template Interfaces for Agile Parallel Data-Intensive Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramakrishnan, Lavanya; Gunter, Daniel; Pastorello, Gilerto Z.
Tigres provides a programming library to compose and execute large-scale data-intensive scientific workflows from desktops to supercomputers. DOE User Facilities and large science collaborations are increasingly generating large enough data sets that it is no longer practical to download them to a desktop to operate on them. They are instead stored at centralized compute and storage resources such as high performance computing (HPC) centers. Analysis of this data requires an ability to run on these facilities, but with current technologies, scaling an analysis to an HPC center and to a large data set is difficult even for experts. Tigres ismore » addressing the challenge of enabling collaborative analysis of DOE Science data through a new concept of reusable "templates" that enable scientists to easily compose, run and manage collaborative computational tasks. These templates define common computation patterns used in analyzing a data set.« less
Climate simulations and services on HPC, Cloud and Grid infrastructures
NASA Astrophysics Data System (ADS)
Cofino, Antonio S.; Blanco, Carlos; Minondo Tshuma, Antonio
2017-04-01
Cloud, Grid and High Performance Computing have changed the accessibility and availability of computing resources for Earth Science research communities, specially for Climate community. These paradigms are modifying the way how climate applications are being executed. By using these technologies the number, variety and complexity of experiments and resources are increasing substantially. But, although computational capacity is increasing, traditional applications and tools used by the community are not good enough to manage this large volume and variety of experiments and computing resources. In this contribution, we evaluate the challenges to run climate simulations and services on Grid, Cloud and HPC infrestructures and how to tackle them. The Grid and Cloud infrastructures provided by EGI's VOs ( esr , earth.vo.ibergrid and fedcloud.egi.eu) will be evaluated, as well as HPC resources from PRACE infrastructure and institutional clusters. To solve those challenges, solutions using DRM4G framework will be shown. DRM4G provides a good framework to manage big volume and variety of computing resources for climate experiments. This work has been supported by the Spanish National R&D Plan under projects WRF4G (CGL2011-28864), INSIGNIA (CGL2016-79210-R) and MULTI-SDM (CGL2015-66583-R) ; the IS-ENES2 project from the 7FP of the European Commission (grant agreement no. 312979); the European Regional Development Fund—ERDF and the Programa de Personal Investigador en Formación Predoctoral from Universidad de Cantabria and Government of Cantabria.
KITTEN Lightweight Kernel 0.1 Beta
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pedretti, Kevin; Levenhagen, Michael; Kelly, Suzanne
2007-12-12
The Kitten Lightweight Kernel is a simplified OS (operating system) kernel that is intended to manage a compute node's hardware resources. It provides a set of mechanisms to user-level applications for utilizing hardware resources (e.g., allocating memory, creating processes, accessing the network). Kitten is much simpler than general-purpose OS kernels, such as Linux or Windows, but includes all of the esssential functionality needed to support HPC (high-performance computing) MPI, PGAS and OpenMP applications. Kitten provides unique capabilities such as physically contiguous application memory, transparent large page support, and noise-free tick-less operation, which enable HPC applications to obtain greater efficiency andmore » scalability than with general purpose OS kernels.« less
NREL's Building-Integrated Supercomputer Provides Heating and Efficient Computing (Fact Sheet)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
2014-09-01
NREL's Energy Systems Integration Facility (ESIF) is meant to investigate new ways to integrate energy sources so they work together efficiently, and one of the key tools to that investigation, a new supercomputer, is itself a prime example of energy systems integration. NREL teamed with Hewlett-Packard (HP) and Intel to develop the innovative warm-water, liquid-cooled Peregrine supercomputer, which not only operates efficiently but also serves as the primary source of building heat for ESIF offices and laboratories. This innovative high-performance computer (HPC) can perform more than a quadrillion calculations per second as part of the world's most energy-efficient HPC datamore » center.« less
Providing a parallel and distributed capability for JMASS using SPEEDES
NASA Astrophysics Data System (ADS)
Valinski, Maria; Driscoll, Jonathan; McGraw, Robert M.; Meyer, Bob
2002-07-01
The Joint Modeling And Simulation System (JMASS) is a Tri-Service simulation environment that supports engineering and engagement-level simulations. As JMASS is expanded to support other Tri-Service domains, the current set of modeling services must be expanded for High Performance Computing (HPC) applications by adding support for advanced time-management algorithms, parallel and distributed topologies, and high speed communications. By providing support for these services, JMASS can better address modeling domains requiring parallel computationally intense calculations such clutter, vulnerability and lethality calculations, and underwater-based scenarios. A risk reduction effort implementing some HPC services for JMASS using the SPEEDES (Synchronous Parallel Environment for Emulation and Discrete Event Simulation) Simulation Framework has recently concluded. As an artifact of the JMASS-SPEEDES integration, not only can HPC functionality be brought to the JMASS program through SPEEDES, but an additional HLA-based capability can be demonstrated that further addresses interoperability issues. The JMASS-SPEEDES integration provided a means of adding HLA capability to preexisting JMASS scenarios through an implementation of the standard JMASS port communication mechanism that allows players to communicate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Unat, Didem; Dubey, Anshu; Hoefler, Torsten
The cost of data movement has always been an important concern in high performance computing (HPC) systems. It has now become the dominant factor in terms of both energy consumption and performance. Support for expression of data locality has been explored in the past, but those efforts have had only modest success in being adopted in HPC applications for various reasons. However, with the increasing complexity of the memory hierarchy and higher parallelism in emerging HPC systems, locality management has acquired a new urgency. Developers can no longer limit themselves to low-level solutions and ignore the potential for productivity andmore » performance portability obtained by using locality abstractions. Fortunately, the trend emerging in recent literature on the topic alleviates many of the concerns that got in the way of their adoption by application developers. Data locality abstractions are available in the forms of libraries, data structures, languages and runtime systems; a common theme is increasing productivity without sacrificing performance. Furthermore, this paper examines these trends and identifies commonalities that can combine various locality concepts to develop a comprehensive approach to expressing and managing data locality on future large-scale high-performance computing systems.« less
Analysis of Application Power and Schedule Composition in a High Performance Computing Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Elmore, Ryan; Gruchalla, Kenny; Phillips, Caleb
As the capacity of high performance computing (HPC) systems continues to grow, small changes in energy management have the potential to produce significant energy savings. In this paper, we employ an extensive informatics system for aggregating and analyzing real-time performance and power use data to evaluate energy footprints of jobs running in an HPC data center. We look at the effects of algorithmic choices for a given job on the resulting energy footprints, and analyze application-specific power consumption, and summarize average power use in the aggregate. All of these views reveal meaningful power variance between classes of applications as wellmore » as chosen methods for a given job. Using these data, we discuss energy-aware cost-saving strategies based on reordering the HPC job schedule. Using historical job and power data, we present a hypothetical job schedule reordering that: (1) reduces the facility's peak power draw and (2) manages power in conjunction with a large-scale photovoltaic array. Lastly, we leverage this data to understand the practical limits on predicting key power use metrics at the time of submission.« less
NASA Astrophysics Data System (ADS)
Myre, Joseph M.
Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
Multi-Core Processor Memory Contention Benchmark Analysis Case Study
NASA Technical Reports Server (NTRS)
Simon, Tyler; McGalliard, James
2009-01-01
Multi-core processors dominate current mainframe, server, and high performance computing (HPC) systems. This paper provides synthetic kernel and natural benchmark results from an HPC system at the NASA Goddard Space Flight Center that illustrate the performance impacts of multi-core (dual- and quad-core) vs. single core processor systems. Analysis of processor design, application source code, and synthetic and natural test results all indicate that multi-core processors can suffer from significant memory subsystem contention compared to similar single-core processors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hornung, Richard D.; Hones, Holger E.
The RAJA Performance Suite is designed to evaluate performance of the RAJA performance portability library on a wide variety of important high performance computing (HPC) algorithmic lulmels. These kernels assess compiler optimizations and various parallel programming model backends accessible through RAJA, such as OpenMP, CUDA, etc. The Initial version of the suite contains 25 computational kernels, each of which appears in 6 variants: Baseline SequcntiaJ, RAJA SequentiaJ, Baseline OpenMP, RAJA OpenMP, Baseline CUDA, RAJA CUDA. All variants of each kernel perform essentially the same mathematical operations and the loop body code for each kernel is identical across all variants. Theremore » are a few kernels, such as those that contain reduction operations, that require CUDA-specific coding for their CUDA variants. ActuaJ computer instructions executed and how they run in parallel differs depending on the parallel programming model backend used and which optimizations are perfonned by the compiler used to build the Perfonnance Suite executable. The Suite will be used primarily by RAJA developers to perform regular assessments of RAJA performance across a range of hardware platforms and compilers as RAJA features are being developed. It will also be used by LLNL hardware and software vendor panners for new defining requirements for future computing platform procurements and acceptance testing. In particular, the RAJA Performance Suite will be used for compiler acceptance testing of the upcoming CORAUSierra machine {initial LLNL delivery expected in late-2017/early 2018) and the CORAL-2 procurement. The Suite will aJso be used to generate concise source code reproducers of compiler and runtime issues we uncover so that we may provide them to relevant vendors to be fixed.« less
Unmet needs for analyzing biological big data: A survey of 704 NSF principal investigators.
Barone, Lindsay; Williams, Jason; Micklos, David
2017-10-01
In a 2016 survey of 704 National Science Foundation (NSF) Biological Sciences Directorate principal investigators (BIO PIs), nearly 90% indicated they are currently or will soon be analyzing large data sets. BIO PIs considered a range of computational needs important to their work, including high performance computing (HPC), bioinformatics support, multistep workflows, updated analysis software, and the ability to store, share, and publish data. Previous studies in the United States and Canada emphasized infrastructure needs. However, BIO PIs said the most pressing unmet needs are training in data integration, data management, and scaling analyses for HPC-acknowledging that data science skills will be required to build a deeper understanding of life. This portends a growing data knowledge gap in biology and challenges institutions and funding agencies to redouble their support for computational training in biology.
Performance of the fusion code GYRO on four generations of Cray computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fahey, Mark R
2014-01-01
GYRO is a code used for the direct numerical simulation of plasma microturbulence. It has been ported to a variety of modern MPP platforms including several modern commodity clusters, IBM SPs, and Cray XC, XT, and XE series machines. We briefly describe the mathematical structure of the equations, the data layout, and the redistribution scheme. Also, while the performance and scaling of GYRO on many of these systems has been shown before, here we show the comparative performance and scaling on four generations of Cray supercomputers including the newest addition - the Cray XC30. The more recently added hybrid OpenMP/MPImore » imple- mentation also shows a great deal of promise on custom HPC systems that utilize fast CPUs and proprietary interconnects. Four machines of varying sizes were used in the experiment, all of which are located at the National Institute for Computational Sciences at the University of Tennessee at Knoxville and Oak Ridge National Laboratory. The advantages, limitations, and performance of using each system are discussed.« less
Kim, Gyungock; Park, Hyundai; Joo, Jiho; Jang, Ki-Seok; Kwack, Myung-Joon; Kim, Sanghoon; Kim, In Gyoo; Oh, Jin Hyuk; Kim, Sun Ae; Park, Jaegyu; Kim, Sanggi
2015-06-10
When silicon photonic integrated circuits (PICs), defined for transmitting and receiving optical data, are successfully monolithic-integrated into major silicon electronic chips as chip-level optical I/Os (inputs/outputs), it will bring innovative changes in data computing and communications. Here, we propose new photonic integration scheme, a single-chip optical transceiver based on a monolithic-integrated vertical photonic I/O device set including light source on bulk-silicon. This scheme can solve the major issues which impede practical implementation of silicon-based chip-level optical interconnects. We demonstrated a prototype of a single-chip photonic transceiver with monolithic-integrated vertical-illumination type Ge-on-Si photodetectors and VCSELs-on-Si on the same bulk-silicon substrate operating up to 50 Gb/s and 20 Gb/s, respectively. The prototype realized 20 Gb/s low-power chip-level optical interconnects for λ ~ 850 nm between fabricated chips. This approach can have a significant impact on practical electronic-photonic integration in high performance computers (HPC), cpu-memory interface, hybrid memory cube, and LAN, SAN, data center and network applications.
Antonioletti, Mario; Biktashev, Vadim N; Jackson, Adrian; Kharche, Sanjay R; Stary, Tomas; Biktasheva, Irina V
2017-01-01
The BeatBox simulation environment combines flexible script language user interface with the robust computational tools, in order to setup cardiac electrophysiology in-silico experiments without re-coding at low-level, so that cell excitation, tissue/anatomy models, stimulation protocols may be included into a BeatBox script, and simulation run either sequentially or in parallel (MPI) without re-compilation. BeatBox is a free software written in C language to be run on a Unix-based platform. It provides the whole spectrum of multi scale tissue modelling from 0-dimensional individual cell simulation, 1-dimensional fibre, 2-dimensional sheet and 3-dimensional slab of tissue, up to anatomically realistic whole heart simulations, with run time measurements including cardiac re-entry tip/filament tracing, ECG, local/global samples of any variables, etc. BeatBox solvers, cell, and tissue/anatomy models repositories are extended via robust and flexible interfaces, thus providing an open framework for new developments in the field. In this paper we give an overview of the BeatBox current state, together with a description of the main computational methods and MPI parallelisation approaches.
NASA Astrophysics Data System (ADS)
Mphuthi, Ntsoaki G.; Adekunle, Abolanle S.; Fayemi, Omolola E.; Olasunkanmi, Lukman O.; Ebenso, Eno E.
2017-03-01
The electrocatalytic properties of metal oxides (MO = Fe3O4, ZnO) nanoparticles doped phthalocyanine (Pc) and functionalized MWCNTs, decorated on glassy carbon electrode (GCE) was investigated. Successful synthesis of the metal oxide nanoparticles and the MO/Pc/MWCNT composite were confirmed using UV-Vis, EDX, XRD and TEM techniques. Successful modification of GCE with the MO and their composite was also confirmed using cyclic voltammetry (CV) technique. GCE-MWCNT/ZnO/29H,31H-Pc was the best electrode towards DA detection with very low detection limit (0.75 μM) which compared favourably with literature, good sensitivity (1.45 μA/μM), resistance to electrode fouling, and excellent ability to detect DA without interference from AA signal. Electrocatalytic oxidation of DA on GCE-MWCNT/ZnO/29H,31H-Pc electrode was diffusion controlled but characterized with some adsorption of electro-oxidation reaction intermediates products. The fabricated sensors are easy to prepare, cost effective and can be applied for real sample analysis of dopamine in drug composition. The good electrocatalytic properties of 29H,31H-Pc and 2,3-Nc were related to their (quantum chemically derived) frontier molecular orbital energies and global electronegativities. The better performance of 29H,31H-Pc than 2,3-Nc in aiding electrochemical oxidation of DA might be due to its better electron accepting ability, which is inferred from its lower ELUMO and higher χ.
Mphuthi, Ntsoaki G.; Adekunle, Abolanle S.; Fayemi, Omolola E.; Olasunkanmi, Lukman O.; Ebenso, Eno E.
2017-01-01
The electrocatalytic properties of metal oxides (MO = Fe3O4, ZnO) nanoparticles doped phthalocyanine (Pc) and functionalized MWCNTs, decorated on glassy carbon electrode (GCE) was investigated. Successful synthesis of the metal oxide nanoparticles and the MO/Pc/MWCNT composite were confirmed using UV-Vis, EDX, XRD and TEM techniques. Successful modification of GCE with the MO and their composite was also confirmed using cyclic voltammetry (CV) technique. GCE-MWCNT/ZnO/29H,31H-Pc was the best electrode towards DA detection with very low detection limit (0.75 μM) which compared favourably with literature, good sensitivity (1.45 μA/μM), resistance to electrode fouling, and excellent ability to detect DA without interference from AA signal. Electrocatalytic oxidation of DA on GCE-MWCNT/ZnO/29H,31H-Pc electrode was diffusion controlled but characterized with some adsorption of electro-oxidation reaction intermediates products. The fabricated sensors are easy to prepare, cost effective and can be applied for real sample analysis of dopamine in drug composition. The good electrocatalytic properties of 29H,31H-Pc and 2,3-Nc were related to their (quantum chemically derived) frontier molecular orbital energies and global electronegativities. The better performance of 29H,31H-Pc than 2,3-Nc in aiding electrochemical oxidation of DA might be due to its better electron accepting ability, which is inferred from its lower ELUMO and higher χ. PMID:28256521
Scaling Deep Learning on GPU and Knights Landing clusters
You, Yang; Buluc, Aydin; Demmel, James
2017-09-26
The speed of deep neural networks training has become a big bottleneck of deep learning research and development. For example, training GoogleNet by ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the current deep learning systems heavily rely on the hardware accelerators. However, these accelerators have limited on-chip memory compared with CPUs. To handle large datasets, they need to fetch data from either CPU memory or remote processors. We use both self-hosted Intel Knights Landing (KNL) clusters and multi-GPU clusters as our target platforms. From an algorithm aspect, current distributed machine learningmore » systems are mainly designed for cloud systems. These methods are asynchronous because of the slow network and high fault-tolerance requirement on cloud systems. We focus on Elastic Averaging SGD (EASGD) to design algorithms for HPC clusters. Original EASGD used round-robin method for communication and updating. The communication is ordered by the machine rank ID, which is inefficient on HPC clusters. First, we redesign four efficient algorithms for HPC systems to improve EASGD's poor scaling on clusters. Async EASGD, Async MEASGD, and Hogwild EASGD are faster \\textcolor{black}{than} their existing counterparts (Async SGD, Async MSGD, and Hogwild SGD, resp.) in all the comparisons. Finally, we design Sync EASGD, which ties for the best performance among all the methods while being deterministic. In addition to the algorithmic improvements, we use some system-algorithm codesign techniques to scale up the algorithms. By reducing the percentage of communication from 87% to 14%, our Sync EASGD achieves 5.3x speedup over original EASGD on the same platform. We get 91.5% weak scaling efficiency on 4253 KNL cores, which is higher than the state-of-the-art implementation.« less
Scaling Deep Learning on GPU and Knights Landing clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Buluc, Aydin; Demmel, James
The speed of deep neural networks training has become a big bottleneck of deep learning research and development. For example, training GoogleNet by ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the current deep learning systems heavily rely on the hardware accelerators. However, these accelerators have limited on-chip memory compared with CPUs. To handle large datasets, they need to fetch data from either CPU memory or remote processors. We use both self-hosted Intel Knights Landing (KNL) clusters and multi-GPU clusters as our target platforms. From an algorithm aspect, current distributed machine learningmore » systems are mainly designed for cloud systems. These methods are asynchronous because of the slow network and high fault-tolerance requirement on cloud systems. We focus on Elastic Averaging SGD (EASGD) to design algorithms for HPC clusters. Original EASGD used round-robin method for communication and updating. The communication is ordered by the machine rank ID, which is inefficient on HPC clusters. First, we redesign four efficient algorithms for HPC systems to improve EASGD's poor scaling on clusters. Async EASGD, Async MEASGD, and Hogwild EASGD are faster \\textcolor{black}{than} their existing counterparts (Async SGD, Async MSGD, and Hogwild SGD, resp.) in all the comparisons. Finally, we design Sync EASGD, which ties for the best performance among all the methods while being deterministic. In addition to the algorithmic improvements, we use some system-algorithm codesign techniques to scale up the algorithms. By reducing the percentage of communication from 87% to 14%, our Sync EASGD achieves 5.3x speedup over original EASGD on the same platform. We get 91.5% weak scaling efficiency on 4253 KNL cores, which is higher than the state-of-the-art implementation.« less
Kjaergaard, Thomas; Baudin, Pablo; Bykov, Dmytro; ...
2016-11-16
Here, we present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide–Expand–Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide–Expand–Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalabilitymore » of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the “resolution of the identity second-order Moller–Plesset perturbation theory” (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.« less
ERIC Educational Resources Information Center
Amenyo, John-Thones
2012-01-01
Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…
Tuning HDF5 for Lustre File Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Howison, Mark; Koziol, Quincey; Knaak, David
2010-09-24
HDF5 is a cross-platform parallel I/O library that is used by a wide variety of HPC applications for the flexibility of its hierarchical object-database representation of scientific data. We describe our recent work to optimize the performance of the HDF5 and MPI-IO libraries for the Lustre parallel file system. We selected three different HPC applications to represent the diverse range of I/O requirements, and measured their performance on three different systems to demonstrate the robustness of our optimizations across different file system configurations and to validate our optimization strategy. We demonstrate that the combined optimizations improve HDF5 parallel I/O performancemore » by up to 33 times in some cases running close to the achievable peak performance of the underlying file system and demonstrate scalable performance up to 40,960-way concurrency.« less
Data Provenance Hybridization Supporting Extreme-Scale Scientific WorkflowApplications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Elsethagen, Todd O.; Stephan, Eric G.; Raju, Bibi
As high performance computing (HPC) infrastructures continue to grow in capability and complexity, so do the applications that they serve. HPC and distributed-area computing (DAC) (e.g. grid and cloud) users are looking increasingly toward workflow solutions to orchestrate their complex application coupling, pre- and post-processing needs To gain insight and a more quantitative understanding of a workflow’s performance our method includes not only the capture of traditional provenance information, but also the capture and integration of system environment metrics helping to give context and explanation for a workflow’s execution. In this paper, we describe IPPD’s provenance management solution (ProvEn) andmore » its hybrid data store combining both of these data provenance perspectives.« less
CFD Ventilation Study for the Human Powered Centrifuge at the International Space Station
NASA Technical Reports Server (NTRS)
Son, Chang H.
2011-01-01
The Human Powered Centrifuge (HPC) is a hyper gravity facility that will be installed on board the International Space Station (ISS) to enable crew exercises under the artificial gravity conditions. The HPC equipment includes a bicycle for long-term exercises of a crewmember that provides power for rotation of HPC at a speed of 30 rpm. The crewmember exercising vigorously on the centrifuge generates the amount of carbon dioxide of several times higher than a crewmember in ordinary conditions. The goal of the study is to analyze the airflow and carbon dioxide distribution within Pressurized Multipurpose Module (PMM) cabin. The 3D computational model included PMM cabin. The full unsteady formulation was used for airflow and CO2 transport modeling with the so-called sliding mesh concept is considered in the rotating reference frame while the rest of the cabin volume is considered in the stationary reference frame. The localized effects of carbon dioxide dispersion are examined. Strong influence of the rotating HPC equipment on the CO2 distribution is detected and discussed.
Exploring the capabilities of support vector machines in detecting silent data corruptions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo
As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs), or silent errors, are one of the major sources that corrupt the execution results of HPC applications without being detected. Here in this paper, we explore a set of novel SDC detectors – by leveraging epsilon-insensitive support vector machine regression – to detect SDCs that occur in HPC applications. The key contributions are threefold. (1) Our exploration takes temporal, spatial, and spatiotemporal features into account and analyzes different detectors based onmore » different features. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show that support-vector-machine-based detectors can achieve detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% false positive rate for most cases. Our detectors incur low performance overhead, 5% on average, for all benchmarks studied in this work.« less
Exploring the capabilities of support vector machines in detecting silent data corruptions
Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo; ...
2018-02-01
As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs), or silent errors, are one of the major sources that corrupt the execution results of HPC applications without being detected. Here in this paper, we explore a set of novel SDC detectors – by leveraging epsilon-insensitive support vector machine regression – to detect SDCs that occur in HPC applications. The key contributions are threefold. (1) Our exploration takes temporal, spatial, and spatiotemporal features into account and analyzes different detectors based onmore » different features. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show that support-vector-machine-based detectors can achieve detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% false positive rate for most cases. Our detectors incur low performance overhead, 5% on average, for all benchmarks studied in this work.« less
User-level framework for performance monitoring of HPC applications
NASA Astrophysics Data System (ADS)
Hristova, R.; Goranov, G.
2013-10-01
HP-SEE is an infrastructure that links the existing HPC facilities in South East Europe in a common infrastructure. The analysis of the performance monitoring of the High-Performance Computing (HPC) applications in the infrastructure can be useful for the end user as diagnostic for the overall performance of his applications. The existing monitoring tools for HP-SEE provide to the end user only aggregated information for all applications. Usually, the user does not have permissions to select only the relevant information for him and for his applications. In this article we present a framework for performance monitoring of the HPC applications in the HP-SEE infrastructure. The framework provides standardized performance metrics, which every user can use in order to monitor his applications. Furthermore as a part of the framework a program interface is developed. The interface allows the user to publish metrics data from his application and to read and analyze gathered information. Publishing and reading through the framework is possible only with grid certificate valid for the infrastructure. Therefore the user is authorized to access only the data for his applications.
Scaling Support Vector Machines On Modern HPC Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Fu, Haohuan; Song, Shuaiwen
2015-02-01
We designed and implemented MIC-SVM, a highly efficient parallel SVM for x86 based multicore and many-core architectures, such as the Intel Ivy Bridge CPUs and Intel Xeon Phi co-processor (MIC). We propose various novel analysis methods and optimization techniques to fully utilize the multilevel parallelism provided by these architectures and serve as general optimization methods for other machine learning tools.
Jones, David S; Margetson, Daniel N; McAllister, Mark S; Andrews, Gavin P
2015-12-30
Given the growing interest in thermal processing methods, this study describes the use of an advanced rheological technique, capillary rheometry, to accurately determine the thermorheological properties of two pharmaceutical polymers, Eudragit E100 (E100) and hydroxypropylcellulose JF (HPC) and their blends, both in the presence and absence of a model therapeutic agent (quinine, as the base and hydrochloride salt). Furthermore, the glass transition temperatures (Tg) of the cooled extrudates produced using capillary rheometry were characterised using Dynamic Mechanical Thermal Analysis (DMTA) thereby enabling correlations to be drawn between the information derived from capillary rheometry and the glass transition properties of the extrudates. The shear viscosities of E100 and HPC (and their blends) decreased as functions of increasing temperature and shear rates, with the shear viscosity of E100 being significantly greater than that of HPC at all temperatures and shear rates. All platforms were readily processed at shear rates relevant to extrusion (approximately 200-300s(-1)) and injection moulding (approximately 900s(-1)). Quinine base was observed to lower the shear viscosities of E100 and E100/HPC blends during processing and the Tg of extrudates, indicative of plasticisation at processing temperatures and when cooled (i.e. in the solid state). Quinine hydrochloride (20% w/w) increased the shear viscosities of E100 and HPC and their blends during processing and did not affect the Tg of the parent polymer. However, the shear viscosities of these systems were not prohibitive to processing at shear rates relevant to extrusion and injection moulding. As the ratio of E100:HPC increased within the polymer blends the effects of quinine base on the lowering of both shear viscosity and Tg of the polymer blends increased, reflecting the greater solubility of quinine within E100. In conclusion, this study has highlighted the importance of capillary rheometry in identifying processing conditions, polymer miscibility and plasticisation phenomena. Copyright © 2015. Published by Elsevier B.V.
Jones, David S; Margetson, Daniel N; McAllister, Mark S; Andrews, Gavin P
2015-09-30
Given the growing interest in thermal processing methods, this study describes the use of an advanced rheological technique, capillary rheometry, to accurately determine the thermorheological properties of two pharmaceutical polymers, Eudragit E100 (E100) and hydroxypropylcellulose JF (HPC) and their blends, both in the presence and absence of a model therapeutic agent (quinine, as the base and hydrochloride salt). Furthermore, the glass transition temperatures (Tg) of the cooled extrudates produced using capillary rheometry were characterised using Dynamic Mechanical Thermal Analysis (DMTA) thereby enabling correlations to be drawn between the information derived from capillary rheometry and the glass transition properties of the extrudates. The shear viscosities of E100 and HPC (and their blends) decreased as functions of increasing temperature and shear rates, with the shear viscosity of E100 being significantly greater than that of HPC at all temperatures and shear rates. All platforms were readily processed at shear rates relevant to extrusion (approximately 200-300 s(-1)) and injection moulding (approximately 900 s(-1)). Quinine base was observed to lower the shear viscosities of E100 and E100/HPC blends during processing and the Tg of extrudates, indicative of plasticisation at processing temperatures and when cooled (i.e. in the solid state). Quinine hydrochloride (20% w/w) increased the shear viscosities of E100 and HPC and their blends during processing and did not affect the Tg of the parent polymer. However, the shear viscosities of these systems were not prohibitive to processing at shear rates relevant to extrusion and injection moulding. As the ratio of E100:HPC increased within the polymer blends the effects of quinine base on the lowering of both shear viscosity and Tg of the polymer blends increased, reflecting the greater solubility of quinine within E100. In conclusion, this study has highlighted the importance of capillary rheometry in identifying processing conditions, polymer miscibility and plasticisation phenomena. Copyright © 2015. Published by Elsevier B.V.
DOE Office of Scientific and Technical Information (OSTI.GOV)
East, D. R.; Sexton, J.
This was a collaborative effort between Lawrence Livermore National Security, LLC as manager and operator of Lawrence Livermore National Laboratory (LLNL) and IBM TJ Watson Research Center to research, assess feasibility and develop an implementation plan for a High Performance Computing Innovation Center (HPCIC) in the Livermore Valley Open Campus (LVOC). The ultimate goal of this work was to help advance the State of California and U.S. commercial competitiveness in the arena of High Performance Computing (HPC) by accelerating the adoption of computational science solutions, consistent with recent DOE strategy directives. The desired result of this CRADA was a well-researched,more » carefully analyzed market evaluation that would identify those firms in core sectors of the US economy seeking to adopt or expand their use of HPC to become more competitive globally, and to define how those firms could be helped by the HPCIC with IBM as an integral partner.« less
An Integrated Software Package to Enable Predictive Simulation Capabilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Yousu; Fitzhenry, Erin B.; Jin, Shuangshuang
The power grid is increasing in complexity due to the deployment of smart grid technologies. Such technologies vastly increase the size and complexity of power grid systems for simulation and modeling. This increasing complexity necessitates not only the use of high-performance-computing (HPC) techniques, but a smooth, well-integrated interplay between HPC applications. This paper presents a new integrated software package that integrates HPC applications and a web-based visualization tool based on a middleware framework. This framework can support the data communication between different applications. Case studies with a large power system demonstrate the predictive capability brought by the integrated software package,more » as well as the better situational awareness provided by the web-based visualization tool in a live mode. Test results validate the effectiveness and usability of the integrated software package.« less
Using Formal Grammars to Predict I/O Behaviors in HPC: The Omnisc'IO Approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dorier, Matthieu; Ibrahim, Shadi; Antoniu, Gabriel
2016-08-01
The increasing gap between the computation performance of post-petascale machines and the performance of their I/O subsystem has motivated many I/O optimizations including prefetching, caching, and scheduling. In order to further improve these techniques, modeling and predicting spatial and temporal I/O patterns of HPC applications as they run has become crucial. In this paper we present Omnisc'IO, an approach that builds a grammar-based model of the I/O behavior of HPC applications and uses it to predict when future I/O operations will occur, and where and how much data will be accessed. To infer grammars, Omnisc'IO is based on StarSequitur, amore » novel algorithm extending Nevill-Manning's Sequitur algorithm. Omnisc'IO is transparently integrated into the POSIX and MPI I/O stacks and does not require any modification in applications or higher-level I/O libraries. It works without any prior knowledge of the application and converges to accurate predictions of any N future I/O operations within a couple of iterations. Its implementation is efficient in both computation time and memory footprint.« less
An efficient framework for Java data processing systems in HPC environments
NASA Astrophysics Data System (ADS)
Fries, Aidan; Castañeda, Javier; Isasi, Yago; Taboada, Guillermo L.; Portell de Mora, Jordi; Sirvent, Raül
2011-11-01
Java is a commonly used programming language, although its use in High Performance Computing (HPC) remains relatively low. One of the reasons is a lack of libraries offering specific HPC functions to Java applications. In this paper we present a Java-based framework, called DpcbTools, designed to provide a set of functions that fill this gap. It includes a set of efficient data communication functions based on message-passing, thus providing, when a low latency network such as Myrinet is available, higher throughputs and lower latencies than standard solutions used by Java. DpcbTools also includes routines for the launching, monitoring and management of Java applications on several computing nodes by making use of JMX to communicate with remote Java VMs. The Gaia Data Processing and Analysis Consortium (DPAC) is a real case where scientific data from the ESA Gaia astrometric satellite will be entirely processed using Java. In this paper we describe the main elements of DPAC and its usage of the DpcbTools framework. We also assess the usefulness and performance of DpcbTools through its performance evaluation and the analysis of its impact on some DPAC systems deployed in the MareNostrum supercomputer (Barcelona Supercomputing Center).
Exploiting opportunistic resources for ATLAS with ARC CE and the Event Service
NASA Astrophysics Data System (ADS)
Cameron, D.; Filipčič, A.; Guan, W.; Tsulaia, V.; Walker, R.; Wenaus, T.;
2017-10-01
With ever-greater computing needs and fixed budgets, big scientific experiments are turning to opportunistic resources as a means to add much-needed extra computing power. These resources can be very different in design from those that comprise the Grid computing of most experiments, therefore exploiting them requires a change in strategy for the experiment. They may be highly restrictive in what can be run or in connections to the outside world, or tolerate opportunistic usage only on condition that tasks may be terminated without warning. The Advanced Resource Connector Computing Element (ARC CE) with its nonintrusive architecture is designed to integrate resources such as High Performance Computing (HPC) systems into a computing Grid. The ATLAS experiment developed the ATLAS Event Service (AES) primarily to address the issue of jobs that can be terminated at any point when opportunistic computing capacity is needed by someone else. This paper describes the integration of these two systems in order to exploit opportunistic resources for ATLAS in a restrictive environment. In addition to the technical details, results from deployment of this solution in the SuperMUC HPC centre in Munich are shown.
Resilience Design Patterns - A Structured Approach to Resilience at Extreme Scale (version 1.0)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Engelmann, Christian
Reliability is a serious concern for future extreme-scale high-performance computing (HPC) systems. Projections based on the current generation of HPC systems and technology roadmaps suggest that very high fault rates in future systems. The errors resulting from these faults will propagate and generate various kinds of failures, which may result in outcomes ranging from result corruptions to catastrophic application crashes. Practical limits on power consumption in HPC systems will require future systems to embrace innovative architectures, increasing the levels of hardware and software complexities. The resilience challenge for extreme-scale HPC systems requires management of various hardware and software technologies thatmore » are capable of handling a broad set of fault models at accelerated fault rates. These techniques must seek to improve resilience at reasonable overheads to power consumption and performance. While the HPC community has developed various solutions, application-level as well as system-based solutions, the solution space of HPC resilience techniques remains fragmented. There are no formal methods and metrics to investigate and evaluate resilience holistically in HPC systems that consider impact scope, handling coverage, and performance & power eciency across the system stack. Additionally, few of the current approaches are portable to newer architectures and software ecosystems, which are expected to be deployed on future systems. In this document, we develop a structured approach to the management of HPC resilience based on the concept of resilience-based design patterns. A design pattern is a general repeatable solution to a commonly occurring problem. We identify the commonly occurring problems and solutions used to deal with faults, errors and failures in HPC systems. The catalog of resilience design patterns provides designers with reusable design elements. We define a design framework that enhances our understanding of the important constraints and opportunities for solutions deployed at various layers of the system stack. The framework may be used to establish mechanisms and interfaces to coordinate flexible fault management across hardware and software components. The framework also enables optimization of the cost-benefit trade-os among performance, resilience, and power consumption. The overall goal of this work is to enable a systematic methodology for the design and evaluation of resilience technologies in extreme-scale HPC systems that keep scientific applications running to a correct solution in a timely and cost-ecient manner in spite of frequent faults, errors, and failures of various types.« less
Create full-scale predictive economic models on ROI and innovation with performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joseph, Earl C.; Conway, Steve
The U.S. Department of Energy (DOE), the world's largest buyer and user of supercomputers, awarded IDC Research, Inc. a grant to create two macroeconomic models capable of quantifying, respectively, financial and non-financial (innovation) returns on investments in HPC resources. Following a 2013 pilot study in which we created the models and tested them on about 200 real-world HPC cases, DOE authorized us to conduct a full-out, three-year grant study to collect and measure many more examples, a process that would also subject the methodology to further testing and validation. A secondary, "stretch" goal of the full-out study was to advancemore » the methodology from association toward (but not all the way to) causation, by eliminating the effects of some of the other factors that might be contributing, along with HPC investments, to the returns produced in the investigated projects.« less
Appropriate Use Policy | High-Performance Computing | NREL
users of the National Renewable Energy Laboratory (NREL) High Performance Computing (HPC) resources government agency, National Laboratory, University, or private entity, the intellectual property terms (if issued a multifactor token which may be a physical token or a virtual token used with one-time password
Bioinformatics and Astrophysics Cluster (BinAc)
NASA Astrophysics Data System (ADS)
Krüger, Jens; Lutz, Volker; Bartusch, Felix; Dilling, Werner; Gorska, Anna; Schäfer, Christoph; Walter, Thomas
2017-09-01
BinAC provides central high performance computing capacities for bioinformaticians and astrophysicists from the state of Baden-Württemberg. The bwForCluster BinAC is part of the implementation concept for scientific computing for the universities in Baden-Württemberg. Community specific support is offered through the bwHPC-C5 project.
NASA Astrophysics Data System (ADS)
Gregory, A. E.; Benedict, K. K.; Zhang, S.; Savickas, J.
2017-12-01
Large scale, high severity wildfires in forests have become increasingly prevalent in the western United States due to fire exclusion. Although past work has focused on the immediate consequences of wildfire (ie. runoff magnitude and debris flow), little has been done to understand the post wildfire hydrologic consequences of vegetation regrowth. Furthermore, vegetation is often characterized by static parameterizations within hydrological models. In order to understand the temporal relationship between hydrologic processes and revegetation, we modularized and partially automated the hydrologic modeling process to increase connectivity between remotely sensed data, the Virtual Watershed Platform (a data management resource, called the VWP), input meteorological data, and the Precipitation-Runoff Modeling System (PRMS). This process was used to run simulations in the Valles Caldera of NM, an area impacted by the 2011 Las Conchas Fire, in PRMS before and after the Las Conchas to evaluate hydrologic process changes. The modeling environment addressed some of the existing challenges faced by hydrological modelers. At present, modelers are somewhat limited in their ability to push the boundaries of hydrologic understanding. Specific issues faced by modelers include limited computational resources to model processes at large spatial and temporal scales, data storage capacity and accessibility from the modeling platform, computational and time contraints for experimental modeling, and the skills to integrate modeling software in ways that have not been explored. By taking an interdisciplinary approach, we were able to address some of these challenges by leveraging the skills of hydrologic, data, and computer scientists; and the technical capabilities provided by a combination of on-demand/high-performance computing, distributed data, and cloud services. The hydrologic modeling process was modularized to include options for distributing meteorological data, parameter space experimentation, data format transformation, looping, validation of models and containerization for enabling new analytic scenarios. The user interacts with the modules through Jupyter Notebooks which can be connected to an on-demand computing and HPC environment, and data services built as part of the VWP.
Towards Test Driven Development for Computational Science with pFUnit
NASA Technical Reports Server (NTRS)
Rilee, Michael L.; Clune, Thomas L.
2014-01-01
Developers working in Computational Science & Engineering (CSE)/High Performance Computing (HPC) must contend with constant change due to advances in computing technology and science. Test Driven Development (TDD) is a methodology that mitigates software development risks due to change at the cost of adding comprehensive and continuous testing to the development process. Testing frameworks tailored for CSE/HPC, like pFUnit, can lower the barriers to such testing, yet CSE software faces unique constraints foreign to the broader software engineering community. Effective testing of numerical software requires a comprehensive suite of oracles, i.e., use cases with known answers, as well as robust estimates for the unavoidable numerical errors associated with implementation with finite-precision arithmetic. At first glance these concerns often seem exceedingly challenging or even insurmountable for real-world scientific applications. However, we argue that this common perception is incorrect and driven by (1) a conflation between model validation and software verification and (2) the general tendency in the scientific community to develop relatively coarse-grained, large procedures that compound numerous algorithmic steps.We believe TDD can be applied routinely to numerical software if developers pursue fine-grained implementations that permit testing, neatly side-stepping concerns about needing nontrivial oracles as well as the accumulation of errors. We present an example of a successful, complex legacy CSE/HPC code whose development process shares some aspects with TDD, which we contrast with current and potential capabilities. A mix of our proposed methodology and framework support should enable everyday use of TDD by CSE-expert developers.
INFN-Pisa scientific computation environment (GRID, HPC and Interactive Analysis)
NASA Astrophysics Data System (ADS)
Arezzini, S.; Carboni, A.; Caruso, G.; Ciampa, A.; Coscetti, S.; Mazzoni, E.; Piras, S.
2014-06-01
The INFN-Pisa Tier2 infrastructure is described, optimized not only for GRID CPU and Storage access, but also for a more interactive use of the resources in order to provide good solutions for the final data analysis step. The Data Center, equipped with about 6700 production cores, permits the use of modern analysis techniques realized via advanced statistical tools (like RooFit and RooStat) implemented in multicore systems. In particular a POSIX file storage access integrated with standard SRM access is provided. Therefore the unified storage infrastructure is described, based on GPFS and Xrootd, used both for SRM data repository and interactive POSIX access. Such a common infrastructure allows a transparent access to the Tier2 data to the users for their interactive analysis. The organization of a specialized many cores CPU facility devoted to interactive analysis is also described along with the login mechanism integrated with the INFN-AAI (National INFN Infrastructure) to extend the site access and use to a geographical distributed community. Such infrastructure is used also for a national computing facility in use to the INFN theoretical community, it enables a synergic use of computing and storage resources. Our Center initially developed for the HEP community is now growing and includes also HPC resources fully integrated. In recent years has been installed and managed a cluster facility (1000 cores, parallel use via InfiniBand connection) and we are now updating this facility that will provide resources for all the intermediate level HPC computing needs of the INFN theoretical national community.
U.S. Army Research Laboratory (ARL) multimodal signatures database
NASA Astrophysics Data System (ADS)
Bennett, Kelly
2008-04-01
The U.S. Army Research Laboratory (ARL) Multimodal Signatures Database (MMSDB) is a centralized collection of sensor data of various modalities that are co-located and co-registered. The signatures include ground and air vehicles, personnel, mortar, artillery, small arms gunfire from potential sniper weapons, explosives, and many other high value targets. This data is made available to Department of Defense (DoD) and DoD contractors, Intel agencies, other government agencies (OGA), and academia for use in developing target detection, tracking, and classification algorithms and systems to protect our Soldiers. A platform independent Web interface disseminates the signatures to researchers and engineers within the scientific community. Hierarchical Data Format 5 (HDF5) signature models provide an excellent solution for the sharing of complex multimodal signature data for algorithmic development and database requirements. Many open source tools for viewing and plotting HDF5 signatures are available over the Web. Seamless integration of HDF5 signatures is possible in both proprietary computational environments, such as MATLAB, and Free and Open Source Software (FOSS) computational environments, such as Octave and Python, for performing signal processing, analysis, and algorithm development. Future developments include extending the Web interface into a portal system for accessing ARL algorithms and signatures, High Performance Computing (HPC) resources, and integrating existing database and signature architectures into sensor networking environments.
Enabling parallel simulation of large-scale HPC network systems
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.; ...
2016-04-07
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Enabling parallel simulation of large-scale HPC network systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Dynamic updating of hippocampal object representations reflects new conceptual knowledge
Mack, Michael L.; Love, Bradley C.; Preston, Alison R.
2016-01-01
Concepts organize the relationship among individual stimuli or events by highlighting shared features. Often, new goals require updating conceptual knowledge to reflect relationships based on different goal-relevant features. Here, our aim is to determine how hippocampal (HPC) object representations are organized and updated to reflect changing conceptual knowledge. Participants learned two classification tasks in which successful learning required attention to different stimulus features, thus providing a means to index how representations of individual stimuli are reorganized according to changing task goals. We used a computational learning model to capture how people attended to goal-relevant features and organized object representations based on those features during learning. Using representational similarity analyses of functional magnetic resonance imaging data, we demonstrate that neural representations in left anterior HPC correspond with model predictions of concept organization. Moreover, we show that during early learning, when concept updating is most consequential, HPC is functionally coupled with prefrontal regions. Based on these findings, we propose that when task goals change, object representations in HPC can be organized in new ways, resulting in updated concepts that highlight the features most critical to the new goal. PMID:27803320
Large Scale Computing and Storage Requirements for High Energy Physics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerber, Richard A.; Wasserman, Harvey
2010-11-24
The National Energy Research Scientific Computing Center (NERSC) is the leading scientific computing facility for the Department of Energy's Office of Science, providing high-performance computing (HPC) resources to more than 3,000 researchers working on about 400 projects. NERSC provides large-scale computing resources and, crucially, the support and expertise needed for scientists to make effective use of them. In November 2009, NERSC, DOE's Office of Advanced Scientific Computing Research (ASCR), and DOE's Office of High Energy Physics (HEP) held a workshop to characterize the HPC resources needed at NERSC to support HEP research through the next three to five years. Themore » effort is part of NERSC's legacy of anticipating users needs and deploying resources to meet those demands. The workshop revealed several key points, in addition to achieving its goal of collecting and characterizing computing requirements. The chief findings: (1) Science teams need access to a significant increase in computational resources to meet their research goals; (2) Research teams need to be able to read, write, transfer, store online, archive, analyze, and share huge volumes of data; (3) Science teams need guidance and support to implement their codes on future architectures; and (4) Projects need predictable, rapid turnaround of their computational jobs to meet mission-critical time constraints. This report expands upon these key points and includes others. It also presents a number of case studies as representative of the research conducted within HEP. Workshop participants were asked to codify their requirements in this case study format, summarizing their science goals, methods of solution, current and three-to-five year computing requirements, and software and support needs. Participants were also asked to describe their strategy for computing in the highly parallel, multi-core environment that is expected to dominate HPC architectures over the next few years. The report includes a section that describes efforts already underway or planned at NERSC that address requirements collected at the workshop. NERSC has many initiatives in progress that address key workshop findings and are aligned with NERSC's strategic plans.« less
ERIC Educational Resources Information Center
Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.
This committee report is intended to accompany S. 1067, a bill designed to provide for a coordinated federal research program in high-performance computing (HPC). The primary objective of the legislation is given as the acceleration of research, development, and application of the most advanced computing technology in research, education, and…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Michael Pernice
2010-09-01
INL has agreed to provide participants in the Nuclear Energy Advanced Mod- eling and Simulation (NEAMS) program with access to its high performance computing (HPC) resources under sponsorship of the Enabling Computational Technologies (ECT) program element. This report documents the process used to select applications and the software stack in place at INL.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Venkata, Manjunath Gorentla; Aderholdt, William F
The pre-exascale systems are expected to have a significant amount of hierarchical and heterogeneous on-node memory, and this trend of system architecture in extreme-scale systems is expected to continue into the exascale era. along with hierarchical-heterogeneous memory, the system typically has a high-performing network ad a compute accelerator. This system architecture is not only effective for running traditional High Performance Computing (HPC) applications (Big-Compute), but also for running data-intensive HPC applications and Big-Data applications. As a consequence, there is a growing desire to have a single system serve the needs of both Big-Compute and Big-Data applications. Though the system architecturemore » supports the convergence of the Big-Compute and Big-Data, the programming models and software layer have yet to evolve to support either hierarchical-heterogeneous memory systems or the convergence. A programming abstraction to address this problem. The programming abstraction is implemented as a software library and runs on pre-exascale and exascale systems supporting current and emerging system architecture. Using distributed data-structures as a central concept, it provides (1) a simple, usable, and portable abstraction for hierarchical-heterogeneous memory and (2) a unified programming abstraction for Big-Compute and Big-Data applications.« less
Mobile high-performance computing (HPC) for synthetic aperture radar signal processing
NASA Astrophysics Data System (ADS)
Misko, Joshua; Kim, Youngsoo; Qi, Chenchen; Sirkeci, Birsen
2018-04-01
The importance of mobile high-performance computing has emerged in numerous battlespace applications at the tactical edge in hostile environments. Energy efficient computing power is a key enabler for diverse areas ranging from real-time big data analytics and atmospheric science to network science. However, the design of tactical mobile data centers is dominated by power, thermal, and physical constraints. Presently, it is very unlikely to achieve required computing processing power by aggregating emerging heterogeneous many-core processing platforms consisting of CPU, Field Programmable Gate Arrays and Graphic Processor cores constrained by power and performance. To address these challenges, we performed a Synthetic Aperture Radar case study for Automatic Target Recognition (ATR) using Deep Neural Networks (DNNs). However, these DNN models are typically trained using GPUs with gigabytes of external memories and massively used 32-bit floating point operations. As a result, DNNs do not run efficiently on hardware appropriate for low power or mobile applications. To address this limitation, we proposed for compressing DNN models for ATR suited to deployment on resource constrained hardware. This proposed compression framework utilizes promising DNN compression techniques including pruning and weight quantization while also focusing on processor features common to modern low-power devices. Following this methodology as a guideline produced a DNN for ATR tuned to maximize classification throughput, minimize power consumption, and minimize memory footprint on a low-power device.
NASA Astrophysics Data System (ADS)
Vilotte, J.-P.; Atkinson, M.; Michelini, A.; Igel, H.; van Eck, T.
2012-04-01
Increasingly dense seismic and geodetic networks are continuously transmitting a growing wealth of data from around the world. The multi-use of these data leaded the seismological community to pioneer globally distributed open-access data infrastructures, standard services and formats, e.g., the Federation of Digital Seismic Networks (FDSN) and the European Integrated Data Archives (EIDA). Our ability to acquire observational data outpaces our ability to manage, analyze and model them. Research in seismology is today facing a fundamental paradigm shift. Enabling advanced data-intensive analysis and modeling applications challenges conventional storage, computation and communication models and requires a new holistic approach. It is instrumental to exploit the cornucopia of data, and to guarantee optimal operation and design of the high-cost monitoring facilities. The strategy of VERCE is driven by the needs of the seismological data-intensive applications in data analysis and modeling. It aims to provide a comprehensive architecture and framework adapted to the scale and the diversity of those applications, and integrating the data infrastructures with Grid, Cloud and HPC infrastructures. It will allow prototyping solutions for new use cases as they emerge within the European Plate Observatory Systems (EPOS), the ESFRI initiative of the solid Earth community. Computational seismology, and information management, is increasingly revolving around massive amounts of data that stem from: (1) the flood of data from the observational systems; (2) the flood of data from large-scale simulations and inversions; (3) the ability to economically store petabytes of data online; (4) the evolving Internet and Data-aware computing capabilities. As data-intensive applications are rapidly increasing in scale and complexity, they require additional services-oriented architectures offering a virtualization-based flexibility for complex and re-usable workflows. Scientific information management poses computer science challenges: acquisition, organization, query and visualization tasks scale almost linearly with the data volumes. Commonly used FTP-GREP metaphor allows today to scan gigabyte-sized datasets but will not work for scanning terabyte-sized continuous waveform datasets. New data analysis and modeling methods, exploiting the signal coherence within dense network arrays, are nonlinear. Pair-algorithms on N points scale as N2. Wave form inversion and stochastic simulations raise computing and data handling challenges These applications are unfeasible for tera-scale datasets without new parallel algorithms that use near-linear processing, storage and bandwidth, and that can exploit new computing paradigms enabled by the intersection of several technologies (HPC, parallel scalable database crawler, data-aware HPC). This issues will be discussed based on a number of core pilot data-intensive applications and use cases retained in VERCE. This core applications are related to: (1) data processing and data analysis methods based on correlation techniques; (2) cpu-intensive applications such as large-scale simulation of synthetic waveforms in complex earth systems, and full waveform inversion and tomography. We shall analyze their workflow and data flow, and their requirements for a new service-oriented architecture and a data-aware platform with services and tools. Finally, we will outline the importance of a new collaborative environment between seismology and computer science, together with the need for the emergence and the recognition of 'research technologists' mastering the evolving data-aware technologies and the data-intensive research goals in seismology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fadika, Zacharia; Dede, Elif; Govindaraju, Madhusudhan
MapReduce is increasingly becoming a popular framework, and a potent programming model. The most popular open source implementation of MapReduce, Hadoop, is based on the Hadoop Distributed File System (HDFS). However, as HDFS is not POSIX compliant, it cannot be fully leveraged by applications running on a majority of existing HPC environments such as Teragrid and NERSC. These HPC environments typicallysupport globally shared file systems such as NFS and GPFS. On such resourceful HPC infrastructures, the use of Hadoop not only creates compatibility issues, but also affects overall performance due to the added overhead of the HDFS. This paper notmore » only presents a MapReduce implementation directly suitable for HPC environments, but also exposes the design choices for better performance gains in those settings. By leveraging inherent distributed file systems' functions, and abstracting them away from its MapReduce framework, MARIANE (MApReduce Implementation Adapted for HPC Environments) not only allows for the use of the model in an expanding number of HPCenvironments, but also allows for better performance in such settings. This paper shows the applicability and high performance of the MapReduce paradigm through MARIANE, an implementation designed for clustered and shared-disk file systems and as such not dedicated to a specific MapReduce solution. The paper identifies the components and trade-offs necessary for this model, and quantifies the performance gains exhibited by our approach in distributed environments over Apache Hadoop in a data intensive setting, on the Magellan testbed at the National Energy Research Scientific Computing Center (NERSC).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shao, Yan-Lin, E-mail: yanlin.shao@dnvgl.com; Faltinsen, Odd M.
2014-10-01
We propose a new efficient and accurate numerical method based on harmonic polynomials to solve boundary value problems governed by 3D Laplace equation. The computational domain is discretized by overlapping cells. Within each cell, the velocity potential is represented by the linear superposition of a complete set of harmonic polynomials, which are the elementary solutions of Laplace equation. By its definition, the method is named as Harmonic Polynomial Cell (HPC) method. The characteristics of the accuracy and efficiency of the HPC method are demonstrated by studying analytical cases. Comparisons will be made with some other existing boundary element based methods,more » e.g. Quadratic Boundary Element Method (QBEM) and the Fast Multipole Accelerated QBEM (FMA-QBEM) and a fourth order Finite Difference Method (FDM). To demonstrate the applications of the method, it is applied to some studies relevant for marine hydrodynamics. Sloshing in 3D rectangular tanks, a fully-nonlinear numerical wave tank, fully-nonlinear wave focusing on a semi-circular shoal, and the nonlinear wave diffraction of a bottom-mounted cylinder in regular waves are studied. The comparisons with the experimental results and other numerical results are all in satisfactory agreement, indicating that the present HPC method is a promising method in solving potential-flow problems. The underlying procedure of the HPC method could also be useful in other fields than marine hydrodynamics involved with solving Laplace equation.« less
Kim, Gyungock; Park, Hyundai; Joo, Jiho; Jang, Ki-Seok; Kwack, Myung-Joon; Kim, Sanghoon; Gyoo Kim, In; Hyuk Oh, Jin; Ae Kim, Sun; Park, Jaegyu; Kim, Sanggi
2015-01-01
When silicon photonic integrated circuits (PICs), defined for transmitting and receiving optical data, are successfully monolithic-integrated into major silicon electronic chips as chip-level optical I/Os (inputs/outputs), it will bring innovative changes in data computing and communications. Here, we propose new photonic integration scheme, a single-chip optical transceiver based on a monolithic-integrated vertical photonic I/O device set including light source on bulk-silicon. This scheme can solve the major issues which impede practical implementation of silicon-based chip-level optical interconnects. We demonstrated a prototype of a single-chip photonic transceiver with monolithic-integrated vertical-illumination type Ge-on-Si photodetectors and VCSELs-on-Si on the same bulk-silicon substrate operating up to 50 Gb/s and 20 Gb/s, respectively. The prototype realized 20 Gb/s low-power chip-level optical interconnects for λ ~ 850 nm between fabricated chips. This approach can have a significant impact on practical electronic-photonic integration in high performance computers (HPC), cpu-memory interface, hybrid memory cube, and LAN, SAN, data center and network applications. PMID:26061463
Sirocco Storage Server v. pre-alpha 0.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curry, Matthew L.; Danielson, Geoffrey; Ward, H. Lee
Sirocco is a parallel storage system under development, designed for write-intensive workloads on large-scale HPC platforms. It implements a keyvalue object store on top of a set of loosely federated storage servers that cooperate to ensure data integrity and performance. It includes support for a range of different types of storage transactions. This software release constitutes a conformant storage server, along with the client-side libraries to access the storage over a network.
Flexible Description and Adaptive Processing of Earth Observation Data through the BigEarth Platform
NASA Astrophysics Data System (ADS)
Gorgan, Dorian; Bacu, Victor; Stefanut, Teodor; Nandra, Cosmin; Mihon, Danut
2016-04-01
The Earth Observation data repositories extending periodically by several terabytes become a critical issue for organizations. The management of the storage capacity of such big datasets, accessing policy, data protection, searching, and complex processing require high costs that impose efficient solutions to balance the cost and value of data. Data can create value only when it is used, and the data protection has to be oriented toward allowing innovation that sometimes depends on creative people, which achieve unexpected valuable results through a flexible and adaptive manner. The users need to describe and experiment themselves different complex algorithms through analytics in order to valorize data. The analytics uses descriptive and predictive models to gain valuable knowledge and information from data analysis. Possible solutions for advanced processing of big Earth Observation data are given by the HPC platforms such as cloud. With platforms becoming more complex and heterogeneous, the developing of applications is even harder and the efficient mapping of these applications to a suitable and optimum platform, working on huge distributed data repositories, is challenging and complex as well, even by using specialized software services. From the user point of view, an optimum environment gives acceptable execution times, offers a high level of usability by hiding the complexity of computing infrastructure, and supports an open accessibility and control to application entities and functionality. The BigEarth platform [1] supports the entire flow of flexible description of processing by basic operators and adaptive execution over cloud infrastructure [2]. The basic modules of the pipeline such as the KEOPS [3] set of basic operators, the WorDeL language [4], the Planner for sequential and parallel processing, and the Executor through virtual machines, are detailed as the main components of the BigEarth platform [5]. The presentation exemplifies the development of some Earth Observation oriented applications based on flexible description of processing, and adaptive and portable execution over Cloud infrastructure. Main references for further information: [1] BigEarth project, http://cgis.utcluj.ro/projects/bigearth [2] Gorgan, D., "Flexible and Adaptive Processing of Earth Observation Data over High Performance Computation Architectures", International Conference and Exhibition Satellite 2015, August 17-19, Houston, Texas, USA. [3] Mihon, D., Bacu, V., Colceriu, V., Gorgan, D., "Modeling of Earth Observation Use Cases through the KEOPS System", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 455-460, (2015). [4] Nandra, C., Gorgan, D., "Workflow Description Language for Defining Big Earth Data Processing Tasks", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 461-468, (2015). [5] Bacu, V., Stefan, T., Gorgan, D., "Adaptive Processing of Earth Observation Data on Cloud Infrastructures Based on Workflow Description", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp.444-454, (2015).
Running R Statistical Computing Environment Software on the Peregrine
for the development of new statistical methodologies and enjoys a large user base. Please consult the distribution details. Natural language support but running in an English locale R is a collaborative project programming paradigms to better leverage modern HPC systems. The CRAN task view for High Performance Computing
Opportunities for nonvolatile memory systems in extreme-scale high-performance computing
Vetter, Jeffrey S.; Mittal, Sparsh
2015-01-12
For extreme-scale high-performance computing systems, system-wide power consumption has been identified as one of the key constraints moving forward, where DRAM main memory systems account for about 30 to 50 percent of a node's overall power consumption. As the benefits of device scaling for DRAM memory slow, it will become increasingly difficult to keep memory capacities balanced with increasing computational rates offered by next-generation processors. However, several emerging memory technologies related to nonvolatile memory (NVM) devices are being investigated as an alternative for DRAM. Moving forward, NVM devices could offer solutions for HPC architectures. Researchers are investigating how to integratemore » these emerging technologies into future extreme-scale HPC systems and how to expose these capabilities in the software stack and applications. In addition, current results show several of these strategies could offer high-bandwidth I/O, larger main memory capacities, persistent data structures, and new approaches for application resilience and output postprocessing, such as transaction-based incremental checkpointing and in situ visualization, respectively.« less
BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services.
Velloso, Henrique; Vialle, Ricardo A; Ortega, J Miguel
2015-06-02
Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.
System-Level Virtualization for High Performance Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vallee, Geoffroy R; Naughton, III, Thomas J; Engelmann, Christian
2008-01-01
System-level virtualization has been a research topic since the 70's but regained popularity during the past few years because of the availability of efficient solution such as Xen and the implementation of hardware support in commodity processors (e.g. Intel-VT, AMD-V). However, a majority of system-level virtualization projects is guided by the server consolidation market. As a result, current virtualization solutions appear to not be suitable for high performance computing (HPC) which is typically based on large-scale systems. On another hand there is significant interest in exploiting virtual machines (VMs) within HPC for a number of other reasons. By virtualizing themore » machine, one is able to run a variety of operating systems and environments as needed by the applications. Virtualization allows users to isolate workloads, improving security and reliability. It is also possible to support non-native environments and/or legacy operating environments through virtualization. In addition, it is possible to balance work loads, use migration techniques to relocate applications from failing machines, and isolate fault systems for repair. This document presents the challenges for the implementation of a system-level virtualization solution for HPC. It also presents a brief survey of the different approaches and techniques to address these challenges.« less
Continuous Security and Configuration Monitoring of HPC Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia-Lomeli, H. D.; Bertsch, A. D.; Fox, D. M.
Continuous security and configuration monitoring of information systems has been a time consuming and laborious task for system administrators at the High Performance Computing (HPC) center. Prior to this project, system administrators had to manually check the settings of thousands of nodes, which required a significant number of hours rendering the old process ineffective and inefficient. This paper explains the application of Splunk Enterprise, a software agent, and a reporting tool in the development of a user application interface to track and report on critical system updates and security compliance status of HPC Clusters. In conjunction with other configuration managementmore » systems, the reporting tool is to provide continuous situational awareness to system administrators of the compliance state of information systems. Our approach consisted of the development, testing, and deployment of an agent to collect any arbitrary information across a massively distributed computing center, and organize that information into a human-readable format. Using Splunk Enterprise, this raw data was then gathered into a central repository and indexed for search, analysis, and correlation. Following acquisition and accumulation, the reporting tool generated and presented actionable information by filtering the data according to command line parameters passed at run time. Preliminary data showed results for over six thousand nodes. Further research and expansion of this tool could lead to the development of a series of agents to gather and report critical system parameters. However, in order to make use of the flexibility and resourcefulness of the reporting tool the agent must conform to specifications set forth in this paper. This project has simplified the way system administrators gather, analyze, and report on the configuration and security state of HPC clusters, maintaining ongoing situational awareness. Rather than querying each cluster independently, compliance checking can be managed from one central location.« less
VisIVO: A Library and Integrated Tools for Large Astrophysical Dataset Exploration
NASA Astrophysics Data System (ADS)
Becciani, U.; Costa, A.; Ersotelos, N.; Krokos, M.; Massimino, P.; Petta, C.; Vitello, F.
2012-09-01
VisIVO provides an integrated suite of tools and services that can be used in many scientific fields. VisIVO development starts in the Virtual Observatory framework. VisIVO allows users to visualize meaningfully highly-complex, large-scale datasets and create movies of these visualizations based on distributed infrastructures. VisIVO supports high-performance, multi-dimensional visualization of large-scale astrophysical datasets. Users can rapidly obtain meaningful visualizations while preserving full and intuitive control of the relevant parameters. VisIVO consists of VisIVO Desktop - a stand-alone application for interactive visualization on standard PCs, VisIVO Server - a platform for high performance visualization, VisIVO Web - a custom designed web portal, VisIVOSmartphone - an application to exploit the VisIVO Server functionality and the latest VisIVO features: VisIVO Library allows a job running on a computational system (grid, HPC, etc.) to produce movies directly with the code internal data arrays without the need to produce intermediate files. This is particularly important when running on large computational facilities, where the user wants to have a look at the results during the data production phase. For example, in grid computing facilities, images can be produced directly in the grid catalogue while the user code is running in a system that cannot be directly accessed by the user (a worker node). The deployment of VisIVO on the DG and gLite is carried out with the support of EDGI and EGI-Inspire projects. Depending on the structure and size of datasets under consideration, the data exploration process could take several hours of CPU for creating customized views and the production of movies could potentially last several days. For this reason an MPI parallel version of VisIVO could play a fundamental role in increasing performance, e.g. it could be automatically deployed on nodes that are MPI aware. A central concept in our development is thus to produce unified code that can run either on serial nodes or in parallel by using HPC oriented grid nodes. Another important aspect, to obtain as high performance as possible, is the integration of VisIVO processes with grid nodes where GPUs are available. We have selected CUDA for implementing a range of computationally heavy modules. VisIVO is supported by EGI-Inspire, EDGI and SCI-BUS projects.
Resilience Design Patterns - A Structured Approach to Resilience at Extreme Scale (version 1.1)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Engelmann, Christian
Reliability is a serious concern for future extreme-scale high-performance computing (HPC) systems. Projections based on the current generation of HPC systems and technology roadmaps suggest the prevalence of very high fault rates in future systems. The errors resulting from these faults will propagate and generate various kinds of failures, which may result in outcomes ranging from result corruptions to catastrophic application crashes. Therefore the resilience challenge for extreme-scale HPC systems requires management of various hardware and software technologies that are capable of handling a broad set of fault models at accelerated fault rates. Also, due to practical limits on powermore » consumption in HPC systems future systems are likely to embrace innovative architectures, increasing the levels of hardware and software complexities. As a result the techniques that seek to improve resilience must navigate the complex trade-off space between resilience and the overheads to power consumption and performance. While the HPC community has developed various resilience solutions, application-level techniques as well as system-based solutions, the solution space of HPC resilience techniques remains fragmented. There are no formal methods and metrics to investigate and evaluate resilience holistically in HPC systems that consider impact scope, handling coverage, and performance & power efficiency across the system stack. Additionally, few of the current approaches are portable to newer architectures and software environments that will be deployed on future systems. In this document, we develop a structured approach to the management of HPC resilience using the concept of resilience-based design patterns. A design pattern is a general repeatable solution to a commonly occurring problem. We identify the commonly occurring problems and solutions used to deal with faults, errors and failures in HPC systems. Each established solution is described in the form of a pattern that addresses concrete problems in the design of resilient systems. The complete catalog of resilience design patterns provides designers with reusable design elements. We also define a framework that enhances a designer's understanding of the important constraints and opportunities for the design patterns to be implemented and deployed at various layers of the system stack. This design framework may be used to establish mechanisms and interfaces to coordinate flexible fault management across hardware and software components. The framework also supports optimization of the cost-benefit trade-offs among performance, resilience, and power consumption. The overall goal of this work is to enable a systematic methodology for the design and evaluation of resilience technologies in extreme-scale HPC systems that keep scientific applications running to a correct solution in a timely and cost-efficient manner in spite of frequent faults, errors, and failures of various types.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Justin; Karra, Satish; Nakshatrala, Kalyana B.
It is well-known that the standard Galerkin formulation, which is often the formulation of choice under the finite element method for solving self-adjoint diffusion equations, does not meet maximum principles and the non-negative constraint for anisotropic diffusion equations. Recently, optimization-based methodologies that satisfy maximum principles and the non-negative constraint for steady-state and transient diffusion-type equations have been proposed. To date, these methodologies have been tested only on small-scale academic problems. The purpose of this paper is to systematically study the performance of the non-negative methodology in the context of high performance computing (HPC). PETSc and TAO libraries are, respectively, usedmore » for the parallel environment and optimization solvers. For large-scale problems, it is important for computational scientists to understand the computational performance of current algorithms available in these scientific libraries. The numerical experiments are conducted on the state-of-the-art HPC systems, and a single-core performance model is used to better characterize the efficiency of the solvers. Furthermore, our studies indicate that the proposed non-negative computational framework for diffusion-type equations exhibits excellent strong scaling for real-world large-scale problems.« less
A Framework for Debugging Geoscience Projects in a High Performance Computing Environment
NASA Astrophysics Data System (ADS)
Baxter, C.; Matott, L.
2012-12-01
High performance computing (HPC) infrastructure has become ubiquitous in today's world with the emergence of commercial cloud computing and academic supercomputing centers. Teams of geoscientists, hydrologists and engineers can take advantage of this infrastructure to undertake large research projects - for example, linking one or more site-specific environmental models with soft computing algorithms, such as heuristic global search procedures, to perform parameter estimation and predictive uncertainty analysis, and/or design least-cost remediation systems. However, the size, complexity and distributed nature of these projects can make identifying failures in the associated numerical experiments using conventional ad-hoc approaches both time- consuming and ineffective. To address these problems a multi-tiered debugging framework has been developed. The framework allows for quickly isolating and remedying a number of potential experimental failures, including: failures in the HPC scheduler; bugs in the soft computing code; bugs in the modeling code; and permissions and access control errors. The utility of the framework is demonstrated via application to a series of over 200,000 numerical experiments involving a suite of 5 heuristic global search algorithms and 15 mathematical test functions serving as cheap analogues for the simulation-based optimization of pump-and-treat subsurface remediation systems.
Chang, Justin; Karra, Satish; Nakshatrala, Kalyana B.
2016-07-26
It is well-known that the standard Galerkin formulation, which is often the formulation of choice under the finite element method for solving self-adjoint diffusion equations, does not meet maximum principles and the non-negative constraint for anisotropic diffusion equations. Recently, optimization-based methodologies that satisfy maximum principles and the non-negative constraint for steady-state and transient diffusion-type equations have been proposed. To date, these methodologies have been tested only on small-scale academic problems. The purpose of this paper is to systematically study the performance of the non-negative methodology in the context of high performance computing (HPC). PETSc and TAO libraries are, respectively, usedmore » for the parallel environment and optimization solvers. For large-scale problems, it is important for computational scientists to understand the computational performance of current algorithms available in these scientific libraries. The numerical experiments are conducted on the state-of-the-art HPC systems, and a single-core performance model is used to better characterize the efficiency of the solvers. Furthermore, our studies indicate that the proposed non-negative computational framework for diffusion-type equations exhibits excellent strong scaling for real-world large-scale problems.« less
The Use of High Performance Computing (HPC) to Strengthen the Development of Army Systems
2011-11-01
accurately predicting the supersonic magus effect about spinning cones, ogive- cylinders , and boat-tailed afterbodies. This work led to the successful...successful computer model of the proposed product or system, one can then build prototypes on the computer and study the effects on the performance of...needed. The NRC report discusses the requirements for effective use of such computing power. One needs “models, algorithms, software, hardware
Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale
Engelmann, Christian; Hukerikar, Saurabh
2017-09-01
Reliability is a serious concern for future extreme-scale high-performance computing (HPC) systems. Projections based on the current generation of HPC systems and technology roadmaps suggest the prevalence of very high fault rates in future systems. While the HPC community has developed various resilience solutions, application-level techniques as well as system-based solutions, the solution space remains fragmented. There are no formal methods and metrics to integrate the various HPC resilience techniques into composite solutions, nor are there methods to holistically evaluate the adequacy and efficacy of such solutions in terms of their protection coverage, and their performance \\& power efficiency characteristics.more » Additionally, few of the current approaches are portable to newer architectures and software environments that will be deployed on future systems. In this paper, we develop a structured approach to the design, evaluation and optimization of HPC resilience using the concept of design patterns. A design pattern is a general repeatable solution to a commonly occurring problem. We identify the problems caused by various types of faults, errors and failures in HPC systems and the techniques used to deal with these events. Each well-known solution that addresses a specific HPC resilience challenge is described in the form of a pattern. We develop a complete catalog of such resilience design patterns, which may be used by system architects, system software and tools developers, application programmers, as well as users and operators as essential building blocks when designing and deploying resilience solutions. We also develop a design framework that enhances a designer's understanding the opportunities for integrating multiple patterns across layers of the system stack and the important constraints during implementation of the individual patterns. It is also useful for defining mechanisms and interfaces to coordinate flexible fault management across hardware and software components. The resilience patterns and the design framework also enable exploration and evaluation of design alternatives and support optimization of the cost-benefit trade-offs among performance, protection coverage, and power consumption of resilience solutions. Here, the overall goal of this work is to establish a systematic methodology for the design and evaluation of resilience technologies in extreme-scale HPC systems that keep scientific applications running to a correct solution in a timely and cost-efficient manner despite frequent faults, errors, and failures of various types.« less
Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Engelmann, Christian; Hukerikar, Saurabh
Reliability is a serious concern for future extreme-scale high-performance computing (HPC) systems. Projections based on the current generation of HPC systems and technology roadmaps suggest the prevalence of very high fault rates in future systems. While the HPC community has developed various resilience solutions, application-level techniques as well as system-based solutions, the solution space remains fragmented. There are no formal methods and metrics to integrate the various HPC resilience techniques into composite solutions, nor are there methods to holistically evaluate the adequacy and efficacy of such solutions in terms of their protection coverage, and their performance \\& power efficiency characteristics.more » Additionally, few of the current approaches are portable to newer architectures and software environments that will be deployed on future systems. In this paper, we develop a structured approach to the design, evaluation and optimization of HPC resilience using the concept of design patterns. A design pattern is a general repeatable solution to a commonly occurring problem. We identify the problems caused by various types of faults, errors and failures in HPC systems and the techniques used to deal with these events. Each well-known solution that addresses a specific HPC resilience challenge is described in the form of a pattern. We develop a complete catalog of such resilience design patterns, which may be used by system architects, system software and tools developers, application programmers, as well as users and operators as essential building blocks when designing and deploying resilience solutions. We also develop a design framework that enhances a designer's understanding the opportunities for integrating multiple patterns across layers of the system stack and the important constraints during implementation of the individual patterns. It is also useful for defining mechanisms and interfaces to coordinate flexible fault management across hardware and software components. The resilience patterns and the design framework also enable exploration and evaluation of design alternatives and support optimization of the cost-benefit trade-offs among performance, protection coverage, and power consumption of resilience solutions. Here, the overall goal of this work is to establish a systematic methodology for the design and evaluation of resilience technologies in extreme-scale HPC systems that keep scientific applications running to a correct solution in a timely and cost-efficient manner despite frequent faults, errors, and failures of various types.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aderholdt, Ferrol; Caldwell, Blake A.; Hicks, Susan Elaine
High performance computing environments are often used for a wide variety of workloads ranging from simulation, data transformation and analysis, and complex workflows to name just a few. These systems may process data at various security levels but in so doing are often enclaved at the highest security posture. This approach places significant restrictions on the users of the system even when processing data at a lower security level and exposes data at higher levels of confidentiality to a much broader population than otherwise necessary. The traditional approach of isolation, while effective in establishing security enclaves poses significant challenges formore » the use of shared infrastructure in HPC environments. This report details current state-of-the-art in reconfigurable network enclaving through Software Defined Networking (SDN) and Network Function Virtualization (NFV) and their applicability to secure enclaves in HPC environments. SDN and NFV methods are based on a solid foundation of system wide virtualization. The purpose of which is very straight forward, the system administrator can deploy networks that are more amenable to customer needs, and at the same time achieve increased scalability making it easier to increase overall capacity as needed without negatively affecting functionality. The network administration of both the server system and the virtual sub-systems is simplified allowing control of the infrastructure through well-defined APIs (Application Programming Interface). While SDN and NFV technologies offer significant promise in meeting these goals, they also provide the ability to address a significant component of the multi-tenant challenge in HPC environments, namely resource isolation. Traditional HPC systems are built upon scalable high-performance networking technologies designed to meet specific application requirements. Dynamic isolation of resources within these environments has remained difficult to achieve. SDN and NFV methodology provide us with relevant concepts and available open standards based APIs that isolate compute and storage resources within an otherwise common networking infrastructure. Additionally, the integration of the networking APIs within larger system frameworks such as OpenStack provide the tools necessary to establish isolated enclaves dynamically allowing the benefits of HPC while providing a controlled security structure surrounding these systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamlet, Jason R.; Keliiaa, Curtis M.
This report assesses current public domain cyber security practices with respect to cyber indications and warnings. It describes cybersecurity industry and government activities, including cybersecurity tools, methods, practices, and international and government-wide initiatives known to be impacting current practice. Of particular note are the U.S. Government's Trusted Internet Connection (TIC) and 'Einstein' programs, which are serving to consolidate the Government's internet access points and to provide some capability to monitor and mitigate cyber attacks. Next, this report catalogs activities undertaken by various industry and government entities. In addition, it assesses the benchmarks of HPC capability and other HPC attributes thatmore » may lend themselves to assist in the solution of this problem. This report draws few conclusions, as it is intended to assess current practice in preparation for future work, however, no explicit references to HPC usage for the purpose of analyzing cyber infrastructure in near-real-time were found in the current practice. This report and a related SAND2010-4766 National Cyber Defense High Performance Computing and Analysis: Concepts, Planning and Roadmap report are intended to provoke discussion throughout a broad audience about developing a cohesive HPC centric solution to wide-area cybersecurity problems.« less
PuTTY | High-Performance Computing | NREL
PuTTY PuTTY Learn how to use PuTTY to connect to NREL's high-performance computing (HPC) systems . Connecting When you start the PuTTY app, the program will display PuTTY's Configuration menu. When this comes
Evolution of Embedded Processing for Wide Area Surveillance
2014-01-01
future vision . 15. SUBJECT TERMS Embedded processing; high performance computing; general-purpose graphical processing units (GPGPUs) 16. SECURITY...recon- naissance (ISR) mission capabilities. The capabilities these advancements are achieving include the ability to provide persistent all...fighters to support and positively affect their mission . Significant improvements in high-performance computing (HPC) technology make it possible to
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, Li; Chen, Zizhong; Song, Shuaiwen
2016-01-18
Energy efficiency and resilience are two crucial challenges for HPC systems to reach exascale. While energy efficiency and resilience issues have been extensively studied individually, little has been done to understand the interplay between energy efficiency and resilience for HPC systems. Decreasing the supply voltage associated with a given operating frequency for processors and other CMOS-based components can significantly reduce power consumption. However, this often raises system failure rates and consequently increases application execution time. In this work, we present an energy saving undervolting approach that leverages the mainstream resilience techniques to tolerate the increased failures caused by undervolting.
Investigating the Interplay between Energy Efficiency and Resilience in High Performance Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, Li; Song, Shuaiwen; Wu, Panruo
2015-05-29
Energy efficiency and resilience are two crucial challenges for HPC systems to reach exascale. While energy efficiency and resilience issues have been extensively studied individually, little has been done to understand the interplay between energy efficiency and resilience for HPC systems. Decreasing the supply voltage associated with a given operating frequency for processors and other CMOS-based components can significantly reduce power consumption. However, this often raises system failure rates and consequently increases application execution time. In this work, we present an energy saving undervolting approach that leverages the mainstream resilience techniques to tolerate the increased failures caused by undervolting.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, Li; Chen, Zizhong; Song, Shuaiwen Leon
2015-11-16
Energy efficiency and resilience are two crucial challenges for HPC systems to reach exascale. While energy efficiency and resilience issues have been extensively studied individually, little has been done to understand the interplay between energy efficiency and resilience for HPC systems. Decreasing the supply voltage associated with a given operating frequency for processors and other CMOS-based components can significantly reduce power consumption. However, this often raises system failure rates and consequently increases application execution time. In this work, we present an energy saving undervolting approach that leverages the mainstream resilience techniques to tolerate the increased failures caused by undervolting.
Enteric-coating of pulsatile-release HPC capsules prepared by injection molding.
Macchi, E; Zema, L; Maroni, A; Gazzaniga, A; Felton, L A
2015-04-05
Capsular devices based on hydroxypropyl cellulose (Klucel® LF) intended for pulsatile release were prepared by injection molding (IM). In the present work, the possibility of exploiting such capsules for the development of colonic delivery systems based on a time-dependent approach was evaluated. For this purpose, it was necessary to demonstrate the ability of molded cores to undergo a coating process and that coated systems yield the desired performance (gastric resistance). Although no information was available on the coating of IM substrates, some issues relevant to that of commercially-available capsules are known. Thus, preliminary studies were conducted on molded disks for screening purposes prior to the spray-coating of HPC capsular cores with Eudragit® L 30 D 55. The ability of the polymeric suspension to wet the substrate, spread, start penetrating and initiate hydration/swelling, as well as to provide a gastroresistant barrier was demonstrated. The coating of prototype HPC capsules was carried out successfully, leading to coated systems with good technological properties and able to withstand the acidic medium with no need for sealing at the cap/body joint. Such systems maintained the original pulsatile release performance after dissolution of the enteric film in pH 6.8 fluid. Therefore, they appeared potentially suitable for the development of a colon delivery platform based on a time-dependent approach. Copyright © 2015 Elsevier B.V. All rights reserved.
Role of HPC in Advancing Computational Aeroelasticity
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.
2004-01-01
On behalf of the High Performance Computing and Modernization Program (HPCMP) and NASA Advanced Supercomputing Division (NAS) a study is conducted to assess the role of supercomputers on computational aeroelasticity of aerospace vehicles. The study is mostly based on the responses to a web based questionnaire that was designed to capture the nuances of high performance computational aeroelasticity, particularly on parallel computers. A procedure is presented to assign a fidelity-complexity index to each application. Case studies based on major applications using HPCMP resources are presented.
Yokohama, Noriya
2013-07-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karthik, Rajasekar
2014-01-01
In this paper, an architecture for building Scalable And Mobile Environment For High-Performance Computing with spatial capabilities called SAME4HPC is described using cutting-edge technologies and standards such as Node.js, HTML5, ECMAScript 6, and PostgreSQL 9.4. Mobile devices are increasingly becoming powerful enough to run high-performance apps. At the same time, there exist a significant number of low-end and older devices that rely heavily on the server or the cloud infrastructure to do the heavy lifting. Our architecture aims to support both of these types of devices to provide high-performance and rich user experience. A cloud infrastructure consisting of OpenStack withmore » Ubuntu, GeoServer, and high-performance JavaScript frameworks are some of the key open-source and industry standard practices that has been adopted in this architecture.« less
Computational challenges in atomic, molecular and optical physics.
Taylor, Kenneth T
2002-06-15
Six challenges are discussed. These are the laser-driven helium atom; the laser-driven hydrogen molecule and hydrogen molecular ion; electron scattering (with ionization) from one-electron atoms; the vibrational and rotational structure of molecules such as H(3)(+) and water at their dissociation limits; laser-heated clusters; and quantum degeneracy and Bose-Einstein condensation. The first four concern fundamental few-body systems where use of high-performance computing (HPC) is currently making possible accurate modelling from first principles. This leads to reliable predictions and support for laboratory experiment as well as true understanding of the dynamics. Important aspects of these challenges addressable only via a terascale facility are set out. Such a facility makes the last two challenges in the above list meaningfully accessible for the first time, and the scientific interest together with the prospective role for HPC in these is emphasized.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aderholdt, Ferrol; Caldwell, Blake A.; Hicks, Susan Elaine
High performance computing environments are often used for a wide variety of workloads ranging from simulation, data transformation and analysis, and complex workflows to name just a few. These systems may process data at various security levels but in so doing are often enclaved at the highest security posture. This approach places significant restrictions on the users of the system even when processing data at a lower security level and exposes data at higher levels of confidentiality to a much broader population than otherwise necessary. The traditional approach of isolation, while effective in establishing security enclaves poses significant challenges formore » the use of shared infrastructure in HPC environments. This report details current state-of-the-art in virtualization, reconfigurable network enclaving via Software Defined Networking (SDN), and storage architectures and bridging techniques for creating secure enclaves in HPC environments.« less
Experience Paper: Software Engineering and Community Codes Track in ATPESC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubey, Anshu; Riley, Katherine M.
Argonne Training Program in Extreme Scale Computing (ATPESC) was started by the Argonne National Laboratory with the objective of expanding the ranks of better prepared users of high performance computing (HPC) machines. One of the unique aspects of the program was inclusion of software engineering and community codes track. The inclusion was motivated by the observation that the projects with a good scientific and software process were better able to meet their scientific goals. In this paper we present our experience of running the software track from the beginning of the program until now. We discuss the motivations, the reception,more » and the evolution of the track over the years. We welcome discussion and input from the community to enhance the track in ATPESC, and also to facilitate inclusion of similar tracks in other HPC oriented training programs.« less
The OptIPuter microscopy demonstrator: enabling science through a transatlantic lightpath
Ellisman, M.; Hutton, T.; Kirkland, A.; Lin, A.; Lin, C.; Molina, T.; Peltier, S.; Singh, R.; Tang, K.; Trefethen, A.E.; Wallom, D.C.H.; Xiong, X.
2009-01-01
The OptIPuter microscopy demonstrator project has been designed to enable concurrent and remote usage of world-class electron microscopes located in Oxford and San Diego. The project has constructed a network consisting of microscopes and computational and data resources that are all connected by a dedicated network infrastructure using the UK Lightpath and US Starlight systems. Key science drivers include examples from both materials and biological science. The resulting system is now a permanent link between the Oxford and San Diego microscopy centres. This will form the basis of further projects between the sites and expansion of the types of systems that can be remotely controlled, including optical, as well as electron, microscopy. Other improvements will include the updating of the Microsoft cluster software to the high performance computing (HPC) server 2008, which includes the HPC basic profile implementation that will enable the development of interoperable clients. PMID:19487201
The OptIPuter microscopy demonstrator: enabling science through a transatlantic lightpath.
Ellisman, M; Hutton, T; Kirkland, A; Lin, A; Lin, C; Molina, T; Peltier, S; Singh, R; Tang, K; Trefethen, A E; Wallom, D C H; Xiong, X
2009-07-13
The OptIPuter microscopy demonstrator project has been designed to enable concurrent and remote usage of world-class electron microscopes located in Oxford and San Diego. The project has constructed a network consisting of microscopes and computational and data resources that are all connected by a dedicated network infrastructure using the UK Lightpath and US Starlight systems. Key science drivers include examples from both materials and biological science. The resulting system is now a permanent link between the Oxford and San Diego microscopy centres. This will form the basis of further projects between the sites and expansion of the types of systems that can be remotely controlled, including optical, as well as electron, microscopy. Other improvements will include the updating of the Microsoft cluster software to the high performance computing (HPC) server 2008, which includes the HPC basic profile implementation that will enable the development of interoperable clients.
A Heterogeneous High-Performance System for Computational and Computer Science
2016-11-15
Patents Submitted Patents Awarded Awards Graduate Students Names of Post Doctorates Names of Faculty Supported Names of Under Graduate students supported...team of research faculty from the departments of computer science and natural science at Bowie State University. The supercomputer is not only to...accelerated HPC systems. The supercomputer is also ideal for the research conducted in the Department of Natural Science, as research faculty work on
NASA's Climate in a Box: Desktop Supercomputing for Open Scientific Model Development
NASA Astrophysics Data System (ADS)
Wojcik, G. S.; Seablom, M. S.; Lee, T. J.; McConaughy, G. R.; Syed, R.; Oloso, A.; Kemp, E. M.; Greenseid, J.; Smith, R.
2009-12-01
NASA's High Performance Computing Portfolio in cooperation with its Modeling, Analysis, and Prediction program intends to make its climate and earth science models more accessible to a larger community. A key goal of this effort is to open the model development and validation process to the scientific community at large such that a natural selection process is enabled and results in a more efficient scientific process. One obstacle to others using NASA models is the complexity of the models and the difficulty in learning how to use them. This situation applies not only to scientists who regularly use these models but also non-typical users who may want to use the models such as scientists from different domains, policy makers, and teachers. Another obstacle to the use of these models is that access to high performance computing (HPC) accounts, from which the models are implemented, can be restrictive with long wait times in job queues and delays caused by an arduous process of obtaining an account, especially for foreign nationals. This project explores the utility of using desktop supercomputers in providing a complete ready-to-use toolkit of climate research products to investigators and on demand access to an HPC system. One objective of this work is to pre-package NASA and NOAA models so that new users will not have to spend significant time porting the models. In addition, the prepackaged toolkit will include tools, such as workflow, visualization, social networking web sites, and analysis tools, to assist users in running the models and analyzing the data. The system architecture to be developed will allow for automatic code updates for each user and an effective means with which to deal with data that are generated. We plan to investigate several desktop systems, but our work to date has focused on a Cray CX1. Currently, we are investigating the potential capabilities of several non-traditional development environments. While most NASA and NOAA models are designed for Linux operating systems (OS), the arrival of the WindowsHPC 2008 OS provides the opportunity to evaluate the use of a new platform on which to develop and port climate and earth science models. In particular, we are evaluating Microsoft's Visual Studio Integrated Developer Environment to determine its appropriateness for the climate modeling community. In the initial phases of this project, we have ported GEOS-5, WRF, GISS ModelE, and GFS to Linux on a CX1 and are in the process of porting WRF and ModelE to WindowsHPC 2008. Initial tests on the CX1 Linux OS indicate favorable comparisons in terms of performance and consistency of scientific results when compared with experiments executed on NASA high end systems. As in the past, NASA's large clusters will continue to be an important part of our objectives. We envision a seamless environment in which an investigator performs model development and testing on a desktop system and can seamlessly transfer execution to supercomputer clusters for production.
Final Report Extreme Computing and U.S. Competitiveness DOE Award. DE-FG02-11ER26087/DE-SC0008764
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mustain, Christopher J.
The Council has acted on each of the grant deliverables during the funding period. The deliverables are: (1) convening the Council’s High Performance Computing Advisory Committee (HPCAC) on a bi-annual basis; (2) broadening public awareness of high performance computing (HPC) and exascale developments; (3) assessing the industrial applications of extreme computing; and (4) establishing a policy and business case for an exascale economy.
GPU Implementation of High Rayleigh Number Three-Dimensional Mantle Convection
NASA Astrophysics Data System (ADS)
Sanchez, D. A.; Yuen, D. A.; Wright, G. B.; Barnett, G. A.
2010-12-01
Although we have entered the age of petascale computing, many factors are still prohibiting high-performance computing (HPC) from infiltrating all suitable scientific disciplines. For this reason and others, application of GPU to HPC is gaining traction in the scientific world. With its low price point, high performance potential, and competitive scalability, GPU has been an option well worth considering for the last few years. Moreover with the advent of NVIDIA's Fermi architecture, which brings ECC memory, better double-precision performance, and more RAM to GPU, there is a strong message of corporate support for GPU in HPC. However many doubts linger concerning the practicality of using GPU for scientific computing. In particular, GPU has a reputation for being difficult to program and suitable for only a small subset of problems. Although inroads have been made in addressing these concerns, for many scientists GPU still has hurdles to clear before becoming an acceptable choice. We explore the applicability of GPU to geophysics by implementing a three-dimensional, second-order finite-difference model of Rayleigh-Benard thermal convection on an NVIDIA GPU using C for CUDA. Our code reaches sufficient resolution, on the order of 500x500x250 evenly-spaced finite-difference gridpoints, on a single GPU. We make extensive use of highly optimized CUBLAS routines, allowing us to achieve performance on the order of O( 0.1 ) µs per timestep*gridpoint at this resolution. This performance has allowed us to study high Rayleigh number simulations, on the order of 2x10^7, on a single GPU.
HPC USER WORKSHOP - JUNE 12TH | High-Performance Computing | NREL
to CentOS 7, changes to modules management, Singularity and containers on Peregrine, and using of changes, with the remaining two hours dedicated to demos and one-on-one interaction as needed
Innovative HPC architectures for the study of planetary plasma environments
NASA Astrophysics Data System (ADS)
Amaya, Jorge; Wolf, Anna; Lembège, Bertrand; Zitz, Anke; Alvarez, Damian; Lapenta, Giovanni
2016-04-01
DEEP-ER is an European Commission founded project that develops a new type of High Performance Computer architecture. The revolutionary system is currently used by KU Leuven to study the effects of the solar wind on the global environments of the Earth and Mercury. The new architecture combines the versatility of Intel Xeon computing nodes with the power of the upcoming Intel Xeon Phi accelerators. Contrary to classical heterogeneous HPC architectures, where it is customary to find CPU and accelerators in the same computing nodes, in the DEEP-ER system CPU nodes are grouped together (Cluster) and independently from the accelerator nodes (Booster). The system is equipped with a state of the art interconnection network, a highly scalable and fast I/O and a fail recovery resiliency system. The final objective of the project is to introduce a scalable system that can be used to create the next generation of exascale supercomputers. The code iPic3D from KU Leuven is being adapted to this new architecture. This particle-in-cell code can now perform the computation of the electromagnetic fields in the Cluster while the particles are moved in the Booster side. Using fast and scalable Xeon Phi accelerators in the Booster we can introduce many more particles per cell in the simulation than what is possible in the current generation of HPC systems, allowing to calculate fully kinetic plasmas with very low interpolation noise. The system will be used to perform fully kinetic, low noise, 3D simulations of the interaction of the solar wind with the magnetosphere of the Earth and Mercury. Preliminary simulations have been performed in other HPC centers in order to compare the results in different systems. In this presentation we show the complexity of the plasma flow around the planets, including the development of hydrodynamic instabilities at the flanks, the presence of the collision-less shock, the magnetosheath, the magnetopause, reconnection zones, the formation of the plasma sheet and the magnetotail, and the variation of ion/electron plasma flows when crossing these frontiers. The simulations also give access to detailed information about the particle dynamics and their velocity distribution at locations that can be used for comparison with satellite data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brust, Frederick W.; Punch, Edward F.; Twombly, Elizabeth Kurth
This report summarizes the final product developed for the US DOE Small Business Innovation Research (SBIR) Phase II grant made to Engineering Mechanics Corporation of Columbus (Emc 2) between April 16, 2014 and August 31, 2016 titled ‘Adoption of High Performance Computational (HPC) Modeling Software for Widespread Use in the Manufacture of Welded Structures’. Many US companies have moved fabrication and production facilities off shore because of cheaper labor costs. A key aspect in bringing these jobs back to the US is the use of technology to render US-made fabrications more cost-efficient overall with higher quality. One significant advantage thatmore » has emerged in the US over the last two decades is the use of virtual design for fabrication of small and large structures in weld fabrication industries. Industries that use virtual design and analysis tools have reduced material part size, developed environmentally-friendly fabrication processes, improved product quality and performance, and reduced manufacturing costs. Indeed, Caterpillar Inc. (CAT), one of the partners in this effort, continues to have a large fabrication presence in the US because of the use of weld fabrication modeling to optimize fabrications by controlling weld residual stresses and distortions and improving fatigue, corrosion, and fracture performance. This report describes Emc 2’s DOE SBIR Phase II final results to extend an existing, state-of-the-art software code, Virtual Fabrication Technology (VFT®), currently used to design and model large welded structures prior to fabrication - to a broader range of products with widespread applications for small and medium-sized enterprises (SMEs). VFT® helps control distortion, can minimize and/or control residual stresses, control welding microstructure, and pre-determine welding parameters such as weld-sequencing, pre-bending, thermal-tensioning, etc. VFT® uses material properties, consumable properties, etc. as inputs. Through VFT®, manufacturing companies can avoid costly design changes after fabrication. This leads to the concept of joint design/fabrication where these important disciplines are intimately linked to minimize fabrication costs. Finally service performance (such as fatigue, corrosion, and fracture/damage) can be improved using this product. Emc 2’s DOE SBIR Phase II effort successfully adapted VFT® to perform efficiently in an HPC environment independent of commercial software on a platform to permit easy and cost effective access to the code. This provides the key for SMEs to access this sophisticated and proven methodology that is quick, accurate, cost effective and available “on-demand” to address weld-simulation and fabrication problems prior to manufacture. In addition, other organizations, such as Government agencies and large companies, may have a need for spot use of such a tool. The open source code, WARP3D, a high performance finite element code used in fracture and damage assessment of structures, was significantly modified so computational weld problems can be solved efficiently on multiple processors and threads with VFT®. The thermal solver for VFT®, based on a series of closed form solution approximations, was extensively enhanced for solution on multiple processors greatly increasing overall speed. In addition, the graphical user interface (GUI) was re-written to permit SMEs access to an HPC environment at the Ohio Super Computer Center (OSC) to integrate these solutions with WARP3D. The GUI is used to define all weld pass descriptions, number of passes, material properties, consumable properties, weld speed, etc. for the structure to be modeled. The GUI was enhanced to make it more user-friendly so that non-experts can perform weld modeling. Finally, an extensive outreach program to market this capability to fabrication companies was performed. This access will permit SMEs to perform weld modeling to improve their competitiveness at a reasonable cost.« less
Hierarchically porous carbon/polyaniline hybrid for use in supercapacitors.
Joo, Min Jae; Yun, Young Soo; Jin, Hyoung-Joon
2014-12-01
A hierarchically porous carbon (HPC)/polyaniline (PANI) hybrid electrode was prepared by the polymerization of PANI on the surface of the HPC via rapid-mixing polymerization. The surface morphologies and chemical composition of the HPC/PANI hybrid electrode were characterized using transmission electron microscopy and X-ray photoelectron spectroscopy (XPS), respectively. The surface morphologies and XPS results for the HPC, PANI and HPC/PANI hybrids indicate that PANI is coated on the surface of HPC in the HPC/PANI hybrids which have two different nitrogen groups as a benzenoid amine (-NH-) peak and positively charged nitrogen (N+) peak. The electrochemical performances of the HPC/PANI hybrids were analyzed by performing cyclic voltammetry and galvanostatic charge-discharge tests. The HPC/PANI hybrids showed a better specific capacitance (222 F/g) than HPC (111 F/g) because of effect of pseudocapacitor behavior. In addition, good cycle stabilities were maintained over 1000 cycles.
Toward Transparent Data Management in Multi-layer Storage Hierarchy for HPC Systems
Wadhwa, Bharti; Byna, Suren; Butt, Ali R.
2018-04-17
Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier storage hierarchy, and thus will necessitate innovative storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of storage hierarchies due to the lack of hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. Here, we take the first steps of realizing such an object abstraction and explore storage mechanisms for these objectsmore » to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our storage model stores data objects in different physical organizations to support data movement across layers of memory/storage hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level storage hierarchy achieves up to 7 X I/O performance improvement for scientific data.« less
Toward Transparent Data Management in Multi-layer Storage Hierarchy for HPC Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wadhwa, Bharti; Byna, Suren; Butt, Ali R.
Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier storage hierarchy, and thus will necessitate innovative storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of storage hierarchies due to the lack of hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. Here, we take the first steps of realizing such an object abstraction and explore storage mechanisms for these objectsmore » to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our storage model stores data objects in different physical organizations to support data movement across layers of memory/storage hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level storage hierarchy achieves up to 7 X I/O performance improvement for scientific data.« less
Allocation Usage Tracking and Management | High-Performance Computing |
NREL's high-performance computing (HPC) systems, learn how to track and manage your allocations. The alloc_tracker script (/usr/local/bin/alloc_tracker) may be used to see what allocations you have access to, how much of the allocation has been used, how much remains and how many node hours will be forfeited at the
Pope, Bernard J; Fitch, Blake G; Pitman, Michael C; Rice, John J; Reumann, Matthias
2011-10-01
Future multiscale and multiphysics models that support research into human disease, translational medical science, and treatment can utilize the power of high-performance computing (HPC) systems. We anticipate that computationally efficient multiscale models will require the use of sophisticated hybrid programming models, mixing distributed message-passing processes [e.g., the message-passing interface (MPI)] with multithreading (e.g., OpenMP, Pthreads). The objective of this study is to compare the performance of such hybrid programming models when applied to the simulation of a realistic physiological multiscale model of the heart. Our results show that the hybrid models perform favorably when compared to an implementation using only the MPI and, furthermore, that OpenMP in combination with the MPI provides a satisfactory compromise between performance and code complexity. Having the ability to use threads within MPI processes enables the sophisticated use of all processor cores for both computation and communication phases. Considering that HPC systems in 2012 will have two orders of magnitude more cores than what was used in this study, we believe that faster than real-time multiscale cardiac simulations can be achieved on these systems.
Grids, Clouds, and Virtualization
NASA Astrophysics Data System (ADS)
Cafaro, Massimo; Aloisio, Giovanni
This chapter introduces and puts in context Grids, Clouds, and Virtualization. Grids promised to deliver computing power on demand. However, despite a decade of active research, no viable commercial grid computing provider has emerged. On the other hand, it is widely believed - especially in the Business World - that HPC will eventually become a commodity. Just as some commercial consumers of electricity have mission requirements that necessitate they generate their own power, some consumers of computational resources will continue to need to provision their own supercomputers. Clouds are a recent business-oriented development with the potential to render this eventually as rare as organizations that generate their own electricity today, even among institutions who currently consider themselves the unassailable elite of the HPC business. Finally, Virtualization is one of the key technologies enabling many different Clouds. We begin with a brief history in order to put them in context, and recall the basic principles and concepts underlying and clearly differentiating them. A thorough overview and survey of existing technologies provides the basis to delve into details as the reader progresses through the book.
Connecting to HPC VPN | High-Performance Computing | NREL
and password will match your NREL network account login/password. From OS X or Linux, open a terminal finalized. Open a Remote Desktop connection using server name WINHPC02 (this is the login node). Mac Mac
NASA Astrophysics Data System (ADS)
Fasel, Markus
2016-10-01
High-Performance Computing Systems are powerful tools tailored to support large- scale applications that rely on low-latency inter-process communications to run efficiently. By design, these systems often impose constraints on application workflows, such as limited external network connectivity and whole node scheduling, that make more general-purpose computing tasks, such as those commonly found in high-energy nuclear physics applications, more difficult to carry out. In this work, we present a tool designed to simplify access to such complicated environments by handling the common tasks of job submission, software management, and local data management, in a framework that is easily adaptable to the specific requirements of various computing systems. The tool, initially constructed to process stand-alone ALICE simulations for detector and software development, was successfully deployed on the NERSC computing systems, Carver, Hopper and Edison, and is being configured to provide access to the next generation NERSC system, Cori. In this report, we describe the tool and discuss our experience running ALICE applications on NERSC HPC systems. The discussion will include our initial benchmarks of Cori compared to other systems and our attempts to leverage the new capabilities offered with Cori to support data-intensive applications, with a future goal of full integration of such systems into ALICE grid operations.
NASA Technical Reports Server (NTRS)
Warner, James E.; Zubair, Mohammad; Ranjan, Desh
2017-01-01
This work investigates novel approaches to probabilistic damage diagnosis that utilize surrogate modeling and high performance computing (HPC) to achieve substantial computational speedup. Motivated by Digital Twin, a structural health management (SHM) paradigm that integrates vehicle-specific characteristics with continual in-situ damage diagnosis and prognosis, the methods studied herein yield near real-time damage assessments that could enable monitoring of a vehicle's health while it is operating (i.e. online SHM). High-fidelity modeling and uncertainty quantification (UQ), both critical to Digital Twin, are incorporated using finite element method simulations and Bayesian inference, respectively. The crux of the proposed Bayesian diagnosis methods, however, is the reformulation of the numerical sampling algorithms (e.g. Markov chain Monte Carlo) used to generate the resulting probabilistic damage estimates. To this end, three distinct methods are demonstrated for rapid sampling that utilize surrogate modeling and exploit various degrees of parallelism for leveraging HPC. The accuracy and computational efficiency of the methods are compared on the problem of strain-based crack identification in thin plates. While each approach has inherent problem-specific strengths and weaknesses, all approaches are shown to provide accurate probabilistic damage diagnoses and several orders of magnitude computational speedup relative to a baseline Bayesian diagnosis implementation.
Institute for Sustained Performance, Energy, and Resilience (SuPER)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jagode, Heike; Bosilca, George; Danalis, Anthony
The University of Tennessee (UTK) and University of Texas at El Paso (UTEP) partnership supported the three main thrusts of the SUPER project---performance, energy, and resilience. The UTK-UTEP effort thus helped advance the main goal of SUPER, which was to ensure that DOE's computational scientists can successfully exploit the emerging generation of high performance computing (HPC) systems. This goal is being met by providing application scientists with strategies and tools to productively maximize performance, conserve energy, and attain resilience. The primary vehicle through which UTK provided performance measurement support to SUPER and the larger HPC community is the Performance Applicationmore » Programming Interface (PAPI). PAPI is an ongoing project that provides a consistent interface and methodology for collecting hardware performance information from various hardware and software components, including most major CPUs, GPUs and accelerators, interconnects, I/O systems, and power interfaces, as well as virtual cloud environments. The PAPI software is widely used for performance modeling of scientific and engineering applications---for example, the HOMME (High Order Methods Modeling Environment) climate code, and the GAMESS and NWChem computational chemistry codes---on DOE supercomputers. PAPI is widely deployed as middleware for use by higher-level profiling, tracing, and sampling tools (e.g., CrayPat, HPCToolkit, Scalasca, Score-P, TAU, Vampir, PerfExpert), making it the de facto standard for hardware counter analysis. PAPI has established itself as fundamental software infrastructure in every application domain (spanning academia, government, and industry), where improving performance can be mission critical. Ultimately, as more application scientists migrate their applications to HPC platforms, they will benefit from the extended capabilities this grant brought to PAPI to analyze and optimize performance in these environments, whether they use PAPI directly, or via third-party performance tools. Capabilities added to PAPI through this grant include support for new architectures such as the lastest GPU and Xeon Phi accelerators, and advanced power measurement and management features. Another important topic for the UTK team was providing support for a rich ecosystem of different fault management strategies in the context of parallel computing. Our long term efforts have been oriented toward proposing flexible strategies and providing building boxes that application developers can use to build the most efficient fault management technique for their application. These efforts span across the entire software spectrum, from theoretical models of existing strategies to easily assess their performance, to algorithmic modifications to take advantage of specific mathematical properties for data redundancy and to extensions to widely used programming paradigms to empower the application developers to deal with all types of faults. We have also continued our tight collaborations with users to help them adopt these technologies to ensure their application always deliver meaningful scientific data. Large supercomputer systems are becoming more and more power and energy constrained, and future systems and applications running on them will need to be optimized to run under power caps and/or minimize energy consumption. The UTEP team contributed to the SUPER energy thrust by developing power modeling methodologies and investigating power management strategies. Scalability modeling results showed that some applications can scale better with respect to an increasing power budget than with respect to only the number of processors. Power management, in particular shifting power to processors on the critical path of an application execution, can reduce perturbation due to system noise and other sources of runtime variability, which are growing problems on large-scale power-constrained computer systems.« less
Flexible Description Language for HPC based Processing of Remote Sense Data
NASA Astrophysics Data System (ADS)
Nandra, Constantin; Gorgan, Dorian; Bacu, Victor
2016-04-01
When talking about Big Data, the most challenging aspect lays in processing them in order to gain new insight, find new patterns and gain knowledge from them. This problem is likely most apparent in the case of Earth Observation (EO) data. With ever higher numbers of data sources and increasing data acquisition rates, dealing with EO data is indeed a challenge [1]. Geoscientists should address this challenge by using flexible and efficient tools and platforms. To answer this trend, the BigEarth project [2] aims to combine the advantages of high performance computing solutions with flexible processing description methodologies in order to reduce both task execution times and task definition time and effort. As a component of the BigEarth platform, WorDeL (Workflow Description Language) [3] is intended to offer a flexible, compact and modular approach to the task definition process. WorDeL, unlike other description alternatives such as Python or shell scripts, is oriented towards the description topologies, using them as abstractions for the processing programs. This feature is intended to make it an attractive alternative for users lacking in programming experience. By promoting modular designs, WorDeL not only makes the processing descriptions more user-readable and intuitive, but also helps organizing the processing tasks into independent sub-tasks, which can be executed in parallel on multi-processor platforms in order to improve execution times. As a BigEarth platform [4] component, WorDeL represents the means by which the user interacts with the system, describing processing algorithms in terms of existing operators and workflows [5], which are ultimately translated into sets of executable commands. The WorDeL language has been designed to help in the definition of compute-intensive, batch tasks which can be distributed and executed on high-performance, cloud or grid-based architectures in order to improve the processing time. Main references for further information: [1] Gorgan, D., "Flexible and Adaptive Processing of Earth Observation Data over High Performance Computation Architectures", International Conference and Exhibition Satellite 2015, August 17-19, Houston, Texas, USA. [2] Bigearth project - flexible processing of big earth data over high performance computing architectures. http://cgis.utcluj.ro/bigearth, (2014) [3] Nandra, C., Gorgan, D., "Workflow Description Language for Defining Big Earth Data Processing Tasks", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 461-468, (2015). [4] Bacu, V., Stefan, T., Gorgan, D., "Adaptive Processing of Earth Observation Data on Cloud Infrastructures Based on Workflow Description", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp.444-454, (2015). [5] Mihon, D., Bacu, V., Colceriu, V., Gorgan, D., "Modeling of Earth Observation Use Cases through the KEOPS System", Proceedings of the Intelligent Computer Communication and Processing (ICCP), IEEE-Press, pp. 455-460, (2015).
Who watches the watchers?: preventing fault in a fault tolerance library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stanavige, C. D.
The Scalable Checkpoint/Restart library (SCR) was developed and is used by researchers at Lawrence Livermore National Laboratory to provide a fast and efficient method of saving and recovering large applications during runtime on high-performance computing (HPC) systems. Though SCR protects other programs, up until June 2017, nothing was actively protecting SCR. The goal of this project was to automate the building and testing of this library on the varying HPC architectures on which it is used. Our methods centered around the use of a continuous integration tool called Bamboo that allowed for automation agents to be installed on the HPCmore » systems themselves. These agents provided a way for us to establish a new and unique way to automate and customize the allocation of resources and running of tests with CMake’s unit testing framework, CTest, as well as integration testing scripts though an HPC package manager called Spack. These methods provided a parallel environment in which to test the more complex features of SCR. As a result, SCR is now automatically built and tested on several HPC architectures any time changes are made by developers to the library’s source code. The results of these tests are then communicated back to the developers for immediate feedback, allowing them to fix functionality of SCR that may have broken. Hours of developers’ time are now being saved from the tedious process of manually testing and debugging, which saves money and allows the SCR project team to focus their efforts towards development. Thus, HPC system users can use SCR in conjunction with their own applications to efficiently and effectively checkpoint and restart as needed with the assurance that SCR itself is functioning properly.« less
When to Renew Software Licences at HPC Centres? A Mathematical Analysis
NASA Astrophysics Data System (ADS)
Baolai, Ge; MacIsaac, Allan B.
2010-11-01
In this paper we study a common problem faced by many high performance computing (HPC) centres: When and how to renew commercial software licences. Software vendors often sell perpetual licences along with forward update and support contracts at an additional, annual cost. Every year or so, software support personnel and the budget units of HPC centres are required to make the decision of whether or not to renew such support, and usually such decisions are made intuitively. The total cost for a continuing support contract can, however, be costly. One might therefore want a rational answer to the question of whether the option for a renewal should be exercised and when. In an attempt to study this problem within a market framework, we present the mathematical problem derived for the day to day operation of a hypothetical HPC centre that charges for the use of software packages. In the mathematical model, we assume that the uncertainty comes from the demand, number of users using the packages, as well as the price. Further we assume the availability of up to date software versions may also affect the demand. We develop a renewal strategy that aims to maximize the expected profit from the use the software under consideration. The derived problem involves a decision tree, which constitutes a numerical procedure that can be processed in parallel.
Spatial Support Vector Regression to Detect Silent Errors in the Exascale Era
DOE Office of Scientific and Technical Information (OSTI.GOV)
Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo
As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data pointmore » in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.« less
Auto-Versioning Systems Image Manager
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pezzaglia, Larry
2013-08-01
The av_sys_image_mgr utility provides an interface for the creation, manipulation, and analysis of system boot images for computer systems. It is primarily intended to provide a convenient method for managing the introduction of changes to boot images for long-lived production HPC systems.
76 FR 64330 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-18
... talks on HPC Reliability, Diffusion on Complex Networks, and Reversible Software Execution Systems Report from Applied Math Workshop on Mathematics for the Analysis, Simulation, and Optimization of Complex Systems Report from ASCR-BES Workshop on Data Challenges from Next Generation Facilities Public...
Advances in Parallel Computing and Databases for Digital Pathology in Cancer Research
2016-11-13
these technologies and how we have used them in the past. We are interested in learning more about the needs of clinical pathologists as we continue to...such as image processing and correlation. Further, High Performance Computing (HPC) paradigms such as the Message Passing Interface (MPI) have been...Defense for Research and Engineering. such as pMatlab [4], or bcMPI [5] can significantly reduce the need for deep knowledge of parallel computing. In
SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes
NASA Astrophysics Data System (ADS)
Homann, Holger; Laenen, Francois
2018-03-01
The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two different possibilities for handling particles in high performance computing (HPC) codes. The concept of an Array of Structures (AoS) is in the spirit of the object-oriented programming (OOP) paradigm in that the particle information is implemented as a structure. Here, an object (realization of the structure) represents one particle and a set of many particles is stored in an array. In contrast, using the concept of a Structure of Arrays (SoA), a single structure holds several arrays each representing one property (such as the identity) of the whole set of particles. The AoS approach is often implemented in HPC codes due to its handiness and flexibility. For a class of problems, however, it is known that the performance of SoA is much better than that of AoS. We confirm this observation for our particle problem. Using a benchmark we show that on modern Intel Xeon processors the SoA implementation is typically several times faster than the AoS one. On Intel's MIC co-processors the performance gap even attains a factor of ten. The same is true for GPU computing, using both computational and multi-purpose GPUs. Combining performance and handiness, we present the library SoAx that has optimal performance (on CPUs, MICs, and GPUs) while providing the same handiness as AoS. For this, SoAx uses modern C++ design techniques such template meta programming that allows to automatically generate code for user defined heterogeneous data structures.
3D streamers simulation in a pin to plane configuration using massively parallel computing
NASA Astrophysics Data System (ADS)
Plewa, J.-M.; Eichwald, O.; Ducasse, O.; Dessante, P.; Jacobs, C.; Renon, N.; Yousfi, M.
2018-03-01
This paper concerns the 3D simulation of corona discharge using high performance computing (HPC) managed with the message passing interface (MPI) library. In the field of finite volume methods applied on non-adaptive mesh grids and in the case of a specific 3D dynamic benchmark test devoted to streamer studies, the great efficiency of the iterative R&B SOR and BiCGSTAB methods versus the direct MUMPS method was clearly demonstrated in solving the Poisson equation using HPC resources. The optimization of the parallelization and the resulting scalability was undertaken as a function of the HPC architecture for a number of mesh cells ranging from 8 to 512 million and a number of cores ranging from 20 to 1600. The R&B SOR method remains at least about four times faster than the BiCGSTAB method and requires significantly less memory for all tested situations. The R&B SOR method was then implemented in a 3D MPI parallelized code that solves the classical first order model of an atmospheric pressure corona discharge in air. The 3D code capabilities were tested by following the development of one, two and four coplanar streamers generated by initial plasma spots for 6 ns. The preliminary results obtained allowed us to follow in detail the formation of the tree structure of a corona discharge and the effects of the mutual interactions between the streamers in terms of streamer velocity, trajectory and diameter. The computing time for 64 million of mesh cells distributed over 1000 cores using the MPI procedures is about 30 min ns-1, regardless of the number of streamers.
Power-Time Curve Comparison between Weightlifting Derivatives
Suchomel, Timothy J.; Sole, Christopher J.
2017-01-01
This study examined the power production differences between weightlifting derivatives through a comparison of power-time (P-t) curves. Thirteen resistance-trained males performed hang power clean (HPC), jump shrug (JS), and hang high pull (HHP) repetitions at relative loads of 30%, 45%, 65%, and 80% of their one repetition maximum (1RM) HPC. Relative peak power (PPRel), work (WRel), and P-t curves were compared. The JS produced greater PPRel than the HPC (p < 0.001, d = 2.53) and the HHP (p < 0.001, d = 2.14). In addition, the HHP PPRel was statistically greater than the HPC (p = 0.008, d = 0.80). Similarly, the JS produced greater WRel compared to the HPC (p < 0.001, d = 1.89) and HHP (p < 0.001, d = 1.42). Furthermore, HHP WRel was statistically greater than the HPC (p = 0.003, d = 0.73). The P-t profiles of each exercise were similar during the first 80-85% of the movement; however, during the final 15-20% of the movement the P-t profile of the JS was found to be greater than the HPC and HHP. The JS produced greater PPRel and WRel compared to the HPC and HHP with large effect size differences. The HHP produced greater PPRel and WRel than the HPC with moderate effect size differences. The JS and HHP produced markedly different P-t profiles in the final 15-20% of the movement compared to the HPC. Thus, these exercises may be superior methods of training to enhance PPRel. The greatest differences in PPRel between the JS and HHP and the HPC occurred at lighter loads, suggesting that loads of 30-45% 1RM HPC may provide the best training stimulus when using the JS and HHP. In contrast, loads ranging 65-80% 1RM HPC may provide an optimal stimulus for power production during the HPC. Key points The JS and HHP exercises produced greater relative peak power and relative work compared to the HPC. Although the power-time curves were similar during the first 80-85% of the movement, the JS and HHP possessed unique power-time characteristics during the final 15-20% of the movement compared to the HPC. The JS and HHP may be effectively implemented to train peak power characteristics, especially using loads ranging from 30-45% of an individual’s 1RM HPC. The HPC may be best implemented using loads ranging from 65-80% of an individual’s 1RM HPC. PMID:28912659
Discrete Particle Model for Porous Media Flow using OpenFOAM at Intel Xeon Phi Coprocessors
NASA Astrophysics Data System (ADS)
Shang, Zhi; Nandakumar, Krishnaswamy; Liu, Honggao; Tyagi, Mayank; Lupo, James A.; Thompson, Karten
2015-11-01
The discrete particle model (DPM) in OpenFOAM was used to study the turbulent solid particle suspension flows through the porous media of a natural dual-permeability rock. The 2D and 3D pore geometries of the porous media were generated by sphere packing with the radius ratio of 3. The porosity is about 38% same as the natural dual-permeability rock. In the 2D case, the mesh cells reach 5 million with 1 million solid particles and in the 3D case, the mesh cells are above 10 million with 5 million solid particles. The solid particles are distributed by Gaussian distribution from 20 μm to 180 μm with expectation as 100 μm. Through the numerical simulations, not only was the HPC studied using Intel Xeon Phi Coprocessors but also the flow behaviors of large scale solid suspension flows in porous media were studied. The authors would like to thank the support by IPCC@LSU-Intel Parallel Computing Center (LSU # Y1SY1-1) and the HPC resources at Louisiana State University (http://www.hpc.lsu.edu).
Visualisation methods for large provenance collections in data-intensive collaborative platforms
NASA Astrophysics Data System (ADS)
Spinuso, Alessandro; Fligueira, Rosa; Atkinson, Malcolm; Gemuend, Andre
2016-04-01
This work investigates improving the methods of visually representing provenance information in the context of modern data-driven scientific research. It explores scenarios where data-intensive workflows systems are serving communities of researchers within collaborative environments, supporting the sharing of data and methods, and offering a variety of computation facilities, including HPC, HTC and Cloud. It focuses on the exploration of big-data visualization techniques aiming at producing comprehensive and interactive views on top of large and heterogeneous provenance data. The same approach is applicable to control-flow and data-flow workflows or to combinations of the two. This flexibility is achieved using the W3C-PROV recommendation as a reference model, especially its workflow oriented profiles such as D-PROV (Messier et al. 2013). Our implementation is based on the provenance records produced by the dispel4py data-intensive processing library (Filgueira et al. 2015). dispel4py is an open-source Python framework for describing abstract stream-based workflows for distributed data-intensive applications, developed during the VERCE project. dispel4py enables scientists to develop their scientific methods and applications on their laptop and then run them at scale on a wide range of e-Infrastructures (Cloud, Cluster, etc.) without making changes. Users can therefore focus on designing their workflows at an abstract level, describing actions, input and output streams, and how they are connected. The dispel4py system then maps these descriptions to the enactment platforms, such as MPI, Storm, multiprocessing. It provides a mechanism which allows users to determine the provenance information to be collected and to analyze it at runtime. For this work we consider alternative visualisation methods for provenance data, from infinite lists and localised interactive graphs, to radial-views. The latter technique has been positively explored in many fields, from text data visualisation to genomics and social networking analysis. Its adoption for provenance has been presented in literature (Borkin et al. 2013) in the context of parent-child relationships across processes, constructed from control-flow information. Computer graphics research has focused on the advantage of this radial distribution of interlinked information and on ways to improve the visual efficiency and tunability of such representations, like the Hierarchical Edge Bundles visualisation method, (Holten et al. 2006), which aims at reducing visual clutter of highly connected structures via the generation of bundles. Our approach explores the potential of the combination of these methods. It serves environments where the size of the provenance collection, coupled with the diversity of the infrastructures and the domain metadata, make the extrapolation of usage trends extremely challenging. Applications of such visualisation systems can engage groups of scientists, data providers and computational engineers, by serving visual snapshots that highlight relationships between an item and its connected processes. We will present examples of comprehensive views on the distribution of processing and data transfers during a workflow's execution in HPC, as well as cross workflows interactions and internal dynamics. The latter in the context of faceted searches on domain metadata values-range. These are obtained from the analysis of real provenance data generated by the processing of seismic traces performed through the VERCE platform.
Riaz, Sadia; Schumacher, Anett; Sivagurunathan, Seyon; Van Der Meer, Matthijs; Ito, Rutsuko
2017-07-01
The hippocampus (HPC) has been widely implicated in the contextual control of appetitive and aversive conditioning. However, whole hippocampal lesions do not invariably impair all forms of contextual processing, as in the case of complex biconditional context discrimination, leading to contention over the exact nature of the contribution of the HPC in contextual processing. Moreover, the increasingly well-established functional dissociation between the dorsal (dHPC) and ventral (vHPC) subregions of the HPC has been largely overlooked in the existing literature on hippocampal-based contextual memory processing in appetitively motivated tasks. Thus, the present study sought to investigate the individual roles of the dHPC and the vHPC in contextual biconditional discrimination (CBD) performance and memory retrieval. To this end, we examined the effects of transient post-acquisition pharmacological inactivation (using a combination of GABA A and GABA B receptor agonists muscimol and baclofen) of functionally distinct subregions of the HPC (CA1/CA3 subfields of the dHPC and vHPC) on CBD memory retrieval. Additional behavioral assays including novelty preference, light-dark box and locomotor activity test were also performed to confirm that the respective sites of inactivation were functionally silent. We observed robust deficits in CBD performance and memory retrieval following inactivation of the vHPC, but not the dHPC. Our data provides novel insight into the differential roles of the ventral and dorsal HPC in reward contextual processing, under conditions in which the context is defined by proximal cues. © 2017 Wiley Periodicals, Inc.
Distributed Accounting on the Grid
NASA Technical Reports Server (NTRS)
Thigpen, William; Hacker, Thomas J.; McGinnis, Laura F.; Athey, Brian D.
2001-01-01
By the late 1990s, the Internet was adequately equipped to move vast amounts of data between HPC (High Performance Computing) systems, and efforts were initiated to link together the national infrastructure of high performance computational and data storage resources together into a general computational utility 'grid', analogous to the national electrical power grid infrastructure. The purpose of the Computational grid is to provide dependable, consistent, pervasive, and inexpensive access to computational resources for the computing community in the form of a computing utility. This paper presents a fully distributed view of Grid usage accounting and a methodology for allocating Grid computational resources for use on a Grid computing system.
High-Throughput and Low-Latency Network Communication with NetIO
NASA Astrophysics Data System (ADS)
Schumacher, Jörn; Plessl, Christian; Vandelli, Wainer
2017-10-01
HPC network technologies like Infiniband, TrueScale or OmniPath provide low- latency and high-throughput communication between hosts, which makes them attractive options for data-acquisition systems in large-scale high-energy physics experiments. Like HPC networks, DAQ networks are local and include a well specified number of systems. Unfortunately traditional network communication APIs for HPC clusters like MPI or PGAS exclusively target the HPC community and are not suited well for DAQ applications. It is possible to build distributed DAQ applications using low-level system APIs like Infiniband Verbs, but it requires a non-negligible effort and expert knowledge. At the same time, message services like ZeroMQ have gained popularity in the HEP community. They make it possible to build distributed applications with a high-level approach and provide good performance. Unfortunately, their usage usually limits developers to TCP/IP- based networks. While it is possible to operate a TCP/IP stack on top of Infiniband and OmniPath, this approach may not be very efficient compared to a direct use of native APIs. NetIO is a simple, novel asynchronous message service that can operate on Ethernet, Infiniband and similar network fabrics. In this paper the design and implementation of NetIO is presented and described, and its use is evaluated in comparison to other approaches. NetIO supports different high-level programming models and typical workloads of HEP applications. The ATLAS FELIX project [1] successfully uses NetIO as its central communication platform. The architecture of NetIO is described in this paper, including the user-level API and the internal data-flow design. The paper includes a performance evaluation of NetIO including throughput and latency measurements. The performance is compared against the state-of-the- art ZeroMQ message service. Performance measurements are performed in a lab environment with Ethernet and FDR Infiniband networks.
NASA Astrophysics Data System (ADS)
van Hemert, Jano; Vilotte, Jean-Pierre
2010-05-01
Research in earthquake and seismology addresses fundamental problems in understanding Earth's internal wave sources and structures, and augment applications to societal concerns about natural hazards, energy resources and environmental change. This community is central to the European Plate Observing System (EPOS)—the ESFRI initiative in solid Earth Sciences. Global and regional seismology monitoring systems are continuously operated and are transmitting a growing wealth of data from Europe and from around the world. These tremendous volumes of seismograms, i.e., records of ground motions as a function of time, have a definite multi-use attribute, which puts a great premium on open-access data infrastructures that are integrated globally. In Europe, the earthquake and seismology community is part of the European Integrated Data Archives (EIDA) infrastructure and is structured as "horizontal" data services. On top of this distributed data archive system, the community has developed recently within the EC project NERIES advanced SOA-based web services and a unified portal system. Enabling advanced analysis of these data by utilising a data-aware distributed computing environment is instrumental to fully exploit the cornucopia of data and to guarantee optimal operation of the high-cost monitoring facilities. The strategy of VERCE is driven by the needs of data-intensive applications in data mining and modelling and will be illustrated through a set of applications. It aims to provide a comprehensive architecture and framework adapted to the scale and the diversity of these applications, and to integrate the community data infrastructure with Grid and HPC infrastructures. A first novel aspect is a service-oriented architecture that provides well-equipped integrated workbenches, with an efficient communication layer between data and Grid infrastructures, augmented with bridges to the HPC facilities. A second novel aspect is the coupling between Grid data analysis and HPC data modelling applications through workflow and data sharing mechanisms. VERCE will develop important interactions with the European infrastructure initiatives in Grid and HPC computing. The VERCE team: CNRS-France (IPG Paris, LGIT Grenoble), UEDIN (UK), KNMI-ORFEUS (Holland), EMSC, INGV (Italy), LMU (Germany), ULIV (UK), BADW-LRZ (Germany), SCAI (Germany), CINECA (Italy)
SEE-GRID eInfrastructure for Regional eScience
NASA Astrophysics Data System (ADS)
Prnjat, Ognjen; Balaz, Antun; Vudragovic, Dusan; Liabotis, Ioannis; Sener, Cevat; Marovic, Branko; Kozlovszky, Miklos; Neagu, Gabriel
In the past 6 years, a number of targeted initiatives, funded by the European Commission via its information society and RTD programmes and Greek infrastructure development actions, have articulated a successful regional development actions in South East Europe that can be used as a role model for other international developments. The SEEREN (South-East European Research and Education Networking initiative) project, through its two phases, established the SEE segment of the pan-European G ´EANT network and successfully connected the research and scientific communities in the region. Currently, the SEE-LIGHT project is working towards establishing a dark-fiber backbone that will interconnect most national Research and Education networks in the region. On the distributed computing and storage provisioning i.e. Grid plane, the SEE-GRID (South-East European GRID e-Infrastructure Development) project, similarly through its two phases, has established a strong human network in the area of scientific computing and has set up a powerful regional Grid infrastructure, and attracted a number of applications from different fields from countries throughout the South-East Europe. The current SEEGRID-SCI project, ending in April 2010, empowers the regional user communities from fields of meteorology, seismology and environmental protection in common use and sharing of the regional e-Infrastructure. Current technical initiatives in formulation are focusing on a set of coordinated actions in the area of HPC and application fields making use of HPC initiatives. Finally, the current SEERA-EI project brings together policy makers - programme managers from 10 countries in the region. The project aims to establish a communication platform between programme managers, pave the way towards common e-Infrastructure strategy and vision, and implement concrete actions for common funding of electronic infrastructures on the regional level. The regional vision on establishing an e-Infrastructure compatible with European developments, and empowering the scientists in the region in equal participation in the use of pan- European infrastructures, is materializing through the above initiatives. This model has a number of concrete operational and organizational guidelines which can be adapted to help e-Infrastructure developments in other world regions. In this paper we review the most important developments and contributions by the SEEGRID- SCI project.
NASA Astrophysics Data System (ADS)
Druken, K. A.; Trenham, C. E.; Steer, A.; Evans, B. J. K.; Richards, C. J.; Smillie, J.; Allen, C.; Pringle, S.; Wang, J.; Wyborn, L. A.
2016-12-01
The Australian National Computational Infrastructure (NCI) provides access to petascale data in climate, weather, Earth observations, and genomics, and terascale data in astronomy, geophysics, ecology and land use, as well as social sciences. The data is centralized in a closely integrated High Performance Computing (HPC), High Performance Data (HPD) and cloud facility. Despite this, there remain significant barriers for many users to find and access the data: simply hosting a large volume of data is not helpful if researchers are unable to find, access, and use the data for their particular need. Use cases demonstrate we need to support a diverse range of users who are increasingly crossing traditional research discipline boundaries. To support their varying experience, access needs and research workflows, NCI has implemented an integrated data platform providing a range of services that enable users to interact with our data holdings. These services include: - A GeoNetwork catalog built on standardized Data Management Plans to search collection metadata, and find relevant datasets; - Web data services to download or remotely access data via OPeNDAP, WMS, WCS and other protocols; - Virtual Desktop Infrastructure (VDI) built on a highly integrated on-site cloud with access to both the HPC peak machine and research data collections. The VDI is a fully featured environment allowing visualization, code development and analysis to take place in an interactive desktop environment; and - A Learning Management System (LMS) containing User Guides, Use Case examples and Jupyter Notebooks structured into courses, so that users can self-teach how to use these facilities with examples from our system across a range of disciplines. We will briefly present these components, and discuss how we engage with data custodians and consumers to develop standardized data structures and services that support the range of needs. We will also highlight some key developments that have improved user experience in utilizing the services, particularly enabling transdisciplinary science. This work combines with other developments at NCI to increase the confidence of scientists from any field to undertake research and analysis on these important data collections regardless of their preferred work environment or level of skill.
Conversion of HSPF Legacy Model to a Platform-Independent, Open-Source Language
NASA Astrophysics Data System (ADS)
Heaphy, R. T.; Burke, M. P.; Love, J. T.
2015-12-01
Since its initial development over 30 years ago, the Hydrologic Simulation Program - FORTAN (HSPF) model has been used worldwide to support water quality planning and management. In the United States, HSPF receives widespread endorsement as a regulatory tool at all levels of government and is a core component of the EPA's Better Assessment Science Integrating Point and Nonpoint Sources (BASINS) system, which was developed to support nationwide Total Maximum Daily Load (TMDL) analysis. However, the model's legacy code and data management systems have limitations in their ability to integrate with modern software, hardware, and leverage parallel computing, which have left voids in optimization, pre-, and post-processing tools. Advances in technology and our scientific understanding of environmental processes that have occurred over the last 30 years mandate that upgrades be made to HSPF to allow it to evolve and continue to be a premiere tool for water resource planners. This work aims to mitigate the challenges currently facing HSPF through two primary tasks: (1) convert code to a modern widely accepted, open-source, high-performance computing (hpc) code; and (2) convert model input and output files to modern widely accepted, open-source, data model, library, and binary file format. Python was chosen as the new language for the code conversion. It is an interpreted, object-oriented, hpc code with dynamic semantics that has become one of the most popular open-source languages. While python code execution can be slow compared to compiled, statically typed programming languages, such as C and FORTRAN, the integration of Numba (a just-in-time specializing compiler) has allowed this challenge to be overcome. For the legacy model data management conversion, HDF5 was chosen to store the model input and output. The code conversion for HSPF's hydrologic and hydraulic modules has been completed. The converted code has been tested against HSPF's suite of "test" runs and shown good agreement and similar execution times while using the Numba compiler. Continued verification of the accuracy of the converted code against more complex legacy applications and improvement upon execution times by incorporating an intelligent network change detection tool is currently underway, and preliminary results will be presented.
Unleashing spatially distributed ecohydrology modeling using Big Data tools
NASA Astrophysics Data System (ADS)
Miles, B.; Idaszak, R.
2015-12-01
Physically based spatially distributed ecohydrology models are useful for answering science and management questions related to the hydrology and biogeochemistry of prairie, savanna, forested, as well as urbanized ecosystems. However, these models can produce hundreds of gigabytes of spatial output for a single model run over decadal time scales when run at regional spatial scales and moderate spatial resolutions (~100-km2+ at 30-m spatial resolution) or when run for small watersheds at high spatial resolutions (~1-km2 at 3-m spatial resolution). Numerical data formats such as HDF5 can store arbitrarily large datasets. However even in HPC environments, there are practical limits on the size of single files that can be stored and reliably backed up. Even when such large datasets can be stored, querying and analyzing these data can suffer from poor performance due to memory limitations and I/O bottlenecks, for example on single workstations where memory and bandwidth are limited, or in HPC environments where data are stored separately from computational nodes. The difficulty of storing and analyzing spatial data from ecohydrology models limits our ability to harness these powerful tools. Big Data tools such as distributed databases have the potential to surmount the data storage and analysis challenges inherent to large spatial datasets. Distributed databases solve these problems by storing data close to computational nodes while enabling horizontal scalability and fault tolerance. Here we present the architecture of and preliminary results from PatchDB, a distributed datastore for managing spatial output from the Regional Hydro-Ecological Simulation System (RHESSys). The initial version of PatchDB uses message queueing to asynchronously write RHESSys model output to an Apache Cassandra cluster. Once stored in the cluster, these data can be efficiently queried to quickly produce both spatial visualizations for a particular variable (e.g. maps and animations), as well as point time series of arbitrary variables at arbitrary points in space within a watershed or river basin. By treating ecohydrology modeling as a Big Data problem, we hope to provide a platform for answering transformative science and management questions related to water quantity and quality in a world of non-stationary climate.
Exploring similarities among many species distributions
Simmerman, Scott; Wang, Jingyuan; Osborne, James; Shook, Kimberly; Huang, Jian; Godsoe, William; Simons, Theodore R.
2012-01-01
Collecting species presence data and then building models to predict species distribution has been long practiced in the field of ecology for the purpose of improving our understanding of species relationships with each other and with the environment. Due to limitations of computing power as well as limited means of using modeling software on HPC facilities, past species distribution studies have been unable to fully explore diverse data sets. We build a system that can, for the first time to our knowledge, leverage HPC to support effective exploration of species similarities in distribution as well as their dependencies on common environmental conditions. Our system can also compute and reveal uncertainties in the modeling results enabling domain experts to make informed judgments about the data. Our work was motivated by and centered around data collection efforts within the Great Smoky Mountains National Park that date back to the 1940s. Our findings present new research opportunities in ecology and produce actionable field-work items for biodiversity management personnel to include in their planning of daily management activities.
Climate Science Performance, Data and Productivity on Titan
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mayer, Benjamin W; Worley, Patrick H; Gaddis, Abigail L
2015-01-01
Climate Science models are flagship codes for the largest of high performance computing (HPC) resources, both in visibility, with the newly launched Department of Energy (DOE) Accelerated Climate Model for Energy (ACME) effort, and in terms of significant fractions of system usage. The performance of the DOE ACME model is captured with application level timers and examined through a sizeable run archive. Performance and variability of compute, queue time and ancillary services are examined. As Climate Science advances in the use of HPC resources there has been an increase in the required human and data systems to achieve programs goals.more » A description of current workflow processes (hardware, software, human) and planned automation of the workflow, along with historical and projected data in motion and at rest data usage, are detailed. The combination of these two topics motivates a description of future systems requirements for DOE Climate Modeling efforts, focusing on the growth of data storage and network and disk bandwidth required to handle data at an acceptable rate.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barbara Chapman
OpenMP was not well recognized at the beginning of the project, around year 2003, because of its limited use in DoE production applications and the inmature hardware support for an efficient implementation. Yet in the recent years, it has been graduately adopted both in HPC applications, mostly in the form of MPI+OpenMP hybrid code, and in mid-scale desktop applications for scientific and experimental studies. We have observed this trend and worked deligiently to improve our OpenMP compiler and runtimes, as well as to work with the OpenMP standard organization to make sure OpenMP are evolved in the direction close tomore » DoE missions. In the Center for Programming Models for Scalable Parallel Computing project, the HPCTools team at the University of Houston (UH), directed by Dr. Barbara Chapman, has been working with project partners, external collaborators and hardware vendors to increase the scalability and applicability of OpenMP for multi-core (and future manycore) platforms and for distributed memory systems by exploring different programming models, language extensions, compiler optimizations, as well as runtime library support.« less
WMT: The CSDMS Web Modeling Tool
NASA Astrophysics Data System (ADS)
Piper, M.; Hutton, E. W. H.; Overeem, I.; Syvitski, J. P.
2015-12-01
The Community Surface Dynamics Modeling System (CSDMS) has a mission to enable model use and development for research in earth surface processes. CSDMS strives to expand the use of quantitative modeling techniques, promotes best practices in coding, and advocates for the use of open-source software. To streamline and standardize access to models, CSDMS has developed the Web Modeling Tool (WMT), a RESTful web application with a client-side graphical interface and a server-side database and API that allows users to build coupled surface dynamics models in a web browser on a personal computer or a mobile device, and run them in a high-performance computing (HPC) environment. With WMT, users can: Design a model from a set of components Edit component parameters Save models to a web-accessible server Share saved models with the community Submit runs to an HPC system Download simulation results The WMT client is an Ajax application written in Java with GWT, which allows developers to employ object-oriented design principles and development tools such as Ant, Eclipse and JUnit. For deployment on the web, the GWT compiler translates Java code to optimized and obfuscated JavaScript. The WMT client is supported on Firefox, Chrome, Safari, and Internet Explorer. The WMT server, written in Python and SQLite, is a layered system, with each layer exposing a web service API: wmt-db: database of component, model, and simulation metadata and output wmt-api: configure and connect components wmt-exe: launch simulations on remote execution servers The database server provides, as JSON-encoded messages, the metadata for users to couple model components, including descriptions of component exchange items, uses and provides ports, and input parameters. Execution servers are network-accessible computational resources, ranging from HPC systems to desktop computers, containing the CSDMS software stack for running a simulation. Once a simulation completes, its output, in NetCDF, is packaged and uploaded to a data server where it is stored and from which a user can download it as a single compressed archive file.
Equivalent cardioprotection induced by ischemic and hypoxic preconditioning.
Xiang, Xujin; Lin, Haixia; Liu, Jin; Duan, Zeyan
2013-04-01
We aimed to compare cardioprotection induced by various hypoxic preconditioning (HPC) and ischemic preconditioning (IPC) protocols. Isolated rat hearts were randomly divided into 7 groups (n = 7 per group) and received 3 or 5 cycles of 3-minute ischemia or hypoxia followed by 3-minute reperfusion (IPC33 or HPC33 or IPC53 or HPC53 group), 3 cycles of 5-minute ischemia or hypoxia followed by 5-minute reperfusion (IPC35 group or HPC35 group), or 30-minute perfusion (ischemic/reperfusion group), respectively. Then all the hearts were subjected to 50-minute ischemia and 120-minute reperfusion. Cardiac function, infarct size, and coronary flow rate (CFR) were evaluated. Recovery of cardiac function and CFR in IPC35, HPC35, and HPC53 groups was significantly improved as compared with I/R group (p < 0.01). There were no significant differences in cardiac function parameters between IPC35 and HPC35 groups. Consistently, infarct size was significantly reduced in IPC35, HPC35, and HPC53 groups compared with ischemic/reperfusion group. Multiple-cycle short duration HPC exerted cardioprotection, which was as powerful as that of IPC. Georg Thieme Verlag KG Stuttgart · New York.
A population-based analysis of Head and Neck hemangiopericytoma.
Shaigany, Kevin; Fang, Christina H; Patel, Tapan D; Park, Richard Chan; Baredes, Soly; Eloy, Jean Anderson
2016-03-01
Hemangiopericytomas (HPC) are tumors that arise from pericytes. Hemangiopericytomas of the head and neck are rare and occur both extracranially and intracranially. This study analyzes the demographic, clinicopathologic, treatment modalities, and survival characteristics of extracranial head and neck hemangiopericytomas (HN-HPC) and compares them to HPCs at other body sites (Other-HPC). The Surveillance, Epidemiology, and End Results (SEER) database (1973-2012) was queried for HN-HPC (121 cases) and Other-HPC (510 cases). Data were analyzed comparatively with respect to various demographic and clinicopathologic factors. Disease-specific survival (DSS) was analyzed using the Kaplan-Meier model. There was no significant difference in age at time of diagnosis between HN-HPC and Other-HPC. Head and neck HPC was most commonly located in the connective and soft tissue (18.4%), followed by the nasal cavity and paranasal sinuses (8.5%). Head and neck HPCs were smaller than Other-HPC (P < 0.0001) and more likely to be a lower histologic grade (P < 0.0097). The primary treatment modality for HN-HPC was surgery alone, used in 55.8% of cases. The 5-, 10-, and 20-year DSS for HN-HPC were 84.0%, 79.4%, and 69.4%, respectfully. Higher histologic grade and the presence of distant metastases were poor prognostic factors for HN-HPC. Head and neck HPCs are rare tumors. This study represents the largest series of HN-HPCs to date. Surgery alone is the primary treatment modality for HN-HPC, with a favorable prognosis. Adjuvant radiotherapy does not appear to confer a survival benefit for any body site. 4. Laryngoscope, 126:643-650, 2016. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.
PerSEUS: Ultra-Low-Power High Performance Computing for Plasma Simulations
NASA Astrophysics Data System (ADS)
Doxas, I.; Andreou, A.; Lyon, J.; Angelopoulos, V.; Lu, S.; Pritchett, P. L.
2017-12-01
Peta-op SupErcomputing Unconventional System (PerSEUS) aims to explore the use for High Performance Scientific Computing (HPC) of ultra-low-power mixed signal unconventional computational elements developed by Johns Hopkins University (JHU), and demonstrate that capability on both fluid and particle Plasma codes. We will describe the JHU Mixed-signal Unconventional Supercomputing Elements (MUSE), and report initial results for the Lyon-Fedder-Mobarry (LFM) global magnetospheric MHD code, and a UCLA general purpose relativistic Particle-In-Cell (PIC) code.
Unified Performance and Power Modeling of Scientific Workloads
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, Shuaiwen; Barker, Kevin J.; Kerbyson, Darren J.
2013-11-17
It is expected that scientific applications executing on future large-scale HPC must be optimized not only in terms of performance, but also in terms of power consumption. As power and energy become increasingly constrained resources, researchers and developers must have access to tools that will allow for accurate prediction of both performance and power consumption. Reasoning about performance and power consumption in concert will be critical for achieving maximum utilization of limited resources on future HPC systems. To this end, we present a unified performance and power model for the Nek-Bone mini-application developed as part of the DOE's CESAR Exascalemore » Co-Design Center. Our models consider the impact of computation, point-to-point communication, and collective communication« less
Final Report for File System Support for Burst Buffers on HPC Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, W.; Mohror, K.
Distributed burst buffers are a promising storage architecture for handling I/O workloads for exascale computing. As they are being deployed on more supercomputers, a file system that efficiently manages these burst buffers for fast I/O operations carries great consequence. Over the past year, FSU team has undertaken several efforts to design, prototype and evaluate distributed file systems for burst buffers on HPC systems. These include MetaKV: a Key-Value Store for Metadata Management of Distributed Burst Buffers, a user-level file system with multiple backends, and a specialized file system for large datasets of deep neural networks. Our progress for these respectivemore » efforts are elaborated further in this report.« less
Implementation of the NAS Parallel Benchmarks in Java
NASA Technical Reports Server (NTRS)
Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan (Technical Monitor)
2002-01-01
Several features make Java an attractive choice for High Performance Computing (HPC). In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for CFD applications.
Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter
2015-01-01
Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438
Cloud CPFP: a shotgun proteomics data analysis pipeline using cloud and high performance computing.
Trudgian, David C; Mirzaei, Hamid
2012-12-07
We have extended the functionality of the Central Proteomics Facilities Pipeline (CPFP) to allow use of remote cloud and high performance computing (HPC) resources for shotgun proteomics data processing. CPFP has been modified to include modular local and remote scheduling for data processing jobs. The pipeline can now be run on a single PC or server, a local cluster, a remote HPC cluster, and/or the Amazon Web Services (AWS) cloud. We provide public images that allow easy deployment of CPFP in its entirety in the AWS cloud. This significantly reduces the effort necessary to use the software, and allows proteomics laboratories to pay for compute time ad hoc, rather than obtaining and maintaining expensive local server clusters. Alternatively the Amazon cloud can be used to increase the throughput of a local installation of CPFP as necessary. We demonstrate that cloud CPFP allows users to process data at higher speed than local installations but with similar cost and lower staff requirements. In addition to the computational improvements, the web interface to CPFP is simplified, and other functionalities are enhanced. The software is under active development at two leading institutions and continues to be released under an open-source license at http://cpfp.sourceforge.net.
ERIC Educational Resources Information Center
O'Hanlon, Charlene
2007-01-01
Traditionally, the high-performance computing (HPC) systems used to conduct research at universities have amounted to silos of technology scattered across the campus and falling under the purview of the researchers themselves. This article reports that a growing number of universities are now taking over the management of those systems and…
SMT-Aware Instantaneous Footprint Optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roy, Probir; Liu, Xu; Song, Shuaiwen
Modern architectures employ simultaneous multithreading (SMT) to increase thread-level parallelism. SMT threads share many functional units and the whole memory hierarchy of a physical core. Without a careful code design, SMT threads can easily contend with each other for these shared resources, causing severe performance degradation. Minimizing SMT thread contention for HPC applications running on dedicated platforms is very challenging, because they usually spawn threads within Single Program Multiple Data (SPMD) models. To address this important issue, we introduce a simple scheme for SMT-aware code optimization, which aims to reduce the memory contention across SMT threads.
NASA Astrophysics Data System (ADS)
Szelag, Bertrand; Abraham, Alexis; Brision, Stéphane; Gindre, Paul; Blampey, Benjamin; Myko, André; Olivier, Segolene; Kopp, Christophe
2017-05-01
Silicon photonic is becoming a reality for next generation communication system addressing the increasing needs of HPC (High Performance Computing) systems and datacenters. CMOS compatible photonic platforms are developed in many foundries integrating passive and active devices. The use of existing and qualified microelectronics process guarantees cost efficient and mature photonic technologies. Meanwhile, photonic devices have their own fabrication constraints, not similar to those of cmos devices, which can affect their performances. In this paper, we are addressing the integration of PN junction Mach Zehnder modulator in a 200mm CMOS compatible photonic platform. Implantation based device characteristics are impacted by many process variations among which screening layer thickness, dopant diffusion, implantation mask overlay. CMOS devices are generally quite robust with respect to these processes thanks to dedicated design rules. For photonic devices, the situation is different since, most of the time, doped areas must be carefully located within waveguides and CMOS solutions like self-alignment to the gate cannot be applied. In this work, we present different robust integration solutions for junction-based modulators. A simulation setup has been built in order to optimize of the process conditions. It consist in a Mathlab interface coupling process and device electro-optic simulators in order to run many iterations. Illustrations of modulator characteristic variations with process parameters are done using this simulation setup. Parameters under study are, for instance, X and Y direction lithography shifts, screening oxide and slab thicknesses. A robust process and design approach leading to a pn junction Mach Zehnder modulator insensitive to lithography misalignment is then proposed. Simulation results are compared with experimental datas. Indeed, various modulators have been fabricated with different process conditions and integration schemes. Extensive electro-optic characterization of these components will be presented.
The clinical phenotype of hereditary versus sporadic prostate cancer: HPC definition revisited.
Cremers, Ruben G; Aben, Katja K; van Oort, Inge M; Sedelaar, J P Michiel; Vasen, Hans F; Vermeulen, Sita H; Kiemeney, Lambertus A
2016-07-01
The definition of hereditary prostate cancer (HPC) is based on family history and age at onset. Intuitively, HPC is a serious subtype of prostate cancer but there are only limited data on the clinical phenotype of HPC. Here, we aimed to compare the prognosis of HPC to the sporadic form of prostate cancer (SPC). HPC patients were identified through a national registry of HPC families in the Netherlands, selecting patients diagnosed from the year 2000 onward (n = 324). SPC patients were identified from the Netherlands Cancer Registry (NCR) between 2003 and 2006 for a population-based study into the genetic susceptibility of PC (n = 1,664). Detailed clinical data were collected by NCR-registrars, using a standardized registration form. Follow-up extended up to the end of 2013. Differences between the groups were evaluated by cross-tabulations and tested for statistical significance while accounting for familial dependency of observations by GEE. Differences in progression-free and overall survival were evaluated using χ(2) testing with GEE in a proportional-hazards model. HPC patients were on average 3 years younger at diagnosis, had lower PSA values, lower Gleason scores, and more often locally confined disease. Of the HPC patients, 35% had high-risk disease (NICE-criteria) versus 51% of the SPC patients. HPC patients were less often treated with active surveillance. Kaplan-Meier 5-year progression-free survival after radical prostatectomy was comparable for HPC (78%) and SPC (74%; P = 0.30). The 5-year overall survival was 85% (95%CI 81-89%) for HPC versus 80% (95%CI 78-82%) for SPC (P = 0.03). HPC has a favorable clinical phenotype but patients more often underwent radical treatment. The major limitation of HPC is the absence of a genetics-based definition of HPC, which may lead to over-diagnosis of PC in men with a family history of prostate cancer. The HPC definition should, therefore, be re-evaluated, aiming at a reduction of over-diagnosis and overtreatment among men with multiple relatives diagnosed with PC. Prostate 76:897-904, 2016. © 2016 The Authors. The Prostate published by Wiley Periodicals, Inc. © 2016 The Authors. The Prostate published by Wiley Periodicals, Inc.
Primary osseous hemangiopericytoma in the thoracic spine.
Ren, Ke; Zhou, Xing; Wu, SuJia; Sun, Xiaoliang
2014-01-01
Hemangiopericytoma (HPC) is a rare tumor of the central nervous system, most commonly found in the cranial cavity. HPCs in the spine are rare, and very few of them are primary osseous HPC. The aims of this study were to describe a rare case of primary osseous HPC in the thoracic spine and review the literature. A 54-year-old man presented with a 3-month history of back pain. Aneuro logical examination revealed no motor or sensory deficits. Magnetic resonance imaging (MRI) and computed tomography (CT) scan showed a tumor originating from the bone structure of the T10 vertebra with paravertebral extension, and chest CT revealed pulmonary metastases. A laminectomy, face-totomy,and subtotal resection of the tumor was performed with posterior pedicle screw system fixation followed by radiotherapy. The post-operative course was uneventful. His back pain was resolved completely after surgery. The patient survived with tumor during the 18-month follow-up period. Histopathology and immunohistologic findings were consistent with HPC. On immunohistochemistry, the tumor was positive for vimentin and CD34, partially positive for S-100, but negative for EMA, desmin, CD117, and CD1a. A literature review identified eight such cases reported between 1942 and 2013. As a conclusion, clinical manifestations of primary osseous spinal HPCs are different from intraspinal meningeal HPCs. Although showing certain variability, histopathology and immunohistochemical examinations are essential to establish the diagnosis. Surgical resection and radiotherapy are the treatment of choice. *These authors contributed equally to this work.
Pope, Bernard J; Fitch, Blake G; Pitman, Michael C; Rice, John J; Reumann, Matthias
2011-01-01
Future multiscale and multiphysics models must use the power of high performance computing (HPC) systems to enable research into human disease, translational medical science, and treatment. Previously we showed that computationally efficient multiscale models will require the use of sophisticated hybrid programming models, mixing distributed message passing processes (e.g. the message passing interface (MPI)) with multithreading (e.g. OpenMP, POSIX pthreads). The objective of this work is to compare the performance of such hybrid programming models when applied to the simulation of a lightweight multiscale cardiac model. Our results show that the hybrid models do not perform favourably when compared to an implementation using only MPI which is in contrast to our results using complex physiological models. Thus, with regards to lightweight multiscale cardiac models, the user may not need to increase programming complexity by using a hybrid programming approach. However, considering that model complexity will increase as well as the HPC system size in both node count and number of cores per node, it is still foreseeable that we will achieve faster than real time multiscale cardiac simulations on these systems using hybrid programming models.
Xi-cam: Flexible High Throughput Data Processing for GISAXS
NASA Astrophysics Data System (ADS)
Pandolfi, Ronald; Kumar, Dinesh; Venkatakrishnan, Singanallur; Sarje, Abinav; Krishnan, Hari; Pellouchoud, Lenson; Ren, Fang; Fournier, Amanda; Jiang, Zhang; Tassone, Christopher; Mehta, Apurva; Sethian, James; Hexemer, Alexander
With increasing capabilities and data demand for GISAXS beamlines, supporting software is under development to handle larger data rates, volumes, and processing needs. We aim to provide a flexible and extensible approach to GISAXS data treatment as a solution to these rising needs. Xi-cam is the CAMERA platform for data management, analysis, and visualization. The core of Xi-cam is an extensible plugin-based GUI platform which provides users an interactive interface to processing algorithms. Plugins are available for SAXS/GISAXS data and data series visualization, as well as forward modeling and simulation through HipGISAXS. With Xi-cam's advanced mode, data processing steps are designed as a graph-based workflow, which can be executed locally or remotely. Remote execution utilizes HPC or de-localized resources, allowing for effective reduction of high-throughput data. Xi-cam is open-source and cross-platform. The processing algorithms in Xi-cam include parallel cpu and gpu processing optimizations, also taking advantage of external processing packages such as pyFAI. Xi-cam is available for download online.
Operational flash flood forecasting platform based on grid technology
NASA Astrophysics Data System (ADS)
Thierion, V.; Ayral, P.-A.; Angelini, V.; Sauvagnargues-Lesage, S.; Nativi, S.; Payrastre, O.
2009-04-01
Flash flood events of south of France such as the 8th and 9th September 2002 in the Grand Delta territory caused important economic and human damages. Further to this catastrophic hydrological situation, a reform of flood warning services have been initiated (set in 2006). Thus, this political reform has transformed the 52 existing flood warning services (SAC) in 22 flood forecasting services (SPC), in assigning them territories more hydrological consistent and new effective hydrological forecasting mission. Furthermore, national central service (SCHAPI) has been created to ease this transformation and support local services in their new objectives. New functioning requirements have been identified: - SPC and SCHAPI carry the responsibility to clearly disseminate to public organisms, civil protection actors and population, crucial hydrologic information to better anticipate potential dramatic flood event, - a new effective hydrological forecasting mission to these flood forecasting services seems essential particularly for the flash floods phenomenon. Thus, models improvement and optimization was one of the most critical requirements. Initially dedicated to support forecaster in their monitoring mission, thanks to measuring stations and rainfall radar images analysis, hydrological models have to become more efficient in their capacity to anticipate hydrological situation. Understanding natural phenomenon occuring during flash floods mainly leads present hydrological research. Rather than trying to explain such complex processes, the presented research try to manage the well-known need of computational power and data storage capacities of these services. Since few years, Grid technology appears as a technological revolution in high performance computing (HPC) allowing large-scale resource sharing, computational power using and supporting collaboration across networks. Nowadays, EGEE (Enabling Grids for E-science in Europe) project represents the most important effort in term of grid technology development. This paper presents an operational flash flood forecasting platform which have been developed in the framework of CYCLOPS European project providing one of virtual organizations of EGEE project. This platform has been designed to enable multi-simulations processes to ease forecasting operations of several supervised watersheds on Grand Delta (SPC-GD) territory. Grid technology infrastructure, in providing multiple remote computing elements enables the processing of multiple rainfall scenarios, derived to the original meteorological forecasting transmitted by Meteo-France, and their respective hydrological simulations. First results show that from one forecasting scenario, this new presented approach can permit simulations of more than 200 different scenarios to support forecasters in their aforesaid mission and appears as an efficient hydrological decision-making tool. Although, this system seems operational, model validity has to be confirmed. So, further researches are necessary to improve models core to be more efficient in term of hydrological aspects. Finally, this platform could be an efficient tool for developing others modelling aspects as calibration or data assimilation in real time processing.
the one illustrated here, the outer membrane protein OprF of Pseudomonas aeruginosa in its -1990s, NWChem was designed to run on networked processors, as in an HPC system, using one-sided communication, says Jeff Hammond of Intel Corp.'s Parallel Computing Laboratory. In one-sided communication, a
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peles, Slaven
2016-11-06
GridKit is a software development kit for interfacing power systems and power grid application software with high performance computing (HPC) libraries developed at National Labs and academia. It is also intended as interoperability layer between different numerical libraries. GridKit is not a standalone application, but comes with a suite of test examples illustrating possible usage.
Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun
2008-05-28
Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4-15.9 times faster, while Unphased jobs performed 1.1-18.6 times faster compared to the accumulated computation duration. Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance.
Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun
2008-01-01
Background Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Results Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4–15.9 times faster, while Unphased jobs performed 1.1–18.6 times faster compared to the accumulated computation duration. Conclusion Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance. PMID:18541045
Frequency-specific hippocampal-prefrontal interactions during associative learning
Brincat, Scott L.; Miller, Earl K.
2015-01-01
Much of our knowledge of the world depends on learning associations (e.g., face-name), for which the hippocampus (HPC) and prefrontal cortex (PFC) are critical. HPC-PFC interactions have rarely been studied in monkeys, whose cognitive/mnemonic abilities are akin to humans. Here, we show functional differences and frequency-specific interactions between HPC and PFC of monkeys learning object-pair associations, an animal model of human explicit memory. PFC spiking activity reflected learning in parallel with behavioral performance, while HPC neurons reflected feedback about whether trial-and-error guesses were correct or incorrect. Theta-band HPC-PFC synchrony was stronger after errors, was driven primarily by PFC to HPC directional influences, and decreased with learning. In contrast, alpha/beta-band synchrony was stronger after correct trials, was driven more by HPC, and increased with learning. Rapid object associative learning may occur in PFC, while HPC may guide neocortical plasticity by signaling success or failure via oscillatory synchrony in different frequency bands. PMID:25706471
1995-01-01
possible to determine communication points. For this version, a C program spawning Posix threads and using semaphores to synchronize would have to...performance such as the time required for network communication and synchronization as well as issues of asynchrony and memory hierarchy. For example...enhances reusability. Process (or task) parallel computations can also be succinctly expressed with a small set of process creation and synchronization
High Performance Computing Operations Review Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cupps, Kimberly C.
2013-12-19
The High Performance Computing Operations Review (HPCOR) meeting—requested by the ASC and ASCR program headquarters at DOE—was held November 5 and 6, 2013, at the Marriott Hotel in San Francisco, CA. The purpose of the review was to discuss the processes and practices for HPC integration and its related software and facilities. Experiences and lessons learned from the most recent systems deployed were covered in order to benefit the deployment of new systems.
Implementation of NAS Parallel Benchmarks in Java
NASA Technical Reports Server (NTRS)
Frumkin, Michael; Schultz, Matthew; Jin, Hao-Qiang; Yan, Jerry
2000-01-01
A number of features make Java an attractive but a debatable choice for High Performance Computing (HPC). In order to gauge the applicability of Java to the Computational Fluid Dynamics (CFD) we have implemented NAS Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would move Java closer to Fortran in the competition for CFD applications.
Austin, from 2001 to 2007. There he was principal in HPC applications and user support, as well as in research and development in large-scale scientific applications and different HPC systems and technologies Interests HPC applications performance and optimizations|HPC systems and accelerator technologies|Scientific
Toward an optimal online checkpoint solution under a two-level HPC checkpoint model
Di, Sheng; Robert, Yves; Vivien, Frederic; ...
2016-03-29
The traditional single-level checkpointing method suffers from significant overhead on large-scale platforms. Hence, multilevel checkpointing protocols have been studied extensively in recent years. The multilevel checkpoint approach allows different levels of checkpoints to be set (each with different checkpoint overheads and recovery abilities), in order to further improve the fault tolerance performance of extreme-scale HPC applications. How to optimize the checkpoint intervals for each level, however, is an extremely difficult problem. In this paper, we construct an easy-to-use two-level checkpoint model. Checkpoint level 1 deals with errors with low checkpoint/recovery overheads such as transient memory errors, while checkpoint level 2more » deals with hardware crashes such as node failures. Compared with previous optimization work, our new optimal checkpoint solution offers two improvements: (1) it is an online solution without requiring knowledge of the job length in advance, and (2) it shows that periodic patterns are optimal and determines the best pattern. We evaluate the proposed solution and compare it with the most up-to-date related approaches on an extreme-scale simulation testbed constructed based on a real HPC application execution. Simulation results show that our proposed solution outperforms other optimized solutions and can improve the performance significantly in some cases. Specifically, with the new solution the wall-clock time can be reduced by up to 25.3% over that of other state-of-the-art approaches. Lastly, a brute-force comparison with all possible patterns shows that our solution is always within 1% of the best pattern in the experiments.« less
Bu, Xiangning; Zhang, Nan; Yang, Xuan; Liu, Yanyan; Du, Jianli; Liang, Jing; Xu, Qunyuan; Li, Junfa
2011-04-01
Hypoxic preconditioning (HPC) initiates intracellular signaling pathway to provide protection against subsequent cerebral ischemic injuries, and its mechanism may provide molecular targets for therapy in stroke. According to our study of conventional protein kinase C βII (cPKCβII) activation in HPC, the role of cPKCβII in HPC-induced neuroprotection and its interacting proteins were determined in this study. The autohypoxia-induced HPC and middle cerebral artery occlusion (MCAO)-induced cerebral ischemia mouse models were prepared as reported. We found that HPC reduced 6 h MCAO-induced neurological deficits, infarct volume, edema ratio and cell apoptosis in peri-infarct region (penumbra), but cPKCβII inhibitors Go6983 and LY333531 blocked HPC-induced neuroprotection. Proteomic analysis revealed that the expression of four proteins in cytosol and eight proteins in particulate fraction changed significantly among 49 identified cPKCβII-interacting proteins in cortex of HPC mice. In addition, HPC could inhibit the decrease of phosphorylated collapsin response mediator protein-2 (CRMP-2) level and increase of CRMP-2 breakdown product. TAT-CRMP-2 peptide, which prevents the cleavage of endogenous CRMP-2, could inhibit CRMP-2 dephosphorylation and proteolysis as well as the infarct volume of 6 h MCAO mice. This study is the first to report multiple cPKCβII-interacting proteins in HPC mouse brain and the role of cPKCβII-CRMP-2 in HPC-induced neuroprotection against early stages of ischemic injuries in mice. © 2011 The Authors. Journal of Neurochemistry © 2011 International Society for Neurochemistry.
Large Scale GW Calculations on the Cori System
NASA Astrophysics Data System (ADS)
Deslippe, Jack; Del Ben, Mauro; da Jornada, Felipe; Canning, Andrew; Louie, Steven
The NERSC Cori system, powered by 9000+ Intel Xeon-Phi processors, represents one of the largest HPC systems for open-science in the United States and the world. We discuss the optimization of the GW methodology for this system, including both node level and system-scale optimizations. We highlight multiple large scale (thousands of atoms) case studies and discuss both absolute application performance and comparison to calculations on more traditional HPC architectures. We find that the GW method is particularly well suited for many-core architectures due to the ability to exploit a large amount of parallelism across many layers of the system. This work was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division, as part of the Computational Materials Sciences Program.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-27
... plated nuts attaching the HPC stage 3 to 8 drum to the HPC stage 9 to 12 drum, removal of silver residue... plated nuts attaching the HPC stage 3 to 8 drum to the HPC stage 9 to 12 drum, removal of silver residue... AD, removal from service of the fully silver plated nuts attaching the HPC stage 3 to 8 drum to the...
Using Rollback Avoidance to Mitigate Failures in Next-Generation Extreme-Scale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Levy, Scott N.
2016-05-01
High-performance computing (HPC) systems enable scientists to numerically model complex phenomena in many important physical systems. The next major milestone in the development of HPC systems is the construction of the rst supercomputer capable executing more than an exa op, 10 18 oating point operations per second. On systems of this scale, failures will occur much more frequently than on current systems. As a result, resilience is a key obstacle to building next-generation extremescale systems. Coordinated checkpointing is currently the most widely-used mechanism for handling failures on HPC systems. Although coordinated checkpointing remains e ective on current systems, increasing themore » scale of today's systems to build next-generation systems will increase the cost of fault tolerance as more and more time is taken away from the application to protect against or recover from failure. Rollback avoidance techniques seek to mitigate the cost of checkpoint/restart by allowing an application to continue its execution rather than rolling back to an earlier checkpoint when failures occur. These techniqes include failure prediction and preventive migration, replicated computation, fault-tolerant algorithms, and softwarebased memory fault correction. In this thesis, we examine how rollback avoidance techniques can be used to address failures on extreme-scale systems. Using a combination of analytic modeling and simulation, we evaluate the potential impact of rollback avoidance on these systems. We then present a novel rollback avoidance technique that exploits similarities in application memory. Finally, we examine the feasibility of using this technique to protect against memory faults in kernel memory.« less
Lee, Justin Q; Sutherland, Robert J; McDonald, Robert J
2017-09-01
There is a substantial body of evidence that the hippocampus (HPC) plays and essential role in context discrimination in rodents. Studies reporting anterograde amnesia (AA) used repeated, alternating, distributed conditioning and extinction sessions to measure context fear discrimination. In addition, there is uncertainty about the extent of damage to the HPC. Here, we induced conditioned fear prior to discrimination tests and rats sustained extensive, quantified pre- or post-training HPC damage. Unlike previous work, we found that extensive HPC damage spares context discrimination, we observed no AA. There must be a non-HPC system that can acquire long-term memories that support context fear discrimination. Post-training HPC damage caused retrograde amnesia (RA) for context discrimination, even when rats are fear conditioned for multiple sessions. We discuss the implications of these findings for understanding the role of HPC in long-term memory. © 2017 Wiley Periodicals, Inc.
Self-desiccation mechanism of high-performance concrete.
Yang, Quan-Bing; Zhang, Shu-Qing
2004-12-01
Investigations on the effects of W/C ratio and silica fume on the autogenous shrinkage and internal relative humidity of high performance concrete (HPC), and analysis of the self-desiccation mechanisms of HPC showed that the autogenous shrinkage and internal relative humidity of HPC increases and decreases with the reduction of W/C respectively; and that these phenomena were amplified by the addition of silica fume. Theoretical analyses indicated that the reduction of RH in HPC was not due to shortage of water, but due to the fact that the evaporable water in HPC was not evaporated freely. The reduction of internal relative humidity or the so-called self-desiccation of HPC was chiefly caused by the increase in mole concentration of soluble ions in HPC and the reduction of pore size or the increase in the fraction of micro-pore water in the total evaporable water (T(r)/T(te) ratio).
NASA Center for Climate Simulation (NCCS) Advanced Technology AT5 Virtualized Infiniband Report
NASA Technical Reports Server (NTRS)
Thompson, John H.; Bledsoe, Benjamin C.; Wagner, Mark; Shakshober, John; Fromkin, Russ
2013-01-01
The NCCS is part of the Computational and Information Sciences and Technology Office (CISTO) of Goddard Space Flight Center's (GSFC) Sciences and Exploration Directorate. The NCCS's mission is to enable scientists to increase their understanding of the Earth, the solar system, and the universe by supplying state-of-the-art high performance computing (HPC) solutions. To accomplish this mission, the NCCS (https://www.nccs.nasa.gov) provides high performance compute engines, mass storage, and network solutions to meet the specialized needs of the Earth and space science user communities
The novel high-performance 3-D MT inverse solver
NASA Astrophysics Data System (ADS)
Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey
2016-04-01
We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Exposure to high ambient temperatures alters embryology in rabbits
NASA Astrophysics Data System (ADS)
García, M. L.; Argente, M. J.
2017-09-01
High ambient temperatures are a determining factor in the deterioration of embryo quality and survival in mammals. The aim of this study was to evaluate the effect of heat stress on embryo development, embryonic size and size of the embryonic coats in rabbits. A total of 310 embryos from 33 females in thermal comfort zone and 264 embryos of 28 females in heat stress conditions were used in the experiment. The traits studied were ovulation rate, percentage of total embryos, percentage of normal embryos, embryo area, zona pellucida thickness and mucin coat thickness. Traits were measured at 24 and 48 h post-coitum (hpc); mucin coat thickness was only measured at 48 hpc. The embryos were classified as zygotes or two-cell embryos at 24 hpc, and 16-cells or early morulae at 48 hpc. The ovulation rate was one oocyte lower in heat stress conditions than in thermal comfort. Percentage of normal embryos was lower in heat stress conditions at 24 hpc (17.2%) and 48 hpc (13.2%). No differences in percentage of zygotes or two-cell embryos were found at 24 hpc. The embryo development and area was affected by heat stress at 48 hpc (10% higher percentage of 16-cells and 883 μm2 smaller, respectively). Zona pellucida was thicker under thermal stress at 24 hpc (1.2 μm) and 48 hpc (1.5 μm). No differences in mucin coat thickness were found. In conclusion, heat stress appears to alter embryology in rabbits.
Data Retention Policy | High-Performance Computing | NREL
HPC Data Retention Policy. File storage areas on Peregrine and Gyrfalcon are either user-centric to reclaim storage. We can make special arrangements for permanent storage, if needed. User-Centric > is 3 months after the last project ends. During this retention period, the user may log in to
Peregrine Software Toolchains | High-Performance Computing | NREL
toolchain is an open-source alternative against which many technical applications are natively developed and tested. The Portland Group compilers are not fully supported, but are available to the HPC community. Use Group (PGI) C/C++ and Fortran (partially supported) The PGI Accelerator compilers include NVIDIA GPU
Accelerating Science with the NERSC Burst Buffer Early User Program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhimji, Wahid; Bard, Debbie; Romanus, Melissa
NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burstmore » Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.« less
Fine grained event processing on HPCs with the ATLAS Yoda system
NASA Astrophysics Data System (ADS)
Calafiura, Paolo; De, Kaushik; Guan, Wen; Maeno, Tadashi; Nilsson, Paul; Oleynik, Danila; Panitkin, Sergey; Tsulaia, Vakhtang; Van Gemmeren, Peter; Wenaus, Torre
2015-12-01
High performance computing facilities present unique challenges and opportunities for HEP event processing. The massive scale of many HPC systems means that fractionally small utilization can yield large returns in processing throughput. Parallel applications which can dynamically and efficiently fill any scheduling opportunities the resource presents benefit both the facility (maximal utilization) and the (compute-limited) science. The ATLAS Yoda system provides this capability to HEP-like event processing applications by implementing event-level processing in an MPI-based master-client model that integrates seamlessly with the more broadly scoped ATLAS Event Service. Fine grained, event level work assignments are intelligently dispatched to parallel workers to sustain full utilization on all cores, with outputs streamed off to destination object stores in near real time with similarly fine granularity, such that processing can proceed until termination with full utilization. The system offers the efficiency and scheduling flexibility of preemption without requiring the application actually support or employ check-pointing. We will present the new Yoda system, its motivations, architecture, implementation, and applications in ATLAS data processing at several US HPC centers.
NASA Astrophysics Data System (ADS)
Gutzwiller, David; Gontier, Mathieu; Demeulenaere, Alain
2014-11-01
Multi-Block structured solvers hold many advantages over their unstructured counterparts, such as a smaller memory footprint and efficient serial performance. Historically, multi-block structured solvers have not been easily adapted for use in a High Performance Computing (HPC) environment, and the recent trend towards hybrid GPU/CPU architectures has further complicated the situation. This paper will elaborate on developments and innovations applied to the NUMECA FINE/Turbo solver that have allowed near-linear scalability with real-world problems on over 250 hybrid GPU/GPU cluster nodes. Discussion will focus on the implementation of virtual partitioning and load balancing algorithms using a novel meta-block concept. This implementation is transparent to the user, allowing all pre- and post-processing steps to be performed using a simple, unpartitioned grid topology. Additional discussion will elaborate on developments that have improved parallel performance, including fully parallel I/O with the ADIOS API and the GPU porting of the computationally heavy CPUBooster convergence acceleration module. Head of HPC and Release Management, Numeca International.
2015-09-01
the network Mac8 Medium Access Control ( Mac ) (Ethernet) address observed as destination for outgoing packets subsessionid8 Zero-based index of...15. SUBJECT TERMS tactical networks, data reduction, high-performance computing, data analysis, big data 16. SECURITY CLASSIFICATION OF: 17...Integer index of row cts_deid Device (instrument) Identifier where observation took place cts_collpt Collection point or logical observation point on
Diversity in computing technologies and strategies for dynamic resource allocation
Garzoglio, G.; Gutsche, O.
2015-12-23
Here, High Energy Physics (HEP) is a very data intensive and trivially parallelizable science discipline. HEP is probing nature at increasingly finer details requiring ever increasing computational resources to process and analyze experimental data. In this paper, we discuss how HEP provisioned resources so far using Grid technologies, how HEP is starting to include new resource providers like commercial Clouds and HPC installations, and how HEP is transparently provisioning resources at these diverse providers.
2013-01-01
M. Ahmadi, and M. Shridhar, “ Handwritten Numeral Recognition with Multiple Features and Multistage Classifiers,” Proc. IEEE Int’l Symp. Circuits...ARTICLE (Post Print) 3. DATES COVERED (From - To) SEP 2011 – SEP 2013 4. TITLE AND SUBTITLE A PARALLEL NEUROMORPHIC TEXT RECOGNITION SYSTEM AND ITS...research in computational intelligence has entered a new era. In this paper, we present an HPC-based context-aware intelligent text recognition
Research of aerohydrodynamic and aeroelastic processes on PNRPU HPC system
NASA Astrophysics Data System (ADS)
Modorskii, V. Ya.; Shevelev, N. A.
2016-10-01
Research of aerohydrodynamic and aeroelastic processes with the High Performance Computing Complex in PNIPU is actively conducted within the university priority development direction "Aviation engine and gas turbine technology". Work is carried out in two areas: development and use of domestic software and use of well-known foreign licensed applied software packets. In addition, the third direction associated with the verification of computational experiments - physical modeling, with unique proprietary experimental installations is being developed.
Programming for 1.6 Millon cores: Early experiences with IBM's BG/Q SMP architecture
NASA Astrophysics Data System (ADS)
Glosli, James
2013-03-01
With the stall in clock cycle improvements a decade ago, the drive for computational performance has continues along a path of increasing core counts on a processor. The multi-core evolution has been expressed in both a symmetric multi processor (SMP) architecture and cpu/GPU architecture. Debates rage in the high performance computing (HPC) community which architecture best serves HPC. In this talk I will not attempt to resolve that debate but perhaps fuel it. I will discuss the experience of exploiting Sequoia, a 98304 node IBM Blue Gene/Q SMP at Lawrence Livermore National Laboratory. The advantages and challenges of leveraging the computational power BG/Q will be detailed through the discussion of two applications. The first application is a Molecular Dynamics code called ddcMD. This is a code developed over the last decade at LLNL and ported to BG/Q. The second application is a cardiac modeling code called Cardioid. This is a code that was recently designed and developed at LLNL to exploit the fine scale parallelism of BG/Q's SMP architecture. Through the lenses of these efforts I'll illustrate the need to rethink how we express and implement our computational approaches. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Evaluating Application Resilience with XRay
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Sui; Bronevetsky, Greg; Li, Bin
2015-05-07
The rising count and shrinking feature size of transistors within modern computers is making them increasingly vulnerable to various types of soft faults. This problem is especially acute in high-performance computing (HPC) systems used for scientific computing, because these systems include many thousands of compute cores and nodes, all of which may be utilized in a single large-scale run. The increasing vulnerability of HPC applications to errors induced by soft faults is motivating extensive work on techniques to make these applications more resiilent to such faults, ranging from generic techniques such as replication or checkpoint/restart to algorithmspecific error detection andmore » tolerance techniques. Effective use of such techniques requires a detailed understanding of how a given application is affected by soft faults to ensure that (i) efforts to improve application resilience are spent in the code regions most vulnerable to faults and (ii) the appropriate resilience technique is applied to each code region. This paper presents XRay, a tool to view the application vulnerability to soft errors, and illustrates how XRay can be used in the context of a representative application. In addition to providing actionable insights into application behavior XRay automatically selects the number of fault injection experiments required to provide an informative view of application behavior, ensuring that the information is statistically well-grounded without performing unnecessary experiments.« less
Drug repurposing: translational pharmacology, chemistry, computers and the clinic.
Issa, Naiem T; Byers, Stephen W; Dakshanamurthy, Sivanesan
2013-01-01
The process of discovering a pharmacological compound that elicits a desired clinical effect with minimal side effects is a challenge. Prior to the advent of high-performance computing and large-scale screening technologies, drug discovery was largely a serendipitous endeavor, as in the case of thalidomide for erythema nodosum leprosum or cancer drugs in general derived from flora located in far-reaching geographic locations. More recently, de novo drug discovery has become a more rationalized process where drug-target-effect hypotheses are formulated on the basis of already known compounds/protein targets and their structures. Although this approach is hypothesis-driven, the actual success has been very low, contributing to the soaring costs of research and development as well as the diminished pharmaceutical pipeline in the United States. In this review, we discuss the evolution in computational pharmacology as the next generation of successful drug discovery and implementation in the clinic where high-performance computing (HPC) is used to generate and validate drug-target-effect hypotheses completely in silico. The use of HPC would decrease development time and errors while increasing productivity prior to in vitro, animal and human testing. We highlight approaches in chemoinformatics, bioinformatics as well as network biopharmacology to illustrate potential avenues from which to design clinically efficacious drugs. We further discuss the implications of combining these approaches into an integrative methodology for high-accuracy computational predictions within the context of drug repositioning for the efficient streamlining of currently approved drugs back into clinical trials for possible new indications.
Sacro-anterior haemangiopericytoma: a case report
Ge, Xiu-Hong; Liu, Shuai-Shuai; Shan, Hu-Sheng; Wang, Zhi-Min; Li, Qian-Wen
2014-01-01
Haemangiopericytoma (HPC) is a rare vascular tumor with borderline malignancy, considerable histological variability, and unpredictable clinical and biological behavior. HPC can present a diagnostic challenge because of its indeterminate clinical, radiological, and pathological features. HPC generally presents in adulthood and is equally frequent in both sexes. HPC can arise in any site in the body as a slowly growing and painless mass. The precise cell type origin of HPC is uncertain. One third of HPCs occur in the head and neck areas. Exceptional cases of hemangioblastoma arising outside the head and neck areas have been reported, but little is known about their clinicopathologic and immunohistochemical features. This study reports on a case of a large sacro-anterior HPC in a 65-year-old male. PMID:25009757
Parallel tools GUI framework-DOE SBIR phase I final technical report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Galarowicz, James
2013-12-05
Many parallel performance, profiling, and debugging tools require a graphical way of displaying the very large datasets typically gathered from high performance computing (HPC) applications. Most tool projects create their graphical user interfaces (GUI) from scratch, many times spending their project resources on simply redeveloping commonly used infrastructure. Our goal was to create a multiplatform GUI framework, based on Nokia/Digia’s popular Qt libraries, which will specifically address the needs of these parallel tools. The Parallel Tools GUI Framework (PTGF) uses a plugin architecture facilitating rapid GUI development and reduced development costs for new and existing tool projects by allowing themore » reuse of many common GUI elements, called “widgets.” Widgets created include, 2D data visualizations, a source code viewer with syntax highlighting, and integrated help and welcome screens. Application programming interface (API) design was focused on minimizing the time to getting a functional tool working. Having a standard, unified, and userfriendly interface which operates on multiple platforms will benefit HPC application developers by reducing training time and allowing users to move between tools rapidly during a single session. However, Argo Navis Technologies LLC will not be submitting a DOE SBIR Phase II proposal and commercialization plan for the PTGF project. Our preliminary estimates for gross income over the next several years was based upon initial customer interest and income generated by similar projects. Unfortunately, as we further assessed the market during Phase I, we grew to realize that there was not enough demand to warrant such a large investment. While we do find that the project is worth our continued investment of time and money, we do not think it worthy of the DOE's investment at this time. We are grateful that the DOE has afforded us the opportunity to make this assessment, and come to this conclusion.« less
An Overview of the Computational Physics and Methods Group at Los Alamos National Laboratory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Randal Scott
CCS Division was formed to strengthen the visibility and impact of computer science and computational physics research on strategic directions for the Laboratory. Both computer science and computational science are now central to scientific discovery and innovation. They have become indispensable tools for all other scientific missions at the Laboratory. CCS Division forms a bridge between external partners and Laboratory programs, bringing new ideas and technologies to bear on today’s important problems and attracting high-quality technical staff members to the Laboratory. The Computational Physics and Methods Group CCS-2 conducts methods research and develops scientific software aimed at the latest andmore » emerging HPC systems.« less
Li, Xinwei; Li, Qiongling; Wang, Xuetong; Li, Deyu; Li, Shuyu
2018-01-01
The hippocampus plays an important role in memory function relying on information interaction between distributed brain areas. The hippocampus can be divided into the anterior and posterior sections with different structure and function along its long axis. The aim of this study is to investigate the effects of normal aging on the structural covariance of the anterior hippocampus (aHPC) and the posterior hippocampus (pHPC). In this study, 240 healthy subjects aged 18-89 years were selected and subdivided into young (18-23 years), middle-aged (30-58 years), and older (61-89 years) groups. The aHPC and pHPC was divided based on the location of uncal apex in the MNI space. Then, the structural covariance networks were constructed by examining their covariance in gray matter volumes with other brain regions. Finally, the influence of age on the structural covariance of these hippocampal sections was explored. We found that the aHPC and pHPC had different structural covariance patterns, but both of them were associated with the medial temporal lobe and insula. Moreover, both increased and decreased covariances were found with the aHPC but only increased covariance was found with the pHPC with age ( p < 0.05, family-wise error corrected). These decreased connections occurred within the default mode network, while the increased connectivity mainly occurred in other memory systems that differ from the hippocampus. This study reveals different age-related influence on the structural networks of the aHPC and pHPC, providing an essential insight into the mechanisms of the hippocampus in normal aging.
Sarode, Ashish; Wang, Peng; Cote, Catherine; Worthen, David R
2013-03-01
Hydroxypropylcellulose (HPC)-SL and -SSL, low-viscosity hydroxypropylcellulose polymers, are versatile pharmaceutical excipients. The utility of HPC polymers was assessed for both dissolution enhancement and sustained release of pharmaceutical drugs using various processing techniques. The BCS class II drugs carbamazepine (CBZ), hydrochlorthiazide, and phenytoin (PHT) were hot melt mixed (HMM) with various polymers. PHT formulations produced by solvent evaporation (SE) and ball milling (BM) were prepared using HPC-SSL. HMM formulations of BCS class I chlorpheniramine maleate (CPM) were prepared using HPC-SL and -SSL. These solid dispersions (SDs) manufactured using different processes were evaluated for amorphous transformation and dissolution characteristics. Drug degradation because of HMM processing was also assessed. Amorphous conversion using HMM could be achieved only for relatively low-melting CBZ and CPM. SE and BM did not produce amorphous SDs of PHT using HPC-SSL. Chemical stability of all the drugs was maintained using HPC during the HMM process. Dissolution enhancement was observed in HPC-based HMMs and compared well to other polymers. The dissolution enhancement of PHT was in the order of SE>BM>HMM>physical mixtures, as compared to the pure drug, perhaps due to more intimate mixing that occurred during SE and BM than in HMM. Dissolution of CPM could be significantly sustained in simulated gastric and intestinal fluids using HPC polymers. These studies revealed that low-viscosity HPC-SL and -SSL can be employed to produce chemically stable SDs of poorly as well as highly water-soluble drugs using various pharmaceutical processes in order to control drug dissolution.
Directional hippocampal-prefrontal interactions during working memory.
Liu, Tiaotiao; Bai, Wenwen; Xia, Mi; Tian, Xin
2018-02-15
Working memory refers to a system that is essential for performing complex cognitive tasks such as reasoning, comprehension and learning. Evidence shows that hippocampus (HPC) and prefrontal cortex (PFC) play important roles in working memory. The HPC-PFC interaction via theta-band oscillatory synchronization is critical for successful execution of working memory. However, whether one brain region is leading or lagging relative to another is still unclear. Therefore, in the present study, we simultaneously recorded local field potentials (LFPs) from rat ventral hippocampus (vHPC) and medial prefrontal cortex (mPFC) and while the rats performed a Y-maze working memory task. We then applied instantaneous amplitudes cross-correlation method to calculate the time lag between PFC and vHPC to explore the functional dynamics of the HPC-PFC interaction. Our results showed a strong lead from vHPC to mPFC preceded an animal's correct choice during the working memory task. These findings suggest the vHPC-leading interaction contributes to the successful execution of working memory. Copyright © 2017. Published by Elsevier B.V.
Gut vagal sensory signaling regulates hippocampus function through multi-order pathways.
Suarez, Andrea N; Hsu, Ted M; Liu, Clarissa M; Noble, Emily E; Cortella, Alyssa M; Nakamoto, Emily M; Hahn, Joel D; de Lartigue, Guillaume; Kanoski, Scott E
2018-06-05
The vagus nerve is the primary means of neural communication between the gastrointestinal (GI) tract and the brain. Vagally mediated GI signals activate the hippocampus (HPC), a brain region classically linked with memory function. However, the endogenous relevance of GI-derived vagal HPC communication is unknown. Here we utilize a saporin (SAP)-based lesioning procedure to reveal that selective GI vagal sensory/afferent ablation in rats impairs HPC-dependent episodic and spatial memory, effects associated with reduced HPC neurotrophic and neurogenesis markers. To determine the neural pathways connecting the gut to the HPC, we utilize monosynaptic and multisynaptic virus-based tracing methods to identify the medial septum as a relay connecting the medial nucleus tractus solitarius (where GI vagal afferents synapse) to dorsal HPC glutamatergic neurons. We conclude that endogenous GI-derived vagal sensory signaling promotes HPC-dependent memory function via a multi-order brainstem-septal pathway, thereby identifying a previously unknown role for the gut-brain axis in memory control.
[Biological behavior of hypopharyngeal carcinoma].
Zhou, L X
1997-01-01
Hypopharyngeal squamous cell carcinomas (HPC) has an extremely poor prognosis. Characteristics of cell lines of head and neck squamous cell carcinomas including HPC were studied by various methods, e.g., chemosensitivity test and the immunohistochemistry staining method, to determine whether this poor prognosis is due to the biological behavior of this cancer. An HPC cell line was found to be resistant to anti tumor drugs, i.e., PEP, MTX and CPM and moderately sensitive to CDDP, 5-FU and ADM. Thermoresistance to hyperthermatic treatment and weak expression of ICAM-1 on the HPC cell line were observed. DNA synthesis by the HPC cell line was induced by stimulation with a low concentration of EGF and the amount of EGFR on these HPC cells was very high. In addition, cyclinD1 overexpression was found in the HPC cell line. Based on the above findings, further analysis of hypopharyngeal carcinoma cells and the development of a new treatment modality to control tumor growth and metastatic factors influencing the poor outcome are necessary to improve the prognosis of this cancer.
Levy, Scott; Ferreira, Kurt B.; Bridges, Patrick G.; ...
2014-12-09
Building the next-generation of extreme-scale distributed systems will require overcoming several challenges related to system resilience. As the number of processors in these systems grow, the failure rate increases proportionally. One of the most common sources of failure in large-scale systems is memory. In this paper, we propose a novel runtime for transparently exploiting memory content similarity to improve system resilience by reducing the rate at which memory errors lead to node failure. We evaluate the viability of this approach by examining memory snapshots collected from eight high-performance computing (HPC) applications and two important HPC operating systems. Based on themore » characteristics of the similarity uncovered, we conclude that our proposed approach shows promise for addressing system resilience in large-scale systems.« less
The OSG Open Facility: an on-ramp for opportunistic scientific computing
NASA Astrophysics Data System (ADS)
Jayatilaka, B.; Levshina, T.; Sehgal, C.; Gardner, R.; Rynge, M.; Würthwein, F.
2017-10-01
The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource owners and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.
The OSG Open Facility: An On-Ramp for Opportunistic Scientific Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jayatilaka, B.; Levshina, T.; Sehgal, C.
The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource ownersmore » and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.« less
Virtualizing access to scientific applications with the Application Hosting Environment
NASA Astrophysics Data System (ADS)
Zasada, S. J.; Coveney, P. V.
2009-12-01
The growing power and number of high performance computing resources made available through computational grids present major opportunities as well as a number of challenges to the user. At issue is how these resources can be accessed and how their power can be effectively exploited. In this paper we first present our views on the usability of contemporary high-performance computational resources. We introduce the concept of grid application virtualization as a solution to some of the problems with grid-based HPC usability. We then describe a middleware tool that we have developed to realize the virtualization of grid applications, the Application Hosting Environment (AHE), and describe the features of the new release, AHE 2.0, which provides access to a common platform of federated computational grid resources in standard and non-standard ways. Finally, we describe a case study showing how AHE supports clinical use of whole brain blood flow modelling in a routine and automated fashion. Program summaryProgram title: Application Hosting Environment 2.0 Catalogue identifier: AEEJ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEJ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU Public Licence, Version 2 No. of lines in distributed program, including test data, etc.: not applicable No. of bytes in distributed program, including test data, etc.: 1 685 603 766 Distribution format: tar.gz Programming language: Perl (server), Java (Client) Computer: x86 Operating system: Linux (Server), Linux/Windows/MacOS (Client) RAM: 134 217 728 (server), 67 108 864 (client) bytes Classification: 6.5 External routines: VirtualBox (server), Java (client) Nature of problem: The middleware that makes grid computing possible has been found by many users to be too unwieldy, and presents an obstacle to use rather than providing assistance [1,2]. Such problems are compounded when one attempts to harness the power of a grid, or a federation of different grids, rather than just a single resource on the grid. Solution method: To address the above problem, we have developed AHE, a lightweight interface, designed to simplify the process of running scientific codes on a grid of HPC and local resources. AHE does this by introducing a layer of middleware between the user and the grid, which encapsulates much of the complexity associated with launching grid applications. Unusual features: The server is distributed as a VirtualBox virtual machine. VirtualBox ( http://www.virtualbox.org) must be downloaded and installed in order to run the AHE server virtual machine. Details of how to do this are given in the AHE 2.0 Quick Start Guide. Running time: Not applicable References:J. Chin, P.V. Coveney, Towards tractable toolkits for the grid: A plea for lightweight, useable middleware, NeSC Technical Report, 2004, http://nesc.ac.uk/technical_papers/UKeS-2004-01.pdf. P.V. Coveney, R.S. Saksena, S.J. Zasada, M. McKeown, S. Pickles, The Application Hosting Environment: Lightweight middleware for grid-based computational science, Computer Physics Communications 176 (2007) 406-418.
Re-Form: FPGA-Powered True Codesign Flow for High-Performance Computing In The Post-Moore Era
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cappello, Franck; Yoshii, Kazutomo; Finkel, Hal
Multicore scaling will end soon because of practical power limits. Dark silicon is becoming a major issue even more than the end of Moore’s law. In the post-Moore era, the energy efficiency of computing will be a major concern. FPGAs could be a key to maximizing the energy efficiency. In this paper we address severe challenges in the adoption of FPGA in HPC and describe “Re-form,” an FPGA-powered codesign flow.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Muller, Richard P.
2017-07-01
Sandia National Laboratories has developed a broad set of capabilities in quantum information science (QIS), including elements of quantum computing, quantum communications, and quantum sensing. The Sandia QIS program is built atop unique DOE investments at the laboratories, including the MESA microelectronics fabrication facility, the Center for Integrated Nanotechnologies (CINT) facilities (joint with LANL), the Ion Beam Laboratory, and ASC High Performance Computing (HPC) facilities. Sandia has invested $75 M of LDRD funding over 12 years to develop unique, differentiating capabilities that leverage these DOE infrastructure investments.
Optimization of atmospheric transport models on HPC platforms
NASA Astrophysics Data System (ADS)
de la Cruz, Raúl; Folch, Arnau; Farré, Pau; Cabezas, Javier; Navarro, Nacho; Cela, José María
2016-12-01
The performance and scalability of atmospheric transport models on high performance computing environments is often far from optimal for multiple reasons including, for example, sequential input and output, synchronous communications, work unbalance, memory access latency or lack of task overlapping. We investigate how different software optimizations and porting to non general-purpose hardware architectures improve code scalability and execution times considering, as an example, the FALL3D volcanic ash transport model. To this purpose, we implement the FALL3D model equations in the WARIS framework, a software designed from scratch to solve in a parallel and efficient way different geoscience problems on a wide variety of architectures. In addition, we consider further improvements in WARIS such as hybrid MPI-OMP parallelization, spatial blocking, auto-tuning and thread affinity. Considering all these aspects together, the FALL3D execution times for a realistic test case running on general-purpose cluster architectures (Intel Sandy Bridge) decrease by a factor between 7 and 40 depending on the grid resolution. Finally, we port the application to Intel Xeon Phi (MIC) and NVIDIA GPUs (CUDA) accelerator-based architectures and compare performance, cost and power consumption on all the architectures. Implications on time-constrained operational model configurations are discussed.
2013-01-01
Background Solitary Fibrous Tumours (SFT) and haemangiopericytomas (HPC) are rare meningeal tumours that have to be distinguished from meningiomas and more rarely from synovial sarcomas. We recently found that ALDH1A1 was overexpressed in SFT and HPC as compared to soft tissue sarcomas. Using whole-genome DNA microarrays, we defined the gene expression profiles of 16 SFT/HPC (9 HPC and 7 SFT). Expression profiles were compared to publicly available expression profiles of additional SFT or HPC, meningiomas and synovial sarcomas. We also performed an immunohistochemical (IHC) study with anti-ALDH1 and anti-CD34 antibodies on Tissue Micro-Arrays including 38 SFT (25 meningeal and 13 extrameningeal), 55 meningeal haemangiopericytomas (24 grade II, 31 grade III), 163 meningiomas (86 grade I, 62 grade II, 15 grade III) and 98 genetically confirmed synovial sarcomas. Results ALDH1A1 gene was overexpressed in SFT/HPC, as compared to meningiomas and synovial sarcomas. These findings were confirmed at the protein level. 84% of the SFT and 85.4% of the HPC were positive with anti-ALDH1 antibody, while only 7.1% of synovial sarcomas and 1.2% of meningiomas showed consistent expression. Positivity was usually more diffuse in SFT/HPC compared to other tumours with more than 50% of tumour cells immunostained in 32% of SFT and 50.8% of HPC. ALDH1 was a sensitive and specific marker for the diagnosis of SFT (SE = 84%, SP = 98.8%) and HPC (SE = 84.5%, SP = 98.7%) of the meninges. In association with CD34, ALDH1 expression had a specificity and positive predictive value of 100%. Conclusion We show that ALDH1, a stem cell marker, is an accurate diagnostic marker for SFT and HPC, which improves the diagnostic value of CD34. ALDH1 could also be a new therapeutic target for these tumours which are not sensitive to conventional chemotherapy. PMID:24252471
2015-09-01
this report made use of posttest processing techniques to provide packet-level time tagging with an accuracy close to 3 µs relative to Coordinated...h set of test records. The process described herein made use of posttest processing techniques to provide packet-level time tagging with an accuracy
Peregrine System User Basics | High-Performance Computing | NREL
peregrine.hpc.nrel.gov or to one of the login nodes. Example commands to access Peregrine from a Linux or Mac OS X system Code Example Create a file called hello.F90 containing the following code: program hello write(6 information by enclosing it in brackets < >. For example: $ ssh -Y
SaaS enabled admission control for MCMC simulation in cloud computing infrastructures
NASA Astrophysics Data System (ADS)
Vázquez-Poletti, J. L.; Moreno-Vozmediano, R.; Han, R.; Wang, W.; Llorente, I. M.
2017-02-01
Markov Chain Monte Carlo (MCMC) methods are widely used in the field of simulation and modelling of materials, producing applications that require a great amount of computational resources. Cloud computing represents a seamless source for these resources in the form of HPC. However, resource over-consumption can be an important drawback, specially if the cloud provision process is not appropriately optimized. In the present contribution we propose a two-level solution that, on one hand, takes advantage of approximate computing for reducing the resource demand and on the other, uses admission control policies for guaranteeing an optimal provision to running applications.
Dynamic provisioning of local and remote compute resources with OpenStack
NASA Astrophysics Data System (ADS)
Giffels, M.; Hauth, T.; Polgart, F.; Quast, G.
2015-12-01
Modern high-energy physics experiments rely on the extensive usage of computing resources, both for the reconstruction of measured events as well as for Monte-Carlo simulation. The Institut fur Experimentelle Kernphysik (EKP) at KIT is participating in both the CMS and Belle experiments with computing and storage resources. In the upcoming years, these requirements are expected to increase due to growing amount of recorded data and the rise in complexity of the simulated events. It is therefore essential to increase the available computing capabilities by tapping into all resource pools. At the EKP institute, powerful desktop machines are available to users. Due to the multi-core nature of modern CPUs, vast amounts of CPU time are not utilized by common desktop usage patterns. Other important providers of compute capabilities are classical HPC data centers at universities or national research centers. Due to the shared nature of these installations, the standardized software stack required by HEP applications cannot be installed. A viable way to overcome this constraint and offer a standardized software environment in a transparent manner is the usage of virtualization technologies. The OpenStack project has become a widely adopted solution to virtualize hardware and offer additional services like storage and virtual machine management. This contribution will report on the incorporation of the institute's desktop machines into a private OpenStack Cloud. The additional compute resources provisioned via the virtual machines have been used for Monte-Carlo simulation and data analysis. Furthermore, a concept to integrate shared, remote HPC centers into regular HEP job workflows will be presented. In this approach, local and remote resources are merged to form a uniform, virtual compute cluster with a single point-of-entry for the user. Evaluations of the performance and stability of this setup and operational experiences will be discussed.
Leppert, Ulrike; Gillespie, Allan; Orphal, Miriam; Böhme, Karen; Plum, Claudia; Nagorsen, Kaj; Berkholz, Janine; Kreutz, Reinhold; Eisenreich, Andreas
2017-09-05
Human podocytes (hPC) are essential for maintaining normal kidney function and dysfunction or loss of hPC play a pivotal role in the manifestation and progression of chronic kidney diseases including diabetic nephropathy. Previously, α-Lipoic acid (α-LA), a licensed drug for treatment of diabetic neuropathy, was shown to exhibit protective effects on diabetic nephropathy in vivo. However, the effect of α-LA on hPC under non-diabetic conditions is unknown. Therefore, we analyzed the impact of α-LA on cell viability and expression of nephrin and zinc finger protein 580 (ZNF580) in normal hPC in vitro. Protein analyses were done via Western blot techniques. Cell viability was determined using a functional assay. hPC viability was dynamically modulated via α-LA stimulation in a concentration-dependent manner. This was associated with reduced nephrin and ZNF580 expression and increased nephrin phosphorylation in normal hPC. Moreover, α-LA reduced nephrin and ZNF580 protein expression via 'kappa-light-chain-enhancer' of activated B-cells (NF-κB) inhibition. These data demonstrate that low α-LA had no negative influence on hPC viability, whereas, high α-LA concentrations induced cytotoxic effects on normal hPC and reduced nephrin and ZNF580 expression via NF-κB inhibition. These data provide first novel information about potential cytotoxic effects of α-LA on hPC under non-diabetic conditions. Copyright © 2017 Elsevier B.V. All rights reserved.
Towards real-time remote processing of laparoscopic video
NASA Astrophysics Data System (ADS)
Ronaghi, Zahra; Duffy, Edward B.; Kwartowitz, David M.
2015-03-01
Laparoscopic surgery is a minimally invasive surgical technique where surgeons insert a small video camera into the patient's body to visualize internal organs and small tools to perform surgical procedures. However, the benefit of small incisions has a drawback of limited visualization of subsurface tissues, which can lead to navigational challenges in the delivering of therapy. Image-guided surgery (IGS) uses images to map subsurface structures and can reduce the limitations of laparoscopic surgery. One particular laparoscopic camera system of interest is the vision system of the daVinci-Si robotic surgical system (Intuitive Surgical, Sunnyvale, CA, USA). The video streams generate approximately 360 megabytes of data per second, demonstrating a trend towards increased data sizes in medicine, primarily due to higher-resolution video cameras and imaging equipment. Processing this data on a bedside PC has become challenging and a high-performance computing (HPC) environment may not always be available at the point of care. To process this data on remote HPC clusters at the typical 30 frames per second (fps) rate, it is required that each 11.9 MB video frame be processed by a server and returned within 1/30th of a second. The ability to acquire, process and visualize data in real-time is essential for performance of complex tasks as well as minimizing risk to the patient. As a result, utilizing high-speed networks to access computing clusters will lead to real-time medical image processing and improve surgical experiences by providing real-time augmented laparoscopic data. We aim to develop a medical video processing system using an OpenFlow software defined network that is capable of connecting to multiple remote medical facilities and HPC servers.
Baba, Toshiaki; Tsujimoto, Yasuhisa
2016-01-01
The purpose of this study was to improve the operability of calcium silicate cements (CSCs) such as mineral trioxide aggregate (MTA) cement. The flow, working time, and setting time of CSCs with different compositions containing low-viscosity methyl cellulose (MC) or hydroxypropyl cellulose (HPC) additive were examined according to ISO 6876-2012; calcium ion release analysis was also conducted. MTA and low-heat Portland cement (LPC) including 20% fine particle zirconium oxide (ZO group), LPC including zirconium oxide and 2 wt% low-viscosity MC (MC group), and HPC (HPC group) were tested. MC and HPC groups exhibited significantly higher flow values and setting times than other groups ( p < 0.05). Additionally, flow values of these groups were higher than the ISO 6876-2012 reference values; furthermore, working times were over 10 min. Calcium ion release was retarded with ZO, MC, and HPC groups compared with MTA. The concentration of calcium ions was decreased by the addition of the MC or HPC group compared with the ZO group. When low-viscosity MC or HPC was added, the composition of CSCs changed, thus fulfilling the requirements for use as root canal sealer. Calcium ion release by CSCs was affected by changing the CSC composition via the addition of MC or HPC.
Tsujimoto, Yasuhisa
2016-01-01
The purpose of this study was to improve the operability of calcium silicate cements (CSCs) such as mineral trioxide aggregate (MTA) cement. The flow, working time, and setting time of CSCs with different compositions containing low-viscosity methyl cellulose (MC) or hydroxypropyl cellulose (HPC) additive were examined according to ISO 6876-2012; calcium ion release analysis was also conducted. MTA and low-heat Portland cement (LPC) including 20% fine particle zirconium oxide (ZO group), LPC including zirconium oxide and 2 wt% low-viscosity MC (MC group), and HPC (HPC group) were tested. MC and HPC groups exhibited significantly higher flow values and setting times than other groups (p < 0.05). Additionally, flow values of these groups were higher than the ISO 6876-2012 reference values; furthermore, working times were over 10 min. Calcium ion release was retarded with ZO, MC, and HPC groups compared with MTA. The concentration of calcium ions was decreased by the addition of the MC or HPC group compared with the ZO group. When low-viscosity MC or HPC was added, the composition of CSCs changed, thus fulfilling the requirements for use as root canal sealer. Calcium ion release by CSCs was affected by changing the CSC composition via the addition of MC or HPC. PMID:27981048
Hybrid data storage system in an HPC exascale environment
Bent, John M.; Faibish, Sorin; Gupta, Uday K.; Tzelnic, Percy; Ting, Dennis P. J.
2015-08-18
A computer-executable method, system, and computer program product for managing I/O requests from a compute node in communication with a data storage system, including a first burst buffer node and a second burst buffer node, the computer-executable method, system, and computer program product comprising striping data on the first burst buffer node and the second burst buffer node, wherein a first portion of the data is communicated to the first burst buffer node and a second portion of the data is communicated to the second burst buffer node, processing the first portion of the data at the first burst buffer node, and processing the second portion of the data at the second burst buffer node.
Roter, Debra L; Erby, Lori H; Adams, Ann; Buckingham, Christopher D; Vail, Laura; Realpe, Alba; Larson, Susan; Hall, Judith A
2014-09-01
To disentangle the effects of physician gender and patient-centered communication style on patients' oral engagement in depression care. Physician gender, physician race and communication style (high patient-centered (HPC) and low patient-centered (LPC)) were manipulated and presented as videotaped actors within a computer simulated medical visit to assess effects on analogue patient (AP) verbal responsiveness and care ratings. 307 APs (56% female; 70% African American) were randomly assigned to conditions and instructed to verbally respond to depression-related questions and indicate willingness to continue care. Disclosures were coded using Roter Interaction Analysis System (RIAS). Both male and female APs talked more overall and conveyed more psychosocial and emotional talk to HPC gender discordant doctors (all p<.05). APs were more willing to continue treatment with gender-discordant HPC physicians (p<.05). No effects were evident in the LPC condition. Findings highlight a role for physician gender when considering active patient engagement in patient-centered depression care. This pattern suggests that there may be largely under-appreciated and consequential effects associated with patient expectations in regard to physician gender that these differ by patient gender. High patient-centeredness increases active patient engagement in depression care especially in gender discordant dyads. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A new deadlock resolution protocol and message matching algorithm for the extreme-scale simulator
Engelmann, Christian; Naughton, III, Thomas J.
2016-03-22
Investigating the performance of parallel applications at scale on future high-performance computing (HPC) architectures and the performance impact of different HPC architecture choices is an important component of HPC hardware/software co-design. The Extreme-scale Simulator (xSim) is a simulation toolkit for investigating the performance of parallel applications at scale. xSim scales to millions of simulated Message Passing Interface (MPI) processes. The overhead introduced by a simulation tool is an important performance and productivity aspect. This paper documents two improvements to xSim: (1)~a new deadlock resolution protocol to reduce the parallel discrete event simulation overhead and (2)~a new simulated MPI message matchingmore » algorithm to reduce the oversubscription management overhead. The results clearly show a significant performance improvement. The simulation overhead for running the NAS Parallel Benchmark suite was reduced from 102% to 0% for the embarrassingly parallel (EP) benchmark and from 1,020% to 238% for the conjugate gradient (CG) benchmark. xSim offers a highly accurate simulation mode for better tracking of injected MPI process failures. Furthermore, with highly accurate simulation, the overhead was reduced from 3,332% to 204% for EP and from 37,511% to 13,808% for CG.« less
Atlas : A library for numerical weather prediction and climate modelling
NASA Astrophysics Data System (ADS)
Deconinck, Willem; Bauer, Peter; Diamantakis, Michail; Hamrud, Mats; Kühnlein, Christian; Maciel, Pedro; Mengaldo, Gianmarco; Quintino, Tiago; Raoult, Baudouin; Smolarkiewicz, Piotr K.; Wedi, Nils P.
2017-11-01
The algorithms underlying numerical weather prediction (NWP) and climate models that have been developed in the past few decades face an increasing challenge caused by the paradigm shift imposed by hardware vendors towards more energy-efficient devices. In order to provide a sustainable path to exascale High Performance Computing (HPC), applications become increasingly restricted by energy consumption. As a result, the emerging diverse and complex hardware solutions have a large impact on the programming models traditionally used in NWP software, triggering a rethink of design choices for future massively parallel software frameworks. In this paper, we present Atlas, a new software library that is currently being developed at the European Centre for Medium-Range Weather Forecasts (ECMWF), with the scope of handling data structures required for NWP applications in a flexible and massively parallel way. Atlas provides a versatile framework for the future development of efficient NWP and climate applications on emerging HPC architectures. The applications range from full Earth system models, to specific tools required for post-processing weather forecast products. The Atlas library thus constitutes a step towards affordable exascale high-performance simulations by providing the necessary abstractions that facilitate the application in heterogeneous HPC environments by promoting the co-design of NWP algorithms with the underlying hardware.
High throughput computing: a solution for scientific analysis
O'Donnell, M.
2011-01-01
handle job failures due to hardware, software, or network interruptions (obviating the need to manually resubmit the job after each stoppage); be affordable; and most importantly, allow us to complete very large, complex analyses that otherwise would not even be possible. In short, we envisioned a job-management system that would take advantage of unused FORT CPUs within a local area network (LAN) to effectively distribute and run highly complex analytical processes. What we found was a solution that uses High Throughput Computing (HTC) and High Performance Computing (HPC) systems to do exactly that (Figure 1).
Mini-Ckpts: Surviving OS Failures in Persistent Memory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fiala, David; Mueller, Frank; Ferreira, Kurt Brian
Concern is growing in the high-performance computing (HPC) community on the reliability of future extreme-scale systems. Current efforts have focused on application fault-tolerance rather than the operating system (OS), despite the fact that recent studies have suggested that failures in OS memory are more likely. The OS is critical to a system's correct and efficient operation of the node and processes it governs -- and in HPC also for any other nodes a parallelized application runs on and communicates with: Any single node failure generally forces all processes of this application to terminate due to tight communication in HPC. Therefore,more » the OS itself must be capable of tolerating failures. In this work, we introduce mini-ckpts, a framework which enables application survival despite the occurrence of a fatal OS failure or crash. Mini-ckpts achieves this tolerance by ensuring that the critical data describing a process is preserved in persistent memory prior to the failure. Following the failure, the OS is rejuvenated via a warm reboot and the application continues execution effectively making the failure and restart transparent. The mini-ckpts rejuvenation and recovery process is measured to take between three to six seconds and has a failure-free overhead of between 3-5% for a number of key HPC workloads. In contrast to current fault-tolerance methods, this work ensures that the operating and runtime system can continue in the presence of faults. This is a much finer-grained and dynamic method of fault-tolerance than the current, coarse-grained, application-centric methods. Handling faults at this level has the potential to greatly reduce overheads and enables mitigation of additional fault scenarios.« less
The Overshoot Phenomenon in Geodynamics Codes
NASA Astrophysics Data System (ADS)
Kommu, R. K.; Heien, E. M.; Kellogg, L. H.; Bangerth, W.; Heister, T.; Studley, E. H.
2013-12-01
The overshoot phenomenon is a common occurrence in numerical software when a continuous function on a finite dimensional discretized space is used to approximate a discontinuous jump, in temperature and material concentration, for example. The resulting solution overshoots, and undershoots, the discontinuous jump. Numerical simulations play an extremely important role in mantle convection research. This is both due to the strong temperature and stress dependence of viscosity and also due to the inaccessibility of deep earth. Under these circumstances, it is essential that mantle convection simulations be extremely accurate and reliable. CitcomS and ASPECT are two finite element based mantle convection simulations developed and maintained by the Computational Infrastructure for Geodynamics. CitcomS is a finite element based mantle convection code that is designed to run on multiple high-performance computing platforms. ASPECT, an adaptive mesh refinement (AMR) code built on the Deal.II library, is also a finite element based mantle convection code that scales well on various HPC platforms. CitcomS and ASPECT both exhibit the overshoot phenomenon. One attempt at controlling the overshoot uses the Entropy Viscosity method, which introduces an artificial diffusion term in the energy equation of mantle convection. This artificial diffusion term is small where the temperature field is smooth. We present results from CitcomS and ASPECT that quantify the effect of the Entropy Viscosity method in reducing the overshoot phenomenon. In the discontinuous Galerkin (DG) finite element method, the test functions used in the method are continuous within each element but are discontinuous across inter-element boundaries. The solution space in the DG method is discontinuous. FEniCS is a collection of free software tools that automate the solution of differential equations using finite element methods. In this work we also present results from a finite element mantle convection simulation implemented in FEniCS that investigates the effect of using DG elements in reducing the overshoot problem.
Are Hypomineralized Primary Molars and Canines Associated with Molar-Incisor Hypomineralization?
da Silva Figueiredo Sé, Maria Jose; Ribeiro, Ana Paula Dias; Dos Santos-Pinto, Lourdes Aparecida Martins; de Cassia Loiola Cordeiro, Rita; Cabral, Renata Nunes; Leal, Soraya Coelho
2017-11-01
The purpose of this study was to evaluate the prevalence of and relationship between hypomineralized second primary molars (HSPM) and hypomineralized primary canines (HPC) with molar-incisor hypomineralization (MIH) in 1,963 schoolchildren. The European Academy of Paediatric Dentistry (EAPD) criterion was used for scoring HSPM/HPC and MIH. Only children with four permanent first molars and eight incisors were considered in calculating MIH prevalence (n equals 858); for HSPM/HPC prevalence, only children with four primary second molars (n equals 1,590) and four primary canines (n equals 1,442) were considered. To evaluate the relationship between MIH/HSPM, only children meeting both criteria cited were considered (n equals 534), as was true of MIH/HPC (n equals 408) and HSPM/HPC (n equals 360; chi-square test and logistic regression). The prevalence of MIH was 14.69 percent (126 of 858 children). For HSPM and HPC, the prevalence was 6.48 percent (103 of 1,592) and 2.22 percent (32 of 1,442), respectively. A significant relationship was observed between MIH and both HSPM/HPC (P<0.001). The odds ratio for MIH based on HSPM was 6.31 (95 percent confidence interval [CI] equals 2.59 to 15.13) and for HPC was 6.02 (95 percent CI equals 1.08 to 33.05). The results led to the conclusion that both hypomineralized second primary molars and hypomineralized primary canines are associated with molar-incisor hypomineralization, because children with HSPM/HPC are six times more likely to develop MIH.
INDIGO-DataCloud solutions for Earth Sciences
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Fiore, Sandro; Monna, Stephen; Chen, Yin
2017-04-01
INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is a European Commission funded project aiming to develop a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The development of INDIGO solutions covers the different layers in cloud computing (IaaS, PaaS, SaaS), and provides tools to exploit resources like HPC or GPGPUs. INDIGO is oriented to support European Scientific research communities, that are well represented in the project. Twelve different Case Studies have been analyzed in detail from different fields: Biological & Medical sciences, Social sciences & Humanities, Environmental and Earth sciences and Physics & Astrophysics. INDIGO-DataCloud provides solutions to emerging challenges in Earth Science like: -Enabling an easy deployment of community services at different cloud sites. Many Earth Science research infrastructures often involve distributed observation stations across countries, and also have distributed data centers to support the corresponding data acquisition and curation. There is a need to easily deploy new data center services while the research infrastructure continuous spans. As an example: LifeWatch (ESFRI, Ecosystems and Biodiversity) uses INDIGO solutions to manage the deployment of services to perform complex hydrodynamics and water quality modelling over a Cloud Computing environment, predicting algae blooms, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator for deployment, AAI (AuthN, AuthZ) and OneData (Distributed Storage System). -Supporting Big Data Analysis. Nowadays, many Earth Science research communities produce large amounts of data and and are challenged by the difficulties of processing and analysing it. A climate models intercomparison data analysis case study for the European Network for Earth System Modelling (ENES) community has been setup, based on the Ophidia big data analysis framework and the Kepler workflow management system. Such services normally involve a large and distributed set of data and computing resources. In this regard, this case study exploits the INDIGO PaaS for a flexible and dynamic allocation of the resources at the infrastructural level. -Providing Distributed Data Storage Solutions. In order to allow scientific communities to perform heavy computation on huge datasets, INDIGO provides global data access solutions allowing researchers to access data in a distributed environment like fashion regardless of its location, and also to publish and share their research results with public or close communities. INDIGO solutions that support the access to distributed data storage (OneData) are being tested on EMSO infrastructure (Ocean Sciences and Geohazards) data. Another aspect of interest for the EMSO community is in efficient data processing by exploiting INDIGO services like PaaS Orchestrator. Further, for HPC exploitation, a new solution named Udocker has been implemented, enabling users to execute docker containers in supercomputers, without requiring administration privileges. This presentation will overview INDIGO solutions that are interesting and useful for Earth science communities and will show how they can be applied to other Case Studies.
Facilitating Co-Design for Extreme-Scale Systems Through Lightweight Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Engelmann, Christian; Lauer, Frank
This work focuses on tools for investigating algorithm performance at extreme scale with millions of concurrent threads and for evaluating the impact of future architecture choices to facilitate the co-design of high-performance computing (HPC) architectures and applications. The approach focuses on lightweight simulation of extreme-scale HPC systems with the needed amount of accuracy. The prototype presented in this paper is able to provide this capability using a parallel discrete event simulation (PDES), such that a Message Passing Interface (MPI) application can be executed at extreme scale, and its performance properties can be evaluated. The results of an initial prototype aremore » encouraging as a simple 'hello world' MPI program could be scaled up to 1,048,576 virtual MPI processes on a four-node cluster, and the performance properties of two MPI programs could be evaluated at up to 16,384 virtual MPI processes on the same system.« less
Toward performance portability of the Albany finite element analysis code using the Kokkos library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Demeshko, Irina; Watkins, Jerry; Tezaur, Irina K.
Performance portability on heterogeneous high-performance computing (HPC) systems is a major challenge faced today by code developers: parallel code needs to be executed correctly as well as with high performance on machines with different architectures, operating systems, and software libraries. The finite element method (FEM) is a popular and flexible method for discretizing partial differential equations arising in a wide variety of scientific, engineering, and industrial applications that require HPC. This paper presents some preliminary results pertaining to our development of a performance portable implementation of the FEM-based Albany code. Performance portability is achieved using the Kokkos library. We presentmore » performance results for the Aeras global atmosphere dynamical core module in Albany. Finally, numerical experiments show that our single code implementation gives reasonable performance across three multicore/many-core architectures: NVIDIA General Processing Units (GPU’s), Intel Xeon Phis, and multicore CPUs.« less
An Optimizing Compiler for Petascale I/O on Leadership-Class Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kandemir, Mahmut Taylan; Choudary, Alok; Thakur, Rajeev
In high-performance computing (HPC), parallel I/O architectures usually have very complex hierarchies with multiple layers that collectively constitute an I/O stack, including high-level I/O libraries such as PnetCDF and HDF5, I/O middleware such as MPI-IO, and parallel file systems such as PVFS and Lustre. Our DOE project explored automated instrumentation and compiler support for I/O intensive applications. Our project made significant progress towards understanding the complex I/O hierarchies of high-performance storage systems (including storage caches, HDDs, and SSDs), and designing and implementing state-of-the-art compiler/runtime system technology that targets I/O intensive HPC applications that target leadership class machine. This final reportmore » summarizes the major achievements of the project and also points out promising future directions Two new sections in this report compared to the previous report are IOGenie and SSD/NVM-specific optimizations.« less
Toward performance portability of the Albany finite element analysis code using the Kokkos library
Demeshko, Irina; Watkins, Jerry; Tezaur, Irina K.; ...
2018-02-05
Performance portability on heterogeneous high-performance computing (HPC) systems is a major challenge faced today by code developers: parallel code needs to be executed correctly as well as with high performance on machines with different architectures, operating systems, and software libraries. The finite element method (FEM) is a popular and flexible method for discretizing partial differential equations arising in a wide variety of scientific, engineering, and industrial applications that require HPC. This paper presents some preliminary results pertaining to our development of a performance portable implementation of the FEM-based Albany code. Performance portability is achieved using the Kokkos library. We presentmore » performance results for the Aeras global atmosphere dynamical core module in Albany. Finally, numerical experiments show that our single code implementation gives reasonable performance across three multicore/many-core architectures: NVIDIA General Processing Units (GPU’s), Intel Xeon Phis, and multicore CPUs.« less
Constructive Engineering of Simulations
NASA Technical Reports Server (NTRS)
Snyder, Daniel R.; Barsness, Brendan
2011-01-01
Joint experimentation that investigates sensor optimization, re-tasking and management has far reaching implications for Department of Defense, Interagency and multinational partners. An adaption of traditional human in the loop (HITL) Modeling and Simulation (M&S) was one approach used to generate the findings necessary to derive and support these implications. Here an entity-based simulation was re-engineered to run on USJFCOM's High Performance Computer (HPC). The HPC was used to support the vast number of constructive runs necessary to produce statistically significant data in a timely manner. Then from the resulting sensitivity analysis, event designers blended the necessary visualization and decision making components into a synthetic environment for the HITL simulations trials. These trials focused on areas where human decision making had the greatest impact on the sensor investigations. Thus, this paper discusses how re-engineering existing M&S for constructive applications can positively influence the design of an associated HITL experiment.