high performance computers: Topics by Science.gov

Sample records for high performance computers

High-Performance Computing and Visualization | Energy Systems Integration

Science.gov Websites

Facility | NREL High-Performance Computing and Visualization High-Performance Computing and Visualization High-performance computing (HPC) and visualization at NREL propel technology innovation as a . Capabilities High-Performance Computing NREL is home to Peregrine-the largest high-performance computing system
High-Performance Computing Data Center | Energy Systems Integration

Science.gov Websites

Facility | NREL High-Performance Computing Data Center High-Performance Computing Data Center The Energy Systems Integration Facility's High-Performance Computing Data Center is home to Peregrine -the largest high-performance computing system in the world exclusively dedicated to advancing
High-Performance Computing Systems and Operations | Computational Science |

Science.gov Websites

NREL Systems and Operations High-Performance Computing Systems and Operations NREL operates high-performance computing (HPC) systems dedicated to advancing energy efficiency and renewable energy technologies. Capabilities NREL's HPC capabilities include: High-Performance Computing Systems We operate
High-Performance Computing User Facility | Computational Science | NREL

Science.gov Websites

User Facility High-Performance Computing User Facility The High-Performance Computing User Facility technologies. Photo of the Peregrine supercomputer The High Performance Computing (HPC) User Facility provides Gyrfalcon Mass Storage System. Access Our HPC User Facility Learn more about these systems and how to access
High Performance Computing Meets Energy Efficiency - Continuum Magazine |

Science.gov Websites

NREL High Performance Computing Meets Energy Efficiency High Performance Computing Meets Energy turbines. Simulation by Patrick J. Moriarty and Matthew J. Churchfield, NREL The new High Performance Computing Data Center at the National Renewable Energy Laboratory (NREL) hosts high-speed, high-volume data
High performance computing and communications program

NASA Technical Reports Server (NTRS)

Holcomb, Lee

1992-01-01

A review of the High Performance Computing and Communications (HPCC) program is provided in vugraph format. The goals and objectives of this federal program are as follows: extend U.S. leadership in high performance computing and computer communications; disseminate the technologies to speed innovation and to serve national goals; and spur gains in industrial competitiveness by making high performance computing integral to design and production.
Computational Science News | Computational Science | NREL

Science.gov Websites

-Cooled High-Performance Computing Technology at the ESIF February 28, 2018 NREL Launches New Website for High-Performance Computing System Users The National Renewable Energy Laboratory (NREL) Computational Science Center has launched a revamped website for users of the lab's high-performance computing (HPC
High Performance Computer Cluster for Theoretical Studies of Roaming in Chemical Reactions

DTIC Science & Technology

2016-08-30

High-performance Computer Cluster for Theoretical Studies of Roaming in Chemical Reactions A dedicated high-performance computer cluster was...SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS (ES) U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 Computer cluster ...peer-reviewed journals: Final Report: High-performance Computer Cluster for Theoretical Studies of Roaming in Chemical Reactions Report Title A dedicated
Grand Challenges: High Performance Computing and Communications. The FY 1992 U.S. Research and Development Program.

ERIC Educational Resources Information Center

Federal Coordinating Council for Science, Engineering and Technology, Washington, DC.

This report presents a review of the High Performance Computing and Communications (HPCC) Program, which has as its goal the acceleration of the commercial availability and utilization of the next generation of high performance computers and networks in order to: (1) extend U.S. technological leadership in high performance computing and computer…
Facilities | Integrated Energy Solutions | NREL

Science.gov Websites

strategies needed to optimize our entire energy system. A photo of the high-performance computer at NREL . High-Performance Computing Data Center High-performance computing facilities at NREL provide high-speed
High-Performance Computing Data Center Warm-Water Liquid Cooling |

Science.gov Websites

Computational Science | NREL Warm-Water Liquid Cooling High-Performance Computing Data Center Warm-Water Liquid Cooling NREL's High-Performance Computing Data Center (HPC Data Center) is liquid water Liquid cooling technologies offer a more energy-efficient solution that also allows for effective
Research Activity in Computational Physics utilizing High Performance Computing: Co-authorship Network Analysis

NASA Astrophysics Data System (ADS)

Ahn, Sul-Ah; Jung, Youngim

2016-10-01

The research activities of the computational physicists utilizing high performance computing are analyzed by bibliometirc approaches. This study aims at providing the computational physicists utilizing high-performance computing and policy planners with useful bibliometric results for an assessment of research activities. In order to achieve this purpose, we carried out a co-authorship network analysis of journal articles to assess the research activities of researchers for high-performance computational physics as a case study. For this study, we used journal articles of the Scopus database from Elsevier covering the time period of 2004-2013. We extracted the author rank in the physics field utilizing high-performance computing by the number of papers published during ten years from 2004. Finally, we drew the co-authorship network for 45 top-authors and their coauthors, and described some features of the co-authorship network in relation to the author rank. Suggestions for further studies are discussed.
Importance of balanced architectures in the design of high-performance imaging systems

NASA Astrophysics Data System (ADS)

Sgro, Joseph A.; Stanton, Paul C.

1999-03-01

Imaging systems employed in demanding military and industrial applications, such as automatic target recognition and computer vision, typically require real-time high-performance computing resources. While high- performances computing systems have traditionally relied on proprietary architectures and custom components, recent advances in high performance general-purpose microprocessor technology have produced an abundance of low cost components suitable for use in high-performance computing systems. A common pitfall in the design of high performance imaging system, particularly systems employing scalable multiprocessor architectures, is the failure to balance computational and memory bandwidth. The performance of standard cluster designs, for example, in which several processors share a common memory bus, is typically constrained by memory bandwidth. The symptom characteristic of this problem is failure to the performance of the system to scale as more processors are added. The problem becomes exacerbated if I/O and memory functions share the same bus. The recent introduction of microprocessors with large internal caches and high performance external memory interfaces makes it practical to design high performance imaging system with balanced computational and memory bandwidth. Real word examples of such designs will be presented, along with a discussion of adapting algorithm design to best utilize available memory bandwidth.
Marc Henry de Frahan | NREL

Science.gov Websites

Computing Project, Marc develops high-fidelity turbulence models to enhance simulation accuracy and efficient numerical algorithms for future high performance computing hardware architectures. Research Interests High performance computing High order numerical methods for computational fluid dynamics Fluid
High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems

DOE PAGES

Dongarra, Jack; Heroux, Michael A.; Luszczek, Piotr

2015-08-17

Here, we describe a new high-performance conjugate-gradient (HPCG) benchmark. HPCG is composed of computations and data-access patterns commonly found in scientific applications. HPCG strives for a better correlation to existing codes from the computational science domain and to be representative of their performance. Furthermore, HPCG is meant to help drive the computer system design and implementation in directions that will better impact future performance improvement.
Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

ERIC Educational Resources Information Center

Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

2013-01-01

With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…
High-Performance Computing Act of 1991. Report of the Senate Committee on Commerce, Science, and Transportation on S. 272. Senate, 102d Congress, 1st Session.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

This report discusses Senate Bill no. 272, which provides for a coordinated federal research and development program to ensure continued U.S. leadership in high-performance computing. High performance computing is defined as representing the leading edge of technological advancement in computing, i.e., the most sophisticated computer chips, the…
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 15 Commerce and Foreign Trade 2 2012-01-01 2012-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 15 Commerce and Foreign Trade 2 2011-01-01 2011-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...

15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 15 Commerce and Foreign Trade 2 2013-01-01 2013-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 15 Commerce and Foreign Trade 2 2014-01-01 2014-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING AND NOTIFICATION § 743.2 High performance computers: Post shipment...
Support Expressed in Congress for U.S. High-Performance Computing

NASA Astrophysics Data System (ADS)

Showstack, Randy

2004-06-01

Advocates for a stronger U.S. position in high-performance computing-which could help with a number of grand challenges in the Earth sciences and other disciplines-hope that legislation recently introduced in the House of Representatives, and, will help to revitalize U.S. efforts. The High-Performance Computing Revitalization Act of 2004 would amend the earlier High-Performance Computing Act of 1991 (Public Law 102-194), which is partially credited with helping to strengthen U.S. capabilities in this area. The bill has the support of the Bush administration.
Alliance for Computational Science Collaboration: HBCU Partnership at Alabama A&M University Continuing High Performance Computing Research and Education at AAMU

DOE Office of Scientific and Technical Information (OSTI.GOV)

Qian, Xiaoqing; Deng, Z. T.

2009-11-10

This is the final report for the Department of Energy (DOE) project DE-FG02-06ER25746, entitled, "Continuing High Performance Computing Research and Education at AAMU". This three-year project was started in August 15, 2006, and it was ended in August 14, 2009. The objective of this project was to enhance high performance computing research and education capabilities at Alabama A&M University (AAMU), and to train African-American and other minority students and scientists in the computational science field for eventual employment with DOE. AAMU has successfully completed all the proposed research and educational tasks. Through the support of DOE, AAMU was able tomore » provide opportunities to minority students through summer interns and DOE computational science scholarship program. In the past three years, AAMU (1). Supported three graduate research assistants in image processing for hypersonic shockwave control experiment and in computational science related area; (2). Recruited and provided full financial support for six AAMU undergraduate summer research interns to participate Research Alliance in Math and Science (RAMS) program at Oak Ridge National Lab (ORNL); (3). Awarded highly competitive 30 DOE High Performance Computing Scholarships ($1500 each) to qualified top AAMU undergraduate students in science and engineering majors; (4). Improved high performance computing laboratory at AAMU with the addition of three high performance Linux workstations; (5). Conducted image analysis for electromagnetic shockwave control experiment and computation of shockwave interactions to verify the design and operation of AAMU-Supersonic wind tunnel. The high performance computing research and education activities at AAMU created great impact to minority students. As praised by Accreditation Board for Engineering and Technology (ABET) in 2009, ?The work on high performance computing that is funded by the Department of Energy provides scholarships to undergraduate students as computational science scholars. This is a wonderful opportunity to recruit under-represented students.? Three ASEE papers were published in 2007, 2008 and 2009 proceedings of ASEE Annual Conferences, respectively. Presentations of these papers were also made at the ASEE Annual Conferences. It is very critical to continue the research and education activities.« less
Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

NASA Technical Reports Server (NTRS)

Morgan, Philip E.

2004-01-01

This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
National High-Performance Computing and Networking Act. Report To Accompany S. 343, Senate, 102d Congess, 1st Session.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Energy and Natural Resources.

The purpose of the bill (S. 343), as reported by the Senate Committee on Energy and Natural Resources, is to establish a federal commitment to the advancement of high-performance computing, improve interagency planning and coordination of federal high-performance computing and networking activities, authorize a national high-speed computer…
High-performance computing — an overview

NASA Astrophysics Data System (ADS)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
Grand Challenges 1993: High Performance Computing and Communications. A Report by the Committee on Physical, Mathematical, and Engineering Sciences. The FY 1993 U.S. Research and Development Program.

ERIC Educational Resources Information Center

Office of Science and Technology Policy, Washington, DC.

This report presents the United States research and development program for 1993 for high performance computing and computer communications (HPCC) networks. The first of four chapters presents the program goals and an overview of the federal government's emphasis on high performance computing as an important factor in the nation's scientific and…
Design and Implementation of High-Performance GIS Dynamic Objects Rendering Engine

NASA Astrophysics Data System (ADS)

Zhong, Y.; Wang, S.; Li, R.; Yun, W.; Song, G.

2017-12-01

Spatio-temporal dynamic visualization is more vivid than static visualization. It important to use dynamic visualization techniques to reveal the variation process and trend vividly and comprehensively for the geographical phenomenon. To deal with challenges caused by dynamic visualization of both 2D and 3D spatial dynamic targets, especially for different spatial data types require high-performance GIS dynamic objects rendering engine. The main approach for improving the rendering engine with vast dynamic targets relies on key technologies of high-performance GIS, including memory computing, parallel computing, GPU computing and high-performance algorisms. In this study, high-performance GIS dynamic objects rendering engine is designed and implemented for solving the problem based on hybrid accelerative techniques. The high-performance GIS rendering engine contains GPU computing, OpenGL technology, and high-performance algorism with the advantage of 64-bit memory computing. It processes 2D, 3D dynamic target data efficiently and runs smoothly with vast dynamic target data. The prototype system of high-performance GIS dynamic objects rendering engine is developed based SuperMap GIS iObjects. The experiments are designed for large-scale spatial data visualization, the results showed that the high-performance GIS dynamic objects rendering engine have the advantage of high performance. Rendering two-dimensional and three-dimensional dynamic objects achieve 20 times faster on GPU than on CPU.
Early experiences in developing and managing the neuroscience gateway.

PubMed

Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas T

2015-02-01

The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway.
Early experiences in developing and managing the neuroscience gateway

PubMed Central

Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas. T.

2015-01-01

SUMMARY The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway. PMID:26523124
Data Storage and Transfer | High-Performance Computing | NREL

Science.gov Websites

High-Performance Computing (HPC) systems. Photo of computer server wiring and lights, blurred to show data. WinSCP for Windows File Transfers Use to transfer files from a local computer to a remote computer. Robinhood for File Management Use this tool to manage your data files on Peregrine. Best
Asymmetric Core Computing for U.S. Army High-Performance Computing Applications

DTIC Science & Technology

2009-04-01

Playstation 4 (should one be announced). 8 4.2 FPGAs Reconfigurable computing refers to performing computations using Field Programmable Gate Arrays...2008 4 . TITLE AND SUBTITLE Asymmetric Core Computing for U.S. Army High-Performance Computing Applications 5a. CONTRACT NUMBER 5b. GRANT NUMBER...Acknowledgments vi 1. Introduction 1 2. Relevant Technologies 2 3. Technical Approach 5 4 . Research and Development Highlights 7 4.1 Cell
HPCCP/CAS Workshop Proceedings 1998

NASA Technical Reports Server (NTRS)

Schulbach, Catherine; Mata, Ellen (Editor); Schulbach, Catherine (Editor)

1999-01-01

This publication is a collection of extended abstracts of presentations given at the HPCCP/CAS (High Performance Computing and Communications Program/Computational Aerosciences Project) Workshop held on August 24-26, 1998, at NASA Ames Research Center, Moffett Field, California. The objective of the Workshop was to bring together the aerospace high performance computing community, consisting of airframe and propulsion companies, independent software vendors, university researchers, and government scientists and engineers. The Workshop was sponsored by the HPCCP Office at NASA Ames Research Center. The Workshop consisted of over 40 presentations, including an overview of NASA's High Performance Computing and Communications Program and the Computational Aerosciences Project; ten sessions of papers representative of the high performance computing research conducted within the Program by the aerospace industry, academia, NASA, and other government laboratories; two panel sessions; and a special presentation by Mr. James Bailey.
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

NASA Astrophysics Data System (ADS)

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

2015-05-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
High-performance computing on GPUs for resistivity logging of oil and gas wells

NASA Astrophysics Data System (ADS)

Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

2017-10-01

We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
Optical interconnection networks for high-performance computing systems

NASA Astrophysics Data System (ADS)

Biberman, Aleksandr; Bergman, Keren

2012-04-01

Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers.
Quantum Accelerators for High-performance Computing Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humble, Travis S.; Britt, Keith A.; Mohiyaddin, Fahd A.

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, themore » prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.« less
Staff | Computational Science | NREL

Science.gov Websites

develops and leads laboratory-wide efforts in high-performance computing and energy-efficient data centers Professional IV-High Perf Computing Jim.Albin@nrel.gov 303-275-4069 Ananthan, Shreyas Senior Scientist - High -Performance Algorithms and Modeling Shreyas.Ananthan@nrel.gov 303-275-4807 Bendl, Kurt IT Professional IV-High
Computational Fluency Performance Profile of High School Students with Mathematics Disabilities

ERIC Educational Resources Information Center

Calhoon, Mary Beth; Emerson, Robert Wall; Flores, Margaret; Houchins, David E.

2007-01-01

The purpose of this descriptive study was to develop a computational fluency performance profile of 224 high school (Grades 9-12) students with mathematics disabilities (MD). Computational fluency performance was examined by grade-level expectancy (Grades 2-6) and skill area (whole numbers: addition, subtraction, multiplication, division;…

High Performance Computing and Communications Act of 1991. Hearing Before the Subcommittee on Science, Technology, and Space of the Committee on Commerce, Science, and Transportation. One Hundred Second Congress, First Session on S. 272 To Provide for a Coordinated Federal Research Program To Ensure Continued United States Leadership in High-Performance Computing.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

This hearing before the Senate Subcommittee on Science, Technology, and Space focuses on S. 272, the High-Performance Computing and Communications Act of 1991, a bill that provides for a coordinated federal research and development program to ensure continued U.S. leadership in this area. Performance computing is defined as representing the…
The DoD's High Performance Computing Modernization Program - Ensuing the National Earth Systems Prediction Capability Becomes Operational

NASA Astrophysics Data System (ADS)

Burnett, W.

2016-12-01

The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will provide the DoD with future computing assets to initially operate the N-ESPC in 2019. This talk will further describe how DoD's HPCMP will ensure N-ESPC becomes operational, efficiently and effectively, using next-generation high performance computing.
Heterogeneous high throughput scientific computing with APM X-Gene and Intel Xeon Phi

DOE PAGES

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; ...

2015-05-22

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. As a result, we report our experience on software porting, performance and energy efficiency and evaluatemore » the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).« less
Promoting High-Performance Computing and Communications. A CBO Study.

ERIC Educational Resources Information Center

Webre, Philip

In 1991 the Federal Government initiated the multiagency High Performance Computing and Communications program (HPCC) to further the development of U.S. supercomputer technology and high-speed computer network technology. This overview by the Congressional Budget Office (CBO) concentrates on obstacles that might prevent the growth of the…
HPC on Competitive Cloud Resources

NASA Astrophysics Data System (ADS)

Bientinesi, Paolo; Iakymchuk, Roman; Napper, Jeff

Computing as a utility has reached the mainstream. Scientists can now easily rent time on large commercial clusters that can be expanded and reduced on-demand in real-time. However, current commercial cloud computing performance falls short of systems specifically designed for scientific applications. Scientific computing needs are quite different from those of the web applications that have been the focus of cloud computing vendors. In this chapter we demonstrate through empirical evaluation the computational efficiency of high-performance numerical applications in a commercial cloud environment when resources are shared under high contention. Using the Linpack benchmark as a case study, we show that cache utilization becomes highly unpredictable and similarly affects computation time. For some problems, not only is it more efficient to underutilize resources, but the solution can be reached sooner in realtime (wall-time). We also show that the smallest, cheapest (64-bit) instance on the studied environment is the best for price to performance ration. In light of the high-contention we witness, we believe that alternative definitions of efficiency for commercial cloud environments should be introduced where strong performance guarantees do not exist. Concepts like average, expected performance and execution time, expected cost to completion, and variance measures--traditionally ignored in the high-performance computing context--now should complement or even substitute the standard definitions of efficiency.
Lanczos eigensolution method for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1991-01-01

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors.
Micromagnetics on high-performance workstation and mobile computational platforms

NASA Astrophysics Data System (ADS)

Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

2015-05-01

The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.
High performance systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vigil, M.B.

1995-03-01

This document provides a written compilation of the presentations and viewgraphs from the 1994 Conference on High Speed Computing given at the High Speed Computing Conference, {open_quotes}High Performance Systems,{close_quotes} held at Gleneden Beach, Oregon, on April 18 through 21, 1994.
Demonstration of Cost-Effective, High-Performance Computing at Performance and Reliability Levels Equivalent to a 1994 Vector Supercomputer

NASA Technical Reports Server (NTRS)

Babrauckas, Theresa

2000-01-01

The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.
High Performance Computing at NASA

NASA Technical Reports Server (NTRS)

Bailey, David H.; Cooper, D. M. (Technical Monitor)

1994-01-01

The speaker will give an overview of high performance computing in the U.S. in general and within NASA in particular, including a description of the recently signed NASA-IBM cooperative agreement. The latest performance figures of various parallel systems on the NAS Parallel Benchmarks will be presented. The speaker was one of the authors of the NAS (National Aerospace Standards) Parallel Benchmarks, which are now widely cited in the industry as a measure of sustained performance on realistic high-end scientific applications. It will be shown that significant progress has been made by the highly parallel supercomputer industry during the past year or so, with several new systems, based on high-performance RISC processors, that now deliver superior performance per dollar compared to conventional supercomputers. Various pitfalls in reporting performance will be discussed. The speaker will then conclude by assessing the general state of the high performance computing field.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schmalz, Mark S

2011-07-24

Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
David Sickinger | NREL

Science.gov Websites

Sickinger David Sickinger Researcher III-High Performance Computing David.Sickinger@nrel.gov | 303 -275-3724 David Sickinger works with NREL's High Performance Computing Systems & Operations group
Performance measurement and modeling of component applications in a high performance computing environment : a case study.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Armstrong, Robert C.; Ray, Jaideep; Malony, A.

2003-11-01

We present a case study of performance measurement and modeling of a CCA (Common Component Architecture) component-based application in a high performance computing environment. We explore issues peculiar to component-based HPC applications and propose a performance measurement infrastructure for HPC based loosely on recent work done for Grid environments. A prototypical implementation of the infrastructure is used to collect data for a three components in a scientific application and construct performance models for two of them. Both computational and message-passing performance are addressed.
High-Performance Computing for the Electromagnetic Modeling and Simulation of Interconnects

NASA Technical Reports Server (NTRS)

Schutt-Aine, Jose E.

1996-01-01

The electromagnetic modeling of packages and interconnects plays a very important role in the design of high-speed digital circuits, and is most efficiently performed by using computer-aided design algorithms. In recent years, packaging has become a critical area in the design of high-speed communication systems and fast computers, and the importance of the software support for their development has increased accordingly. Throughout this project, our efforts have focused on the development of modeling and simulation techniques and algorithms that permit the fast computation of the electrical parameters of interconnects and the efficient simulation of their electrical performance.
Effect of computer game playing on baseline laparoscopic simulator skills.

PubMed

Halvorsen, Fredrik H; Cvancarova, Milada; Fosse, Erik; Mjåland, Odd

2013-08-01

Studies examining the possible association between computer game playing and laparoscopic performance in general have yielded conflicting results and neither has a relationship between computer game playing and baseline performance on laparoscopic simulators been established. The aim of this study was to examine the possible association between previous and present computer game playing and baseline performance on a virtual reality laparoscopic performance in a sample of potential future medical students. The participating students completed a questionnaire covering the weekly amount and type of computer game playing activity during the previous year and 3 years ago. They then performed 2 repetitions of 2 tasks ("gallbladder dissection" and "traverse tube") on a virtual reality laparoscopic simulator. Performance on the simulator were then analyzed for association to their computer game experience. Local high school, Norway. Forty-eight students from 2 high school classes volunteered to participate in the study. No association between prior and present computer game playing and baseline performance was found. The results were similar both for prior and present action game playing and prior and present computer game playing in general. Our results indicate that prior and present computer game playing may not affect baseline performance in a virtual reality simulator.
Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems

PubMed Central

Teodoro, George; Kurc, Tahsin M.; Pan, Tony; Cooper, Lee A.D.; Kong, Jun; Widener, Patrick; Saltz, Joel H.

2014-01-01

The past decade has witnessed a major paradigm shift in high performance computing with the introduction of accelerators as general purpose processors. These computing devices make available very high parallel computing power at low cost and power consumption, transforming current high performance platforms into heterogeneous CPU-GPU equipped systems. Although the theoretical performance achieved by these hybrid systems is impressive, taking practical advantage of this computing power remains a very challenging problem. Most applications are still deployed to either GPU or CPU, leaving the other resource under- or un-utilized. In this paper, we propose, implement, and evaluate a performance aware scheduling technique along with optimizations to make efficient collaborative use of CPUs and GPUs on a parallel system. In the context of feature computations in large scale image analysis applications, our evaluations show that intelligently co-scheduling CPUs and GPUs can significantly improve performance over GPU-only or multi-core CPU-only approaches. PMID:25419545
Petascale supercomputing to accelerate the design of high-temperature alloys

DOE PAGES

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit; ...

2017-10-25

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ'-Al 2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviourmore » of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. As a result, the approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.« less
Petascale supercomputing to accelerate the design of high-temperature alloys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ'-Al 2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviourmore » of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. As a result, the approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.« less
Petascale supercomputing to accelerate the design of high-temperature alloys

NASA Astrophysics Data System (ADS)

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit; Haynes, J. Allen

2017-12-01

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ‧-Al2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviour of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. The approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.
Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aimone, James Bradley; Betty, Rita

Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information - Sandia researchers developed novel methods and metrics for studying the computational function of neurogenesis, thus generating substantial impact to the neuroscience and neural computing communities. This work could benefit applications in machine learning and other analysis activities.

Templet Web: the use of volunteer computing approach in PaaS-style cloud

NASA Astrophysics Data System (ADS)

Vostokin, Sergei; Artamonov, Yuriy; Tsarev, Daniil

2018-03-01

This article presents the Templet Web cloud service. The service is designed for high-performance scientific computing automation. The use of high-performance technology is specifically required by new fields of computational science such as data mining, artificial intelligence, machine learning, and others. Cloud technologies provide a significant cost reduction for high-performance scientific applications. The main objectives to achieve this cost reduction in the Templet Web service design are: (a) the implementation of "on-demand" access; (b) source code deployment management; (c) high-performance computing programs development automation. The distinctive feature of the service is the approach mainly used in the field of volunteer computing, when a person who has access to a computer system delegates his access rights to the requesting user. We developed an access procedure, algorithms, and software for utilization of free computational resources of the academic cluster system in line with the methods of volunteer computing. The Templet Web service has been in operation for five years. It has been successfully used for conducting laboratory workshops and solving research problems, some of which are considered in this article. The article also provides an overview of research directions related to service development.
Research | Computational Science | NREL

Science.gov Websites

Research Research NREL's computational science experts use advanced high-performance computing (HPC technologies, thereby accelerating the transformation of our nation's energy system. Enabling High-Impact Research NREL's computational science capabilities enable high-impact research. Some recent examples
Generic Divide and Conquer Internet-Based Computing

NASA Technical Reports Server (NTRS)

Follen, Gregory J. (Technical Monitor); Radenski, Atanas

2003-01-01

The growth of Internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of Peer to Peer (P2P) software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high-performance computing applications community. The general goal of this project is to achieve better understanding of the transition to Internet-based high-performance computing and to develop solutions for some of the technical challenges of this transition. In particular, we are interested in creating long-term motivation for end users to provide their idle processor time to support computationally intensive tasks. We believe that a practical P2P architecture should provide useful service to both clients with high-performance computing needs and contributors of lower-end computing resources. To achieve this, we are designing dual -service architecture for P2P high-performance divide-and conquer computing; we are also experimenting with a prototype implementation. Our proposed architecture incorporates a master server, utilizes dual satellite servers, and operates on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. A dual satellite server comprises a high-performance computing engine and a lower-end contributor service engine. The computing engine provides generic support for divide and conquer computations. The service engine is intended to provide free useful HTTP-based services to contributors of lower-end computing resources. Our proposed architecture is complementary to and accessible from computational grids, such as Globus, Legion, and Condor. Grids provide remote access to existing higher-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end Internet nodes. Our project is focused on a generic divide and conquer paradigm and on mobile applications of this paradigm that can operate on a loose and ever changing pool of lower-end Internet nodes.
High Performance Computing Software Applications for Space Situational Awareness

NASA Astrophysics Data System (ADS)

Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.
System Resource Allocations | High-Performance Computing | NREL

Science.gov Websites

Allocations System Resource Allocations To use NREL's high-performance computing (HPC) resources : Compute hours on NREL HPC Systems including Peregrine and Eagle Storage space (in Terabytes) on Peregrine , Eagle and Gyrfalcon. Allocations are principally done in response to an annual call for allocation
Achieving High Performance with FPGA-Based Computing

PubMed Central

Herbordt, Martin C.; VanCourt, Tom; Gu, Yongfeng; Sukhwani, Bharat; Conti, Al; Model, Josh; DiSabello, Doug

2011-01-01

Numerous application areas, including bioinformatics and computational biology, demand increasing amounts of processing capability. In many cases, the computation cores and data types are suited to field-programmable gate arrays. The challenge is identifying the design techniques that can extract high performance potential from the FPGA fabric. PMID:21603088
Debugging a high performance computing program

DOEpatents

Gooding, Thomas M.

2014-08-19

Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.
Debugging a high performance computing program

DOEpatents

Gooding, Thomas M.

2013-08-20

Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.
Expanding the Scope of High-Performance Computing Facilities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uram, Thomas D.; Papka, Michael E.

The high-performance computing centers of the future will expand their roles as service providers, and as the machines scale up, so should the sizes of the communities they serve. National facilities must cultivate their users as much as they focus on operating machines reliably. The authors present five interrelated topic areas that are essential to expanding the value provided to those performing computational science.
GPU-based High-Performance Computing for Radiation Therapy

PubMed Central

Jia, Xun; Ziegenhein, Peter; Jiang, Steve B.

2014-01-01

Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. Graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past a few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of studies have been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this article, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. PMID:24486639
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization

PubMed Central

Chen, Qingkui; Zhao, Deyu; Wang, Jingjuan

2017-01-01

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes’ diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services. PMID:28777325
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization.

PubMed

Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan

2017-08-04

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.
A high performance scientific cloud computing environment for materials simulations

NASA Astrophysics Data System (ADS)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
DoD High Performance Computing Modernization Program Users Group Conference (HPCMP UGC 2011) Held in Portland, Oregon on June 20-23, 2011

DTIC Science & Technology

2011-06-01

4. Conclusion The Web -based AGeS system described in this paper is a computationally-efficient and scalable system for high- throughput genome...method for protecting web services involves making them more resilient to attack using autonomic computing techniques. This paper presents our initial...20–23, 2011 2011 DoD High Performance Computing Modernzation Program Users Group Conference HPCMP UGC 2011 The papers in this book comprise the
High performance network and channel-based storage

NASA Technical Reports Server (NTRS)

Katz, Randy H.

1991-01-01

In the traditional mainframe-centered view of a computer system, storage devices are coupled to the system through complex hardware subsystems called input/output (I/O) channels. With the dramatic shift towards workstation-based computing, and its associated client/server model of computation, storage facilities are now found attached to file servers and distributed throughout the network. We discuss the underlying technology trends that are leading to high performance network-based storage, namely advances in networks, storage devices, and I/O controller and server architectures. We review several commercial systems and research prototypes that are leading to a new approach to high performance computing based on network-attached storage.
High Performance Computing (HPC)-Enabled Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation

DTIC Science & Technology

2016-11-01

Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation by Kathryn Esham, Luis Bravo, Anindya Ghoshal, Muthuvel Murugan, and Michael...Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation by Luis Bravo, Anindya Ghoshal, Muthuvel...High Performance Computing (HPC)-Enabled Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation 5a
A Research and Development Strategy for High Performance Computing.

ERIC Educational Resources Information Center

Office of Science and Technology Policy, Washington, DC.

This report is the result of a systematic review of the status and directions of high performance computing and its relationship to federal research and development. Conducted by the Federal Coordinating Council for Science, Engineering, and Technology (FCCSET), the review involved a series of workshops attended by numerous computer scientists and…
DCL System Using Deep Learning Approaches for Land-Based or Ship-Based Real Time Recognition and Localization of Marine Mammals

DTIC Science & Technology

2015-09-30

Clark (2014), "Using High Performance Computing to Explore Large Complex Bioacoustic Soundscapes : Case Study for Right Whale Acoustics," Procedia...34Using High Performance Computing to Explore Large Complex Bioacoustic Soundscapes : Case Study for Right Whale Acoustics," Procedia Computer Science 20
P2P Technology for High-Performance Computing: An Overview

NASA Technical Reports Server (NTRS)

Follen, Gregory J. (Technical Monitor); Berry, Jason

2003-01-01

The transition from cluster computing to peer-to-peer (P2P) high-performance computing has recently attracted the attention of the computer science community. It has been recognized that existing local networks and dedicated clusters of headless workstations can serve as inexpensive yet powerful virtual supercomputers. It has also been recognized that the vast number of lower-end computers connected to the Internet stay idle for as long as 90% of the time. The growing speed of Internet connections and the high availability of free CPU time encourage exploration of the possibility to use the whole Internet rather than local clusters as massively parallel yet almost freely available P2P supercomputer. As a part of a larger project on P2P high-performance computing, it has been my goal to compile an overview of the 2P2 paradigm. I have studied various P2P platforms and I have compiled systematic brief descriptions of their most important characteristics. I have also experimented and obtained hands-on experience with selected P2P platforms focusing on those that seem promising with respect to P2P high-performance computing. I have also compiled relevant literature and web references. I have prepared a draft technical report and I have summarized my findings in a poster paper.
Distributed Accounting on the Grid

NASA Technical Reports Server (NTRS)

Thigpen, William; Hacker, Thomas J.; McGinnis, Laura F.; Athey, Brian D.

2001-01-01

By the late 1990s, the Internet was adequately equipped to move vast amounts of data between HPC (High Performance Computing) systems, and efforts were initiated to link together the national infrastructure of high performance computational and data storage resources together into a general computational utility 'grid', analogous to the national electrical power grid infrastructure. The purpose of the Computational grid is to provide dependable, consistent, pervasive, and inexpensive access to computational resources for the computing community in the form of a computing utility. This paper presents a fully distributed view of Grid usage accounting and a methodology for allocating Grid computational resources for use on a Grid computing system.

Evaluating the Efficacy of the Cloud for Cluster Computation

NASA Technical Reports Server (NTRS)

Knight, David; Shams, Khawaja; Chang, George; Soderstrom, Tom

2012-01-01

Computing requirements vary by industry, and it follows that NASA and other research organizations have computing demands that fall outside the mainstream. While cloud computing made rapid inroads for tasks such as powering web applications, performance issues on highly distributed tasks hindered early adoption for scientific computation. One venture to address this problem is Nebula, NASA's homegrown cloud project tasked with delivering science-quality cloud computing resources. However, another industry development is Amazon's high-performance computing (HPC) instances on Elastic Cloud Compute (EC2) that promises improved performance for cluster computation. This paper presents results from a series of benchmarks run on Amazon EC2 and discusses the efficacy of current commercial cloud technology for running scientific applications across a cluster. In particular, a 240-core cluster of cloud instances achieved 2 TFLOPS on High-Performance Linpack (HPL) at 70% of theoretical computational performance. The cluster's local network also demonstrated sub-100 ?s inter-process latency with sustained inter-node throughput in excess of 8 Gbps. Beyond HPL, a real-world Hadoop image processing task from NASA's Lunar Mapping and Modeling Project (LMMP) was run on a 29 instance cluster to process lunar and Martian surface images with sizes on the order of tens of gigapixels. These results demonstrate that while not a rival of dedicated supercomputing clusters, commercial cloud technology is now a feasible option for moderately demanding scientific workloads.
Radio Synthesis Imaging - A High Performance Computing and Communications Project

NASA Astrophysics Data System (ADS)

Crutcher, Richard M.

The National Science Foundation has funded a five-year High Performance Computing and Communications project at the National Center for Supercomputing Applications (NCSA) for the direct implementation of several of the computing recommendations of the Astronomy and Astrophysics Survey Committee (the "Bahcall report"). This paper is a summary of the project goals and a progress report. The project will implement a prototype of the next generation of astronomical telescope systems - remotely located telescopes connected by high-speed networks to very high performance, scalable architecture computers and on-line data archives, which are accessed by astronomers over Gbit/sec networks. Specifically, a data link has been installed between the BIMA millimeter-wave synthesis array at Hat Creek, California and NCSA at Urbana, Illinois for real-time transmission of data to NCSA. Data are automatically archived, and may be browsed and retrieved by astronomers using the NCSA Mosaic software. In addition, an on-line digital library of processed images will be established. BIMA data will be processed on a very high performance distributed computing system, with I/O, user interface, and most of the software system running on the NCSA Convex C3880 supercomputer or Silicon Graphics Onyx workstations connected by HiPPI to the high performance, massively parallel Thinking Machines Corporation CM-5. The very computationally intensive algorithms for calibration and imaging of radio synthesis array observations will be optimized for the CM-5 and new algorithms which utilize the massively parallel architecture will be developed. Code running simultaneously on the distributed computers will communicate using the Data Transport Mechanism developed by NCSA. The project will also use the BLANCA Gbit/s testbed network between Urbana and Madison, Wisconsin to connect an Onyx workstation in the University of Wisconsin Astronomy Department to the NCSA CM-5, for development of long-distance distributed computing. Finally, the project is developing 2D and 3D visualization software as part of the international AIPS++ project. This research and development project is being carried out by a team of experts in radio astronomy, algorithm development for massively parallel architectures, high-speed networking, database management, and Thinking Machines Corporation personnel. The development of this complete software, distributed computing, and data archive and library solution to the radio astronomy computing problem will advance our expertise in high performance computing and communications technology and the application of these techniques to astronomical data processing.
Running Jobs on the Peregrine System | High-Performance Computing | NREL

Science.gov Websites

on the Peregrine high-performance computing (HPC) system. Running Different Types of Jobs Batch jobs scheduling policies - queue names, limits, etc. Requesting different node types Sample batch scripts
High-performance scientific computing in the cloud

NASA Astrophysics Data System (ADS)

Jorissen, Kevin; Vila, Fernando; Rehr, John

2011-03-01

Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.
Hadoop-MCC: Efficient Multiple Compound Comparison Algorithm Using Hadoop.

PubMed

Hua, Guan-Jie; Hung, Che-Lun; Tang, Chuan Yi

2018-01-01

In the past decade, the drug design technologies have been improved enormously. The computer-aided drug design (CADD) has played an important role in analysis and prediction in drug development, which makes the procedure more economical and efficient. However, computation with big data, such as ZINC containing more than 60 million compounds data and GDB-13 with more than 930 million small molecules, is a noticeable issue of time-consuming problem. Therefore, we propose a novel heterogeneous high performance computing method, named as Hadoop-MCC, integrating Hadoop and GPU, to copy with big chemical structure data efficiently. Hadoop-MCC gains the high availability and fault tolerance from Hadoop, as Hadoop is used to scatter input data to GPU devices and gather the results from GPU devices. Hadoop framework adopts mapper/reducer computation model. In the proposed method, mappers response for fetching SMILES data segments and perform LINGO method on GPU, then reducers collect all comparison results produced by mappers. Due to the high availability of Hadoop, all of LINGO computational jobs on mappers can be completed, even if some of the mappers encounter problems. A comparison of LINGO is performed on each the GPU device in parallel. According to the experimental results, the proposed method on multiple GPU devices can achieve better computational performance than the CUDA-MCC on a single GPU device. Hadoop-MCC is able to achieve scalability, high availability, and fault tolerance granted by Hadoop, and high performance as well by integrating computational power of both of Hadoop and GPU. It has been shown that using the heterogeneous architecture as Hadoop-MCC effectively can enhance better computational performance than on a single GPU device. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Role of HPC in Advancing Computational Aeroelasticity

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.

2004-01-01

On behalf of the High Performance Computing and Modernization Program (HPCMP) and NASA Advanced Supercomputing Division (NAS) a study is conducted to assess the role of supercomputers on computational aeroelasticity of aerospace vehicles. The study is mostly based on the responses to a web based questionnaire that was designed to capture the nuances of high performance computational aeroelasticity, particularly on parallel computers. A procedure is presented to assign a fidelity-complexity index to each application. Case studies based on major applications using HPCMP resources are presented.
H.R. 656--The High Performance Computer Technology Act of 1991. Hearing before the Subcommittee on Science, and the Subcommittee on Technology and Competitiveness of the Committee on Science, Space, and Technology. U.S. House of Representatives, One Hundred Second Congess, First Session.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. House Committee on Science, Space and Technology.

This hearing focused on H. R. 656, companion bill of S. 272, which calls for high performance computing legislation. This is one of several initiatives to provide for a coordinated federal research program to ensure continued U.S. leadership in high performance computing. The bill authorizes the development of a National Research and Education…
[Series: Medical Applications of the PHITS Code (2): Acceleration by Parallel Computing].

PubMed

Furuta, Takuya; Sato, Tatsuhiko

2015-01-01

Time-consuming Monte Carlo dose calculation becomes feasible owing to the development of computer technology. However, the recent development is due to emergence of the multi-core high performance computers. Therefore, parallel computing becomes a key to achieve good performance of software programs. A Monte Carlo simulation code PHITS contains two parallel computing functions, the distributed-memory parallelization using protocols of message passing interface (MPI) and the shared-memory parallelization using open multi-processing (OpenMP) directives. Users can choose the two functions according to their needs. This paper gives the explanation of the two functions with their advantages and disadvantages. Some test applications are also provided to show their performance using a typical multi-core high performance workstation.
GRAPE project

NASA Astrophysics Data System (ADS)

Makino, Junichiro

2002-12-01

We overview our GRAvity PipE (GRAPE) project to develop special-purpose computers for astrophysical N-body simulations. The basic idea of GRAPE is to attach a custom-build computer dedicated to the calculation of gravitational interaction between particles to a general-purpose programmable computer. By this hybrid architecture, we can achieve both a wide range of applications and very high peak performance. Our newest machine, GRAPE-6, achieved the peak speed of 32 Tflops, and sustained performance of 11.55 Tflops, for the total budget of about 4 million USD. We also discuss relative advantages of special-purpose and general-purpose computers and the future of high-performance computing for science and technology.
Construction of Blaze at the University of Illinois at Chicago: A Shared, High-Performance, Visual Computer for Next-Generation Cyberinfrastructure-Accelerated Scientific, Engineering, Medical and Public Policy Research

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Maxine D.; Leigh, Jason

2014-02-17

The Blaze high-performance visual computing system serves the high-performance computing research and education needs of University of Illinois at Chicago (UIC). Blaze consists of a state-of-the-art, networked, computer cluster and ultra-high-resolution visualization system called CAVE2(TM) that is currently not available anywhere in Illinois. This system is connected via a high-speed 100-Gigabit network to the State of Illinois' I-WIRE optical network, as well as to national and international high speed networks, such as the Internet2, and the Global Lambda Integrated Facility. This enables Blaze to serve as an on-ramp to national cyberinfrastructure, such as the National Science Foundation’s Blue Waters petascalemore » computer at the National Center for Supercomputing Applications at the University of Illinois at Chicago and the Department of Energy’s Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. DOE award # DE-SC005067, leveraged with NSF award #CNS-0959053 for “Development of the Next-Generation CAVE Virtual Environment (NG-CAVE),” enabled us to create a first-of-its-kind high-performance visual computing system. The UIC Electronic Visualization Laboratory (EVL) worked with two U.S. companies to advance their commercial products and maintain U.S. leadership in the global information technology economy. New applications are being enabled with the CAVE2/Blaze visual computing system that is advancing scientific research and education in the U.S. and globally, and help train the next-generation workforce.« less
Development of a small-scale computer cluster

NASA Astrophysics Data System (ADS)

Wilhelm, Jay; Smith, Justin T.; Smith, James E.

2008-04-01

An increase in demand for computing power in academia has necessitated the need for high performance machines. Computing power of a single processor has been steadily increasing, but lags behind the demand for fast simulations. Since a single processor has hard limits to its performance, a cluster of computers can have the ability to multiply the performance of a single computer with the proper software. Cluster computing has therefore become a much sought after technology. Typical desktop computers could be used for cluster computing, but are not intended for constant full speed operation and take up more space than rack mount servers. Specialty computers that are designed to be used in clusters meet high availability and space requirements, but can be costly. A market segment exists where custom built desktop computers can be arranged in a rack mount situation, gaining the space saving of traditional rack mount computers while remaining cost effective. To explore these possibilities, an experiment was performed to develop a computing cluster using desktop components for the purpose of decreasing computation time of advanced simulations. This study indicates that small-scale cluster can be built from off-the-shelf components which multiplies the performance of a single desktop machine, while minimizing occupied space and still remaining cost effective.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bonachea, D.; Dickens, P.; Thakur, R.

There is a growing interest in using Java as the language for developing high-performance computing applications. To be successful in the high-performance computing domain, however, Java must not only be able to provide high computational performance, but also high-performance I/O. In this paper, we first examine several approaches that attempt to provide high-performance I/O in Java - many of which are not obvious at first glance - and evaluate their performance on two parallel machines, the IBM SP and the SGI Origin2000. We then propose extensions to the Java I/O library that address the deficiencies in the Java I/O APImore » and improve performance dramatically. The extensions add bulk (array) I/O operations to Java, thereby removing much of the overhead currently associated with array I/O in Java. We have implemented the extensions in two ways: in a standard JVM using the Java Native Interface (JNI) and in a high-performance parallel dialect of Java called Titanium. We describe the two implementations and present performance results that demonstrate the benefits of the proposed extensions.« less
Workload Characterization of CFD Applications Using Partial Differential Equation Solvers

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.
On the "Exchangeability" of Hands-On and Computer-Simulated Science Performance Assessments. CSE Technical Report.

ERIC Educational Resources Information Center

Rosenquist, Anders; Shavelson, Richard J.; Ruiz-Primo, Maria Araceli

Inconsistencies in scores from computer-simulated and "hands-on" science performance assessments have led to questions about the exchangeability of these two methods in spite of the highly touted potential of computer-simulated performance assessment. This investigation considered possible explanations for students' inconsistent performances: (1)…
Benefits of computer screen-based simulation in learning cardiac arrest procedures.

PubMed

Bonnetain, Elodie; Boucheix, Jean-Michel; Hamet, Maël; Freysz, Marc

2010-07-01

What is the best way to train medical students early so that they acquire basic skills in cardiopulmonary resuscitation as effectively as possible? Studies have shown the benefits of high-fidelity patient simulators, but have also demonstrated their limits. New computer screen-based multimedia simulators have fewer constraints than high-fidelity patient simulators. In this area, as yet, there has been no research on the effectiveness of transfer of learning from a computer screen-based simulator to more realistic situations such as those encountered with high-fidelity patient simulators. We tested the benefits of learning cardiac arrest procedures using a multimedia computer screen-based simulator in 28 Year 2 medical students. Just before the end of the traditional resuscitation course, we compared two groups. An experiment group (EG) was first asked to learn to perform the appropriate procedures in a cardiac arrest scenario (CA1) in the computer screen-based learning environment and was then tested on a high-fidelity patient simulator in another cardiac arrest simulation (CA2). While the EG was learning to perform CA1 procedures in the computer screen-based learning environment, a control group (CG) actively continued to learn cardiac arrest procedures using practical exercises in a traditional class environment. Both groups were given the same amount of practice, exercises and trials. The CG was then also tested on the high-fidelity patient simulator for CA2, after which it was asked to perform CA1 using the computer screen-based simulator. Performances with both simulators were scored on a precise 23-point scale. On the test on a high-fidelity patient simulator, the EG trained with a multimedia computer screen-based simulator performed significantly better than the CG trained with traditional exercises and practice (16.21 versus 11.13 of 23 possible points, respectively; p<0.001). Computer screen-based simulation appears to be effective in preparing learners to use high-fidelity patient simulators, which present simulations that are closer to real-life situations.
Linear static structural and vibration analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.

1993-01-01

Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.
Combining high performance simulation, data acquisition, and graphics display computers

NASA Technical Reports Server (NTRS)

Hickman, Robert J.

1989-01-01

Issues involved in the continuing development of an advanced simulation complex are discussed. This approach provides the capability to perform the majority of tests on advanced systems, non-destructively. The controlled test environments can be replicated to examine the response of the systems under test to alternative treatments of the system control design, or test the function and qualification of specific hardware. Field tests verify that the elements simulated in the laboratories are sufficient. The digital computer is hosted by a Digital Equipment Corp. MicroVAX computer with an Aptec Computer Systems Model 24 I/O computer performing the communication function. An Applied Dynamics International AD100 performs the high speed simulation computing and an Evans and Sutherland PS350 performs on-line graphics display. A Scientific Computer Systems SCS40 acts as a high performance FORTRAN program processor to support the complex, by generating numerous large files from programs coded in FORTRAN that are required for the real time processing. Four programming languages are involved in the process, FORTRAN, ADSIM, ADRIO, and STAPLE. FORTRAN is employed on the MicroVAX host to initialize and terminate the simulation runs on the system. The generation of the data files on the SCS40 also is performed with FORTRAN programs. ADSIM and ADIRO are used to program the processing elements of the AD100 and its IOCP processor. STAPLE is used to program the Aptec DIP and DIA processors.
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy

NASA Astrophysics Data System (ADS)

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

2014-03-01

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.
Welcome to the NASA High Performance Computing and Communications Computational Aerosciences (CAS) Workshop 2000

NASA Technical Reports Server (NTRS)

Schulbach, Catherine H. (Editor)

2000-01-01

The purpose of the CAS workshop is to bring together NASA's scientists and engineers and their counterparts in industry, other government agencies, and academia working in the Computational Aerosciences and related fields. This workshop is part of the technology transfer plan of the NASA High Performance Computing and Communications (HPCC) Program. Specific objectives of the CAS workshop are to: (1) communicate the goals and objectives of HPCC and CAS, (2) promote and disseminate CAS technology within the appropriate technical communities, including NASA, industry, academia, and other government labs, (3) help promote synergy among CAS and other HPCC scientists, and (4) permit feedback from peer researchers on issues facing High Performance Computing in general and the CAS project in particular. This year we had a number of exciting presentations in the traditional aeronautics, aerospace sciences, and high-end computing areas and in the less familiar (to many of us affiliated with CAS) earth science, space science, and revolutionary computing areas. Presentations of more than 40 high quality papers were organized into ten sessions and presented over the three-day workshop. The proceedings are organized here for easy access: by author, title and topic.
The High-Performance Computing and Communications program, the national information infrastructure and health care.

PubMed Central

Lindberg, D A; Humphreys, B L

1995-01-01

The High-Performance Computing and Communications (HPCC) program is a multiagency federal effort to advance the state of computing and communications and to provide the technologic platform on which the National Information Infrastructure (NII) can be built. The HPCC program supports the development of high-speed computers, high-speed telecommunications, related software and algorithms, education and training, and information infrastructure technology and applications. The vision of the NII is to extend access to high-performance computing and communications to virtually every U.S. citizen so that the technology can be used to improve the civil infrastructure, lifelong learning, energy management, health care, etc. Development of the NII will require resolution of complex economic and social issues, including information privacy. Health-related applications supported under the HPCC program and NII initiatives include connection of health care institutions to the Internet; enhanced access to gene sequence data; the "Visible Human" Project; and test-bed projects in telemedicine, electronic patient records, shared informatics tool development, and image systems. PMID:7614116

A cross-sectional study of the effects of load carriage on running characteristics and tibial mechanical stress: implications for stress fracture injuries in women

DTIC Science & Technology

2017-03-23

performance computing resources made available by the US Department of Defense High Performance Computing Modernization Program at the Air Force...1Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, United...States Army Medical Research and Materiel Command, Fort Detrick, Maryland, USA Full list of author information is available at the end of the article
Exploring Cloud Computing for Large-scale Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Guang; Han, Binh; Yin, Jian

This paper explores cloud computing for large-scale data-intensive scientific applications. Cloud computing is attractive because it provides hardware and software resources on-demand, which relieves the burden of acquiring and maintaining a huge amount of resources that may be used only once by a scientific application. However, unlike typical commercial applications that often just requires a moderate amount of ordinary resources, large-scale scientific applications often need to process enormous amount of data in the terabyte or even petabyte range and require special high performance hardware with low latency connections to complete computation in a reasonable amount of time. To address thesemore » challenges, we build an infrastructure that can dynamically select high performance computing hardware across institutions and dynamically adapt the computation to the selected resources to achieve high performance. We have also demonstrated the effectiveness of our infrastructure by building a system biology application and an uncertainty quantification application for carbon sequestration, which can efficiently utilize data and computation resources across several institutions.« less
High-Performance Computing Act of 1990: Report of the Senate Committee on Commerce, Science, and Transportation on S. 1067.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

This committee report is intended to accompany S. 1067, a bill designed to provide for a coordinated federal research program in high-performance computing (HPC). The primary objective of the legislation is given as the acceleration of research, development, and application of the most advanced computing technology in research, education, and…
Performance of High-Reliability Space-Qualified Processors Implementing Software Defined Radios

DTIC Science & Technology

2014-03-01

ADDRESS(ES) AND ADDRESS(ES) Naval Postgraduate School, Department of Electrical and Computer Engineering, 833 Dyer Road, Monterey, CA 93943-5121 8...Chairman Jeffrey D. Paduan Electrical and Computer Engineering Dean of Research iii THIS PAGE...capability. Radiation in space poses a considerable threat to modern microelectronic devices, in particular to the high-performance low-cost computing
Process for selecting NEAMS applications for access to Idaho National Laboratory high performance computing resources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Michael Pernice

2010-09-01

INL has agreed to provide participants in the Nuclear Energy Advanced Mod- eling and Simulation (NEAMS) program with access to its high performance computing (HPC) resources under sponsorship of the Enabling Computational Technologies (ECT) program element. This report documents the process used to select applications and the software stack in place at INL.
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Potok, Thomas E; Schuman, Catherine D; Young, Steven R

Current Deep Learning models use highly optimized convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers with a fairly simple layered network topology, i.e., highly connected layers, without intra-layer connections. Complex topologies have been proposed, but are intractable to train on current systems. Building the topologies of the deep learning network requires hand tuning, and implementing the network in hardware is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determinemore » network topology, and neuromorphic computing for a low-power hardware implementation. Due to input size limitations of current quantum computers we use the MNIST dataset for our evaluation. The results show the possibility of using the three architectures in tandem to explore complex deep learning networks that are untrainable using a von Neumann architecture. We show that a quantum computer can find high quality values of intra-layer connections and weights, while yielding a tractable time result as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. This represents a new capability that is not feasible with current von Neumann architecture. It potentially enables the ability to solve very complicated problems unsolvable with current computing technologies.« less
Final Report Extreme Computing and U.S. Competitiveness DOE Award. DE-FG02-11ER26087/DE-SC0008764

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mustain, Christopher J.

The Council has acted on each of the grant deliverables during the funding period. The deliverables are: (1) convening the Council’s High Performance Computing Advisory Committee (HPCAC) on a bi-annual basis; (2) broadening public awareness of high performance computing (HPC) and exascale developments; (3) assessing the industrial applications of extreme computing; and (4) establishing a policy and business case for an exascale economy.
Multidisciplinary High-Fidelity Analysis and Optimization of Aerospace Vehicles. Part 2; Preliminary Results

NASA Technical Reports Server (NTRS)

Walsh, J. L.; Weston, R. P.; Samareh, J. A.; Mason, B. H.; Green, L. L.; Biedron, R. T.

2000-01-01

An objective of the High Performance Computing and Communication Program at the NASA Langley Research Center is to demonstrate multidisciplinary shape and sizing optimization of a complete aerospace vehicle configuration by using high-fidelity finite-element structural analysis and computational fluid dynamics aerodynamic analysis in a distributed, heterogeneous computing environment that includes high performance parallel computing. A software system has been designed and implemented to integrate a set of existing discipline analysis codes, some of them computationally intensive, into a distributed computational environment for the design of a high-speed civil transport configuration. The paper describes both the preliminary results from implementing and validating the multidisciplinary analysis and the results from an aerodynamic optimization. The discipline codes are integrated by using the Java programming language and a Common Object Request Broker Architecture compliant software product. A companion paper describes the formulation of the multidisciplinary analysis and optimization system.
Tools for 3D scientific visualization in computational aerodynamics

NASA Technical Reports Server (NTRS)

Bancroft, Gordon; Plessel, Todd; Merritt, Fergus; Watson, Val

1989-01-01

The purpose is to describe the tools and techniques in use at the NASA Ames Research Center for performing visualization of computational aerodynamics, for example visualization of flow fields from computer simulations of fluid dynamics about vehicles such as the Space Shuttle. The hardware used for visualization is a high-performance graphics workstation connected to a super computer with a high speed channel. At present, the workstation is a Silicon Graphics IRIS 3130, the supercomputer is a CRAY2, and the high speed channel is a hyperchannel. The three techniques used for visualization are post-processing, tracking, and steering. Post-processing analysis is done after the simulation. Tracking analysis is done during a simulation but is not interactive, whereas steering analysis involves modifying the simulation interactively during the simulation. Using post-processing methods, a flow simulation is executed on a supercomputer and, after the simulation is complete, the results of the simulation are processed for viewing. The software in use and under development at NASA Ames Research Center for performing these types of tasks in computational aerodynamics is described. Workstation performance issues, benchmarking, and high-performance networks for this purpose are also discussed as well as descriptions of other hardware for digital video and film recording.
Optical high-performance computing: introduction to the JOSA A and Applied Optics feature.

PubMed

Caulfield, H John; Dolev, Shlomi; Green, William M J

2009-08-01

The feature issues in both Applied Optics and the Journal of the Optical Society of America A focus on topics of immediate relevance to the community working in the area of optical high-performance computing.
High-Performance Java Codes for Computational Fluid Dynamics

NASA Technical Reports Server (NTRS)

Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.
Turbulence modeling of free shear layers for high-performance aircraft

NASA Technical Reports Server (NTRS)

Sondak, Douglas L.

1993-01-01

The High Performance Aircraft (HPA) Grand Challenge of the High Performance Computing and Communications (HPCC) program involves the computation of the flow over a high performance aircraft. A variety of free shear layers, including mixing layers over cavities, impinging jets, blown flaps, and exhaust plumes, may be encountered in such flowfields. Since these free shear layers are usually turbulent, appropriate turbulence models must be utilized in computations in order to accurately simulate these flow features. The HPCC program is relying heavily on parallel computers. A Navier-Stokes solver (POVERFLOW) utilizing the Baldwin-Lomax algebraic turbulence model was developed and tested on a 128-node Intel iPSC/860. Algebraic turbulence models run very fast, and give good results for many flowfields. For complex flowfields such as those mentioned above, however, they are often inadequate. It was therefore deemed that a two-equation turbulence model will be required for the HPA computations. The k-epsilon two-equation turbulence model was implemented on the Intel iPSC/860. Both the Chien low-Reynolds-number model and a generalized wall-function formulation were included.
Scaling predictive modeling in drug development with cloud computing.

PubMed

Moghadam, Behrooz Torabi; Alvarsson, Jonathan; Holm, Marcus; Eklund, Martin; Carlsson, Lars; Spjuth, Ola

2015-01-26

Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.
A High Performance Computing Approach to the Simulation of Fluid Solid Interaction Problems with Rigid and Flexible Components (Open Access Publisher’s Version)

DTIC Science & Technology

2014-08-01

performance computing, smoothed particle hydrodynamics, rigid body dynamics, flexible body dynamics ARMAN PAZOUKI ∗, RADU SERBAN ∗, DAN NEGRUT ∗ A...HIGH PERFORMANCE COMPUTING APPROACH TO THE SIMULATION OF FLUID-SOLID INTERACTION PROBLEMS WITH RIGID AND FLEXIBLE COMPONENTS This work outlines a unified...are implemented to model rigid and flexible multibody dynamics. The two- way coupling of the fluid and solid phases is supported through use of
Current state and future direction of computer systems at NASA Langley Research Center

NASA Technical Reports Server (NTRS)

Rogers, James L. (Editor); Tucker, Jerry H. (Editor)

1992-01-01

Computer systems have advanced at a rate unmatched by any other area of technology. As performance has dramatically increased there has been an equally dramatic reduction in cost. This constant cost performance improvement has precipitated the pervasiveness of computer systems into virtually all areas of technology. This improvement is due primarily to advances in microelectronics. Most people are now convinced that the new generation of supercomputers will be built using a large number (possibly thousands) of high performance microprocessors. Although the spectacular improvements in computer systems have come about because of these hardware advances, there has also been a steady improvement in software techniques. In an effort to understand how these hardware and software advances will effect research at NASA LaRC, the Computer Systems Technical Committee drafted this white paper to examine the current state and possible future directions of computer systems at the Center. This paper discusses selected important areas of computer systems including real-time systems, embedded systems, high performance computing, distributed computing networks, data acquisition systems, artificial intelligence, and visualization.
beta-Aminoalcohols as Potential Reactivators of Aged Sarin-/Soman-Inhibited Acetylcholinesterase

DTIC Science & Technology

2017-02-08

This approach includes high - quality quantum mechanical/molecular mechanical calcula- tions, providing reliable reactivation steps and energetics...I. V. Khavrutskii Department of Defense Biotechnology High Performance Computing Software Applications Institute Telemedicine and Advanced...b] Dr. A. Wallqvist Department of Defense Biotechnology High Performance Computing Software Applications Institute Telemedicine and Advanced
Business Models of High Performance Computing Centres in Higher Education in Europe

ERIC Educational Resources Information Center

Eurich, Markus; Calleja, Paul; Boutellier, Roman

2013-01-01

High performance computing (HPC) service centres are a vital part of the academic infrastructure of higher education organisations. However, despite their importance for research and the necessary high capital expenditures, business research on HPC service centres is mostly missing. From a business perspective, it is important to find an answer to…
Charging Ahead into the Next Millennium: Proceedings of the Systems and Technology Symposium (20th) Held in Denver, Colorado on 7-10 June 1999

DTIC Science & Technology

1999-06-01

Tactical Radar Correlator EV Electric Vehicle EW Electronic Warfare F ^^m F Frequency FA False Alarm FAO Foreign Area Officer FBE Fleet Battle... Electric Vehicle High Frequency Horsepower High-Performance Computing High Performance Computing and Communications High Performance Knowledge...A/D Analog-to-Digital A/G Air-to-Ground AAN Army After Next AAV Advanced Air Vehicle ABCCC Airborne Battlefield Command, Control and
Advanced Certification Program for Computer Graphic Specialists. Final Performance Report.

ERIC Educational Resources Information Center

Parkland Coll., Champaign, IL.

A pioneer program in computer graphics was implemented at Parkland College (Illinois) to meet the demand for specialized technicians to visualize data generated on high performance computers. In summer 1989, 23 students were accepted into the pilot program. Courses included C programming, calculus and analytic geometry, computer graphics, and…
Compute Server Performance Results

NASA Technical Reports Server (NTRS)

Stockdale, I. E.; Barton, John; Woodrow, Thomas (Technical Monitor)

1994-01-01

Parallel-vector supercomputers have been the workhorses of high performance computing. As expectations of future computing needs have risen faster than projected vector supercomputer performance, much work has been done investigating the feasibility of using Massively Parallel Processor systems as supercomputers. An even more recent development is the availability of high performance workstations which have the potential, when clustered together, to replace parallel-vector systems. We present a systematic comparison of floating point performance and price-performance for various compute server systems. A suite of highly vectorized programs was run on systems including traditional vector systems such as the Cray C90, and RISC workstations such as the IBM RS/6000 590 and the SGI R8000. The C90 system delivers 460 million floating point operations per second (FLOPS), the highest single processor rate of any vendor. However, if the price-performance ration (PPR) is considered to be most important, then the IBM and SGI processors are superior to the C90 processors. Even without code tuning, the IBM and SGI PPR's of 260 and 220 FLOPS per dollar exceed the C90 PPR of 160 FLOPS per dollar when running our highly vectorized suite,

DURIP: High Performance Computing in Biomathematics Applications

DTIC Science & Technology

2017-05-10

Mathematics and Statistics (AMS) at the University of California, Santa Cruz (UCSC) to conduct research and research-related education in areas of...Computing in Biomathematics Applications Report Title The goal of this award was to enhance the capabilities of the Department of Applied Mathematics and...DURIP: High Performance Computing in Biomathematics Applications The goal of this award was to enhance the capabilities of the Department of Applied
High performance computing for advanced modeling and simulation of materials

NASA Astrophysics Data System (ADS)

Wang, Jue; Gao, Fei; Vazquez-Poletti, Jose Luis; Li, Jianjiang

2017-02-01

The First International Workshop on High Performance Computing for Advanced Modeling and Simulation of Materials (HPCMS2015) was held in Austin, Texas, USA, Nov. 18, 2015. HPCMS 2015 was organized by Computer Network Information Center (Chinese Academy of Sciences), University of Michigan, Universidad Complutense de Madrid, University of Science and Technology Beijing, Pittsburgh Supercomputing Center, China Institute of Atomic Energy, and Ames Laboratory.
A Queue Simulation Tool for a High Performance Scientific Computing Center

NASA Technical Reports Server (NTRS)

Spear, Carrie; McGalliard, James

2007-01-01

The NASA Center for Computational Sciences (NCCS) at the Goddard Space Flight Center provides high performance highly parallel processors, mass storage, and supporting infrastructure to a community of computational Earth and space scientists. Long running (days) and highly parallel (hundreds of CPUs) jobs are common in the workload. NCCS management structures batch queues and allocates resources to optimize system use and prioritize workloads. NCCS technical staff use a locally developed discrete event simulation tool to model the impacts of evolving workloads, potential system upgrades, alternative queue structures and resource allocation policies.
Generic Divide and Conquer Internet-Based Computing

NASA Technical Reports Server (NTRS)

Radenski, Atanas; Follen, Gregory J. (Technical Monitor)

2001-01-01

The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
NASA HPCC Technology for Aerospace Analysis and Design

NASA Technical Reports Server (NTRS)

Schulbach, Catherine H.

1999-01-01

The Computational Aerosciences (CAS) Project is part of NASA's High Performance Computing and Communications Program. Its primary goal is to accelerate the availability of high-performance computing technology to the US aerospace community-thus providing the US aerospace community with key tools necessary to reduce design cycle times and increase fidelity in order to improve safety, efficiency and capability of future aerospace vehicles. A complementary goal is to hasten the emergence of a viable commercial market within the aerospace community for the advantage of the domestic computer hardware and software industry. The CAS Project selects representative aerospace problems (especially design) and uses them to focus efforts on advancing aerospace algorithms and applications, systems software, and computing machinery to demonstrate vast improvements in system performance and capability over the life of the program. Recent demonstrations have served to assess the benefits of possible performance improvements while reducing the risk of adopting high-performance computing technology. This talk will discuss past accomplishments in providing technology to the aerospace community, present efforts, and future goals. For example, the times to do full combustor and compressor simulations (of aircraft engines) have been reduced by factors of 320:1 and 400:1 respectively. While this has enabled new capabilities in engine simulation, the goal of an overnight, dynamic, multi-disciplinary, 3-dimensional simulation of an aircraft engine is still years away and will require new generations of high-end technology.
Integrating Reconfigurable Hardware-Based Grid for High Performance Computing

PubMed Central

Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos

2015-01-01

FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process. PMID:25874241
Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

NASA Astrophysics Data System (ADS)

Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

2018-01-01

We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.
National research and education network

NASA Technical Reports Server (NTRS)

Villasenor, Tony

1991-01-01

Some goals of this network are as follows: Extend U.S. technological leadership in high performance computing and computer communications; Provide wide dissemination and application of the technologies both to the speed and the pace of innovation and to serve the national economy, national security, education, and the global environment; and Spur gains in the U.S. productivity and industrial competitiveness by making high performance computing and networking technologies an integral part of the design and production process. Strategies for achieving these goals are as follows: Support solutions to important scientific and technical challenges through a vigorous R and D effort; Reduce the uncertainties to industry for R and D and use of this technology through increased cooperation between government, industry, and universities and by the continued use of government and government funded facilities as a prototype user for early commercial HPCC products; and Support underlying research, network, and computational infrastructures on which U.S. high performance computing technology is based.
Computational Issues in Damping Identification for Large Scale Problems

NASA Technical Reports Server (NTRS)

Pilkey, Deborah L.; Roe, Kevin P.; Inman, Daniel J.

1997-01-01

Two damping identification methods are tested for efficiency in large-scale applications. One is an iterative routine, and the other a least squares method. Numerical simulations have been performed on multiple degree-of-freedom models to test the effectiveness of the algorithm and the usefulness of parallel computation for the problems. High Performance Fortran is used to parallelize the algorithm. Tests were performed using the IBM-SP2 at NASA Ames Research Center. The least squares method tested incurs high communication costs, which reduces the benefit of high performance computing. This method's memory requirement grows at a very rapid rate meaning that larger problems can quickly exceed available computer memory. The iterative method's memory requirement grows at a much slower pace and is able to handle problems with 500+ degrees of freedom on a single processor. This method benefits from parallelization, and significant speedup can he seen for problems of 100+ degrees-of-freedom.
Accelerated Application Development: The ORNL Titan Experience

DOE PAGES

Joubert, Wayne; Archibald, Richard K.; Berrill, Mark A.; ...

2015-05-09

The use of computational accelerators such as NVIDIA GPUs and Intel Xeon Phi processors is now widespread in the high performance computing community, with many applications delivering impressive performance gains. However, programming these systems for high performance, performance portability and software maintainability has been a challenge. In this paper we discuss experiences porting applications to the Titan system. Titan, which began planning in 2009 and was deployed for general use in 2013, was the first multi-petaflop system based on accelerator hardware. To ready applications for accelerated computing, a preparedness effort was undertaken prior to delivery of Titan. In this papermore » we report experiences and lessons learned from this process and describe how users are currently making use of computational accelerators on Titan.« less
Accelerated application development: The ORNL Titan experience

DOE Office of Scientific and Technical Information (OSTI.GOV)

Joubert, Wayne; Archibald, Rick; Berrill, Mark

2015-08-01

The use of computational accelerators such as NVIDIA GPUs and Intel Xeon Phi processors is now widespread in the high performance computing community, with many applications delivering impressive performance gains. However, programming these systems for high performance, performance portability and software maintainability has been a challenge. In this paper we discuss experiences porting applications to the Titan system. Titan, which began planning in 2009 and was deployed for general use in 2013, was the first multi-petaflop system based on accelerator hardware. To ready applications for accelerated computing, a preparedness effort was undertaken prior to delivery of Titan. In this papermore » we report experiences and lessons learned from this process and describe how users are currently making use of computational accelerators on Titan.« less
Fermilab | Science at Fermilab | Experiments & Projects | Intensity

Science.gov Websites

Theory Computing High-performance Computing Grid Computing Networking Mass Storage Plan for the Future List Historic Results Inquiring Minds Questions About Physics Other High-Energy Physics Sites More About Particle Physics Library Visual Media Services Timeline History High-Energy Physics Accelerator
Computational nuclear quantum many-body problem: The UNEDF project

NASA Astrophysics Data System (ADS)

Bogner, S.; Bulgac, A.; Carlson, J.; Engel, J.; Fann, G.; Furnstahl, R. J.; Gandolfi, S.; Hagen, G.; Horoi, M.; Johnson, C.; Kortelainen, M.; Lusk, E.; Maris, P.; Nam, H.; Navratil, P.; Nazarewicz, W.; Ng, E.; Nobre, G. P. A.; Ormand, E.; Papenbrock, T.; Pei, J.; Pieper, S. C.; Quaglioni, S.; Roche, K. J.; Sarich, J.; Schunck, N.; Sosonkina, M.; Terasaki, J.; Thompson, I.; Vary, J. P.; Wild, S. M.

2013-10-01

The UNEDF project was a large-scale collaborative effort that applied high-performance computing to the nuclear quantum many-body problem. The primary focus of the project was on constructing, validating, and applying an optimized nuclear energy density functional, which entailed a wide range of pioneering developments in microscopic nuclear structure and reactions, algorithms, high-performance computing, and uncertainty quantification. UNEDF demonstrated that close associations among nuclear physicists, mathematicians, and computer scientists can lead to novel physics outcomes built on algorithmic innovations and computational developments. This review showcases a wide range of UNEDF science results to illustrate this interplay.
Multidisciplinary High-Fidelity Analysis and Optimization of Aerospace Vehicles. Part 1; Formulation

NASA Technical Reports Server (NTRS)

Walsh, J. L.; Townsend, J. C.; Salas, A. O.; Samareh, J. A.; Mukhopadhyay, V.; Barthelemy, J.-F.

2000-01-01

An objective of the High Performance Computing and Communication Program at the NASA Langley Research Center is to demonstrate multidisciplinary shape and sizing optimization of a complete aerospace vehicle configuration by using high-fidelity, finite element structural analysis and computational fluid dynamics aerodynamic analysis in a distributed, heterogeneous computing environment that includes high performance parallel computing. A software system has been designed and implemented to integrate a set of existing discipline analysis codes, some of them computationally intensive, into a distributed computational environment for the design of a highspeed civil transport configuration. The paper describes the engineering aspects of formulating the optimization by integrating these analysis codes and associated interface codes into the system. The discipline codes are integrated by using the Java programming language and a Common Object Request Broker Architecture (CORBA) compliant software product. A companion paper presents currently available results.
Scout: high-performance heterogeneous computing made simple

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jablin, James; Mc Cormick, Patrick; Herlihy, Maurice

2011-01-26

Researchers must often write their own simulation and analysis software. During this process they simultaneously confront both computational and scientific problems. Current strategies for aiding the generation of performance-oriented programs do not abstract the software development from the science. Furthermore, the problem is becoming increasingly complex and pressing with the continued development of many-core and heterogeneous (CPU-GPU) architectures. To acbieve high performance, scientists must expertly navigate both software and hardware. Co-design between computer scientists and research scientists can alleviate but not solve this problem. The science community requires better tools for developing, optimizing, and future-proofing codes, allowing scientists to focusmore » on their research while still achieving high computational performance. Scout is a parallel programming language and extensible compiler framework targeting heterogeneous architectures. It provides the abstraction required to buffer scientists from the constantly-shifting details of hardware while still realizing higb-performance by encapsulating software and hardware optimization within a compiler framework.« less
[Design and study of parallel computing environment of Monte Carlo simulation for particle therapy planning using a public cloud-computing infrastructure].

PubMed

Yokohama, Noriya

2013-07-01

This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
Cooperative high-performance storage in the accelerated strategic computing initiative

NASA Technical Reports Server (NTRS)

Gary, Mark; Howard, Barry; Louis, Steve; Minuzzo, Kim; Seager, Mark

1996-01-01

The use and acceptance of new high-performance, parallel computing platforms will be impeded by the absence of an infrastructure capable of supporting orders-of-magnitude improvement in hierarchical storage and high-speed I/O (Input/Output). The distribution of these high-performance platforms and supporting infrastructures across a wide-area network further compounds this problem. We describe an architectural design and phased implementation plan for a distributed, Cooperative Storage Environment (CSE) to achieve the necessary performance, user transparency, site autonomy, communication, and security features needed to support the Accelerated Strategic Computing Initiative (ASCI). ASCI is a Department of Energy (DOE) program attempting to apply terascale platforms and Problem-Solving Environments (PSEs) toward real-world computational modeling and simulation problems. The ASCI mission must be carried out through a unified, multilaboratory effort, and will require highly secure, efficient access to vast amounts of data. The CSE provides a logically simple, geographically distributed, storage infrastructure of semi-autonomous cooperating sites to meet the strategic ASCI PSE goal of highperformance data storage and access at the user desktop.
High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bouchard, Kristofer E.; Aimone, James B.; Chun, Miyoung

A lack of coherent plans to analyze, manage, and understand data threatens the various opportunities offered by new neuro-technologies. High-performance computing will allow exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations.
High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination

DOE PAGES

Bouchard, Kristofer E.; Aimone, James B.; Chun, Miyoung; ...

2016-11-01

A lack of coherent plans to analyze, manage, and understand data threatens the various opportunities offered by new neuro-technologies. High-performance computing will allow exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations.
Medical applications for high-performance computers in SKIF-GRID network.

PubMed

Zhuchkov, Alexey; Tverdokhlebov, Nikolay

2009-01-01

The paper presents a set of software services for massive mammography image processing by using high-performance parallel computers of SKIF-family which are linked into a service-oriented grid-network. An experience of a prototype system implementation in two medical institutions is also described.

Software Systems for High-performance Quantum Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humble, Travis S; Britt, Keith A

Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventionalmore » computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.« less
High-performance computing for airborne applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather M; Manuzzato, Andrea; Fairbanks, Tom

2010-06-28

Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even thoughmore » the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.« less
NASA's Participation in the National Computational Grid

NASA Technical Reports Server (NTRS)

Feiereisen, William J.; Zornetzer, Steve F. (Technical Monitor)

1998-01-01

Over the last several years it has become evident that the character of NASA's supercomputing needs has changed. One of the major missions of the agency is to support the design and manufacture of aero- and space-vehicles with technologies that will significantly reduce their cost. It is becoming clear that improvements in the process of aerospace design and manufacturing will require a high performance information infrastructure that allows geographically dispersed teams to draw upon resources that are broader than traditional supercomputing. A computational grid draws together our information resources into one system. We can foresee the time when a Grid will allow engineers and scientists to use the tools of supercomputers, databases and on line experimental devices in a virtual environment to collaborate with distant colleagues. The concept of a computational grid has been spoken of for many years, but several events in recent times are conspiring to allow us to actually build one. In late 1997 the National Science Foundation initiated the Partnerships for Advanced Computational Infrastructure (PACI) which is built around the idea of distributed high performance computing. The Alliance lead, by the National Computational Science Alliance (NCSA), and the National Partnership for Advanced Computational Infrastructure (NPACI), lead by the San Diego Supercomputing Center, have been instrumental in drawing together the "Grid Community" to identify the technology bottlenecks and propose a research agenda to address them. During the same period NASA has begun to reformulate parts of two major high performance computing research programs to concentrate on distributed high performance computing and has banded together with the PACI centers to address the research agenda in common.
Silicon photonics for high-performance interconnection networks

NASA Astrophysics Data System (ADS)

Biberman, Aleksandr

2011-12-01

We assert in the course of this work that silicon photonics has the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems, and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. This work showcases that chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, enable unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of this work, we demonstrate such feasibility of waveguides, modulators, switches, and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. Furthermore, we leverage the unique properties of available silicon photonic materials to create novel silicon photonic devices, subsystems, network topologies, and architectures to enable unprecedented performance of these photonic interconnection networks and computing systems. We show that the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. Furthermore, we explore the immense potential of all-optical functionalities implemented using parametric processing in the silicon platform, demonstrating unique methods that have the ability to revolutionize computation and communication. Silicon photonics enables new sets of opportunities that we can leverage for performance gains, as well as new sets of challenges that we must solve. Leveraging its inherent compatibility with standard fabrication techniques of the semiconductor industry, combined with its capability of dense integration with advanced microelectronics, silicon photonics also offers a clear path toward commercialization through low-cost mass-volume production. Combining empirical validations of feasibility, demonstrations of massive performance gains in large-scale systems, and the potential for commercial penetration of silicon photonics, the impact of this work will become evident in the many decades that follow.
High performance computing and communications: Advancing the frontiers of information technology

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1997-12-31

This report, which supplements the President`s Fiscal Year 1997 Budget, describes the interagency High Performance Computing and Communications (HPCC) Program. The HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of accomplishments to its credit. Over its five-year history, the HPCC Program has focused on developing high performance computing and communications technologies that can be applied to computation-intensive applications. Major highlights for FY 1996: (1) High performance computing systems enable practical solutions to complex problems with accuracies not possible five years ago; (2) HPCC-funded research in very large scale networking techniques has been instrumental inmore » the evolution of the Internet, which continues exponential growth in size, speed, and availability of information; (3) The combination of hardware capability measured in gigaflop/s, networking technology measured in gigabit/s, and new computational science techniques for modeling phenomena has demonstrated that very large scale accurate scientific calculations can be executed across heterogeneous parallel processing systems located thousands of miles apart; (4) Federal investments in HPCC software R and D support researchers who pioneered the development of parallel languages and compilers, high performance mathematical, engineering, and scientific libraries, and software tools--technologies that allow scientists to use powerful parallel systems to focus on Federal agency mission applications; and (5) HPCC support for virtual environments has enabled the development of immersive technologies, where researchers can explore and manipulate multi-dimensional scientific and engineering problems. Educational programs fostered by the HPCC Program have brought into classrooms new science and engineering curricula designed to teach computational science. This document contains a small sample of the significant HPCC Program accomplishments in FY 1996.« less
FPGA-Based High-Performance Embedded Systems for Adaptive Edge Computing in Cyber-Physical Systems: The ARTICo³ Framework.

PubMed

Rodríguez, Alfonso; Valverde, Juan; Portilla, Jorge; Otero, Andrés; Riesgo, Teresa; de la Torre, Eduardo

2018-06-08

Cyber-Physical Systems are experiencing a paradigm shift in which processing has been relocated to the distributed sensing layer and is no longer performed in a centralized manner. This approach, usually referred to as Edge Computing, demands the use of hardware platforms that are able to manage the steadily increasing requirements in computing performance, while keeping energy efficiency and the adaptability imposed by the interaction with the physical world. In this context, SRAM-based FPGAs and their inherent run-time reconfigurability, when coupled with smart power management strategies, are a suitable solution. However, they usually fail in user accessibility and ease of development. In this paper, an integrated framework to develop FPGA-based high-performance embedded systems for Edge Computing in Cyber-Physical Systems is presented. This framework provides a hardware-based processing architecture, an automated toolchain, and a runtime to transparently generate and manage reconfigurable systems from high-level system descriptions without additional user intervention. Moreover, it provides users with support for dynamically adapting the available computing resources to switch the working point of the architecture in a solution space defined by computing performance, energy consumption and fault tolerance. Results show that it is indeed possible to explore this solution space at run time and prove that the proposed framework is a competitive alternative to software-based edge computing platforms, being able to provide not only faster solutions, but also higher energy efficiency for computing-intensive algorithms with significant levels of data-level parallelism.
Mapping land cover change over continental Africa using Landsat and Google Earth Engine cloud computing.

PubMed

Midekisa, Alemayehu; Holl, Felix; Savory, David J; Andrade-Pacheco, Ricardo; Gething, Peter W; Bennett, Adam; Sturrock, Hugh J W

2017-01-01

Quantifying and monitoring the spatial and temporal dynamics of the global land cover is critical for better understanding many of the Earth's land surface processes. However, the lack of regularly updated, continental-scale, and high spatial resolution (30 m) land cover data limit our ability to better understand the spatial extent and the temporal dynamics of land surface changes. Despite the free availability of high spatial resolution Landsat satellite data, continental-scale land cover mapping using high resolution Landsat satellite data was not feasible until now due to the need for high-performance computing to store, process, and analyze this large volume of high resolution satellite data. In this study, we present an approach to quantify continental land cover and impervious surface changes over a long period of time (15 years) using high resolution Landsat satellite observations and Google Earth Engine cloud computing platform. The approach applied here to overcome the computational challenges of handling big earth observation data by using cloud computing can help scientists and practitioners who lack high-performance computational resources.
Mapping land cover change over continental Africa using Landsat and Google Earth Engine cloud computing

PubMed Central

Holl, Felix; Savory, David J.; Andrade-Pacheco, Ricardo; Gething, Peter W.; Bennett, Adam; Sturrock, Hugh J. W.

2017-01-01

Quantifying and monitoring the spatial and temporal dynamics of the global land cover is critical for better understanding many of the Earth’s land surface processes. However, the lack of regularly updated, continental-scale, and high spatial resolution (30 m) land cover data limit our ability to better understand the spatial extent and the temporal dynamics of land surface changes. Despite the free availability of high spatial resolution Landsat satellite data, continental-scale land cover mapping using high resolution Landsat satellite data was not feasible until now due to the need for high-performance computing to store, process, and analyze this large volume of high resolution satellite data. In this study, we present an approach to quantify continental land cover and impervious surface changes over a long period of time (15 years) using high resolution Landsat satellite observations and Google Earth Engine cloud computing platform. The approach applied here to overcome the computational challenges of handling big earth observation data by using cloud computing can help scientists and practitioners who lack high-performance computational resources. PMID:28953943
Low cost, high performance processing of single particle cryo-electron microscopy data in the cloud.

PubMed

Cianfrocco, Michael A; Leschziner, Andres E

2015-05-08

The advent of a new generation of electron microscopes and direct electron detectors has realized the potential of single particle cryo-electron microscopy (cryo-EM) as a technique to generate high-resolution structures. Calculating these structures requires high performance computing clusters, a resource that may be limiting to many likely cryo-EM users. To address this limitation and facilitate the spread of cryo-EM, we developed a publicly available 'off-the-shelf' computing environment on Amazon's elastic cloud computing infrastructure. This environment provides users with single particle cryo-EM software packages and the ability to create computing clusters with 16-480+ CPUs. We tested our computing environment using a publicly available 80S yeast ribosome dataset and estimate that laboratories could determine high-resolution cryo-EM structures for $50 to $1500 per structure within a timeframe comparable to local clusters. Our analysis shows that Amazon's cloud computing environment may offer a viable computing environment for cryo-EM.
InfoMall: An Innovative Strategy for High-Performance Computing and Communications Applications Development.

ERIC Educational Resources Information Center

Mills, Kim; Fox, Geoffrey

1994-01-01

Describes the InfoMall, a program led by the Northeast Parallel Architectures Center (NPAC) at Syracuse University (New York). The InfoMall features a partnership of approximately 24 organizations offering linked programs in High Performance Computing and Communications (HPCC) technology integration, software development, marketing, education and…
Appropriate Use Policy | High-Performance Computing | NREL

Science.gov Websites

users of the National Renewable Energy Laboratory (NREL) High Performance Computing (HPC) resources government agency, National Laboratory, University, or private entity, the intellectual property terms (if issued a multifactor token which may be a physical token or a virtual token used with one-time password
High Performance Computing and Networking for Science--Background Paper.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Office of Technology Assessment.

The Office of Technology Assessment is conducting an assessment of the effects of new information technologies--including high performance computing, data networking, and mass data archiving--on research and development. This paper offers a view of the issues and their implications for current discussions about Federal supercomputer initiatives…
High Performance Computing and Communications Panel Report.

ERIC Educational Resources Information Center

President's Council of Advisors on Science and Technology, Washington, DC.

This report offers advice on the strengths and weaknesses of the High Performance Computing and Communications (HPCC) initiative, one of five presidential initiatives launched in 1992 and coordinated by the Federal Coordinating Council for Science, Engineering, and Technology. The HPCC program has the following objectives: (1) to extend U.S.…
Web Based Parallel Programming Workshop for Undergraduate Education.

ERIC Educational Resources Information Center

Marcus, Robert L.; Robertson, Douglass

Central State University (Ohio), under a contract with Nichols Research Corporation, has developed a World Wide web based workshop on high performance computing entitled "IBN SP2 Parallel Programming Workshop." The research is part of the DoD (Department of Defense) High Performance Computing Modernization Program. The research…
Dense, Efficient Chip-to-Chip Communication at the Extremes of Computing

ERIC Educational Resources Information Center

Loh, Matthew

2013-01-01

The scalability of CMOS technology has driven computation into a diverse range of applications across the power consumption, performance and size spectra. Communication is a necessary adjunct to computation, and whether this is to push data from node-to-node in a high-performance computing cluster or from the receiver of wireless link to a neural…
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs

NASA Technical Reports Server (NTRS)

Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan

2006-01-01

Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Carter, Jonathan; Shalf, John; Skinner, David; Ethier, Stephane; Biswas, Rupak; Djomehri, Jahed; VanderWijngaart, Rob

2003-01-01

The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple optimizations. Finally, we evaluate the perfor- mance of several numerical codes from key scientific computing domains. Overall results demonstrate that the SX6 achieves high performance on a large fraction of our application suite and in many cases significantly outperforms the RISC-based architectures. However, certain classes of applications are not easily amenable to vectorization and would likely require extensive reengineering of both algorithm and implementation to utilize the SX6 effectively.
High Performance Computing and Communications Program. Hearing before the Subcommittee on Science of the Committee on Science, Space, and Technology. U.S. House of Representatives, One Hundred Third Congress, First Session (October 26, 1993).

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. House Committee on Science, Space and Technology.

This hearing explores how the High Performance Computing and Communications Program (HPCC) relates to the technology needs of industry. Testimony and prepared statements from the following witnesses on future effects of computing and networking technologies on their companies are included: (1) F. Brett Berlin, president, Brett Berlin Associates,…
[Earth Science Technology Office's Computational Technologies Project

NASA Technical Reports Server (NTRS)

Fischer, James (Technical Monitor); Merkey, Phillip

2005-01-01

This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
An Approach to Integrate a Space-Time GIS Data Model with High Performance Computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Dali; Zhao, Ziliang; Shaw, Shih-Lung

2011-01-01

In this paper, we describe an approach to integrate a Space-Time GIS data model on a high performance computing platform. The Space-Time GIS data model has been developed on a desktop computing environment. We use the Space-Time GIS data model to generate GIS module, which organizes a series of remote sensing data. We are in the process of porting the GIS module into an HPC environment, in which the GIS modules handle large dataset directly via parallel file system. Although it is an ongoing project, authors hope this effort can inspire further discussions on the integration of GIS on highmore » performance computing platforms.« less

Multidisciplinary Design Optimization of a Full Vehicle with High Performance Computing

NASA Technical Reports Server (NTRS)

Yang, R. J.; Gu, L.; Tho, C. H.; Sobieszczanski-Sobieski, Jaroslaw

2001-01-01

Multidisciplinary design optimization (MDO) of a full vehicle under the constraints of crashworthiness, NVH (Noise, Vibration and Harshness), durability, and other performance attributes is one of the imperative goals for automotive industry. However, it is often infeasible due to the lack of computational resources, robust simulation capabilities, and efficient optimization methodologies. This paper intends to move closer towards that goal by using parallel computers for the intensive computation and combining different approximations for dissimilar analyses in the MDO process. The MDO process presented in this paper is an extension of the previous work reported by Sobieski et al. In addition to the roof crush, two full vehicle crash modes are added: full frontal impact and 50% frontal offset crash. Instead of using an adaptive polynomial response surface method, this paper employs a DOE/RSM method for exploring the design space and constructing highly nonlinear crash functions. Two NMO strategies are used and results are compared. This paper demonstrates that with high performance computing, a conventionally intractable real world full vehicle multidisciplinary optimization problem considering all performance attributes with large number of design variables become feasible.
Onward to Petaflops Computing

NASA Technical Reports Server (NTRS)

Bailey, David H.; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

With programs such as the US High Performance Computing and Communications Program (HPCCP), the attention of scientists and engineers worldwide has been focused on the potential of very high performance scientific computing, namely systems that are hundreds or thousands of times more powerful than those typically available in desktop systems at any given point in time. Extending the frontiers of computing in this manner has resulted in remarkable advances, both in computing technology itself and also in the various scientific and engineering disciplines that utilize these systems. Within the month or two, a sustained rate of 1 Tflop/s (also written 1 teraflops, or 10(exp 12) floating-point operations per second) is likely to be achieved by the 'ASCI Red' system at Sandia National Laboratory in New Mexico. With this objective in sight, it is reasonable to ask what lies ahead for high-end computing.
High performance computing and communications: FY 1997 implementation plan

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1996-12-01

The High Performance Computing and Communications (HPCC) Program was formally authorized by passage, with bipartisan support, of the High-Performance Computing Act of 1991, signed on December 9, 1991. The original Program, in which eight Federal agencies participated, has now grown to twelve agencies. This Plan provides a detailed description of the agencies` FY 1996 HPCC accomplishments and FY 1997 HPCC plans. Section 3 of this Plan provides an overview of the HPCC Program. Section 4 contains more detailed definitions of the Program Component Areas, with an emphasis on the overall directions and milestones planned for each PCA. Appendix A providesmore » a detailed look at HPCC Program activities within each agency.« less
Exascale computing and big data

DOE PAGES

Reed, Daniel A.; Dongarra, Jack

2015-06-25

Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
Exascale computing and big data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reed, Daniel A.; Dongarra, Jack

Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
VLab: A Science Gateway for Distributed First Principles Calculations in Heterogeneous High Performance Computing Systems

ERIC Educational Resources Information Center

da Silveira, Pedro Rodrigo Castro

2014-01-01

This thesis describes the development and deployment of a cyberinfrastructure for distributed high-throughput computations of materials properties at high pressures and/or temperatures--the Virtual Laboratory for Earth and Planetary Materials--VLab. VLab was developed to leverage the aggregated computational power of grid systems to solve…
Benchmarking high performance computing architectures with CMS’ skeleton framework

NASA Astrophysics Data System (ADS)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

NASA Technical Reports Server (NTRS)

Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

2012-01-01

The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bailey, David H.

The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, althoughmore » the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.« less
Tse computers. [Chinese pictograph character binary image processor design for high speed applications

NASA Technical Reports Server (NTRS)

Strong, J. P., III

1973-01-01

Tse computers have the potential of operating four or five orders of magnitude faster than present digital computers. The computers of the new design use binary images as their basic computational entity. The word 'tse' is the transliteration of the Chinese word for 'pictograph character.' Tse computers are large collections of devices that perform logical operations on binary images. The operations on binary images are to be performed over the entire image simultaneously.
Numerical Stability and Control Analysis Towards Falling-Leaf Prediction Capabilities of Splitflow for Two Generic High-Performance Aircraft Models

NASA Technical Reports Server (NTRS)

Charlton, Eric F.

1998-01-01

Aerodynamic analysis are performed using the Lockheed-Martin Tactical Aircraft Systems (LMTAS) Splitflow computational fluid dynamics code to investigate the computational prediction capabilities for vortex-dominated flow fields of two different tailless aircraft models at large angles of attack and sideslip. These computations are performed with the goal of providing useful stability and control data to designers of high performance aircraft. Appropriate metrics for accuracy, time, and ease of use are determined in consultations with both the LMTAS Advanced Design and Stability and Control groups. Results are obtained and compared to wind-tunnel data for all six components of forces and moments. Moment data is combined to form a "falling leaf" stability analysis. Finally, a handful of viscous simulations were also performed to further investigate nonlinearities and possible viscous effects in the differences between the accumulated inviscid computational and experimental data.
Geocomputation over Hybrid Computer Architecture and Systems: Prior Works and On-going Initiatives at UARK

NASA Astrophysics Data System (ADS)

Shi, X.

2015-12-01

As NSF indicated - "Theory and experimentation have for centuries been regarded as two fundamental pillars of science. It is now widely recognized that computational and data-enabled science forms a critical third pillar." Geocomputation is the third pillar of GIScience and geosciences. With the exponential growth of geodata, the challenge of scalable and high performance computing for big data analytics become urgent because many research activities are constrained by the inability of software or tool that even could not complete the computation process. Heterogeneous geodata integration and analytics obviously magnify the complexity and operational time frame. Many large-scale geospatial problems may be not processable at all if the computer system does not have sufficient memory or computational power. Emerging computer architectures, such as Intel's Many Integrated Core (MIC) Architecture and Graphics Processing Unit (GPU), and advanced computing technologies provide promising solutions to employ massive parallelism and hardware resources to achieve scalability and high performance for data intensive computing over large spatiotemporal and social media data. Exploring novel algorithms and deploying the solutions in massively parallel computing environment to achieve the capability for scalable data processing and analytics over large-scale, complex, and heterogeneous geodata with consistent quality and high-performance has been the central theme of our research team in the Department of Geosciences at the University of Arkansas (UARK). New multi-core architectures combined with application accelerators hold the promise to achieve scalability and high performance by exploiting task and data levels of parallelism that are not supported by the conventional computing systems. Such a parallel or distributed computing environment is particularly suitable for large-scale geocomputation over big data as proved by our prior works, while the potential of such advanced infrastructure remains unexplored in this domain. Within this presentation, our prior and on-going initiatives will be summarized to exemplify how we exploit multicore CPUs, GPUs, and MICs, and clusters of CPUs, GPUs and MICs, to accelerate geocomputation in different applications.
User Account Passwords | High-Performance Computing | NREL

Science.gov Websites

Account Passwords User Account Passwords For NREL's high-performance computing (HPC) systems, learn about user account password requirements and how to set up, log in, and change passwords. Password Logging In the First Time After you request an HPC user account, you'll receive a temporary password. Set
Numerical Prediction of Pitch Damping Stability Derivatives for Finned Projectiles

DTIC Science & Technology

2013-11-01

in part by a grant of high-performance computing time from the U.S. DOD High Performance Computing Modernization Program (HPCMP) at the Army...to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data...12 3.3.2 Time -Accurate Simulations
Data Security Policy | High-Performance Computing | NREL

Science.gov Websites

to use its high-performance computing (HPC) systems. NREL HPC systems are operated as research systems and may only contain data related to scientific research. These systems are categorized as low per sensitive or non-sensitive. One example of sensitive data would be personally identifiable information (PII
Federal High Performance Computing and Communications Program. The Department of Energy Component.

ERIC Educational Resources Information Center

Department of Energy, Washington, DC. Office of Energy Research.

This report, profusely illustrated with color photographs and other graphics, elaborates on the Department of Energy (DOE) research program in High Performance Computing and Communications (HPCC). The DOE is one of seven agency programs within the Federal Research and Development Program working on HPCC. The DOE HPCC program emphasizes research in…
Training | High-Performance Computing | NREL

Science.gov Websites

Training Training Find training resources for using NREL's high-performance computing (HPC) systems as well as related online tutorials. Upcoming Training HPC User Workshop - June 12th We will be Conference, a group meets to discuss Best Practices in HPC Training. This group developed a list of resources
Shared Storage Usage Policy | High-Performance Computing | NREL

Science.gov Websites

Shared Storage Usage Policy Shared Storage Usage Policy To use NREL's high-performance computing (HPC) systems, you must abide by the Shared Storage Usage Policy. /projects NREL HPC allocations include storage space in the /projects filesystem. However, /projects is a shared resource and project
Scalable, High-performance 3D Imaging Software Platform: System Architecture and Application to Virtual Colonoscopy

PubMed Central

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli; Brett, Bevin

2013-01-01

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. In this work, we have developed a software platform that is designed to support high-performance 3D medical image processing for a wide range of applications using increasingly available and affordable commodity computing systems: multi-core, clusters, and cloud computing systems. To achieve scalable, high-performance computing, our platform (1) employs size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D image processing algorithms; (2) supports task scheduling for efficient load distribution and balancing; and (3) consists of a layered parallel software libraries that allow a wide range of medical applications to share the same functionalities. We evaluated the performance of our platform by applying it to an electronic cleansing system in virtual colonoscopy, with initial experimental results showing a 10 times performance improvement on an 8-core workstation over the original sequential implementation of the system. PMID:23366803
Validation of MCNP6 Version 1.0 with the ENDF/B-VII.1 Cross Section Library for Plutonium Metals, Oxides, and Solutions on the High Performance Computing Platform Moonlight

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chapman, Bryan Scott; Gough, Sean T.

This report documents a validation of the MCNP6 Version 1.0 computer code on the high performance computing platform Moonlight, for operations at Los Alamos National Laboratory (LANL) that involve plutonium metals, oxides, and solutions. The validation is conducted using the ENDF/B-VII.1 continuous energy group cross section library at room temperature. The results are for use by nuclear criticality safety personnel in performing analysis and evaluation of various facility activities involving plutonium materials.

High Performance Computing of Meshless Time Domain Method on Multi-GPU Cluster

NASA Astrophysics Data System (ADS)

Ikuno, Soichiro; Nakata, Susumu; Hirokawa, Yuta; Itoh, Taku

2015-01-01

High performance computing of Meshless Time Domain Method (MTDM) on multi-GPU using the supercomputer HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) at University of Tsukuba is investigated. Generally, the finite difference time domain (FDTD) method is adopted for the numerical simulation of the electromagnetic wave propagation phenomena. However, the numerical domain must be divided into rectangle meshes, and it is difficult to adopt the problem in a complexed domain to the method. On the other hand, MTDM can be easily adept to the problem because MTDM does not requires meshes. In the present study, we implement MTDM on multi-GPU cluster to speedup the method, and numerically investigate the performance of the method on multi-GPU cluster. To reduce the computation time, the communication time between the decomposed domain is hided below the perfect matched layer (PML) calculation procedure. The results of computation show that speedup of MTDM on 128 GPUs is 173 times faster than that of single CPU calculation.
Modeling Cardiac Electrophysiology at the Organ Level in the Peta FLOPS Computing Age

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mitchell, Lawrence; Bishop, Martin; Hoetzl, Elena

2010-09-30

Despite a steep increase in available compute power, in-silico experimentation with highly detailed models of the heart remains to be challenging due to the high computational cost involved. It is hoped that next generation high performance computing (HPC) resources lead to significant reductions in execution times to leverage a new class of in-silico applications. However, performance gains with these new platforms can only be achieved by engaging a much larger number of compute cores, necessitating strongly scalable numerical techniques. So far strong scalability has been demonstrated only for a moderate number of cores, orders of magnitude below the range requiredmore » to achieve the desired performance boost.In this study, strong scalability of currently used techniques to solve the bidomain equations is investigated. Benchmark results suggest that scalability is limited to 512-4096 cores within the range of relevant problem sizes even when systems are carefully load-balanced and advanced IO strategies are employed.« less
High Performance Computing Modeling Advances Accelerator Science for High-Energy Physics

DOE PAGES

Amundson, James; Macridin, Alexandru; Spentzouris, Panagiotis

2014-07-28

The development and optimization of particle accelerators are essential for advancing our understanding of the properties of matter, energy, space, and time. Particle accelerators are complex devices whose behavior involves many physical effects on multiple scales. Therefore, advanced computational tools utilizing high-performance computing are essential for accurately modeling them. In the past decade, the US Department of Energy's SciDAC program has produced accelerator-modeling tools that have been employed to tackle some of the most difficult accelerator science problems. The authors discuss the Synergia framework and its applications to high-intensity particle accelerator physics. Synergia is an accelerator simulation package capable ofmore » handling the entire spectrum of beam dynamics simulations. Our authors present Synergia's design principles and its performance on HPC platforms.« less
A Modular Environment for Geophysical Inversion and Run-time Autotuning using Heterogeneous Computing Systems

NASA Astrophysics Data System (ADS)

Myre, Joseph M.

Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

DOE PAGES

Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian; ...

2017-09-29

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
Progress in a novel architecture for high performance processing

NASA Astrophysics Data System (ADS)

Zhang, Zhiwei; Liu, Meng; Liu, Zijun; Du, Xueliang; Xie, Shaolin; Ma, Hong; Ding, Guangxin; Ren, Weili; Zhou, Fabiao; Sun, Wenqin; Wang, Huijuan; Wang, Donglin

2018-04-01

The high performance processing (HPP) is an innovative architecture which targets on high performance computing with excellent power efficiency and computing performance. It is suitable for data intensive applications like supercomputing, machine learning and wireless communication. An example chip with four application-specific integrated circuit (ASIC) cores which is the first generation of HPP cores has been taped out successfully under Taiwan Semiconductor Manufacturing Company (TSMC) 40 nm low power process. The innovative architecture shows great energy efficiency over the traditional central processing unit (CPU) and general-purpose computing on graphics processing units (GPGPU). Compared with MaPU, HPP has made great improvement in architecture. The chip with 32 HPP cores is being developed under TSMC 16 nm field effect transistor (FFC) technology process and is planed to use commercially. The peak performance of this chip can reach 4.3 teraFLOPS (TFLOPS) and its power efficiency is up to 89.5 gigaFLOPS per watt (GFLOPS/W).
Computational Analysis of a Prototype Martian Rotorcraft Experiment

NASA Technical Reports Server (NTRS)

Corfeld, Kelly J.; Strawn, Roger C.; Long, Lyle N.

2002-01-01

This paper presents Reynolds-averaged Navier-Stokes calculations for a prototype Martian rotorcraft. The computations are intended for comparison with an ongoing Mars rotor hover test at NASA Ames Research Center. These computational simulations present a new and challenging problem, since rotors that operate on Mars will experience a unique low Reynolds number and high Mach number environment. Computed results for the 3-D rotor differ substantially from 2-D sectional computations in that the 3-D results exhibit a stall delay phenomenon caused by rotational forces along the blade span. Computational results have yet to be compared to experimental data, but computed performance predictions match the experimental design goals fairly well. In addition, the computed results provide a high level of detail in the rotor wake and blade surface aerodynamics. These details provide an important supplement to the expected experimental performance data.
Cell-NPE (Numerical Performance Evaluation): Programming the IBM Cell Broadband Engine -- A General Parallelization Strategy

DTIC Science & Technology

2008-04-01

Space GmbH as follows: B. TECHNICAL PRPOPOSA/DESCRIPTION OF WORK Cell: A Revolutionary High Performance Computing Platform On 29 June 2005 [1...IBM has announced that is has partnered with Mercury Computer Systems, a maker of specialized computers . The Cell chip provides massive floating-point...the computing industry away from the traditional processor technology dominated by Intel. While in the past, the development of computing power has
High-speed multiple sequence alignment on a reconfigurable platform.

PubMed

Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

2006-01-01

Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.
Vivaldi: A Domain-Specific Language for Volume Processing and Visualization on Distributed Heterogeneous Systems.

PubMed

Choi, Hyungsuk; Choi, Woohyuk; Quan, Tran Minh; Hildebrand, David G C; Pfister, Hanspeter; Jeong, Won-Ki

2014-12-01

As the size of image data from microscopes and telescopes increases, the need for high-throughput processing and visualization of large volumetric data has become more pressing. At the same time, many-core processors and GPU accelerators are commonplace, making high-performance distributed heterogeneous computing systems affordable. However, effectively utilizing GPU clusters is difficult for novice programmers, and even experienced programmers often fail to fully leverage the computing power of new parallel architectures due to their steep learning curve and programming complexity. In this paper, we propose Vivaldi, a new domain-specific language for volume processing and visualization on distributed heterogeneous computing systems. Vivaldi's Python-like grammar and parallel processing abstractions provide flexible programming tools for non-experts to easily write high-performance parallel computing code. Vivaldi provides commonly used functions and numerical operators for customized visualization and high-throughput image processing applications. We demonstrate the performance and usability of Vivaldi on several examples ranging from volume rendering to image segmentation.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.

PubMed

Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

2011-09-07

Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed.
Low cost, high performance processing of single particle cryo-electron microscopy data in the cloud

PubMed Central

Cianfrocco, Michael A; Leschziner, Andres E

2015-01-01

The advent of a new generation of electron microscopes and direct electron detectors has realized the potential of single particle cryo-electron microscopy (cryo-EM) as a technique to generate high-resolution structures. Calculating these structures requires high performance computing clusters, a resource that may be limiting to many likely cryo-EM users. To address this limitation and facilitate the spread of cryo-EM, we developed a publicly available ‘off-the-shelf’ computing environment on Amazon's elastic cloud computing infrastructure. This environment provides users with single particle cryo-EM software packages and the ability to create computing clusters with 16–480+ CPUs. We tested our computing environment using a publicly available 80S yeast ribosome dataset and estimate that laboratories could determine high-resolution cryo-EM structures for $50 to $1500 per structure within a timeframe comparable to local clusters. Our analysis shows that Amazon's cloud computing environment may offer a viable computing environment for cryo-EM. DOI: http://dx.doi.org/10.7554/eLife.06664.001 PMID:25955969
PerSEUS: Ultra-Low-Power High Performance Computing for Plasma Simulations

NASA Astrophysics Data System (ADS)

Doxas, I.; Andreou, A.; Lyon, J.; Angelopoulos, V.; Lu, S.; Pritchett, P. L.

2017-12-01

Peta-op SupErcomputing Unconventional System (PerSEUS) aims to explore the use for High Performance Scientific Computing (HPC) of ultra-low-power mixed signal unconventional computational elements developed by Johns Hopkins University (JHU), and demonstrate that capability on both fluid and particle Plasma codes. We will describe the JHU Mixed-signal Unconventional Supercomputing Elements (MUSE), and report initial results for the Lyon-Fedder-Mobarry (LFM) global magnetospheric MHD code, and a UCLA general purpose relativistic Particle-In-Cell (PIC) code.
SOI layout decomposition for double patterning lithography on high-performance computer platforms

NASA Astrophysics Data System (ADS)

Verstov, Vladimir; Zinchenko, Lyudmila; Makarchuk, Vladimir

2014-12-01

In the paper silicon on insulator layout decomposition algorithms for the double patterning lithography on high performance computing platforms are discussed. Our approach is based on the use of a contradiction graph and a modified concurrent breadth-first search algorithm. We evaluate our technique on 45 nm Nangate Open Cell Library including non-Manhattan geometry. Experimental results show that our soft computing algorithms decompose layout successfully and a minimal distance between polygons in layout is increased.
Federal Plan for High-End Computing. Report of the High-End Computing Revitalization Task Force (HECRTF)

DTIC Science & Technology

2004-07-01

steadily for the past fifteen years, while memory latency and bandwidth have improved much more slowly. For example, Intel processor clock rates38 have... processor and memory performance) all greatly restrict the ability to achieve high levels of performance for science, engineering, and national...sub-nuclear distances. Guide experiments to identify transition from quantum chromodynamics to quark -gluon plasma. Accelerator Physics Accurate
Surrogate based wind farm layout optimization using manifold mapping

NASA Astrophysics Data System (ADS)

Kaja Kamaludeen, Shaafi M.; van Zuijle, Alexander; Bijl, Hester

2016-09-01

High computational cost associated with the high fidelity wake models such as RANS or LES serves as a primary bottleneck to perform a direct high fidelity wind farm layout optimization (WFLO) using accurate CFD based wake models. Therefore, a surrogate based multi-fidelity WFLO methodology (SWFLO) is proposed. The surrogate model is built using an SBO method referred as manifold mapping (MM). As a verification, optimization of spacing between two staggered wind turbines was performed using the proposed surrogate based methodology and the performance was compared with that of direct optimization using high fidelity model. Significant reduction in computational cost was achieved using MM: a maximum computational cost reduction of 65%, while arriving at the same optima as that of direct high fidelity optimization. The similarity between the response of models, the number of mapping points and its position, highly influences the computational efficiency of the proposed method. As a proof of concept, realistic WFLO of a small 7-turbine wind farm is performed using the proposed surrogate based methodology. Two variants of Jensen wake model with different decay coefficients were used as the fine and coarse model. The proposed SWFLO method arrived at the same optima as that of the fine model with very less number of fine model simulations.
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE PAGES

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-11-23

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less

15 CFR 762.2 - Records to be retained.

Code of Federal Regulations, 2011 CFR

2011-01-01

... pertaining to the types of transactions described in § 762.1(a) of this part, which are made or obtained by a..., High Performance Computers; (7) supplement No. 3 to part 742 High Performance Computers, Safeguards and...; (44) § 745.2, End-use certificates; (45) § 758.2(c), Assumption writing; and (46) § 734.4(g), de...
PuTTY | High-Performance Computing | NREL

Science.gov Websites

PuTTY PuTTY Learn how to use PuTTY to connect to NREL's high-performance computing (HPC) systems . Connecting When you start the PuTTY app, the program will display PuTTY's Configuration menu. When this comes . When prompted, type your password again followed by . Note: to increase
Evolution of Embedded Processing for Wide Area Surveillance

DTIC Science & Technology

2014-01-01

future vision . 15. SUBJECT TERMS Embedded processing; high performance computing; general-purpose graphical processing units (GPGPUs) 16. SECURITY...recon- naissance (ISR) mission capabilities. The capabilities these advancements are achieving include the ability to provide persistent all...fighters to support and positively affect their mission . Significant improvements in high-performance computing (HPC) technology make it possible to
Monitoring SLAC High Performance UNIX Computing Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lettsome, Annette K.; /Bethune-Cookman Coll. /SLAC

2005-12-15

Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia.more » Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface.« less
Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds

NASA Astrophysics Data System (ADS)

Seinstra, Frank J.; Maassen, Jason; van Nieuwpoort, Rob V.; Drost, Niels; van Kessel, Timo; van Werkhoven, Ben; Urbani, Jacopo; Jacobs, Ceriel; Kielmann, Thilo; Bal, Henri E.

In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle.
[Earth and Space Sciences Project Services for NASA HPCC

NASA Technical Reports Server (NTRS)

Merkey, Phillip

2002-01-01

This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

NASA Astrophysics Data System (ADS)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
Modeling Materials: Design for Planetary Entry, Electric Aircraft, and Beyond

NASA Technical Reports Server (NTRS)

Thompson, Alexander; Lawson, John W.

2014-01-01

NASA missions push the limits of what is possible. The development of high-performance materials must keep pace with the agency's demanding, cutting-edge applications. Researchers at NASA's Ames Research Center are performing multiscale computational modeling to accelerate development times and further the design of next-generation aerospace materials. Multiscale modeling combines several computationally intensive techniques ranging from the atomic level to the macroscale, passing output from one level as input to the next level. These methods are applicable to a wide variety of materials systems. For example: (a) Ultra-high-temperature ceramics for hypersonic aircraft-we utilized the full range of multiscale modeling to characterize thermal protection materials for faster, safer air- and spacecraft, (b) Planetary entry heat shields for space vehicles-we computed thermal and mechanical properties of ablative composites by combining several methods, from atomistic simulations to macroscale computations, (c) Advanced batteries for electric aircraft-we performed large-scale molecular dynamics simulations of advanced electrolytes for ultra-high-energy capacity batteries to enable long-distance electric aircraft service; and (d) Shape-memory alloys for high-efficiency aircraft-we used high-fidelity electronic structure calculations to determine phase diagrams in shape-memory transformations. Advances in high-performance computing have been critical to the development of multiscale materials modeling. We used nearly one million processor hours on NASA's Pleiades supercomputer to characterize electrolytes with a fidelity that would be otherwise impossible. For this and other projects, Pleiades enables us to push the physics and accuracy of our calculations to new levels.
Optimising the Parallelisation of OpenFOAM Simulations

DTIC Science & Technology

2014-06-01

UNCLASSIFIED UNCLASSIFIED Optimising the Parallelisation of OpenFOAM Simulations Shannon Keough Maritime Division Defence...Science and Technology Organisation DSTO-TR-2987 ABSTRACT The OpenFOAM computational fluid dynamics toolbox allows parallel computation of...performance of a given high performance computing cluster with several OpenFOAM cases, running using a combination of MPI libraries and corresponding MPI
Global Magnetohydrodynamic Simulation Using High Performance FORTRAN on Parallel Computers

NASA Astrophysics Data System (ADS)

Ogino, T.

High Performance Fortran (HPF) is one of modern and common techniques to achieve high performance parallel computation. We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5 VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
Performance of Fourth-Grade Students in the 2012 NAEP Computer-Based Writing Pilot Assessment: Scores, Text Length, and Use of Editing Tools. Working Paper Series. NCES 2015-119

ERIC Educational Resources Information Center

White, Sheida; Kim, Young Yee; Chen, Jing; Liu, Fei

2015-01-01

This study examined whether or not fourth-graders could fully demonstrate their writing skills on the computer and factors associated with their performance on the National Assessment of Educational Progress (NAEP) computer-based writing assessment. The results suggest that high-performing fourth-graders (those who scored in the upper 20 percent…
Intelligent Computer Assisted Instruction (ICAI): Formative Evaluation of Two Systems

DTIC Science & Technology

1986-03-01

appreciation .’.,-* for the power of computer technology. Interpretati on Yale students are a strikingly high performing group by traditional academic ...COMPUTER ASSISTED INSTRUCTION April 1984 - August 1985 (ICAI): FORMATIVE EVALUATION OF TWO SYSTEMS 6. PERFORMING ORG. REPORT NUMBER 7. AUTHOR(*) S...956881 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK AREA & WORK UNIT NUMBERS Jet Propulsion Laboratory 2Q263743A794
A performance comparison of scalar, vector, and concurrent vector computers including supercomputers for modeling transport of reactive contaminants in groundwater

NASA Astrophysics Data System (ADS)

Tripathi, Vijay S.; Yeh, G. T.

1993-06-01

Sophisticated and highly computation-intensive models of transport of reactive contaminants in groundwater have been developed in recent years. Application of such models to real-world contaminant transport problems, e.g., simulation of groundwater transport of 10-15 chemically reactive elements (e.g., toxic metals) and relevant complexes and minerals in two and three dimensions over a distance of several hundred meters, requires high-performance computers including supercomputers. Although not widely recognized as such, the computational complexity and demand of these models compare with well-known computation-intensive applications including weather forecasting and quantum chemical calculations. A survey of the performance of a variety of available hardware, as measured by the run times for a reactive transport model HYDROGEOCHEM, showed that while supercomputers provide the fastest execution times for such problems, relatively low-cost reduced instruction set computer (RISC) based scalar computers provide the best performance-to-price ratio. Because supercomputers like the Cray X-MP are inherently multiuser resources, often the RISC computers also provide much better turnaround times. Furthermore, RISC-based workstations provide the best platforms for "visualization" of groundwater flow and contaminant plumes. The most notable result, however, is that current workstations costing less than $10,000 provide performance within a factor of 5 of a Cray X-MP.
Acceleration of Cherenkov angle reconstruction with the new Intel Xeon/FPGA compute platform for the particle identification in the LHCb Upgrade

NASA Astrophysics Data System (ADS)

Faerber, Christian

2017-10-01

The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a ‘triggerless’ readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40 MHz. This increases the data bandwidth from the detector down to the Event Filter farm to 40 TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new Event Filter farm. In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the Event Building or in the Event Filter farm is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported in Verilog to the Intel Xeon/FPGA platform and accelerated by a factor of 35. The same algorithm was ported to the Intel Xeon/FPGA platform with OpenCL. The implementation work and the performance will be compared. Also another FPGA accelerator the Nallatech 385 PCIe accelerator with the same Stratix V FPGA were tested for performance. The results show that the Intel Xeon/FPGA platforms, which are built in general for high performance computing, are also very interesting for the High Energy Physics community.
High Performance, Dependable Multiprocessor

NASA Technical Reports Server (NTRS)

Ramos, Jeremy; Samson, John R.; Troxel, Ian; Subramaniyan, Rajagopal; Jacobs, Adam; Greco, James; Cieslewski, Grzegorz; Curreri, John; Fischer, Michael; Grobelny, Eric;

2006-01-01

With the ever increasing demand for higher bandwidth and processing capacity of today's space exploration, space science, and defense missions, the ability to efficiently apply commercial-off-the-shelf (COTS) processors for on-board computing is now a critical need. In response to this need, NASA's New Millennium Program office has commissioned the development of Dependable Multiprocessor (DM) technology for use in payload and robotic missions. The Dependable Multiprocessor technology is a COTS-based, power efficient, high performance, highly dependable, fault tolerant cluster computer. To date, Honeywell has successfully demonstrated a TRL4 prototype of the Dependable Multiprocessor [I], and is now working on the development of a TRLS prototype. For the present effort Honeywell has teamed up with the University of Florida's High-performance Computing and Simulation (HCS) Lab, and together the team has demonstrated major elements of the Dependable Multiprocessor TRLS system.

Quo vadis: Hydrologic inverse analyses using high-performance computing and a D-Wave quantum annealer

NASA Astrophysics Data System (ADS)

O'Malley, D.; Vesselinov, V. V.

2017-12-01

Classical microprocessors have had a dramatic impact on hydrology for decades, due largely to the exponential growth in computing power predicted by Moore's law. However, this growth is not expected to continue indefinitely and has already begun to slow. Quantum computing is an emerging alternative to classical microprocessors. Here, we demonstrated cutting edge inverse model analyses utilizing some of the best available resources in both worlds: high-performance classical computing and a D-Wave quantum annealer. The classical high-performance computing resources are utilized to build an advanced numerical model that assimilates data from O(10^5) observations, including water levels, drawdowns, and contaminant concentrations. The developed model accurately reproduces the hydrologic conditions at a Los Alamos National Laboratory contamination site, and can be leveraged to inform decision-making about site remediation. We demonstrate the use of a D-Wave 2X quantum annealer to solve hydrologic inverse problems. This work can be seen as an early step in quantum-computational hydrology. We compare and contrast our results with an early inverse approach in classical-computational hydrology that is comparable to the approach we use with quantum annealing. Our results show that quantum annealing can be useful for identifying regions of high and low permeability within an aquifer. While the problems we consider are small-scale compared to the problems that can be solved with modern classical computers, they are large compared to the problems that could be solved with early classical CPUs. Further, the binary nature of the high/low permeability problem makes it well-suited to quantum annealing, but challenging for classical computers.
Comparative Implementation of High Performance Computing for Power System Dynamic Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng

Dynamic simulation for transient stability assessment is one of the most important, but intensive, computations for power system planning and operation. Present commercial software is mainly designed for sequential computation to run a single simulation, which is very time consuming with a single processer. The application of High Performance Computing (HPC) to dynamic simulations is very promising in accelerating the computing process by parallelizing its kernel algorithms while maintaining the same level of computation accuracy. This paper describes the comparative implementation of four parallel dynamic simulation schemes in two state-of-the-art HPC environments: Message Passing Interface (MPI) and Open Multi-Processing (OpenMP).more » These implementations serve to match the application with dedicated multi-processor computing hardware and maximize the utilization and benefits of HPC during the development process.« less
High-Performance Computing: High-Speed Computer Networks in the United States, Europe, and Japan. Report to Congressional Requesters.

ERIC Educational Resources Information Center

General Accounting Office, Washington, DC. Information Management and Technology Div.

This report was prepared in response to a request from the Senate Committee on Commerce, Science, and Transportation, and from the House Committee on Science, Space, and Technology, for information on efforts to develop high-speed computer networks in the United States, Europe (limited to France, Germany, Italy, the Netherlands, and the United…
Reducing Communication in Algebraic Multigrid Using Additive Variants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vassilevski, Panayot S.; Yang, Ulrike Meier

Algebraic multigrid (AMG) has proven to be an effective scalable solver on many high performance computers. However, its increasing communication complexity on coarser levels has shown to seriously impact its performance on computers with high communication cost. Moreover, additive AMG variants provide not only increased parallelism as well as decreased numbers of messages per cycle but also generally exhibit slower convergence. Here we present various new additive variants with convergence rates that are significantly improved compared to the classical additive algebraic multigrid method and investigate their potential for decreased communication, and improved communication-computation overlap, features that are essential for goodmore » performance on future exascale architectures.« less
Reducing Communication in Algebraic Multigrid Using Additive Variants

DOE PAGES

Vassilevski, Panayot S.; Yang, Ulrike Meier

2014-02-12

Algebraic multigrid (AMG) has proven to be an effective scalable solver on many high performance computers. However, its increasing communication complexity on coarser levels has shown to seriously impact its performance on computers with high communication cost. Moreover, additive AMG variants provide not only increased parallelism as well as decreased numbers of messages per cycle but also generally exhibit slower convergence. Here we present various new additive variants with convergence rates that are significantly improved compared to the classical additive algebraic multigrid method and investigate their potential for decreased communication, and improved communication-computation overlap, features that are essential for goodmore » performance on future exascale architectures.« less

An analysis for high speed propeller-nacelle aerodynamic performance prediction. Volume 2: User's manual

NASA Technical Reports Server (NTRS)

Egolf, T. Alan; Anderson, Olof L.; Edwards, David E.; Landgrebe, Anton J.

1988-01-01

A user's manual for the computer program developed for the prediction of propeller-nacelle aerodynamic performance reported in, An Analysis for High Speed Propeller-Nacelle Aerodynamic Performance Prediction: Volume 1 -- Theory and Application, is presented. The manual describes the computer program mode of operation requirements, input structure, input data requirements and the program output. In addition, it provides the user with documentation of the internal program structure and the software used in the computer program as it relates to the theory presented in Volume 1. Sample input data setups are provided along with selected printout of the program output for one of the sample setups.
Graphics Processing Unit Assisted Thermographic Compositing

NASA Technical Reports Server (NTRS)

Ragasa, Scott; Russell, Samuel S.

2012-01-01

Objective Develop a software application utilizing high performance computing techniques, including general purpose graphics processing units (GPGPUs), for the analysis and visualization of large thermographic data sets. Over the past several years, an increasing effort among scientists and engineers to utilize graphics processing units (GPUs) in a more general purpose fashion is allowing for previously unobtainable levels of computation by individual workstations. As data sets grow, the methods to work them grow at an equal, and often greater, pace. Certain common computations can take advantage of the massively parallel and optimized hardware constructs of the GPU which yield significant increases in performance. These common computations have high degrees of data parallelism, that is, they are the same computation applied to a large set of data where the result does not depend on other data elements. Image processing is one area were GPUs are being used to greatly increase the performance of certain analysis and visualization techniques.
Computing a Comprehensible Model for Spam Filtering

NASA Astrophysics Data System (ADS)

Ruiz-Sepúlveda, Amparo; Triviño-Rodriguez, José L.; Morales-Bueno, Rafael

In this paper, we describe the application of the Desicion Tree Boosting (DTB) learning model to spam email filtering.This classification task implies the learning in a high dimensional feature space. So, it is an example of how the DTB algorithm performs in such feature space problems. In [1], it has been shown that hypotheses computed by the DTB model are more comprehensible that the ones computed by another ensemble methods. Hence, this paper tries to show that the DTB algorithm maintains the same comprehensibility of hypothesis in high dimensional feature space problems while achieving the performance of other ensemble methods. Four traditional evaluation measures (precision, recall, F1 and accuracy) have been considered for performance comparison between DTB and others models usually applied to spam email filtering. The size of the hypothesis computed by a DTB is smaller and more comprehensible than the hypothesis computed by Adaboost and Naïve Bayes.
Redirecting Under-Utilised Computer Laboratories into Cluster Computing Facilities

ERIC Educational Resources Information Center

Atkinson, John S.; Spenneman, Dirk H. R.; Cornforth, David

2005-01-01

Purpose: To provide administrators at an Australian university with data on the feasibility of redirecting under-utilised computer laboratories facilities into a distributed high performance computing facility. Design/methodology/approach: The individual log-in records for each computer located in the computer laboratories at the university were…
High Performance Parallel Computational Nanotechnology

NASA Technical Reports Server (NTRS)

Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

At a recent press conference, NASA Administrator Dan Goldin encouraged NASA Ames Research Center to take a lead role in promoting research and development of advanced, high-performance computer technology, including nanotechnology. Manufacturers of leading-edge microprocessors currently perform large-scale simulations in the design and verification of semiconductor devices and microprocessors. Recently, the need for this intensive simulation and modeling analysis has greatly increased, due in part to the ever-increasing complexity of these devices, as well as the lessons of experiences such as the Pentium fiasco. Simulation, modeling, testing, and validation will be even more important for designing molecular computers because of the complex specification of millions of atoms, thousands of assembly steps, as well as the simulation and modeling needed to ensure reliable, robust and efficient fabrication of the molecular devices. The software for this capacity does not exist today, but it can be extrapolated from the software currently used in molecular modeling for other applications: semi-empirical methods, ab initio methods, self-consistent field methods, Hartree-Fock methods, molecular mechanics; and simulation methods for diamondoid structures. In as much as it seems clear that the application of such methods in nanotechnology will require powerful, highly powerful systems, this talk will discuss techniques and issues for performing these types of computations on parallel systems. We will describe system design issues (memory, I/O, mass storage, operating system requirements, special user interface issues, interconnects, bandwidths, and programming languages) involved in parallel methods for scalable classical, semiclassical, quantum, molecular mechanics, and continuum models; molecular nanotechnology computer-aided designs (NanoCAD) techniques; visualization using virtual reality techniques of structural models and assembly sequences; software required to control mini robotic manipulators for positional control; scalable numerical algorithms for reliability, verifications and testability. There appears no fundamental obstacle to simulating molecular compilers and molecular computers on high performance parallel computers, just as the Boeing 777 was simulated on a computer before manufacturing it.
High-Performance Computing Unlocks Innovation at NREL

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

Need to fly around a wind farm? Or step inside a molecule? NREL scientists use a super powerful (and highly energy-efficient) computer to visualize and solve big problems in renewable energy research.
H.R. 1757--High Performance Computing and High Speed Networking Applications Act of 1993. Hearings before the Subcommittee on Science of the Committee on Science, Space, and Technology. House of Representatives, One Hundred Third Congress, First Session (April 27, May 6, May 11, 1993).

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. House Committee on Science, Space and Technology.

This document contains the transcript of three hearings on the High Speed Performance Computing and High Speed Networking Applications Act of 1993 (H.R. 1757). The hearings were designed to obtain specific suggestions for improvements to the legislation and alternative or additional application areas that should be pursued. Testimony and prepared…
Play for Performance: Using Computer Games to Improve Motivation and Test-Taking Performance

ERIC Educational Resources Information Center

Dennis, Alan R.; Bhagwatwar, Akshay; Minas, Randall K.

2013-01-01

The importance of testing, especially certification and high-stakes testing, has increased substantially over the past decade. Building on the "serious gaming" literature and the psychology "priming" literature, we developed a computer game designed to improve test-taking performance using psychological priming. The game primed…
Connecting Performance Analysis and Visualization to Advance Extreme Scale Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bremer, Peer-Timo; Mohr, Bernd; Schulz, Martin

2015-07-29

The characterization, modeling, analysis, and tuning of software performance has been a central topic in High Performance Computing (HPC) since its early beginnings. The overall goal is to make HPC software run faster on particular hardware, either through better scheduling, on-node resource utilization, or more efficient distributed communication.
Department of Defense In-House RDT and E Activities: Management Analysis Report for Fiscal Year 1993

DTIC Science & Technology

1994-11-01

A worldwide unique lab because it houses a high - speed modeling and simulation system, a prototype...E Division, San Diego, CA: High Performance Computing Laboratory providing a wide range of advanced computer systems for the scientific investigation...Machines CM-200 and a 256-node Thinking Machines CM-S. The CM-5 is in a very large memory, ( high performance 32 Gbytes, >4 0 OFlop) coafiguration,
The Design of a High Performance Earth Imagery and Raster Data Management and Processing Platform

NASA Astrophysics Data System (ADS)

Xie, Qingyun

2016-06-01

This paper summarizes the general requirements and specific characteristics of both geospatial raster database management system and raster data processing platform from a domain-specific perspective as well as from a computing point of view. It also discusses the need of tight integration between the database system and the processing system. These requirements resulted in Oracle Spatial GeoRaster, a global scale and high performance earth imagery and raster data management and processing platform. The rationale, design, implementation, and benefits of Oracle Spatial GeoRaster are described. Basically, as a database management system, GeoRaster defines an integrated raster data model, supports image compression, data manipulation, general and spatial indices, content and context based queries and updates, versioning, concurrency, security, replication, standby, backup and recovery, multitenancy, and ETL. It provides high scalability using computer and storage clustering. As a raster data processing platform, GeoRaster provides basic operations, image processing, raster analytics, and data distribution featuring high performance computing (HPC). Specifically, HPC features include locality computing, concurrent processing, parallel processing, and in-memory computing. In addition, the APIs and the plug-in architecture are discussed.
An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

NASA Astrophysics Data System (ADS)

Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

2018-02-01

De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
Studying an Eulerian Computer Model on Different High-performance Computer Platforms and Some Applications

NASA Astrophysics Data System (ADS)

Georgiev, K.; Zlatev, Z.

2010-11-01

The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
From Desktop to Teraflop: Exploiting the U.S. Lead in High Performance Computing. NSF Blue Ribbon Panel on High Performance Computing.

ERIC Educational Resources Information Center

National Science Foundation, Washington, DC.

This report addresses an opportunity to accelerate progress in virtually every branch of science and engineering concurrently, while also boosting the American economy as business firms also learn to exploit these new capabilities. The successful rapid advancement in both science and technology creates its own challenges, four of which are…
Allocation Usage Tracking and Management | High-Performance Computing |

Science.gov Websites

NREL's high-performance computing (HPC) systems, learn how to track and manage your allocations. The alloc_tracker script (/usr/local/bin/alloc_tracker) may be used to see what allocations you have access to, how much of the allocation has been used, how much remains and how many node hours will be forfeited at the
WinHPC System User Basics | High-Performance Computing | NREL

Science.gov Websites

guidance for starting to use this high-performance computing (HPC) system at NREL. Also see WinHPC policies ) when you are finished. Simply quitting Remote Desktop will keep your session active and using resources node). 2. Log in with your NREL.gov username/password. Remember to log out when finished. Mac 1. If you
One-Time Password Tokens | High-Performance Computing | NREL

Science.gov Websites

One-Time Password Tokens One-Time Password Tokens For connecting to NREL's high-performance computing (HPC) systems, learn how to set up a one-time password (OTP) token for remote and privileged a one-time pass code from the HPC Operations team. At the sign-in screen Enter your HPC Username in
About High-Performance Computing at NREL | High-Performance Computing |

Science.gov Websites

Day(s): First Thursday of every month Hours: 11 a.m. - 12 p.m. Location: ESIF B211-Edison Conference Room Contact: Jennifer Southerland Insight Center - Visualization Tools Day(s): Every Monday Hours: 10 Data System Day(s): Every Monday Hours: 10 a.m. - 11 a.m. Location: ESIF B308-Insight Center
Hybrid parallel computing architecture for multiview phase shifting

NASA Astrophysics Data System (ADS)

Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun

2014-11-01

The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
Integration of High-Performance Computing into Cloud Computing Services

NASA Astrophysics Data System (ADS)

Vouk, Mladen A.; Sills, Eric; Dreher, Patrick

High-Performance Computing (HPC) projects span a spectrum of computer hardware implementations ranging from peta-flop supercomputers, high-end tera-flop facilities running a variety of operating systems and applications, to mid-range and smaller computational clusters used for HPC application development, pilot runs and prototype staging clusters. What they all have in common is that they operate as a stand-alone system rather than a scalable and shared user re-configurable resource. The advent of cloud computing has changed the traditional HPC implementation. In this article, we will discuss a very successful production-level architecture and policy framework for supporting HPC services within a more general cloud computing infrastructure. This integrated environment, called Virtual Computing Lab (VCL), has been operating at NC State since fall 2004. Nearly 8,500,000 HPC CPU-Hrs were delivered by this environment to NC State faculty and students during 2009. In addition, we present and discuss operational data that show that integration of HPC and non-HPC (or general VCL) services in a cloud can substantially reduce the cost of delivering cloud services (down to cents per CPU hour).

SCEAPI: A unified Restful Web API for High-Performance Computing

NASA Astrophysics Data System (ADS)

Rongqiang, Cao; Haili, Xiao; Shasha, Lu; Yining, Zhao; Xiaoning, Wang; Xuebin, Chi

2017-10-01

The development of scientific computing is increasingly moving to collaborative web and mobile applications. All these applications need high-quality programming interface for accessing heterogeneous computing resources consisting of clusters, grid computing or cloud computing. In this paper, we introduce our high-performance computing environment that integrates computing resources from 16 HPC centers across China. Then we present a bundle of web services called SCEAPI and describe how it can be used to access HPC resources with HTTP or HTTPs protocols. We discuss SCEAPI from several aspects including architecture, implementation and security, and address specific challenges in designing compatible interfaces and protecting sensitive data. We describe the functions of SCEAPI including authentication, file transfer and job management for creating, submitting and monitoring, and how to use SCEAPI in an easy-to-use way. Finally, we discuss how to exploit more HPC resources quickly for the ATLAS experiment by implementing the custom ARC compute element based on SCEAPI, and our work shows that SCEAPI is an easy-to-use and effective solution to extend opportunistic HPC resources.
Engineering and Computing Portal to Solve Environmental Problems

NASA Astrophysics Data System (ADS)

Gudov, A. M.; Zavozkin, S. Y.; Sotnikov, I. Y.

2018-01-01

This paper describes architecture and services of the Engineering and Computing Portal, which is considered to be a complex solution that provides access to high-performance computing resources, enables to carry out computational experiments, teach parallel technologies and solve computing tasks, including technogenic safety ones.
Message Passing vs. Shared Address Space on a Cluster of SMPs

NASA Technical Reports Server (NTRS)

Shan, Hongzhang; Singh, Jaswinder Pal; Oliker, Leonid; Biswas, Rupak

2000-01-01

The convergence of scalable computer architectures using clusters of PCs (or PC-SMPs) with commodity networking has become an attractive platform for high end scientific computing. Currently, message-passing and shared address space (SAS) are the two leading programming paradigms for these systems. Message-passing has been standardized with MPI, and is the most common and mature programming approach. However message-passing code development can be extremely difficult, especially for irregular structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality, and high protocol overhead. In this paper, we compare the performance of and programming effort, required for six applications under both programming models on a 32 CPU PC-SMP cluster. Our application suite consists of codes that typically do not exhibit high efficiency under shared memory programming. due to their high communication to computation ratios and complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications: however, on certain classes of problems SAS performance is competitive with MPI. We also present new algorithms for improving the PC cluster performance of MPI collective operations.
Parallel computing works

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less
High-performance computing with quantum processing units

DOE PAGES

Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.; ...

2017-03-01

The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Performance Evaluation of Counter-Based Dynamic Load Balancing Schemes for Massive Contingency Analysis with Different Computing Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Yousu; Huang, Zhenyu; Chavarría-Miranda, Daniel

Contingency analysis is a key function in the Energy Management System (EMS) to assess the impact of various combinations of power system component failures based on state estimation. Contingency analysis is also extensively used in power market operation for feasibility test of market solutions. High performance computing holds the promise of faster analysis of more contingency cases for the purpose of safe and reliable operation of today’s power grids with less operating margin and more intermittent renewable energy sources. This paper evaluates the performance of counter-based dynamic load balancing schemes for massive contingency analysis under different computing environments. Insights frommore » the performance evaluation can be used as guidance for users to select suitable schemes in the application of massive contingency analysis. Case studies, as well as MATLAB simulations, of massive contingency cases using the Western Electricity Coordinating Council power grid model are presented to illustrate the application of high performance computing with counter-based dynamic load balancing schemes.« less
High-performance computing with quantum processing units

DOE Office of Scientific and Technical Information (OSTI.GOV)

Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.

The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Challenges of Future High-End Computing

NASA Technical Reports Server (NTRS)

Bailey, David; Kutler, Paul (Technical Monitor)

1998-01-01

The next major milestone in high performance computing is a sustained rate of one Pflop/s (also written one petaflops, or 10(circumflex)15 floating-point operations per second). In addition to prodigiously high computational performance, such systems must of necessity feature very large main memories, as well as comparably high I/O bandwidth and huge mass storage facilities. The current consensus of scientists who have studied these issues is that "affordable" petaflops systems may be feasible by the year 2010, assuming that certain key technologies continue to progress at current rates. One important question is whether applications can be structured to perform efficiently on such systems, which are expected to incorporate many thousands of processors and deeply hierarchical memory systems. To answer these questions, advanced performance modeling techniques, including simulation of future architectures and applications, may be required. It may also be necessary to formulate "latency tolerant algorithms" and other completely new algorithmic approaches for certain applications. This talk will give an overview of these challenges.
OpenCluster: A Flexible Distributed Computing Framework for Astronomical Data Processing

NASA Astrophysics Data System (ADS)

Wei, Shoulin; Wang, Feng; Deng, Hui; Liu, Cuiyin; Dai, Wei; Liang, Bo; Mei, Ying; Shi, Congming; Liu, Yingbo; Wu, Jingping

2017-02-01

The volume of data generated by modern astronomical telescopes is extremely large and rapidly growing. However, current high-performance data processing architectures/frameworks are not well suited for astronomers because of their limitations and programming difficulties. In this paper, we therefore present OpenCluster, an open-source distributed computing framework to support rapidly developing high-performance processing pipelines of astronomical big data. We first detail the OpenCluster design principles and implementations and present the APIs facilitated by the framework. We then demonstrate a case in which OpenCluster is used to resolve complex data processing problems for developing a pipeline for the Mingantu Ultrawide Spectral Radioheliograph. Finally, we present our OpenCluster performance evaluation. Overall, OpenCluster provides not only high fault tolerance and simple programming interfaces, but also a flexible means of scaling up the number of interacting entities. OpenCluster thereby provides an easily integrated distributed computing framework for quickly developing a high-performance data processing system of astronomical telescopes and for significantly reducing software development expenses.
A Real-Time High Performance Computation Architecture for Multiple Moving Target Tracking Based on Wide-Area Motion Imagery via Cloud and Graphic Processing Units

PubMed Central

Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik

2017-01-01

This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions. PMID:28208684
DOE Office of Scientific and Technical Information (OSTI.GOV)

Simon, Horst D.; Zorn, Manfred D.; Spengler, Sylvia J.

The pace of extraordinary advances in molecular biology has accelerated in the past decade due in large part to discoveries coming from genome projects on human and model organisms. The advances in the genome project so far, happening well ahead of schedule and under budget, have exceeded any dreams by its protagonists, let alone formal expectations. Biologists expect the next phase of the genome project to be even more startling in terms of dramatic breakthroughs in our understanding of human biology, the biology of health and of disease. Only today can biologists begin to envision the necessary experimental, computational andmore » theoretical steps necessary to exploit genome sequence information for its medical impact, its contribution to biotechnology and economic competitiveness, and its ultimate contribution to environmental quality. High performance computing has become one of the critical enabling technologies, which will help to translate this vision of future advances in biology into reality. Biologists are increasingly becoming aware of the potential of high performance computing. The goal of this tutorial is to introduce the exciting new developments in computational biology and genomics to the high performance computing community.« less
A Real-Time High Performance Computation Architecture for Multiple Moving Target Tracking Based on Wide-Area Motion Imagery via Cloud and Graphic Processing Units.

PubMed

Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik

2017-02-12

This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions.
Shaded-Color Picture Generation of Computer-Defined Arbitrary Shapes

NASA Technical Reports Server (NTRS)

Cozzolongo, J. V.; Hermstad, D. L.; Mccoy, D. S.; Clark, J.

1986-01-01

SHADE computer program generates realistic color-shaded pictures from computer-defined arbitrary shapes. Objects defined for computer representation displayed as smooth, color-shaded surfaces, including varying degrees of transparency. Results also used for presentation of computational results. By performing color mapping, SHADE colors model surface to display analysis results as pressures, stresses, and temperatures. NASA has used SHADE extensively in sign and analysis of high-performance aircraft. Industry should find applications for SHADE in computer-aided design and computer-aided manufacturing. SHADE written in VAX FORTRAN and MACRO Assembler for either interactive or batch execution.
RAPPORT: running scientific high-performance computing applications on the cloud.

PubMed

Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

2013-01-28

Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
The Case for Modular Redundancy in Large-Scale High Performance Computing Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engelmann, Christian; Ong, Hong Hoe; Scott, Stephen L

2009-01-01

Recent investigations into resilience of large-scale high-performance computing (HPC) systems showed a continuous trend of decreasing reliability and availability. Newly installed systems have a lower mean-time to failure (MTTF) and a higher mean-time to recover (MTTR) than their predecessors. Modular redundancy is being used in many mission critical systems today to provide for resilience, such as for aerospace and command \\& control systems. The primary argument against modular redundancy for resilience in HPC has always been that the capability of a HPC system, and respective return on investment, would be significantly reduced. We argue that modular redundancy can significantly increasemore » compute node availability as it removes the impact of scale from single compute node MTTR. We further argue that single compute nodes can be much less reliable, and therefore less expensive, and still be highly available, if their MTTR/MTTF ratio is maintained.« less
Ku-Band rendezvous radar performance computer simulation model

NASA Technical Reports Server (NTRS)

Magnusson, H. G.; Goff, M. F.

1984-01-01

All work performed on the Ku-band rendezvous radar performance computer simulation model program since the release of the preliminary final report is summarized. Developments on the program fall into three distinct categories: (1) modifications to the existing Ku-band radar tracking performance computer model; (2) the addition of a highly accurate, nonrealtime search and acquisition performance computer model to the total software package developed on this program; and (3) development of radar cross section (RCS) computation models for three additional satellites. All changes in the tracking model involved improvements in the automatic gain control (AGC) and the radar signal strength (RSS) computer models. Although the search and acquisition computer models were developed under the auspices of the Hughes Aircraft Company Ku-Band Integrated Radar and Communications Subsystem program office, they have been supplied to NASA as part of the Ku-band radar performance comuter model package. Their purpose is to predict Ku-band acquisition performance for specific satellite targets on specific missions. The RCS models were developed for three satellites: the Long Duration Exposure Facility (LDEF) spacecraft, the Solar Maximum Mission (SMM) spacecraft, and the Space Telescopes.
Ku-Band rendezvous radar performance computer simulation model

NASA Astrophysics Data System (ADS)

Magnusson, H. G.; Goff, M. F.

1984-06-01

All work performed on the Ku-band rendezvous radar performance computer simulation model program since the release of the preliminary final report is summarized. Developments on the program fall into three distinct categories: (1) modifications to the existing Ku-band radar tracking performance computer model; (2) the addition of a highly accurate, nonrealtime search and acquisition performance computer model to the total software package developed on this program; and (3) development of radar cross section (RCS) computation models for three additional satellites. All changes in the tracking model involved improvements in the automatic gain control (AGC) and the radar signal strength (RSS) computer models. Although the search and acquisition computer models were developed under the auspices of the Hughes Aircraft Company Ku-Band Integrated Radar and Communications Subsystem program office, they have been supplied to NASA as part of the Ku-band radar performance comuter model package. Their purpose is to predict Ku-band acquisition performance for specific satellite targets on specific missions. The RCS models were developed for three satellites: the Long Duration Exposure Facility (LDEF) spacecraft, the Solar Maximum Mission (SMM) spacecraft, and the Space Telescopes.
High Productivity Computing Systems and Competitiveness Initiative

DTIC Science & Technology

2007-07-01

planning committee for the annual, international Supercomputing Conference in 2004 and 2005. This is the leading HPC industry conference in the world. It...sector partnerships. Partnerships will form a key part of discussions at the 2nd High Performance Computing Users Conference, planned for July 13, 2005...other things an interagency roadmap for high-end computing core technologies and an accessibility improvement plan . Improving HPC Education and
NASA Center for Climate Simulation (NCCS) Advanced Technology AT5 Virtualized Infiniband Report

NASA Technical Reports Server (NTRS)

Thompson, John H.; Bledsoe, Benjamin C.; Wagner, Mark; Shakshober, John; Fromkin, Russ

2013-01-01

The NCCS is part of the Computational and Information Sciences and Technology Office (CISTO) of Goddard Space Flight Center's (GSFC) Sciences and Exploration Directorate. The NCCS's mission is to enable scientists to increase their understanding of the Earth, the solar system, and the universe by supplying state-of-the-art high performance computing (HPC) solutions. To accomplish this mission, the NCCS (https://www.nccs.nasa.gov) provides high performance compute engines, mass storage, and network solutions to meet the specialized needs of the Earth and space science user communities
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens.

PubMed

Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D; Volz, Kerstin

2017-06-01

We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space. Copyright © 2017 Elsevier B.V. All rights reserved.

The Use of High Performance Computing (HPC) to Strengthen the Development of Army Systems

DTIC Science & Technology

2011-11-01

accurately predicting the supersonic magus effect about spinning cones, ogive- cylinders , and boat-tailed afterbodies. This work led to the successful...successful computer model of the proposed product or system, one can then build prototypes on the computer and study the effects on the performance of...needed. The NRC report discusses the requirements for effective use of such computing power. One needs “models, algorithms, software, hardware
High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Transport Protocol (Transmission Control Protocol/User Datagram Protocol [TCP/UDP]) Analysis

DTIC Science & Technology

2015-09-01

the network Mac8 Medium Access Control ( Mac ) (Ethernet) address observed as destination for outgoing packets subsessionid8 Zero-based index of...15. SUBJECT TERMS tactical networks, data reduction, high-performance computing, data analysis, big data 16. SECURITY CLASSIFICATION OF: 17...Integer index of row cts_deid Device (instrument) Identifier where observation took place cts_collpt Collection point or logical observation point on
Circuit-Switched Memory Access in Photonic Interconnection Networks for High-Performance Embedded Computing

DTIC Science & Technology

2010-07-22

dependent , providing a natural bandwidth match between compute cores and the memory subsystem. • High Bandwidth Dcnsity. Waveguides crossing the chip...simulate this memory access architecture on a 2S6-core chip with a concentrated 64-node network lIsing detailed traces of high-performance embedded...memory modulcs, wc placc memory access poi nts (MAPs) around the pcriphery of the chip connected to thc nctwork. These MAPs, shown in Figure 4, contain
Design consideration in constructing high performance embedded Knowledge-Based Systems (KBS)

NASA Technical Reports Server (NTRS)

Dalton, Shelly D.; Daley, Philip C.

1988-01-01

As the hardware trends for artificial intelligence (AI) involve more and more complexity, the process of optimizing the computer system design for a particular problem will also increase in complexity. Space applications of knowledge based systems (KBS) will often require an ability to perform both numerically intensive vector computations and real time symbolic computations. Although parallel machines can theoretically achieve the speeds necessary for most of these problems, if the application itself is not highly parallel, the machine's power cannot be utilized. A scheme is presented which will provide the computer systems engineer with a tool for analyzing machines with various configurations of array, symbolic, scaler, and multiprocessors. High speed networks and interconnections make customized, distributed, intelligent systems feasible for the application of AI in space. The method presented can be used to optimize such AI system configurations and to make comparisons between existing computer systems. It is an open question whether or not, for a given mission requirement, a suitable computer system design can be constructed for any amount of money.
Hyperswitch Communication Network Computer

NASA Technical Reports Server (NTRS)

Peterson, John C.; Chow, Edward T.; Priel, Moshe; Upchurch, Edwin T.

1993-01-01

Hyperswitch Communications Network (HCN) computer is prototype multiple-processor computer being developed. Incorporates improved version of hyperswitch communication network described in "Hyperswitch Network For Hypercube Computer" (NPO-16905). Designed to support high-level software and expansion of itself. HCN computer is message-passing, multiple-instruction/multiple-data computer offering significant advantages over older single-processor and bus-based multiple-processor computers, with respect to price/performance ratio, reliability, availability, and manufacturing. Design of HCN operating-system software provides flexible computing environment accommodating both parallel and distributed processing. Also achieves balance among following competing factors; performance in processing and communications, ease of use, and tolerance of (and recovery from) faults.
ASIC For Complex Fixed-Point Arithmetic

NASA Technical Reports Server (NTRS)

Petilli, Stephen G.; Grimm, Michael J.; Olson, Erlend M.

1995-01-01

Application-specific integrated circuit (ASIC) performs 24-bit, fixed-point arithmetic operations on arrays of complex-valued input data. High-performance, wide-band arithmetic logic unit (ALU) designed for use in computing fast Fourier transforms (FFTs) and for performing ditigal filtering functions. Other applications include general computations involved in analysis of spectra and digital signal processing.
FLAME: A platform for high performance computing of complex systems, applied for three case studies

DOE PAGES

Kiran, Mariam; Bicak, Mesude; Maleki-Dizaji, Saeedeh; ...

2011-01-01

FLAME allows complex models to be automatically parallelised on High Performance Computing (HPC) grids enabling large number of agents to be simulated over short periods of time. Modellers are hindered by complexities of porting models on parallel platforms and time taken to run large simulations on a single machine, which FLAME overcomes. Three case studies from different disciplines were modelled using FLAME, and are presented along with their performance results on a grid.
Extreme Scale Computing to Secure the Nation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, D L; McGraw, J R; Johnson, J R

2009-11-10

Since the dawn of modern electronic computing in the mid 1940's, U.S. national security programs have been dominant users of every new generation of high-performance computer. Indeed, the first general-purpose electronic computer, ENIAC (the Electronic Numerical Integrator and Computer), was used to calculate the expected explosive yield of early thermonuclear weapons designs. Even the U. S. numerical weather prediction program, another early application for high-performance computing, was initially funded jointly by sponsors that included the U.S. Air Force and Navy, agencies interested in accurate weather predictions to support U.S. military operations. For the decades of the cold war, national securitymore » requirements continued to drive the development of high performance computing (HPC), including advancement of the computing hardware and development of sophisticated simulation codes to support weapons and military aircraft design, numerical weather prediction as well as data-intensive applications such as cryptography and cybersecurity U.S. national security concerns continue to drive the development of high-performance computers and software in the U.S. and in fact, events following the end of the cold war have driven an increase in the growth rate of computer performance at the high-end of the market. This mainly derives from our nation's observance of a moratorium on underground nuclear testing beginning in 1992, followed by our voluntary adherence to the Comprehensive Test Ban Treaty (CTBT) beginning in 1995. The CTBT prohibits further underground nuclear tests, which in the past had been a key component of the nation's science-based program for assuring the reliability, performance and safety of U.S. nuclear weapons. In response to this change, the U.S. Department of Energy (DOE) initiated the Science-Based Stockpile Stewardship (SBSS) program in response to the Fiscal Year 1994 National Defense Authorization Act, which requires, 'in the absence of nuclear testing, a progam to: (1) Support a focused, multifaceted program to increase the understanding of the enduring stockpile; (2) Predict, detect, and evaluate potential problems of the aging of the stockpile; (3) Refurbish and re-manufacture weapons and components, as required; and (4) Maintain the science and engineering institutions needed to support the nation's nuclear deterrent, now and in the future'. This program continues to fulfill its national security mission by adding significant new capabilities for producing scientific results through large-scale computational simulation coupled with careful experimentation, including sub-critical nuclear experiments permitted under the CTBT. To develop the computational science and the computational horsepower needed to support its mission, SBSS initiated the Accelerated Strategic Computing Initiative, later renamed the Advanced Simulation & Computing (ASC) program (sidebar: 'History of ASC Computing Program Computing Capability'). The modern 3D computational simulation capability of the ASC program supports the assessment and certification of the current nuclear stockpile through calibration with past underground test (UGT) data. While an impressive accomplishment, continued evolution of national security mission requirements will demand computing resources at a significantly greater scale than we have today. In particular, continued observance and potential Senate confirmation of the Comprehensive Test Ban Treaty (CTBT) together with the U.S administration's promise for a significant reduction in the size of the stockpile and the inexorable aging and consequent refurbishment of the stockpile all demand increasing refinement of our computational simulation capabilities. Assessment of the present and future stockpile with increased confidence of the safety and reliability without reliance upon calibration with past or future test data is a long-term goal of the ASC program. This will be accomplished through significant increases in the scientific bases that underlie the computational tools. Computer codes must be developed that replace phenomenology with increased levels of scientific understanding together with an accompanying quantification of uncertainty. These advanced codes will place significantly higher demands on the computing infrastructure than do the current 3D ASC codes. This article discusses not only the need for a future computing capability at the exascale for the SBSS program, but also considers high performance computing requirements for broader national security questions. For example, the increasing concern over potential nuclear terrorist threats demands a capability to assess threats and potential disablement technologies as well as a rapid forensic capability for determining a nuclear weapons design from post-detonation evidence (nuclear counterterrorism).« less
Translational bioinformatics in the cloud: an affordable alternative

PubMed Central

2010-01-01

With the continued exponential expansion of publicly available genomic data and access to low-cost, high-throughput molecular technologies for profiling patient populations, computational technologies and informatics are becoming vital considerations in genomic medicine. Although cloud computing technology is being heralded as a key enabling technology for the future of genomic research, available case studies are limited to applications in the domain of high-throughput sequence data analysis. The goal of this study was to evaluate the computational and economic characteristics of cloud computing in performing a large-scale data integration and analysis representative of research problems in genomic medicine. We find that the cloud-based analysis compares favorably in both performance and cost in comparison to a local computational cluster, suggesting that cloud computing technologies might be a viable resource for facilitating large-scale translational research in genomic medicine. PMID:20691073
Effects of Learning Style and Training Method on Computer Attitude and Performance in World Wide Web Page Design Training.

ERIC Educational Resources Information Center

Chou, Huey-Wen; Wang, Yu-Fang

1999-01-01

Compares the effects of two training methods on computer attitude and performance in a World Wide Web page design program in a field experiment with high school students in Taiwan. Discusses individual differences, Kolb's Experiential Learning Theory and Learning Style Inventory, Computer Attitude Scale, and results of statistical analyses.…
Smart Sampling and HPC-based Probabilistic Look-ahead Contingency Analysis Implementation and its Evaluation with Real-world Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Yousu; Etingov, Pavel V.; Ren, Huiying

This paper describes a probabilistic look-ahead contingency analysis application that incorporates smart sampling and high-performance computing (HPC) techniques. Smart sampling techniques are implemented to effectively represent the structure and statistical characteristics of uncertainty introduced by different sources in the power system. They can significantly reduce the data set size required for multiple look-ahead contingency analyses, and therefore reduce the time required to compute them. High-performance-computing (HPC) techniques are used to further reduce computational time. These two techniques enable a predictive capability that forecasts the impact of various uncertainties on potential transmission limit violations. The developed package has been tested withmore » real world data from the Bonneville Power Administration. Case study results are presented to demonstrate the performance of the applications developed.« less
Inquiring Minds

Science.gov Websites

Proposed Projects and Experiments Fermilab's Tevatron Questions for the Universe Theory Computing High -performance Computing Grid Computing Networking Mass Storage Plan for the Future State of the Laboratory Homeland Security Industry Computing Sciences Workforce Development A Growing List Historic Results
Accelerating phylogenetics computing on the desktop: experiments with executing UPGMA in programmable logic.

PubMed

Davis, J P; Akella, S; Waddell, P H

2004-01-01

Having greater computational power on the desktop for processing taxa data sets has been a dream of biologists/statisticians involved in phylogenetics data analysis. Many existing algorithms have been highly optimized-one example being Felsenstein's PHYLIP code, written in C, for UPGMA and neighbor joining algorithms. However, the ability to process more than a few tens of taxa in a reasonable amount of time using conventional computers has not yielded a satisfactory speedup in data processing, making it difficult for phylogenetics practitioners to quickly explore data sets-such as might be done from a laptop computer. We discuss the application of custom computing techniques to phylogenetics. In particular, we apply this technology to speed up UPGMA algorithm execution by a factor of a hundred, against that of PHYLIP code running on the same PC. We report on these experiments and discuss how custom computing techniques can be used to not only accelerate phylogenetics algorithm performance on the desktop, but also on larger, high-performance computing engines, thus enabling the high-speed processing of data sets involving thousands of taxa.
Dataflow computing approach in high-speed digital simulation

NASA Technical Reports Server (NTRS)

Ercegovac, M. D.; Karplus, W. J.

1984-01-01

New computational tools and methodologies for the digital simulation of continuous systems were explored. Programmability, and cost effective performance in multiprocessor organizations for real time simulation was investigated. Approach is based on functional style languages and data flow computing principles, which allow for the natural representation of parallelism in algorithms and provides a suitable basis for the design of cost effective high performance distributed systems. The objectives of this research are to: (1) perform comparative evaluation of several existing data flow languages and develop an experimental data flow language suitable for real time simulation using multiprocessor systems; (2) investigate the main issues that arise in the architecture and organization of data flow multiprocessors for real time simulation; and (3) develop and apply performance evaluation models in typical applications.
PICSiP: new system-in-package technology using a high bandwidth photonic interconnection layer for converged microsystems

NASA Astrophysics Data System (ADS)

Tekin, Tolga; Töpper, Michael; Reichl, Herbert

2009-05-01

Technological frontiers between semiconductor technology, packaging, and system design are disappearing. Scaling down geometries [1] alone does not provide improvement of performance, less power, smaller size, and lower cost. It will require "More than Moore" [2] through the tighter integration of system level components at the package level. System-in-Package (SiP) will deliver the efficient use of three dimensions (3D) through innovation in packaging and interconnect technology. A key bottleneck to the implementation of high-performance microelectronic systems, including SiP, is the lack of lowlatency, high-bandwidth, and high density off-chip interconnects. Some of the challenges in achieving high-bandwidth chip-to-chip communication using electrical interconnects include the high losses in the substrate dielectric, reflections and impedance discontinuities, and susceptibility to crosstalk [3]. Obviously, the incentive for the use of photonics to overcome the challenges and leverage low-latency and highbandwidth communication will enable the vision of optical computing within next generation architectures. Supercomputers of today offer sustained performance of more than petaflops, which can be increased by utilizing optical interconnects. Next generation computing architectures are needed with ultra low power consumption; ultra high performance with novel interconnection technologies. In this paper we will discuss a CMOS compatible underlying technology to enable next generation optical computing architectures. By introducing a new optical layer within the 3D SiP, the development of converged microsystems, deployment for next generation optical computing architecture will be leveraged.
Cognitive Correlates of Performance in Algorithms in a Computer Science Course for High School

ERIC Educational Resources Information Center

Avancena, Aimee Theresa; Nishihara, Akinori

2014-01-01

Computer science for high school faces many challenging issues. One of these is whether the students possess the appropriate cognitive ability for learning the fundamentals of computer science. Online tests were created based on known cognitive factors and fundamental algorithms and were implemented among the second grade students in the…
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kozacik, Stephen

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
SAME4HPC: A Promising Approach in Building a Scalable and Mobile Environment for High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karthik, Rajasekar

2014-01-01

In this paper, an architecture for building Scalable And Mobile Environment For High-Performance Computing with spatial capabilities called SAME4HPC is described using cutting-edge technologies and standards such as Node.js, HTML5, ECMAScript 6, and PostgreSQL 9.4. Mobile devices are increasingly becoming powerful enough to run high-performance apps. At the same time, there exist a significant number of low-end and older devices that rely heavily on the server or the cloud infrastructure to do the heavy lifting. Our architecture aims to support both of these types of devices to provide high-performance and rich user experience. A cloud infrastructure consisting of OpenStack withmore » Ubuntu, GeoServer, and high-performance JavaScript frameworks are some of the key open-source and industry standard practices that has been adopted in this architecture.« less
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure

NASA Astrophysics Data System (ADS)

Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

2011-09-01

Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed. This work was presented in part at the 2010 Annual Meeting of the American Association of Physicists in Medicine (AAPM), Philadelphia, PA.
The Research and Implementation of MUSER CLEAN Algorithm Based on OpenCL

NASA Astrophysics Data System (ADS)

Feng, Y.; Chen, K.; Deng, H.; Wang, F.; Mei, Y.; Wei, S. L.; Dai, W.; Yang, Q. P.; Liu, Y. B.; Wu, J. P.

2017-03-01

It's urgent to carry out high-performance data processing with a single machine in the development of astronomical software. However, due to the different configuration of the machine, traditional programming techniques such as multi-threading, and CUDA (Compute Unified Device Architecture)+GPU (Graphic Processing Unit) have obvious limitations in portability and seamlessness between different operation systems. The OpenCL (Open Computing Language) used in the development of MUSER (MingantU SpEctral Radioheliograph) data processing system is introduced. And the Högbom CLEAN algorithm is re-implemented into parallel CLEAN algorithm by the Python language and PyOpenCL extended package. The experimental results show that the CLEAN algorithm based on OpenCL has approximately equally operating efficiency compared with the former CLEAN algorithm based on CUDA. More important, the data processing in merely CPU (Central Processing Unit) environment of this system can also achieve high performance, which has solved the problem of environmental dependence of CUDA+GPU. Overall, the research improves the adaptability of the system with emphasis on performance of MUSER image clean computing. In the meanwhile, the realization of OpenCL in MUSER proves its availability in scientific data processing. In view of the high-performance computing features of OpenCL in heterogeneous environment, it will probably become the preferred technology in the future high-performance astronomical software development.

Cloud object store for checkpoints of high performance computing applications using decoupling middleware

DOEpatents

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-04-19

Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
Particle Identification on an FPGA Accelerated Compute Platform for the LHCb Upgrade

NASA Astrophysics Data System (ADS)

Fäerber, Christian; Schwemmer, Rainer; Machen, Jonathan; Neufeld, Niko

2017-07-01

The current LHCb readout system will be upgraded in 2018 to a “triggerless” readout of the entire detector at the Large Hadron Collider collision rate of 40 MHz. The corresponding bandwidth from the detector down to the foreseen dedicated computing farm (event filter farm), which acts as the trigger, has to be increased by a factor of almost 100 from currently 500 Gb/s up to 40 Tb/s. The event filter farm will preanalyze the data and will select the events on an event by event basis. This will reduce the bandwidth down to a manageable size to write the interesting physics data to tape. The design of such a system is a challenging task, and the reason why different new technologies are considered and have to be investigated for the different parts of the system. For the usage in the event building farm or in the event filter farm (trigger), an experimental field programmable gate array (FPGA) accelerated computing platform is considered and, therefore, tested. FPGA compute accelerators are used more and more in standard servers such as for Microsoft Bing search or Baidu search. The platform we use hosts a general Intel CPU and a high-performance FPGA linked via the high-speed Intel QuickPath Interconnect. An accelerator is implemented on the FPGA. It is very likely that these platforms, which are built, in general, for high-performance computing, are also very interesting for the high-energy physics community. First, the performance results of smaller test cases performed at the beginning are presented. Afterward, a part of the existing LHCb RICH particle identification is tested and is ported to the experimental FPGA accelerated platform. We have compared the performance of the LHCb RICH particle identification running on a normal CPU with the performance of the same algorithm, which is running on the Xeon-FPGA compute accelerator platform.
Effects of Computer Skill on Mouse Move and Click Performance

ERIC Educational Resources Information Center

Panagiotakopoulos, Chris; Sarris, Menelaos

2008-01-01

This study focuses on the use of computers in the field of education. It reports a series of experimental mouse move and click tasks on constant and moving stimuli. These experiments attempt to explore the efficiency with which individuals of different skill level and age group perform using a mouse. Differences in performance between high-skill…
Concurrent Probabilistic Simulation of High Temperature Composite Structural Response

NASA Technical Reports Server (NTRS)

Abdi, Frank

1996-01-01

A computational structural/material analysis and design tool which would meet industry's future demand for expedience and reduced cost is presented. This unique software 'GENOA' is dedicated to parallel and high speed analysis to perform probabilistic evaluation of high temperature composite response of aerospace systems. The development is based on detailed integration and modification of diverse fields of specialized analysis techniques and mathematical models to combine their latest innovative capabilities into a commercially viable software package. The technique is specifically designed to exploit the availability of processors to perform computationally intense probabilistic analysis assessing uncertainties in structural reliability analysis and composite micromechanics. The primary objectives which were achieved in performing the development were: (1) Utilization of the power of parallel processing and static/dynamic load balancing optimization to make the complex simulation of structure, material and processing of high temperature composite affordable; (2) Computational integration and synchronization of probabilistic mathematics, structural/material mechanics and parallel computing; (3) Implementation of an innovative multi-level domain decomposition technique to identify the inherent parallelism, and increasing convergence rates through high- and low-level processor assignment; (4) Creating the framework for Portable Paralleled architecture for the machine independent Multi Instruction Multi Data, (MIMD), Single Instruction Multi Data (SIMD), hybrid and distributed workstation type of computers; and (5) Market evaluation. The results of Phase-2 effort provides a good basis for continuation and warrants Phase-3 government, and industry partnership.
Bifocal computational near eye light field displays and Structure parameters determination scheme for bifocal computational display.

PubMed

Liu, Mali; Lu, Chihao; Li, Haifeng; Liu, Xu

2018-02-19

We propose a bifocal computational near eye light field display (bifocal computational display) and structure parameters determination scheme (SPDS) for bifocal computational display that achieves greater depth of field (DOF), high resolution, accommodation and compact form factor. Using a liquid varifocal lens, two single-focal computational light fields are superimposed to reconstruct a virtual object's light field by time multiplex and avoid the limitation on high refresh rate. By minimizing the deviation between reconstructed light field and original light field, we propose a determination framework to determine the structure parameters of bifocal computational light field display. When applied to different objective to SPDS, it can achieve high average resolution or uniform resolution display over scene depth range. To analyze the advantages and limitation of our proposed method, we have conducted simulations and constructed a simple prototype which comprises a liquid varifocal lens, dual-layer LCDs and a uniform backlight. The results of simulation and experiments with our method show that the proposed system can achieve expected performance well. Owing to the excellent performance of our system, we motivate bifocal computational display and SPDS to contribute to a daily-use and commercial virtual reality display.
ERDC MSRC (Major Shared Resource Center) Resource. High Performance Computing for the Warfighter. Fall 2008

DTIC Science & Technology

2008-01-01

32 “Solving the Hard Problems” at UGC 2008 in Seattle By Rose J. Dykes, ERDC MSRC...two fields to remain competitive in the global market . The ERDC MSRC attempts to take every available opportunity to encourage students to enter these...Attendees of the 18th annual DoD High Performance Computing Mod ern- ization Pr ogram (HPCMP) Users Gr oup Confer ence ( UGC
Biocellion: accelerating computer simulation of multicellular biological system models

PubMed Central

Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya

2014-01-01

Motivation: Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. Results: We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Availability and implementation: Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. Contact: seunghwa.kang@pnnl.gov PMID:25064572
GPU Accelerated Prognostics

NASA Technical Reports Server (NTRS)

Gorospe, George E., Jr.; Daigle, Matthew J.; Sankararaman, Shankar; Kulkarni, Chetan S.; Ng, Eley

2017-01-01

Prognostic methods enable operators and maintainers to predict the future performance for critical systems. However, these methods can be computationally expensive and may need to be performed each time new information about the system becomes available. In light of these computational requirements, we have investigated the application of graphics processing units (GPUs) as a computational platform for real-time prognostics. Recent advances in GPU technology have reduced cost and increased the computational capability of these highly parallel processing units, making them more attractive for the deployment of prognostic software. We present a survey of model-based prognostic algorithms with considerations for leveraging the parallel architecture of the GPU and a case study of GPU-accelerated battery prognostics with computational performance results.
Allocating Tactical High-Performance Computer (HPC) Resources to Offloaded Computation in Battlefield Scenarios

DTIC Science & Technology

2013-12-01

authors present a Computing on Dissemination with predictable contacts ( pCoD ) algorithm, since it is impossible to reserve task execution time in advance...Computing While Charging DAG Directed Acyclic Graph 18 TTL Time-to-live pCoD Predictable contacts CoD Computing on Dissemination upCoD Unpredictable
Argonne Out Loud: Computation, Big Data, and the Future of Cities

ScienceCinema

Catlett, Charlie

2018-01-16

Charlie Catlett, a Senior Computer Scientist at Argonne and Director of the Urban Center for Computation and Data at the Computation Institute of the University of Chicago and Argonne, talks about how he and his colleagues are using high-performance computing, data analytics, and embedded systems to better understand and design cities.
Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

PubMed Central

Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

2015-01-01

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438
Outcomes and challenges of global high-resolution non-hydrostatic atmospheric simulations using the K computer

NASA Astrophysics Data System (ADS)

Satoh, Masaki; Tomita, Hirofumi; Yashiro, Hisashi; Kajikawa, Yoshiyuki; Miyamoto, Yoshiaki; Yamaura, Tsuyoshi; Miyakawa, Tomoki; Nakano, Masuo; Kodama, Chihiro; Noda, Akira T.; Nasuno, Tomoe; Yamada, Yohei; Fukutomi, Yoshiki

2017-12-01

This article reviews the major outcomes of a 5-year (2011-2016) project using the K computer to perform global numerical atmospheric simulations based on the non-hydrostatic icosahedral atmospheric model (NICAM). The K computer was made available to the public in September 2012 and was used as a primary resource for Japan's Strategic Programs for Innovative Research (SPIRE), an initiative to investigate five strategic research areas; the NICAM project fell under the research area of climate and weather simulation sciences. Combining NICAM with high-performance computing has created new opportunities in three areas of research: (1) higher resolution global simulations that produce more realistic representations of convective systems, (2) multi-member ensemble simulations that are able to perform extended-range forecasts 10-30 days in advance, and (3) multi-decadal simulations for climatology and variability. Before the K computer era, NICAM was used to demonstrate realistic simulations of intra-seasonal oscillations including the Madden-Julian oscillation (MJO), merely as a case study approach. Thanks to the big leap in computational performance of the K computer, we could greatly increase the number of cases of MJO events for numerical simulations, in addition to integrating time and horizontal resolution. We conclude that the high-resolution global non-hydrostatic model, as used in this five-year project, improves the ability to forecast intra-seasonal oscillations and associated tropical cyclogenesis compared with that of the relatively coarser operational models currently in use. The impacts of the sub-kilometer resolution simulation and the multi-decadal simulations using NICAM are also reviewed.
High performance transcription factor-DNA docking with GPU computing

PubMed Central

2012-01-01

Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575
Sequence Tolerance of a Highly Stable Single Domain Antibody: Comparison of Computational and Experimental Profiles

DTIC Science & Technology

2016-09-09

evaluating 18 mutants using either the A or B conformer is only r = ~ 0.2. Given the poor performance of approximating the observed experimental ...1 Sequence Tolerance of a Highly Stable Single Domain Antibody: Comparison of Computational and Experimental Profiles Mark A. Olson,1 Patricia...unusually high thermal stability is explored by a combined computational and experimental study. Starting with the crystallographic structure
Job Management and Task Bundling

NASA Astrophysics Data System (ADS)

Berkowitz, Evan; Jansen, Gustav R.; McElvain, Kenneth; Walker-Loud, André

2018-03-01

High Performance Computing is often performed on scarce and shared computing resources. To ensure computers are used to their full capacity, administrators often incentivize large workloads that are not possible on smaller systems. Measurements in Lattice QCD frequently do not scale to machine-size workloads. By bundling tasks together we can create large jobs suitable for gigantic partitions. We discuss METAQ and mpi_jm, software developed to dynamically group computational tasks together, that can intelligently backfill to consume idle time without substantial changes to users' current workflows or executables.
Multicore Challenges and Benefits for High Performance Scientific Computing

DOE PAGES

Nielsen, Ida M. B.; Janssen, Curtis L.

2008-01-01

Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexitymore » of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.« less
RFA Guardian: Comprehensive Simulation of Radiofrequency Ablation Treatment of Liver Tumors.

PubMed

Voglreiter, Philip; Mariappan, Panchatcharam; Pollari, Mika; Flanagan, Ronan; Blanco Sequeiros, Roberto; Portugaller, Rupert Horst; Fütterer, Jurgen; Schmalstieg, Dieter; Kolesnik, Marina; Moche, Michael

2018-01-15

The RFA Guardian is a comprehensive application for high-performance patient-specific simulation of radiofrequency ablation of liver tumors. We address a wide range of usage scenarios. These include pre-interventional planning, sampling of the parameter space for uncertainty estimation, treatment evaluation and, in the worst case, failure analysis. The RFA Guardian is the first of its kind that exhibits sufficient performance for simulating treatment outcomes during the intervention. We achieve this by combining a large number of high-performance image processing, biomechanical simulation and visualization techniques into a generalized technical workflow. Further, we wrap the feature set into a single, integrated application, which exploits all available resources of standard consumer hardware, including massively parallel computing on graphics processing units. This allows us to predict or reproduce treatment outcomes on a single personal computer with high computational performance and high accuracy. The resulting low demand for infrastructure enables easy and cost-efficient integration into the clinical routine. We present a number of evaluation cases from the clinical practice where users performed the whole technical workflow from patient-specific modeling to final validation and highlight the opportunities arising from our fast, accurate prediction techniques.
A high performance linear equation solver on the VPP500 parallel supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi

1994-12-31

This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.
Integrated computational study of ultra-high heat flux cooling using cryogenic micro-solid nitrogen spray

NASA Astrophysics Data System (ADS)

Ishimoto, Jun; Oh, U.; Tan, Daisuke

2012-10-01

A new type of ultra-high heat flux cooling system using the atomized spray of cryogenic micro-solid nitrogen (SN2) particles produced by a superadiabatic two-fluid nozzle was developed and numerically investigated for application to next generation super computer processor thermal management. The fundamental characteristics of heat transfer and cooling performance of micro-solid nitrogen particulate spray impinging on a heated substrate were numerically investigated and experimentally measured by a new type of integrated computational-experimental technique. The employed Computational Fluid Dynamics (CFD) analysis based on the Euler-Lagrange model is focused on the cryogenic spray behavior of atomized particulate micro-solid nitrogen and also on its ultra-high heat flux cooling characteristics. Based on the numerically predicted performance, a new type of cryogenic spray cooling technique for application to a ultra-high heat power density device was developed. In the present integrated computation, it is clarified that the cryogenic micro-solid spray cooling characteristics are affected by several factors of the heat transfer process of micro-solid spray which impinges on heated surface as well as by atomization behavior of micro-solid particles. When micro-SN2 spraying cooling was used, an ultra-high cooling heat flux level was achieved during operation, a better cooling performance than that with liquid nitrogen (LN2) spray cooling. As micro-SN2 cooling has the advantage of direct latent heat transport which avoids the film boiling state, the ultra-short time scale heat transfer in a thin boundary layer is more possible than in LN2 spray. The present numerical prediction of the micro-SN2 spray cooling heat flux profile can reasonably reproduce the measurement results of cooling wall heat flux profiles. The application of micro-solid spray as a refrigerant for next generation computer processors is anticipated, and its ultra-high heat flux technology is expected to result in an extensive improvement in the effective cooling performance of large scale supercomputer systems.
Rich client data exploration and research prototyping for NOAA

NASA Astrophysics Data System (ADS)

Grossberg, Michael; Gladkova, Irina; Guch, Ingrid; Alabi, Paul; Shahriar, Fazlul; Bonev, George; Aizenman, Hannah

2009-08-01

Data from satellites and model simulations is increasing exponentially as observations and model computing power improve rapidly. Not only is technology producing more data, but it often comes from sources all over the world. Researchers and scientists who must collaborate are also located globally. This work presents a software design and technologies which will make it possible for groups of researchers to explore large data sets visually together without the need to download these data sets locally. The design will also make it possible to exploit high performance computing remotely and transparently to analyze and explore large data sets. Computer power, high quality sensing, and data storage capacity have improved at a rate that outstrips our ability to develop software applications that exploit these resources. It is impractical for NOAA scientists to download all of the satellite and model data that may be relevant to a given problem and the computing environments available to a given researcher range from supercomputers to only a web browser. The size and volume of satellite and model data are increasing exponentially. There are at least 50 multisensor satellite platforms collecting Earth science data. On the ground and in the sea there are sensor networks, as well as networks of ground based radar stations, producing a rich real-time stream of data. This new wealth of data would have limited use were it not for the arrival of large-scale high-performance computation provided by parallel computers, clusters, grids, and clouds. With these computational resources and vast archives available, it is now possible to analyze subtle relationships which are global, multi-modal and cut across many data sources. Researchers, educators, and even the general public, need tools to access, discover, and use vast data center archives and high performance computing through a simple yet flexible interface.

A High Performance SOAP Engine for Grid Computing

NASA Astrophysics Data System (ADS)

Wang, Ning; Welzl, Michael; Zhang, Liang

Web Service technology still has many defects that make its usage for Grid computing problematic, most notably the low performance of the SOAP engine. In this paper, we develop a novel SOAP engine called SOAPExpress, which adopts two key techniques for improving processing performance: SCTP data transport and dynamic early binding based data mapping. Experimental results show a significant and consistent performance improvement of SOAPExpress over Apache Axis.
The iPlant collaborative: cyberinfrastructure for enabling data to discovery for the life sciences

USDA-ARS?s Scientific Manuscript database

The iPlant Collaborative provides life science research communities access to comprehensive, scalable, and cohesive computational infrastructure for data management; identify management; collaboration tools; and cloud, high-performance, high-throughput computing. iPlant provides training, learning m...
Specialized computer architectures for computational aerodynamics

NASA Technical Reports Server (NTRS)

Stevenson, D. K.

1978-01-01

In recent years, computational fluid dynamics has made significant progress in modelling aerodynamic phenomena. Currently, one of the major barriers to future development lies in the compute-intensive nature of the numerical formulations and the relative high cost of performing these computations on commercially available general purpose computers, a cost high with respect to dollar expenditure and/or elapsed time. Today's computing technology will support a program designed to create specialized computing facilities to be dedicated to the important problems of computational aerodynamics. One of the still unresolved questions is the organization of the computing components in such a facility. The characteristics of fluid dynamic problems which will have significant impact on the choice of computer architecture for a specialized facility are reviewed.
Mass storage: The key to success in high performance computing

NASA Technical Reports Server (NTRS)

Lee, Richard R.

1993-01-01

There are numerous High Performance Computing & Communications Initiatives in the world today. All are determined to help solve some 'Grand Challenges' type of problem, but each appears to be dominated by the pursuit of higher and higher levels of CPU performance and interconnection bandwidth as the approach to success, without any regard to the impact of Mass Storage. My colleagues and I at Data Storage Technologies believe that all will have their performance against their goals ultimately measured by their ability to efficiently store and retrieve the 'deluge of data' created by end-users who will be using these systems to solve Scientific Grand Challenges problems, and that the issue of Mass Storage will become then the determinant of success or failure in achieving each projects goals. In today's world of High Performance Computing and Communications (HPCC), the critical path to success in solving problems can only be traveled by designing and implementing Mass Storage Systems capable of storing and manipulating the truly 'massive' amounts of data associated with solving these challenges. Within my presentation I will explore this critical issue and hypothesize solutions to this problem.
High Performance Computing Operations Review Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cupps, Kimberly C.

2013-12-19

The High Performance Computing Operations Review (HPCOR) meeting—requested by the ASC and ASCR program headquarters at DOE—was held November 5 and 6, 2013, at the Marriott Hotel in San Francisco, CA. The purpose of the review was to discuss the processes and practices for HPC integration and its related software and facilities. Experiences and lessons learned from the most recent systems deployed were covered in order to benefit the deployment of new systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sewell, Christopher Meyer

This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.
Safety of High Speed Ground Transportation Systems : Analytical Methodology for Safety Validation of Computer Controlled Subsystems : Volume 2. Development of a Safety Validation Methodology

DOT National Transportation Integrated Search

1995-01-01

This report describes the development of a methodology designed to assure that a sufficiently high level of safety is achieved and maintained in computer-based systems which perform safety cortical functions in high-speed rail or magnetic levitation ...
Scalable software-defined optical networking with high-performance routing and wavelength assignment algorithms.

PubMed

Lee, Chankyun; Cao, Xiaoyuan; Yoshikane, Noboru; Tsuritani, Takehiro; Rhee, June-Koo Kevin

2015-10-19

The feasibility of software-defined optical networking (SDON) for a practical application critically depends on scalability of centralized control performance. The paper, highly scalable routing and wavelength assignment (RWA) algorithms are investigated on an OpenFlow-based SDON testbed for proof-of-concept demonstration. Efficient RWA algorithms are proposed to achieve high performance in achieving network capacity with reduced computation cost, which is a significant attribute in a scalable centralized-control SDON. The proposed heuristic RWA algorithms differ in the orders of request processes and in the procedures of routing table updates. Combined in a shortest-path-based routing algorithm, a hottest-request-first processing policy that considers demand intensity and end-to-end distance information offers both the highest throughput of networks and acceptable computation scalability. We further investigate trade-off relationship between network throughput and computation complexity in routing table update procedure by a simulation study.
Using high-performance networks to enable computational aerosciences applications

NASA Technical Reports Server (NTRS)

Johnson, Marjory J.

1992-01-01

One component of the U.S. Federal High Performance Computing and Communications Program (HPCCP) is the establishment of a gigabit network to provide a communications infrastructure for researchers across the nation. This gigabit network will provide new services and capabilities, in addition to increased bandwidth, to enable future applications. An understanding of these applications is necessary to guide the development of the gigabit network and other high-performance networks of the future. In this paper we focus on computational aerosciences applications run remotely using the Numerical Aerodynamic Simulation (NAS) facility located at NASA Ames Research Center. We characterize these applications in terms of network-related parameters and relate user experiences that reveal limitations imposed by the current wide-area networking infrastructure. Then we investigate how the development of a nationwide gigabit network would enable users of the NAS facility to work in new, more productive ways.
GPU-accelerated FDTD modeling of radio-frequency field-tissue interactions in high-field MRI.

PubMed

Chi, Jieru; Liu, Feng; Weber, Ewald; Li, Yu; Crozier, Stuart

2011-06-01

The analysis of high-field RF field-tissue interactions requires high-performance finite-difference time-domain (FDTD) computing. Conventional CPU-based FDTD calculations offer limited computing performance in a PC environment. This study presents a graphics processing unit (GPU)-based parallel-computing framework, producing substantially boosted computing efficiency (with a two-order speedup factor) at a PC-level cost. Specific details of implementing the FDTD method on a GPU architecture have been presented and the new computational strategy has been successfully applied to the design of a novel 8-element transceive RF coil system at 9.4 T. Facilitated by the powerful GPU-FDTD computing, the new RF coil array offers optimized fields (averaging 25% improvement in sensitivity, and 20% reduction in loop coupling compared with conventional array structures of the same size) for small animal imaging with a robust RF configuration. The GPU-enabled acceleration paves the way for FDTD to be applied for both detailed forward modeling and inverse design of MRI coils, which were previously impractical.
University Students' Attainment and Perceptions of Computer Delivered Assessment; A Comparison between Computer-Based and Traditional Tests in a "High-Stakes" Examination

ERIC Educational Resources Information Center

Escudier, M. P.; Newton, T. J.; Cox, M. J.; Reynolds, P. A.; Odell, E. W.

2011-01-01

This study compared higher education dental undergraduate student performance in online assessments with performance in traditional paper-based tests and investigated students' perceptions of the fairness and acceptability of online tests, and showed performance to be comparable. The project design involved two parallel cross-over trials, one in…
Multicore Programming Challenges

NASA Astrophysics Data System (ADS)

Perrone, Michael

The computer industry is facing fundamental challenges that are driving a major change in the design of computer processors. Due to restrictions imposed by quantum physics, one historical path to higher computer processor performance - by increased clock frequency - has come to an end. Increasing clock frequency now leads to power consumption costs that are too high to justify. As a result, we have seen in recent years that the processor frequencies have peaked and are receding from their high point. At the same time, competitive market conditions are giving business advantage to those companies that can field new streaming applications, handle larger data sets, and update their models to market conditions faster. The desire for newer, faster and larger is driving continued demand for higher computer performance.
SANs and Large Scale Data Migration at the NASA Center for Computational Sciences

NASA Technical Reports Server (NTRS)

Salmon, Ellen M.

2004-01-01

Evolution and migration are a way of life for provisioners of high-performance mass storage systems that serve high-end computers used by climate and Earth and space science researchers: the compute engines come and go, but the data remains. At the NASA Center for Computational Sciences (NCCS), disk and tape SANs are deployed to provide high-speed I/O for the compute engines and the hierarchical storage management systems. Along with gigabit Ethernet, they also enable the NCCS's latest significant migration: the transparent transfer of 300 Til3 of legacy HSM data into the new Sun SAM-QFS cluster.
Cloud object store for archive storage of high performance computing data using decoupling middleware

DOEpatents

Bent, John M.; Faibish, Sorin; Grider, Gary

2015-06-30

Cloud object storage is enabled for archived data, such as checkpoints and results, of high performance computing applications using a middleware process. A plurality of archived files, such as checkpoint files and results, generated by a plurality of processes in a parallel computing system are stored by obtaining the plurality of archived files from the parallel computing system; converting the plurality of archived files to objects using a log structured file system middleware process; and providing the objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
Future computing platforms for science in a power constrained era

DOE PAGES

Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; ...

2015-12-23

Power consumption will be a key constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics (HEP). This makes performance-per-watt a crucial metric for selecting cost-efficient computing solutions. For this paper, we have done a wide survey of current and emerging architectures becoming available on the market including x86-64 variants, ARMv7 32-bit, ARMv8 64-bit, Many-Core and GPU solutions, as well as newer System-on-Chip (SoC) solutions. We compare performance and energy efficiency using an evolving set of standardized HEP-related benchmarks and power measurement techniques we have been developing. In conclusion, we evaluate the potentialmore » for use of such computing solutions in the context of DHTC systems, such as the Worldwide LHC Computing Grid (WLCG).« less
birgHPC: creating instant computing clusters for bioinformatics and molecular dynamics.

PubMed

Chew, Teong Han; Joyce-Tan, Kwee Hong; Akma, Farizuwana; Shamsir, Mohd Shahir

2011-05-01

birgHPC, a bootable Linux Live CD has been developed to create high-performance clusters for bioinformatics and molecular dynamics studies using any Local Area Network (LAN)-networked computers. birgHPC features automated hardware and slots detection as well as provides a simple job submission interface. The latest versions of GROMACS, NAMD, mpiBLAST and ClustalW-MPI can be run in parallel by simply booting the birgHPC CD or flash drive from the head node, which immediately positions the rest of the PCs on the network as computing nodes. Thus, a temporary, affordable, scalable and high-performance computing environment can be built by non-computing-based researchers using low-cost commodity hardware. The birgHPC Live CD and relevant user guide are available for free at http://birg1.fbb.utm.my/birghpc.
Center for Center for Technology for Advanced Scientific Component Software (TASCS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kostadin, Damevski

A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technologymore » for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.« less
Computer architecture evaluation for structural dynamics computations: Project summary

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1989-01-01

The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.
Colt: an experiment in wormhole run-time reconfiguration

NASA Astrophysics Data System (ADS)

Bittner, Ray; Athanas, Peter M.; Musgrove, Mark

1996-10-01

Wormhole run-time reconfiguration (RTR) is an attempt to create a refined computing paradigm for high performance computational tasks. By combining concepts from field programmable gate array (FPGA) technologies with data flow computing, the Colt/Stallion architecture achieves high utilization of hardware resources, and facilitates rapid run-time reconfiguration. Targeted mainly at DSP-type operations, the Colt integrated circuit -- a prototype wormhole RTR device -- compares favorably to contemporary DSP alternatives in terms of silicon area consumed per unit computation and in computing performance. Although emphasis has been placed on signal processing applications, general purpose computation has not been overlooked. Colt is a prototype that defines an architecture not only at the chip level but also in terms of an overall system design. As this system is realized, the concept of wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
A parallel-processing approach to computing for the geographic sciences; applications and systems enhancements

USGS Publications Warehouse

Crane, Michael; Steinwand, Dan; Beckmann, Tim; Krpan, Greg; Liu, Shu-Guang; Nichols, Erin; Haga, Jim; Maddox, Brian; Bilderback, Chris; Feller, Mark; Homer, George

2001-01-01

The overarching goal of this project is to build a spatially distributed infrastructure for information science research by forming a team of information science researchers and providing them with similar hardware and software tools to perform collaborative research. Four geographically distributed Centers of the U.S. Geological Survey (USGS) are developing their own clusters of low-cost, personal computers into parallel computing environments that provide a costeffective way for the USGS to increase participation in the high-performance computing community. Referred to as Beowulf clusters, these hybrid systems provide the robust computing power required for conducting information science research into parallel computing systems and applications.

A true minimally invasive approach for cochlear implantation: high accuracy in cranial base navigation through flat-panel-based volume computed tomography.

PubMed

Majdani, Omid; Bartling, Soenke H; Leinung, Martin; Stöver, Timo; Lenarz, Minoo; Dullin, Christian; Lenarz, Thomas

2008-02-01

High-precision intraoperative navigation using high-resolution flat-panel volume computed tomography makes feasible the possibility of minimally invasive cochlear implant surgery, including cochleostomy. Conventional cochlear implant surgery is typically performed via mastoidectomy with facial recess to identify and avoid damage to vital anatomic landmarks. To accomplish this procedure via a minimally invasive approach--without performing mastoidectomy--in a precise fashion, image-guided technology is necessary. With such an approach, surgical time and expertise may be reduced, and hearing preservation may be improved. Flat-panel volume computed tomography was used to scan 4 human temporal bones. A drilling channel was planned preoperatively from the mastoid surface to the round window niche, providing a margin of safety to all functional important structures (e.g., facial nerve, chorda tympani, incus). Postoperatively, computed tomographic imaging and conventional surgical exploration of the drilled route to the cochlea were performed. All 4 specimens showed a cochleostomy located at the scala tympani anterior inferior to the round window. The chorda tympani was damaged in 1 specimen--this was preoperatively planned as a narrow facial recess was encountered. Using flat-panel volume computed tomography for image-guided surgical navigation, we were able to perform minimally invasive cochlear implant surgery defined as a narrow, single-channel mastoidotomy with cochleostomy. Although this finding is preliminary, it is technologically achievable.
Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges.

PubMed

Yin, Zekun; Lan, Haidong; Tan, Guangming; Lu, Mian; Vasilakos, Athanasios V; Liu, Weiguo

2017-01-01

The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need to rapidly perform data analysis tasks in life sciences. As a result, both biologists and computer scientists are facing the challenge of gaining a profound insight into the deepest biological functions from big biological data. This in turn requires massive computational resources. Therefore, high performance computing (HPC) platforms are highly needed as well as efficient and scalable algorithms that can take advantage of these platforms. In this paper, we survey the state-of-the-art HPC platforms for big biological data analytics. We first list the characteristics of big biological data and popular computing platforms. Then we provide a taxonomy of different biological data analysis applications and a survey of the way they have been mapped onto various computing platforms. After that, we present a case study to compare the efficiency of different computing platforms for handling the classical biological sequence alignment problem. At last we discuss the open issues in big biological data analytics.
Webinar: Delivering Transformational HPC Solutions to Industry

ScienceCinema

Streitz, Frederick

2018-01-16

Dr. Frederick Streitz, director of the High Performance Computing Innovation Center, discusses Lawrence Livermore National Laboratory computational capabilities and expertise available to industry in this webinar.
Parallel high-performance grid computing: capabilities and opportunities of a novel demanding service and business class allowing highest resource efficiency.

PubMed

Kepper, Nick; Ettig, Ramona; Dickmann, Frank; Stehr, Rene; Grosveld, Frank G; Wedemann, Gero; Knoch, Tobias A

2010-01-01

Especially in the life-science and the health-care sectors the huge IT requirements are imminent due to the large and complex systems to be analysed and simulated. Grid infrastructures play here a rapidly increasing role for research, diagnostics, and treatment, since they provide the necessary large-scale resources efficiently. Whereas grids were first used for huge number crunching of trivially parallelizable problems, increasingly parallel high-performance computing is required. Here, we show for the prime example of molecular dynamic simulations how the presence of large grid clusters including very fast network interconnects within grid infrastructures allows now parallel high-performance grid computing efficiently and thus combines the benefits of dedicated super-computing centres and grid infrastructures. The demands for this service class are the highest since the user group has very heterogeneous requirements: i) two to many thousands of CPUs, ii) different memory architectures, iii) huge storage capabilities, and iv) fast communication via network interconnects, are all needed in different combinations and must be considered in a highly dedicated manner to reach highest performance efficiency. Beyond, advanced and dedicated i) interaction with users, ii) the management of jobs, iii) accounting, and iv) billing, not only combines classic with parallel high-performance grid usage, but more importantly is also able to increase the efficiency of IT resource providers. Consequently, the mere "yes-we-can" becomes a huge opportunity like e.g. the life-science and health-care sectors as well as grid infrastructures by reaching higher level of resource efficiency.
Computed a multiple band metamaterial absorber and its application based on the figure of merit value

NASA Astrophysics Data System (ADS)

Chen, Chao; Sheng, Yuping; Jun, Wang

2018-01-01

A high performed multiple band metamaterial absorber is designed and computed through the software Ansofts HFSS 10.0, which is constituted with two kinds of separated metal particles sub-structures. The multiple band absorption property of the metamaterial absorber is based on the resonance of localized surface plasmon (LSP) modes excited near edges of metal particles. The damping constant of gold layer is optimized to obtain a near-perfect absorption rate. Four kinds of dielectric layers is computed to achieve the perfect absorption perform. The perfect absorption perform of the metamaterial absorber is enhanced through optimizing the structural parameters (R = 75 nm, w = 80 nm). Moreover, a perfect absorption band is achieved because of the plasmonic hybridization phenomenon between LSP modes. The designed metamaterial absorber shows high sensitive in the changed of the refractive index of the liquid. A liquid refractive index sensor strategy is proposed based on the computed figure of merit (FOM) value of the metamaterial absorber. High FOM values (116, 111, and 108) are achieved with three liquid (Methanol, Carbon tetrachloride, and Carbon disulfide).
High-performance computational fluid dynamics: a custom-code approach

NASA Astrophysics Data System (ADS)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer.

PubMed

Yang, Xi; Wu, Chengkun; Lu, Kai; Fang, Lin; Zhang, Yong; Li, Shengkang; Guo, Guixin; Du, YunFei

2017-12-01

Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides extra possibilities and capacity to address the challenges associated with big data. In this paper, we propose Orion-a big data interface on the Tianhe-2 supercomputer-to enable big data applications to run on Tianhe-2 via a single command or a shell script. Orion supports multiple users, and each user can launch multiple tasks. It minimizes the effort needed to initiate big data applications on the Tianhe-2 supercomputer via automated configuration. Orion follows the "allocate-when-needed" paradigm, and it avoids the idle occupation of computational resources. We tested the utility and performance of Orion using a big genomic dataset and achieved a satisfactory performance on Tianhe-2 with very few modifications to existing applications that were implemented in Hadoop/Spark. In summary, Orion provides a practical and economical interface for big data processing on Tianhe-2.
Security and Cloud Outsourcing Framework for Economic Dispatch

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sarker, Mushfiqur R.; Wang, Jianhui; Li, Zuyi

The computational complexity and problem sizes of power grid applications have increased significantly with the advent of renewable resources and smart grid technologies. The current paradigm of solving these issues consist of inhouse high performance computing infrastructures, which have drawbacks of high capital expenditures, maintenance, and limited scalability. Cloud computing is an ideal alternative due to its powerful computational capacity, rapid scalability, and high cost-effectiveness. A major challenge, however, remains in that the highly confidential grid data is susceptible for potential cyberattacks when outsourced to the cloud. In this work, a security and cloud outsourcing framework is developed for themore » Economic Dispatch (ED) linear programming application. As a result, the security framework transforms the ED linear program into a confidentiality-preserving linear program, that masks both the data and problem structure, thus enabling secure outsourcing to the cloud. Results show that for large grid test cases the performance gain and costs outperforms the in-house infrastructure.« less
Security and Cloud Outsourcing Framework for Economic Dispatch

DOE PAGES

Sarker, Mushfiqur R.; Wang, Jianhui; Li, Zuyi; ...

2017-04-24

The computational complexity and problem sizes of power grid applications have increased significantly with the advent of renewable resources and smart grid technologies. The current paradigm of solving these issues consist of inhouse high performance computing infrastructures, which have drawbacks of high capital expenditures, maintenance, and limited scalability. Cloud computing is an ideal alternative due to its powerful computational capacity, rapid scalability, and high cost-effectiveness. A major challenge, however, remains in that the highly confidential grid data is susceptible for potential cyberattacks when outsourced to the cloud. In this work, a security and cloud outsourcing framework is developed for themore » Economic Dispatch (ED) linear programming application. As a result, the security framework transforms the ED linear program into a confidentiality-preserving linear program, that masks both the data and problem structure, thus enabling secure outsourcing to the cloud. Results show that for large grid test cases the performance gain and costs outperforms the in-house infrastructure.« less
Building high-performance system for processing a daily large volume of Chinese satellites imagery

NASA Astrophysics Data System (ADS)

Deng, Huawu; Huang, Shicun; Wang, Qi; Pan, Zhiqiang; Xin, Yubin

2014-10-01

The number of Earth observation satellites from China increases dramatically recently and those satellites are acquiring a large volume of imagery daily. As the main portal of image processing and distribution from those Chinese satellites, the China Centre for Resources Satellite Data and Application (CRESDA) has been working with PCI Geomatics during the last three years to solve two issues in this regard: processing the large volume of data (about 1,500 scenes or 1 TB per day) in a timely manner and generating geometrically accurate orthorectified products. After three-year research and development, a high performance system has been built and successfully delivered. The high performance system has a service oriented architecture and can be deployed to a cluster of computers that may be configured with high end computing power. The high performance is gained through, first, making image processing algorithms into parallel computing by using high performance graphic processing unit (GPU) cards and multiple cores from multiple CPUs, and, second, distributing processing tasks to a cluster of computing nodes. While achieving up to thirty (and even more) times faster in performance compared with the traditional practice, a particular methodology was developed to improve the geometric accuracy of images acquired from Chinese satellites (including HJ-1 A/B, ZY-1-02C, ZY-3, GF-1, etc.). The methodology consists of fully automatic collection of dense ground control points (GCP) from various resources and then application of those points to improve the photogrammetric model of the images. The delivered system is up running at CRESDA for pre-operational production and has been and is generating good return on investment by eliminating a great amount of manual labor and increasing more than ten times of data throughput daily with fewer operators. Future work, such as development of more performance-optimized algorithms, robust image matching methods and application workflows, is identified to improve the system in the coming years.
On the tip of the tongue: learning typing and pointing with an intra-oral computer interface.

PubMed

Caltenco, Héctor A; Breidegard, Björn; Struijk, Lotte N S Andreasen

2014-07-01

To evaluate typing and pointing performance and improvement over time of four able-bodied participants using an intra-oral tongue-computer interface for computer control. A physically disabled individual may lack the ability to efficiently control standard computer input devices. There have been several efforts to produce and evaluate interfaces that provide individuals with physical disabilities the possibility to control personal computers. Training with the intra-oral tongue-computer interface was performed by playing games over 18 sessions. Skill improvement was measured through typing and pointing exercises at the end of each training session. Typing throughput improved from averages of 2.36 to 5.43 correct words per minute. Pointing throughput improved from averages of 0.47 to 0.85 bits/s. Target tracking performance, measured as relative time on target, improved from averages of 36% to 47%. Path following throughput improved from averages of 0.31 to 0.83 bits/s and decreased to 0.53 bits/s with more difficult tasks. Learning curves support the notion that the tongue can rapidly learn novel motor tasks. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, which makes the tongue a feasible input organ for computer control. Intra-oral computer interfaces could provide individuals with severe upper-limb mobility impairments the opportunity to control computers and automatic equipment. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, but does not cause fatigue easily and might be invisible to other people, which is highly prioritized by assistive device users. Combination of visual and auditory feedback is vital for a good performance of an intra-oral computer interface and helps to reduce involuntary or erroneous activations.
Cognitive Model Exploration and Optimization: A New Challenge for Computational Science

DTIC Science & Technology

2010-03-01

the generation and analysis of computational cognitive models to explain various aspects of cognition. Typically the behavior of these models...computational scale of a workstation, so we have turned to high performance computing (HPC) clusters and volunteer computing for large-scale...computational resources. The majority of applications on the Department of Defense HPC clusters focus on solving partial differential equations (Post
Implementing Molecular Dynamics on Hybrid High Performance Computers - Three-Body Potentials

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, W Michael; Yamada, Masako

The use of coprocessors or accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power re- quirements. Hybrid high-performance computers, defined as machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. Although there has been extensive research into methods to efficiently use accelerators to improve the performance of molecular dynamics (MD) employing pairwise potential energy models, little is reported in the literature for models that includemore » many-body effects. 3-body terms are required for many popular potentials such as MEAM, Tersoff, REBO, AIREBO, Stillinger-Weber, Bond-Order Potentials, and others. Because the per-atom simulation times are much higher for models incorporating 3-body terms, there is a clear need for efficient algo- rithms usable on hybrid high performance computers. Here, we report a shared-memory force-decomposition for 3-body potentials that avoids memory conflicts to allow for a deterministic code with substantial performance improvements on hybrid machines. We describe modifications necessary for use in distributed memory MD codes and show results for the simulation of water with Stillinger-Weber on the hybrid Titan supercomputer. We compare performance of the 3-body model to the SPC/E water model when using accelerators. Finally, we demonstrate that our approach can attain a speedup of 5.1 with acceleration on Titan for production simulations to study water droplet freezing on a surface.« less
Computation Directorate 2008 Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, D L

2009-03-25

Whether a computer is simulating the aging and performance of a nuclear weapon, the folding of a protein, or the probability of rainfall over a particular mountain range, the necessary calculations can be enormous. Our computers help researchers answer these and other complex problems, and each new generation of system hardware and software widens the realm of possibilities. Building on Livermore's historical excellence and leadership in high-performance computing, Computation added more than 331 trillion floating-point operations per second (teraFLOPS) of power to LLNL's computer room floors in 2008. In addition, Livermore's next big supercomputer, Sequoia, advanced ever closer to itsmore » 2011-2012 delivery date, as architecture plans and the procurement contract were finalized. Hyperion, an advanced technology cluster test bed that teams Livermore with 10 industry leaders, made a big splash when it was announced during Michael Dell's keynote speech at the 2008 Supercomputing Conference. The Wall Street Journal touted Hyperion as a 'bright spot amid turmoil' in the computer industry. Computation continues to measure and improve the costs of operating LLNL's high-performance computing systems by moving hardware support in-house, by measuring causes of outages to apply resources asymmetrically, and by automating most of the account and access authorization and management processes. These improvements enable more dollars to go toward fielding the best supercomputers for science, while operating them at less cost and greater responsiveness to the customers.« less
Biocellion: accelerating computer simulation of multicellular biological system models.

PubMed

Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya

2014-11-01

Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

NASA Technical Reports Server (NTRS)

1991-01-01

Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

PubMed

Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

2012-01-01

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.
BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

PubMed Central

Ayres, Daniel L.; Darling, Aaron; Zwickl, Derrick J.; Beerli, Peter; Holder, Mark T.; Lewis, Paul O.; Huelsenbeck, John P.; Ronquist, Fredrik; Swofford, David L.; Cummings, Michael P.; Rambaut, Andrew; Suchard, Marc A.

2012-01-01

Abstract Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software. PMID:21963610
Parallel-vector unsymmetric Eigen-Solver on high performance computers

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Jiangning, Qin

1993-01-01

The popular QR algorithm for solving all eigenvalues of an unsymmetric matrix is reviewed. Among the basic components in the QR algorithm, it was concluded from this study, that the reduction of an unsymmetric matrix to a Hessenberg form (before applying the QR algorithm itself) can be done effectively by exploiting the vector speed and multiple processors offered by modern high-performance computers. Numerical examples of several test cases have indicated that the proposed parallel-vector algorithm for converting a given unsymmetric matrix to a Hessenberg form offers computational advantages over the existing algorithm. The time saving obtained by the proposed methods is increased as the problem size increased.
Lawrence Livermore National Laboratories Perspective on Code Development and High Performance Computing Resources in Support of the National HED/ICF Effort

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clouse, C. J.; Edwards, M. J.; McCoy, M. G.

2015-07-07

Through its Advanced Scientific Computing (ASC) and Inertial Confinement Fusion (ICF) code development efforts, Lawrence Livermore National Laboratory (LLNL) provides a world leading numerical simulation capability for the National HED/ICF program in support of the Stockpile Stewardship Program (SSP). In addition the ASC effort provides high performance computing platform capabilities upon which these codes are run. LLNL remains committed to, and will work with, the national HED/ICF program community to help insure numerical simulation needs are met and to make those capabilities available, consistent with programmatic priorities and available resources.

High Performance Computing for Medical Image Interpretation

DTIC Science & Technology

1993-10-01

programme for some key companies in the health care industries (Dumay et al., (1993)). With the modules developed for the "Smart Surgeon" an anatomical model...Interpretation System" (HIPMI 2S) geoft FEL-TNO do mogelijkheid omn haar expertise amn to bieden amn do Nederlandse, en Europese civiole medische industrie ...Spin-off 23 3 HIGH PERFORMANCE COMPUTING 24 3.1 Introduction 24 3.2 Parallel processing 24 3.3 Artificial Neural Networks 25 3.4 European Industry
SUPREM-DSMC: A New Scalable, Parallel, Reacting, Multidimensional Direct Simulation Monte Carlo Flow Code

NASA Technical Reports Server (NTRS)

Campbell, David; Wysong, Ingrid; Kaplan, Carolyn; Mott, David; Wadsworth, Dean; VanGilder, Douglas

2000-01-01

An AFRL/NRL team has recently been selected to develop a scalable, parallel, reacting, multidimensional (SUPREM) Direct Simulation Monte Carlo (DSMC) code for the DoD user community under the High Performance Computing Modernization Office (HPCMO) Common High Performance Computing Software Support Initiative (CHSSI). This paper will introduce the JANNAF Exhaust Plume community to this three-year development effort and present the overall goals, schedule, and current status of this new code.
Remote control system for high-perfomance computer simulation of crystal growth by the PFC method

NASA Astrophysics Data System (ADS)

Pavlyuk, Evgeny; Starodumov, Ilya; Osipov, Sergei

2017-04-01

Modeling of crystallization process by the phase field crystal method (PFC) - one of the important directions of modern computational materials science. In this paper, the practical side of the computer simulation of the crystallization process by the PFC method is investigated. To solve problems using this method, it is necessary to use high-performance computing clusters, data storage systems and other often expensive complex computer systems. Access to such resources is often limited, unstable and accompanied by various administrative problems. In addition, the variety of software and settings of different computing clusters sometimes does not allow researchers to use unified program code. There is a need to adapt the program code for each configuration of the computer complex. The practical experience of the authors has shown that the creation of a special control system for computing with the possibility of remote use can greatly simplify the implementation of simulations and increase the performance of scientific research. In current paper we show the principal idea of such a system and justify its efficiency.
The TeraShake Computational Platform for Large-Scale Earthquake Simulations

NASA Astrophysics Data System (ADS)

Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
Hypersonic Research Vehicle (HRV) real-time flight test support feasibility and requirements study. Part 2: Remote computation support for flight systems functions

NASA Technical Reports Server (NTRS)

Rediess, Herman A.; Hewett, M. D.

1991-01-01

The requirements are assessed for the use of remote computation to support HRV flight testing. First, remote computational requirements were developed to support functions that will eventually be performed onboard operational vehicles of this type. These functions which either cannot be performed onboard in the time frame of initial HRV flight test programs because the technology of airborne computers will not be sufficiently advanced to support the computational loads required, or it is not desirable to perform the functions onboard in the flight test program for other reasons. Second, remote computational support either required or highly desirable to conduct flight testing itself was addressed. The use is proposed of an Automated Flight Management System which is described in conceptual detail. Third, autonomous operations is discussed and finally, unmanned operations.
Using a Computer-based Messaging System at a High School To Increase School/Home Communication.

ERIC Educational Resources Information Center

Burden, Mitzi K.

Minimal communication between school and home was found to contribute to low performance by students at McDuffie High School (South Carolina). This report describes the experience of establishing a computer-based telephone messaging system in the high school and involving parents, teachers, and students in its use. Additional strategies employed…
Costa - Introduction to 2015 Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Costa, James E.

In parallel with Sandia National Laboratories having two major locations (NM and CA), along with a number of smaller facilities across the nation, so too is the distribution of scientific, engineering and computing resources. As a part of Sandia’s Institutional Computing Program, CA site-based Sandia computer scientists and engineers have been providing mission and research staff with local CA resident expertise on computing options while also focusing on two growing high performance computing research problems. The first is how to increase system resilience to failure, as machines grow larger, more complex and heterogeneous. The second is how to ensure thatmore » computer hardware and configurations are optimized for specialized data analytical mission needs within the overall Sandia computing environment, including the HPC subenvironment. All of these activities support the larger Sandia effort in accelerating development and integration of high performance computing into national security missions. Sandia continues to both promote national R&D objectives, including the recent Presidential Executive Order establishing the National Strategic Computing Initiative and work to ensure that the full range of computing services and capabilities are available for all mission responsibilities, from national security to energy to homeland defense.« less
A FRAMEWORK FOR FINE-SCALE COMPUTATIONAL FLUID DYNAMICS AIR QUALITY MODELING AND ANALYSIS

EPA Science Inventory

Fine-scale Computational Fluid Dynamics (CFD) simulation of pollutant concentrations within roadway and building microenvironments is feasible using high performance computing. Unlike currently used regulatory air quality models, fine-scale CFD simulations are able to account rig...
WinSCP for Windows File Transfers | High-Performance Computing | NREL

Science.gov Websites

WinSCP for Windows File Transfers WinSCP for Windows File Transfers WinSCP for can used to securely transfer files between your local computer running Microsoft Windows and a remote computer running Linux
A C++11 implementation of arbitrary-rank tensors for high-performance computing

NASA Astrophysics Data System (ADS)

Aragón, Alejandro M.

2014-06-01

This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.
A C++11 implementation of arbitrary-rank tensors for high-performance computing

NASA Astrophysics Data System (ADS)

Aragón, Alejandro M.

2014-11-01

This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.
High-End Computing Challenges in Aerospace Design and Engineering

NASA Technical Reports Server (NTRS)

Bailey, F. Ronald

2004-01-01

High-End Computing (HEC) has had significant impact on aerospace design and engineering and is poised to make even more in the future. In this paper we describe four aerospace design and engineering challenges: Digital Flight, Launch Simulation, Rocket Fuel System and Digital Astronaut. The paper discusses modeling capabilities needed for each challenge and presents projections of future near and far-term HEC computing requirements. NASA's HEC Project Columbia is described and programming strategies presented that are necessary to achieve high real performance.
An XML-Based Protocol for Distributed Event Services

NASA Technical Reports Server (NTRS)

Smith, Warren; Gunter, Dan; Quesnel, Darcy; Biegel, Bryan (Technical Monitor)

2001-01-01

A recent trend in distributed computing is the construction of high-performance distributed systems called computational grids. One difficulty we have encountered is that there is no standard format for the representation of performance information and no standard protocol for transmitting this information. This limits the types of performance analysis that can be undertaken in complex distributed systems. To address this problem, we present an XML-based protocol for transmitting performance events in distributed systems and evaluate the performance of this protocol.
plasmaFoam: An OpenFOAM framework for computational plasma physics and chemistry

NASA Astrophysics Data System (ADS)

Venkattraman, Ayyaswamy; Verma, Abhishek Kumar

2016-09-01

As emphasized in the 2012 Roadmap for low temperature plasmas (LTP), scientific computing has emerged as an essential tool for the investigation and prediction of the fundamental physical and chemical processes associated with these systems. While several in-house and commercial codes exist, with each having its own advantages and disadvantages, a common framework that can be developed by researchers from all over the world will likely accelerate the impact of computational studies on advances in low-temperature plasma physics and chemistry. In this regard, we present a finite volume computational toolbox to perform high-fidelity simulations of LTP systems. This framework, primarily based on the OpenFOAM solver suite, allows us to enhance our understanding of multiscale plasma phenomenon by performing massively parallel, three-dimensional simulations on unstructured meshes using well-established high performance computing tools that are widely used in the computational fluid dynamics community. In this talk, we will present preliminary results obtained using the OpenFOAM-based solver suite with benchmark three-dimensional simulations of microplasma devices including both dielectric and plasma regions. We will also discuss the future outlook for the solver suite.
Analysis of Flowfields over Four-Engine DC-X Rockets

NASA Technical Reports Server (NTRS)

Wang, Ten-See; Cornelison, Joni

1996-01-01

The objective of this study is to validate a computational methodology for the aerodynamic performance of an advanced conical launch vehicle configuration. The computational methodology is based on a three-dimensional, viscous flow, pressure-based computational fluid dynamics formulation. Both wind-tunnel and ascent flight-test data are used for validation. Emphasis is placed on multiple-engine power-on effects. Computational characterization of the base drag in the critical subsonic regime is the focus of the validation effort; until recently, almost no multiple-engine data existed for a conical launch vehicle configuration. Parametric studies using high-order difference schemes are performed for the cold-flow tests, whereas grid studies are conducted for the flight tests. The computed vehicle axial force coefficients, forebody, aftbody, and base surface pressures compare favorably with those of tests. The results demonstrate that with adequate grid density and proper distribution, a high-order difference scheme, finite rate afterburning kinetics to model the plume chemistry, and a suitable turbulence model to describe separated flows, plume/air mixing, and boundary layers, computational fluid dynamics is a tool that can be used to predict the low-speed aerodynamic performance for rocket design and operations.
Biological and Environmental Research Exascale Requirements Review. An Office of Science review sponsored jointly by Advanced Scientific Computing Research and Biological and Environmental Research, March 28-31, 2016, Rockville, Maryland

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arkin, Adam; Bader, David C.; Coffey, Richard

Understanding the fundamentals of genomic systems or the processes governing impactful weather patterns are examples of the types of simulation and modeling performed on the most advanced computing resources in America. High-performance computing and computational science together provide a necessary platform for the mission science conducted by the Biological and Environmental Research (BER) office at the U.S. Department of Energy (DOE). This report reviews BER’s computing needs and their importance for solving some of the toughest problems in BER’s portfolio. BER’s impact on science has been transformative. Mapping the human genome, including the U.S.-supported international Human Genome Project that DOEmore » began in 1987, initiated the era of modern biotechnology and genomics-based systems biology. And since the 1950s, BER has been a core contributor to atmospheric, environmental, and climate science research, beginning with atmospheric circulation studies that were the forerunners of modern Earth system models (ESMs) and by pioneering the implementation of climate codes onto high-performance computers. See http://exascaleage.org/ber/ for more information.« less
Biological modelling of a computational spiking neural network with neuronal avalanches.

PubMed

Li, Xiumin; Chen, Qing; Xue, Fangzheng

2017-06-28

In recent years, an increasing number of studies have demonstrated that networks in the brain can self-organize into a critical state where dynamics exhibit a mixture of ordered and disordered patterns. This critical branching phenomenon is termed neuronal avalanches. It has been hypothesized that the homeostatic level balanced between stability and plasticity of this critical state may be the optimal state for performing diverse neural computational tasks. However, the critical region for high performance is narrow and sensitive for spiking neural networks (SNNs). In this paper, we investigated the role of the critical state in neural computations based on liquid-state machines, a biologically plausible computational neural network model for real-time computing. The computational performance of an SNN when operating at the critical state and, in particular, with spike-timing-dependent plasticity for updating synaptic weights is investigated. The network is found to show the best computational performance when it is subjected to critical dynamic states. Moreover, the active-neuron-dominant structure refined from synaptic learning can remarkably enhance the robustness of the critical state and further improve computational accuracy. These results may have important implications in the modelling of spiking neural networks with optimal computational performance.This article is part of the themed issue 'Mathematical methods in medicine: neuroscience, cardiology and pathology'. © 2017 The Author(s).
Biological modelling of a computational spiking neural network with neuronal avalanches

NASA Astrophysics Data System (ADS)

Li, Xiumin; Chen, Qing; Xue, Fangzheng

2017-05-01

In recent years, an increasing number of studies have demonstrated that networks in the brain can self-organize into a critical state where dynamics exhibit a mixture of ordered and disordered patterns. This critical branching phenomenon is termed neuronal avalanches. It has been hypothesized that the homeostatic level balanced between stability and plasticity of this critical state may be the optimal state for performing diverse neural computational tasks. However, the critical region for high performance is narrow and sensitive for spiking neural networks (SNNs). In this paper, we investigated the role of the critical state in neural computations based on liquid-state machines, a biologically plausible computational neural network model for real-time computing. The computational performance of an SNN when operating at the critical state and, in particular, with spike-timing-dependent plasticity for updating synaptic weights is investigated. The network is found to show the best computational performance when it is subjected to critical dynamic states. Moreover, the active-neuron-dominant structure refined from synaptic learning can remarkably enhance the robustness of the critical state and further improve computational accuracy. These results may have important implications in the modelling of spiking neural networks with optimal computational performance. This article is part of the themed issue `Mathematical methods in medicine: neuroscience, cardiology and pathology'.
Reconfigurable Computing for Computational Science: A New Focus in High Performance Computing

DTIC Science & Technology

2006-11-01

in the past decade. Researchers are regularly employing the power of large computing systems and parallel processing to tackle larger and more...complex problems in all of the physical sciences. For the past decade or so, most of this growth in computing power has been “free” with increased...the scientific computing community as a means to continued growth in computing capability. This paper offers a glimpse of the hardware and
Coupled Aerodynamic and Structural Sensitivity Analysis of a High-Speed Civil Transport

NASA Technical Reports Server (NTRS)

Mason, B. H.; Walsh, J. L.

2001-01-01

An objective of the High Performance Computing and Communication Program at the NASA Langley Research Center is to demonstrate multidisciplinary shape and sizing optimization of a complete aerospace vehicle configuration by using high-fidelity, finite-element structural analysis and computational fluid dynamics aerodynamic analysis. In a previous study, a multi-disciplinary analysis system for a high-speed civil transport was formulated to integrate a set of existing discipline analysis codes, some of them computationally intensive, This paper is an extension of the previous study, in which the sensitivity analysis for the coupled aerodynamic and structural analysis problem is formulated and implemented. Uncoupled stress sensitivities computed with a constant load vector in a commercial finite element analysis code are compared to coupled aeroelastic sensitivities computed by finite differences. The computational expense of these sensitivity calculation methods is discussed.

A High Performance Cloud-Based Protein-Ligand Docking Prediction Algorithm

PubMed Central

Chen, Jui-Le; Yang, Chu-Sing

2013-01-01

The potential of predicting druggability for a particular disease by integrating biological and computer science technologies has witnessed success in recent years. Although the computer science technologies can be used to reduce the costs of the pharmaceutical research, the computation time of the structure-based protein-ligand docking prediction is still unsatisfied until now. Hence, in this paper, a novel docking prediction algorithm, named fast cloud-based protein-ligand docking prediction algorithm (FCPLDPA), is presented to accelerate the docking prediction algorithm. The proposed algorithm works by leveraging two high-performance operators: (1) the novel migration (information exchange) operator is designed specially for cloud-based environments to reduce the computation time; (2) the efficient operator is aimed at filtering out the worst search directions. Our simulation results illustrate that the proposed method outperforms the other docking algorithms compared in this paper in terms of both the computation time and the quality of the end result. PMID:23762864
High performance flight computer developed for deep space applications

NASA Technical Reports Server (NTRS)

Bunker, Robert L.

1993-01-01

The development of an advanced space flight computer for real time embedded deep space applications which embodies the lessons learned on Galileo and modern computer technology is described. The requirements are listed and the design implementation that meets those requirements is described. The development of SPACE-16 (Spaceborne Advanced Computing Engine) (where 16 designates the databus width) was initiated to support the MM2 (Marine Mark 2) project. The computer is based on a radiation hardened emulation of a modern 32 bit microprocessor and its family of support devices including a high performance floating point accelerator. Additional custom devices which include a coprocessor to improve input/output capabilities, a memory interface chip, and an additional support chip that provide management of all fault tolerant features, are described. Detailed supporting analyses and rationale which justifies specific design and architectural decisions are provided. The six chip types were designed and fabricated. Testing and evaluation of a brass/board was initiated.
DeepX: Deep Learning Accelerator for Restricted Boltzmann Machine Artificial Neural Networks.

PubMed

Kim, Lok-Won

2018-05-01

Although there have been many decades of research and commercial presence on high performance general purpose processors, there are still many applications that require fully customized hardware architectures for further computational acceleration. Recently, deep learning has been successfully used to learn in a wide variety of applications, but their heavy computation demand has considerably limited their practical applications. This paper proposes a fully pipelined acceleration architecture to alleviate high computational demand of an artificial neural network (ANN) which is restricted Boltzmann machine (RBM) ANNs. The implemented RBM ANN accelerator (integrating network size, using 128 input cases per batch, and running at a 303-MHz clock frequency) integrated in a state-of-the art field-programmable gate array (FPGA) (Xilinx Virtex 7 XC7V-2000T) provides a computational performance of 301-billion connection-updates-per-second and about 193 times higher performance than a software solution running on general purpose processors. Most importantly, the architecture enables over 4 times (12 times in batch learning) higher performance compared with a previous work when both are implemented in an FPGA device (XC2VP70).
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

NASA Astrophysics Data System (ADS)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; Ng, Esmond G.; Maris, Pieter; Vary, James P.

2018-01-01

We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
HPC Annual Report 2017

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dennig, Yasmin

Sandia National Laboratories has a long history of significant contributions to the high performance community and industry. Our innovative computer architectures allowed the United States to become the first to break the teraFLOP barrier—propelling us to the international spotlight. Our advanced simulation and modeling capabilities have been integral in high consequence US operations such as Operation Burnt Frost. Strong partnerships with industry leaders, such as Cray, Inc. and Goodyear, have enabled them to leverage our high performance computing (HPC) capabilities to gain a tremendous competitive edge in the marketplace. As part of our continuing commitment to providing modern computing infrastructuremore » and systems in support of Sandia missions, we made a major investment in expanding Building 725 to serve as the new home of HPC systems at Sandia. Work is expected to be completed in 2018 and will result in a modern facility of approximately 15,000 square feet of computer center space. The facility will be ready to house the newest National Nuclear Security Administration/Advanced Simulation and Computing (NNSA/ASC) Prototype platform being acquired by Sandia, with delivery in late 2019 or early 2020. This new system will enable continuing advances by Sandia science and engineering staff in the areas of operating system R&D, operation cost effectiveness (power and innovative cooling technologies), user environment and application code performance.« less
SISYPHUS: A high performance seismic inversion factory

NASA Astrophysics Data System (ADS)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
Application of the Linux cluster for exhaustive window haplotype analysis using the FBAT and Unphased programs.

PubMed

Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun

2008-05-28

Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4-15.9 times faster, while Unphased jobs performed 1.1-18.6 times faster compared to the accumulated computation duration. Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance.
Application of the Linux cluster for exhaustive window haplotype analysis using the FBAT and Unphased programs

PubMed Central

Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun

2008-01-01

Background Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Results Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4–15.9 times faster, while Unphased jobs performed 1.1–18.6 times faster compared to the accumulated computation duration. Conclusion Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance. PMID:18541045
Final Report for DOE Award ER25756

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kesselman, Carl

2014-11-17

The SciDAC-funded Center for Enabling Distributed Petascale Science (CEDPS) was established to address technical challenges that arise due to the frequent geographic distribution of data producers (in particular, supercomputers and scientific instruments) and data consumers (people and computers) within the DOE laboratory system. Its goal is to produce technical innovations that meet DOE end-user needs for (a) rapid and dependable placement of large quantities of data within a distributed high-performance environment, and (b) the convenient construction of scalable science services that provide for the reliable and high-performance processing of computation and data analysis requests from many remote clients. The Centermore » is also addressing (c) the important problem of troubleshooting these and other related ultra-high-performance distributed activities from the perspective of both performance and functionality« less
Heterogeneous Distributed Computing for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Sunderam, Vaidy S.

1998-01-01

The research supported under this award focuses on heterogeneous distributed computing for high-performance applications, with particular emphasis on computational aerosciences. The overall goal of this project was to and investigate issues in, and develop solutions to, efficient execution of computational aeroscience codes in heterogeneous concurrent computing environments. In particular, we worked in the context of the PVM[1] system and, subsequent to detailed conversion efforts and performance benchmarking, devising novel techniques to increase the efficacy of heterogeneous networked environments for computational aerosciences. Our work has been based upon the NAS Parallel Benchmark suite, but has also recently expanded in scope to include the NAS I/O benchmarks as specified in the NHT-1 document. In this report we summarize our research accomplishments under the auspices of the grant.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Performance Evaluation of Communication Software Systems for Distributed Computing

NASA Technical Reports Server (NTRS)

Fatoohi, Rod

1996-01-01

In recent years there has been an increasing interest in object-oriented distributed computing since it is better quipped to deal with complex systems while providing extensibility, maintainability, and reusability. At the same time, several new high-speed network technologies have emerged for local and wide area networks. However, the performance of networking software is not improving as fast as the networking hardware and the workstation microprocessors. This paper gives an overview and evaluates the performance of the Common Object Request Broker Architecture (CORBA) standard in a distributed computing environment at NASA Ames Research Center. The environment consists of two testbeds of SGI workstations connected by four networks: Ethernet, FDDI, HiPPI, and ATM. The performance results for three communication software systems are presented, analyzed and compared. These systems are: BSD socket programming interface, IONA's Orbix, an implementation of the CORBA specification, and the PVM message passing library. The results show that high-level communication interfaces, such as CORBA and PVM, can achieve reasonable performance under certain conditions.
Optimization of tomographic reconstruction workflows on geographically distributed resources

DOE PAGES

Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar; ...

2016-01-01

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
Optimization of tomographic reconstruction workflows on geographically distributed resources

PubMed Central

Bicer, Tekin; Gürsoy, Doǧa; Kettimuthu, Rajkumar; De Carlo, Francesco; Foster, Ian T.

2016-01-01

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modeling of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Moreover, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks. PMID:27359149
Optimization of tomographic reconstruction workflows on geographically distributed resources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
Akuna: An Open Source User Environment for Managing Subsurface Simulation Workflows

NASA Astrophysics Data System (ADS)

Freedman, V. L.; Agarwal, D.; Bensema, K.; Finsterle, S.; Gable, C. W.; Keating, E. H.; Krishnan, H.; Lansing, C.; Moeglein, W.; Pau, G. S. H.; Porter, E.; Scheibe, T. D.

2014-12-01

The U.S. Department of Energy (DOE) is investing in development of a numerical modeling toolset called ASCEM (Advanced Simulation Capability for Environmental Management) to support modeling analyses at legacy waste sites. ASCEM is an open source and modular computing framework that incorporates new advances and tools for predicting contaminant fate and transport in natural and engineered systems. The ASCEM toolset includes both a Platform with Integrated Toolsets (called Akuna) and a High-Performance Computing multi-process simulator (called Amanzi). The focus of this presentation is on Akuna, an open-source user environment that manages subsurface simulation workflows and associated data and metadata. In this presentation, key elements of Akuna are demonstrated, which includes toolsets for model setup, database management, sensitivity analysis, parameter estimation, uncertainty quantification, and visualization of both model setup and simulation results. A key component of the workflow is in the automated job launching and monitoring capabilities, which allow a user to submit and monitor simulation runs on high-performance, parallel computers. Visualization of large outputs can also be performed without moving data back to local resources. These capabilities make high-performance computing accessible to the users who might not be familiar with batch queue systems and usage protocols on different supercomputers and clusters.
Implementing Molecular Dynamics on Hybrid High Performance Computers - Particle-Particle Particle-Mesh

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, W Michael; Kohlmeyer, Axel; Plimpton, Steven J

The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. In this paper, we present a continuation of previous work implementing algorithms for using accelerators into the LAMMPS molecular dynamics software for distributed memory parallel hybrid machines. In our previous work, we focused on acceleration for short-range models with anmore » approach intended to harness the processing power of both the accelerator and (multi-core) CPUs. To augment the existing implementations, we present an efficient implementation of long-range electrostatic force calculation for molecular dynamics. Specifically, we present an implementation of the particle-particle particle-mesh method based on the work by Harvey and De Fabritiis. We present benchmark results on the Keeneland InfiniBand GPU cluster. We provide a performance comparison of the same kernels compiled with both CUDA and OpenCL. We discuss limitations to parallel efficiency and future directions for improving performance on hybrid or heterogeneous computers.« less
Computer Facilitated Mathematical Methods in Chemical Engineering--Similarity Solution

ERIC Educational Resources Information Center

Subramanian, Venkat R.

2006-01-01

High-performance computers coupled with highly efficient numerical schemes and user-friendly software packages have helped instructors to teach numerical solutions and analysis of various nonlinear models more efficiently in the classroom. One of the main objectives of a model is to provide insight about the system of interest. Analytical…
Hypergraph-Based Combinatorial Optimization of Matrix-Vector Multiplication

ERIC Educational Resources Information Center

Wolf, Michael Maclean

2009-01-01

Combinatorial scientific computing plays an important enabling role in computational science, particularly in high performance scientific computing. In this thesis, we will describe our work on optimizing matrix-vector multiplication using combinatorial techniques. Our research has focused on two different problems in combinatorial scientific…
Big data computing: Building a vision for ARS information management

USDA-ARS?s Scientific Manuscript database

Improvements are needed within the ARS to increase scientific capacity and keep pace with new developments in computer technologies that support data acquisition and analysis. Enhancements in computing power and IT infrastructure are needed to provide scientists better access to high performance com...

The Effects of Varied versus Constant High-, Medium-, and Low-Preference Stimuli on Performance

ERIC Educational Resources Information Center

Wine, Byron; Wilder, David A.

2009-01-01

The purpose of the current study was to compare the delivery of varied versus constant high-, medium-, and low-preference stimuli on performance of 2 adults on a computer-based task in an analogue employment setting. For both participants, constant delivery of the high-preference stimulus produced the greatest increases in performance over…
Human and Robotic Space Mission Use Cases for High-Performance Spaceflight Computing

NASA Technical Reports Server (NTRS)

Some, Raphael; Doyle, Richard; Bergman, Larry; Whitaker, William; Powell, Wesley; Johnson, Michael; Goforth, Montgomery; Lowry, Michael

2013-01-01

Spaceflight computing is a key resource in NASA space missions and a core determining factor of spacecraft capability, with ripple effects throughout the spacecraft, end-to-end system, and mission. Onboard computing can be aptly viewed as a "technology multiplier" in that advances provide direct dramatic improvements in flight functions and capabilities across the NASA mission classes, and enable new flight capabilities and mission scenarios, increasing science and exploration return. Space-qualified computing technology, however, has not advanced significantly in well over ten years and the current state of the practice fails to meet the near- to mid-term needs of NASA missions. Recognizing this gap, the NASA Game Changing Development Program (GCDP), under the auspices of the NASA Space Technology Mission Directorate, commissioned a study on space-based computing needs, looking out 15-20 years. The study resulted in a recommendation to pursue high-performance spaceflight computing (HPSC) for next-generation missions, and a decision to partner with the Air Force Research Lab (AFRL) in this development.
The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences.

PubMed

Merchant, Nirav; Lyons, Eric; Goff, Stephen; Vaughn, Matthew; Ware, Doreen; Micklos, David; Antin, Parker

2016-01-01

The iPlant Collaborative provides life science research communities access to comprehensive, scalable, and cohesive computational infrastructure for data management; identity management; collaboration tools; and cloud, high-performance, high-throughput computing. iPlant provides training, learning material, and best practice resources to help all researchers make the best use of their data, expand their computational skill set, and effectively manage their data and computation when working as distributed teams. iPlant's platform permits researchers to easily deposit and share their data and deploy new computational tools and analysis workflows, allowing the broader community to easily use and reuse those data and computational analyses.
Ultrascale collaborative visualization using a display-rich global cyberinfrastructure.

PubMed

Jeong, Byungil; Leigh, Jason; Johnson, Andrew; Renambot, Luc; Brown, Maxine; Jagodic, Ratko; Nam, Sungwon; Hur, Hyejung

2010-01-01

The scalable adaptive graphics environment (SAGE) is high-performance graphics middleware for ultrascale collaborative visualization using a display-rich global cyberinfrastructure. Dozens of sites worldwide use this cyberinfrastructure middleware, which connects high-performance-computing resources over high-speed networks to distributed ultraresolution displays.
Polymer waveguides for electro-optical integration in data centers and high-performance computers.

PubMed

Dangel, Roger; Hofrichter, Jens; Horst, Folkert; Jubin, Daniel; La Porta, Antonio; Meier, Norbert; Soganci, Ibrahim Murat; Weiss, Jonas; Offrein, Bert Jan

2015-02-23

To satisfy the intra- and inter-system bandwidth requirements of future data centers and high-performance computers, low-cost low-power high-throughput optical interconnects will become a key enabling technology. To tightly integrate optics with the computing hardware, particularly in the context of CMOS-compatible silicon photonics, optical printed circuit boards using polymer waveguides are considered as a formidable platform. IBM Research has already demonstrated the essential silicon photonics and interconnection building blocks. A remaining challenge is electro-optical packaging, i.e., the connection of the silicon photonics chips with the system. In this paper, we present a new single-mode polymer waveguide technology and a scalable method for building the optical interface between silicon photonics chips and single-mode polymer waveguides.
Livermore Big Artificial Neural Network Toolkit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Essen, Brian Van; Jacobs, Sam; Kim, Hyojin

2016-07-01

LBANN is a toolkit that is designed to train artificial neural networks efficiently on high performance computing architectures. It is optimized to take advantages of key High Performance Computing features to accelerate neural network training. Specifically it is optimized for low-latency, high bandwidth interconnects, node-local NVRAM, node-local GPU accelerators, and high bandwidth parallel file systems. It is built on top of the open source Elemental distributed-memory dense and spars-direct linear algebra and optimization library that is released under the BSD license. The algorithms contained within LBANN are drawn from the academic literature and implemented to work within a distributed-memory framework.
Java Performance for Scientific Applications on LLNL Computer Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kapfer, C; Wissink, A

2002-05-10

Languages in use for high performance computing at the laboratory--Fortran (f77 and f90), C, and C++--have many years of development behind them and are generally considered the fastest available. However, Fortran and C do not readily extend to object-oriented programming models, limiting their capability for very complex simulation software. C++ facilitates object-oriented programming but is a very complex and error-prone language. Java offers a number of capabilities that these other languages do not. For instance it implements cleaner (i.e., easier to use and less prone to errors) object-oriented models than C++. It also offers networking and security as part ofmore » the language standard, and cross-platform executables that make it architecture neutral, to name a few. These features have made Java very popular for industrial computing applications. The aim of this paper is to explain the trade-offs in using Java for large-scale scientific applications at LLNL. Despite its advantages, the computational science community has been reluctant to write large-scale computationally intensive applications in Java due to concerns over its poor performance. However, considerable progress has been made over the last several years. The Java Grande Forum [1] has been promoting the use of Java for large-scale computing. Members have introduced efficient array libraries, developed fast just-in-time (JIT) compilers, and built links to existing packages used in high performance parallel computing.« less
Oversight Hearing on the High Performance Computing and Communications Program and Uses of the Information Highway. Hearing before the Subcommittee on Science, Technology, and Space of the Committee on Commerce, Science, and Transportation. United States Senate, One Hundred Fourth Congress, First Session.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. House Committee on Science, Space and Technology.

This document presents witness testimony and supplemental materials from a Congressional hearing called to evaluate the progress of the High Performance Computing and Communications program in light of budget requests, to examine the appropriate role for the government in such a project, and to see demonstrations of the World Wide Web and related…
Money for Research, Not for Energy Bills: Finding Energy and Cost Savings in High Performance Computer Facility Designs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Drewmark Communications; Sartor, Dale; Wilson, Mark

2010-07-01

High-performance computing facilities in the United States consume an enormous amount of electricity, cutting into research budgets and challenging public- and private-sector efforts to reduce energy consumption and meet environmental goals. However, these facilities can greatly reduce their energy demand through energy-efficient design of the facility itself. Using a case study of a facility under design, this article discusses strategies and technologies that can be used to help achieve energy reductions.
High-Performance Computing and Four-Dimensional Data Assimilation: The Impact on Future and Current Problems

NASA Technical Reports Server (NTRS)

Makivic, Miloje S.

1996-01-01

This is the final technical report for the project entitled: "High-Performance Computing and Four-Dimensional Data Assimilation: The Impact on Future and Current Problems", funded at NPAC by the DAO at NASA/GSFC. First, the motivation for the project is given in the introductory section, followed by the executive summary of major accomplishments and the list of project-related publications. Detailed analysis and description of research results is given in subsequent chapters and in the Appendix.
Fundamentals of Modeling, Data Assimilation, and High-performance Computing

NASA Technical Reports Server (NTRS)

Rood, Richard B.

2005-01-01

This lecture will introduce the concepts of modeling, data assimilation and high- performance computing as it relates to the study of atmospheric composition. The lecture will work from basic definitions and will strive to provide a framework for thinking about development and application of models and data assimilation systems. It will not provide technical or algorithmic information, leaving that to textbooks, technical reports, and ultimately scientific journals. References to a number of textbooks and papers will be provided as a gateway to the literature.
Maximum likelihood convolutional decoding (MCD) performance due to system losses

NASA Technical Reports Server (NTRS)

Webster, L.

1976-01-01

A model for predicting the computational performance of a maximum likelihood convolutional decoder (MCD) operating in a noisy carrier reference environment is described. This model is used to develop a subroutine that will be utilized by the Telemetry Analysis Program to compute the MCD bit error rate. When this computational model is averaged over noisy reference phase errors using a high-rate interpolation scheme, the results are found to agree quite favorably with experimental measurements.
Message Passing and Shared Address Space Parallelism on an SMP Cluster

NASA Technical Reports Server (NTRS)

Shan, Hongzhang; Singh, Jaswinder P.; Oliker, Leonid; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2002-01-01

Currently, message passing (MP) and shared address space (SAS) are the two leading parallel programming paradigms. MP has been standardized with MPI, and is the more common and mature approach; however, code development can be extremely difficult, especially for irregularly structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality and high protocol overhead. In this paper, we compare the performance of and the programming effort required for six applications under both programming models on a 32-processor PC-SMP cluster, a platform that is becoming increasingly attractive for high-end scientific computing. Our application suite consists of codes that typically do not exhibit scalable performance under shared-memory programming due to their high communication-to-computation ratios and/or complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications, while being competitive for the others. A hybrid MPI+SAS strategy shows only a small performance advantage over pure MPI in some cases. Finally, improved implementations of two MPI collective operations on PC-SMP clusters are presented.
High-Performance I/O: HDF5 for Lattice QCD

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kurth, Thorsten; Pochinsky, Andrew; Sarje, Abhinav

2015-01-01

Practitioners of lattice QCD/QFT have been some of the primary pioneer users of the state-of-the-art high-performance-computing systems, and contribute towards the stress tests of such new machines as soon as they become available. As with all aspects of high-performance-computing, I/O is becoming an increasingly specialized component of these systems. In order to take advantage of the latest available high-performance I/O infrastructure, to ensure reliability and backwards compatibility of data files, and to help unify the data structures used in lattice codes, we have incorporated parallel HDF5 I/O into the SciDAC supported USQCD software stack. Here we present the design andmore » implementation of this I/O framework. Our HDF5 implementation outperforms optimized QIO at the 10-20% level and leaves room for further improvement by utilizing appropriate dataset chunking.« less
High-Performance I/O: HDF5 for Lattice QCD

DOE PAGES

Kurth, Thorsten; Pochinsky, Andrew; Sarje, Abhinav; ...

2017-05-09

Practitioners of lattice QCD/QFT have been some of the primary pioneer users of the state-of-the-art high-performance-computing systems, and contribute towards the stress tests of such new machines as soon as they become available. As with all aspects of high-performance-computing, I/O is becoming an increasingly specialized component of these systems. In order to take advantage of the latest available high-performance I/O infrastructure, to ensure reliability and backwards compatibility of data files, and to help unify the data structures used in lattice codes, we have incorporated parallel HDF5 I/O into the SciDAC supported USQCD software stack. Here we present the design andmore » implementation of this I/O framework. Our HDF5 implementation outperforms optimized QIO at the 10-20% level and leaves room for further improvement by utilizing appropriate dataset chunking.« less
The Impact of Computer and Mathematics Software Usage on Performance of School Leavers in the Western Cape Province of South Africa: A Comparative Analysis

ERIC Educational Resources Information Center

Smith, Garth Spencer; Hardman, Joanne

2014-01-01

In this study the impact of computer immersion on performance of school leavers Senior Certificate mathematics scores was investigated across 31 schools in the EMDC East education district of Cape Town, South Africa by comparing performance between two groups: a control and an experimental group. The experimental group (14 high schools) had access…
Airfoil Vibration Dampers program

NASA Technical Reports Server (NTRS)

Cook, Robert M.

1991-01-01

The Airfoil Vibration Damper program has consisted of an analysis phase and a testing phase. During the analysis phase, a state-of-the-art computer code was developed, which can be used to guide designers in the placement and sizing of friction dampers. The use of this computer code was demonstrated by performing representative analyses on turbine blades from the High Pressure Oxidizer Turbopump (HPOTP) and High Pressure Fuel Turbopump (HPFTP) of the Space Shuttle Main Engine (SSME). The testing phase of the program consisted of performing friction damping tests on two different cantilever beams. Data from these tests provided an empirical check on the accuracy of the computer code developed in the analysis phase. Results of the analysis and testing showed that the computer code can accurately predict the performance of friction dampers. In addition, a valuable set of friction damping data was generated, which can be used to aid in the design of friction dampers, as well as provide benchmark test cases for future code developers.
A Full Navier-Stokes Analysis of Subsonic Diffuser of a Bifurcated 70/30 Supersonic Inlet for High Speed Civil Transport Application

NASA Technical Reports Server (NTRS)

Kapoor, Kamlesh; Anderson, Bernhard H.; Shaw, Robert J.

1994-01-01

A full Navier-Stokes analysis was performed to evaluate the performance of the subsonic diffuser of a NASA Lewis Research Center 70/30 mixed-compression bifurcated supersonic inlet for high speed civil transport application. The PARC3D code was used in the present study. The computations were also performed when approximately 2.5 percent of the engine mass flow was allowed to bypass through the engine bypass doors. The computational results were compared with the available experimental data which consisted of detailed Mach number and total pressure distribution along the entire length of the subsonic diffuser. The total pressure recovery, flow distortion, and crossflow velocity at the engine face were also calculated. The computed surface ramp and cowl pressure distributions were compared with experiments. Overall, the computational results compared well with experimental data. The present CFD analysis demonstrated that the bypass flow improves the total pressure recovery and lessens flow distortions at the engine face.
A High Performance Computing Study of a Scalable FISST-Based Approach to Multi-Target, Multi-Sensor Tracking

NASA Astrophysics Data System (ADS)

Hussein, I.; Wilkins, M.; Roscoe, C.; Faber, W.; Chakravorty, S.; Schumacher, P.

2016-09-01

Finite Set Statistics (FISST) is a rigorous Bayesian multi-hypothesis management tool for the joint detection, classification and tracking of multi-sensor, multi-object systems. Implicit within the approach are solutions to the data association and target label-tracking problems. The full FISST filtering equations, however, are intractable. While FISST-based methods such as the PHD and CPHD filters are tractable, they require heavy moment approximations to the full FISST equations that result in a significant loss of information contained in the collected data. In this paper, we review Smart Sampling Markov Chain Monte Carlo (SSMCMC) that enables FISST to be tractable while avoiding moment approximations. We study the effect of tuning key SSMCMC parameters on tracking quality and computation time. The study is performed on a representative space object catalog with varying numbers of RSOs. The solution is implemented in the Scala computing language at the Maui High Performance Computing Center (MHPCC) facility.
Characterisation of recycled acrylonitrile-butadiene-styrene and high-impact polystyrene from waste computer equipment in Brazil.

PubMed

Hirayama, Denise; Saron, Clodoaldo

2015-06-01

Polymeric materials constitute a considerable fraction of waste computer equipment and polymers acrylonitrile-butadiene-styrene and high-impact polystyrene are the main thermoplastic polymeric components found in waste computer equipment. Identification, separation and characterisation of additives present in acrylonitrile-butadiene-styrene and high-impact polystyrene are fundamental procedures to mechanical recycling of these polymers. The aim of this study was to evaluate the methods for identification of acrylonitrile-butadiene-styrene and high-impact polystyrene from waste computer equipment in Brazil, as well as their potential for mechanical recycling. The imprecise utilisation of symbols for identification of the polymers and the presence of additives containing toxic elements in determinate computer devices are some of the difficulties found for recycling of acrylonitrile-butadiene-styrene and high-impact polystyrene from waste computer equipment. However, the considerable performance of mechanical properties of the recycled acrylonitrile-butadiene-styrene and high-impact polystyrene when compared with the virgin materials confirms the potential for mechanical recycling of these polymers. © The Author(s) 2015.

Playable Serious Games for Studying and Programming Computational STEM and Informatics Applications of Distributed and Parallel Computer Architectures

ERIC Educational Resources Information Center

Amenyo, John-Thones

2012-01-01

Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…
DOE Office of Scientific and Technical Information (OSTI.GOV)

Klitsner, Tom

The recent Executive Order creating the National Strategic Computing Initiative (NSCI) recognizes the value of high performance computing for economic competitiveness and scientific discovery and commits to accelerate delivery of exascale computing. The HPC programs at Sandia –the NNSA ASC program and Sandia’s Institutional HPC Program– are focused on ensuring that Sandia has the resources necessary to deliver computation in the national interest.
Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Yier

As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from thismore » project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.« less
Development of a SaaS application probe to the physical properties of the Earth's interior: An attempt at moving HPC to the cloud

NASA Astrophysics Data System (ADS)

Huang, Qian

2014-09-01

Scientific computing often requires the availability of a massive number of computers for performing large-scale simulations, and computing in mineral physics is no exception. In order to investigate physical properties of minerals at extreme conditions in computational mineral physics, parallel computing technology is used to speed up the performance by utilizing multiple computer resources to process a computational task simultaneously thereby greatly reducing computation time. Traditionally, parallel computing has been addressed by using High Performance Computing (HPC) solutions and installed facilities such as clusters and super computers. Today, it has been seen that there is a tremendous growth in cloud computing. Infrastructure as a Service (IaaS), the on-demand and pay-as-you-go model, creates a flexible and cost-effective mean to access computing resources. In this paper, a feasibility report of HPC on a cloud infrastructure is presented. It is found that current cloud services in IaaS layer still need to improve performance to be useful to research projects. On the other hand, Software as a Service (SaaS), another type of cloud computing, is introduced into an HPC system for computing in mineral physics, and an application of which is developed. In this paper, an overall description of this SaaS application is presented. This contribution can promote cloud application development in computational mineral physics, and cross-disciplinary studies.
Roy Fraley | NREL

Science.gov Websites

Roy Fraley Roy Fraley Professional II-Engineer Roy.Fraley@nrel.gov | 303-384-6468 Roy Fraley is the high-performance computing (HPC) data center engineer with the Computational Science Center's HPC
Large-scale high-throughput computer-aided discovery of advanced materials using cloud computing

NASA Astrophysics Data System (ADS)

Bazhirov, Timur; Mohammadi, Mohammad; Ding, Kevin; Barabash, Sergey

Recent advances in cloud computing made it possible to access large-scale computational resources completely on-demand in a rapid and efficient manner. When combined with high fidelity simulations, they serve as an alternative pathway to enable computational discovery and design of new materials through large-scale high-throughput screening. Here, we present a case study for a cloud platform implemented at Exabyte Inc. We perform calculations to screen lightweight ternary alloys for thermodynamic stability. Due to the lack of experimental data for most such systems, we rely on theoretical approaches based on first-principle pseudopotential density functional theory. We calculate the formation energies for a set of ternary compounds approximated by special quasirandom structures. During an example run we were able to scale to 10,656 CPUs within 7 minutes from the start, and obtain results for 296 compounds within 38 hours. The results indicate that the ultimate formation enthalpy of ternary systems can be negative for some of lightweight alloys, including Li and Mg compounds. We conclude that compared to traditional capital-intensive approach that requires in on-premises hardware resources, cloud computing is agile and cost-effective, yet scalable and delivers similar performance.
Peregrine System Configuration | High-Performance Computing | NREL

Science.gov Websites

nodes and storage are connected by a high speed InfiniBand network. Compute nodes are diskless with an directories are mounted on all nodes, along with a file system dedicated to shared projects. A brief processors with 64 GB of memory. All nodes are connected to the high speed Infiniband network and and a
Matrix multiplication operations with data pre-conditioning in a high performance computing architecture

DOEpatents

Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A

2013-11-05

Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
Deepthi Vaidhynathan | NREL

Science.gov Websites

Complex Systems Simulation and Optimization Group on performance analysis and benchmarking latest . Research Interests High Performance Computing|Embedded System |Microprocessors & Microcontrollers
Simulating Hydrologic Flow and Reactive Transport with PFLOTRAN and PETSc on Emerging Fine-Grained Parallel Computer Architectures

NASA Astrophysics Data System (ADS)

Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.

2017-12-01

As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE PAGES

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.; ...

2015-07-30

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less
Computational Investigation of the Performance and Back-Pressure Limits of a Hypersonic Inlet

NASA Technical Reports Server (NTRS)

Smart, Michael K.; White, Jeffery A.

2002-01-01

A computational analysis of Mach 6.2 operation of a hypersonic inlet with rectangular-to-elliptical shape transition has been performed. The results of the computations are compared with experimental data for cases with and without a manually imposed back-pressure. While the no-back-pressure numerical solutions match the general trends of the data, certain features observed in the experiments did not appear in the computational solutions. The reasons for these discrepancies are discussed and possible remedies are suggested. Most importantly, however, the computational analysis increased the understanding of the consequences of certain aspects of the inlet design. This will enable the performance of future inlets of this class to be improved. Computational solutions with back-pressure under-estimated the back-pressure limit observed in the experiments, but did supply significant insight into the character of highly back-pressured inlet flows.
Integrated Computer-Aided Drafting Instruction (ICADI).

ERIC Educational Resources Information Center

Chen, C. Y.; McCampbell, David H.

Until recently, computer-aided drafting and design (CAD) systems were almost exclusively operated on mainframes or minicomputers and their cost prohibited many schools from offering CAD instruction. Today, many powerful personal computers are capable of performing the high-speed calculation and analysis required by the CAD application; however,…
Bilayer avalanche spin-diode logic

DOE Office of Scientific and Technical Information (OSTI.GOV)

Friedman, Joseph S., E-mail: joseph.friedman@u-psud.fr; Querlioz, Damien; Fadel, Eric R.

2015-11-15

A novel spintronic computing paradigm is proposed and analyzed in which InSb p-n bilayer avalanche spin-diodes are cascaded to efficiently perform complex logic operations. This spin-diode logic family uses control wires to generate magnetic fields that modulate the resistance of the spin-diodes, and currents through these devices control the resistance of cascaded devices. Electromagnetic simulations are performed to demonstrate the cascading mechanism, and guidelines are provided for the development of this innovative computing technology. This cascading scheme permits compact logic circuits with switching speeds determined by electromagnetic wave propagation rather than electron motion, enabling high-performance spintronic computing.
High-performance equation solvers and their impact on finite element analysis

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. Dale, Jr.

1990-01-01

The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number of operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
High-performance equation solvers and their impact on finite element analysis

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. D., Jr.

1992-01-01

The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number od operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

NASA Technical Reports Server (NTRS)

Kramer, Williams T. C.; Simon, Horst D.

1994-01-01

This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.
Critical thinking traits of top-tier experts and implications for computer science education

NASA Astrophysics Data System (ADS)

Bushey, Dean E.

A documented shortage of technical leadership and top-tier performers in computer science jeopardizes the technological edge, security, and economic well-being of the nation. The 2005 President's Information and Technology Advisory Committee (PITAC) Report on competitiveness in computational sciences highlights the major impact of science, technology, and innovation in keeping America competitive in the global marketplace. It stresses the fact that the supply of science, technology, and engineering experts is at the core of America's technological edge, national competitiveness and security. However, recent data shows that both undergraduate and postgraduate production of computer scientists is falling. The decline is "a quiet crisis building in the United States," a crisis that, if allowed to continue unchecked, could endanger America's well-being and preeminence among the world's nations. Past research on expert performance has shown that the cognitive traits of critical thinking, creativity, and problem solving possessed by top-tier performers can be identified, observed and measured. The studies show that the identified attributes are applicable across many domains and disciplines. Companies have begun to realize that cognitive skills are important for high-level performance and are reevaluating the traditional academic standards they have used to predict success for their top-tier performers in computer science. Previous research in the computer science field has focused either on programming skills of its experts or has attempted to predict the academic success of students at the undergraduate level. This study, on the other hand, examines the critical-thinking skills found among experts in the computer science field in order to explore the questions, "What cognitive skills do outstanding performers possess that make them successful?" and "How do currently used measures of academic performance correlate to critical-thinking skills among students?" The results of this study suggest a need to examine how critical-thinking abilities are learned in the undergraduate computer science curriculum and the need to foster these abilities in order to produce the high-level, critical-thinking professionals necessary to fill the growing need for these experts. Due to the fact that current measures of academic performance do not adequately depict students' cognitive abilities, assessment of these skills must be incorporated into existing curricula.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Aliaga, José I., E-mail: aliaga@uji.es; Alonso, Pedro; Badía, José M.

We introduce a new iterative Krylov subspace-based eigensolver for the simulation of macromolecular motions on desktop multithreaded platforms equipped with multicore processors and, possibly, a graphics accelerator (GPU). The method consists of two stages, with the original problem first reduced into a simpler band-structured form by means of a high-performance compute-intensive procedure. This is followed by a memory-intensive but low-cost Krylov iteration, which is off-loaded to be computed on the GPU by means of an efficient data-parallel kernel. The experimental results reveal the performance of the new eigensolver. Concretely, when applied to the simulation of macromolecules with a few thousandsmore » degrees of freedom and the number of eigenpairs to be computed is small to moderate, the new solver outperforms other methods implemented as part of high-performance numerical linear algebra packages for multithreaded architectures.« less

High-Fidelity Simulations of Electromagnetic Propagation and RF Communication Systems

DTIC Science & Technology

2017-05-01

addition to high -fidelity RF propagation modeling, lower-fidelity mod- els, which are less computationally burdensome, are available via a C++ API...expensive to perform, requiring roughly one hour of computer time with 36 available cores and ray tracing per- formed by a single high -end GPU...ER D C TR -1 7- 2 Military Engineering Applied Research High -Fidelity Simulations of Electromagnetic Propagation and RF Communication
Libraries for Software Use on Peregrine | High-Performance Computing | NREL

Science.gov Websites

-specific libraries. Libraries List Name Description BLAS Basic Linear Algebra Subroutines, libraries only managing hierarchically structured data. LAPACK Standard Netlib offering for computational linear algebra
High-Productivity Computing in Computational Physics Education

NASA Astrophysics Data System (ADS)

Tel-Zur, Guy

2011-03-01

We describe the development of a new course in Computational Physics at the Ben-Gurion University. This elective course for 3rd year undergraduates and MSc. students is being taught during one semester. Computational Physics is by now well accepted as the Third Pillar of Science. This paper's claim is that modern Computational Physics education should deal also with High-Productivity Computing. The traditional approach of teaching Computational Physics emphasizes ``Correctness'' and then ``Accuracy'' and we add also ``Performance.'' Along with topics in Mathematical Methods and case studies in Physics the course deals a significant amount of time with ``Mini-Courses'' in topics such as: High-Throughput Computing - Condor, Parallel Programming - MPI and OpenMP, How to build a Beowulf, Visualization and Grid and Cloud Computing. The course does not intend to teach neither new physics nor new mathematics but it is focused on an integrated approach for solving problems starting from the physics problem, the corresponding mathematical solution, the numerical scheme, writing an efficient computer code and finally analysis and visualization.
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vineyard, Craig Michael; Verzi, Stephen Joseph

As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilizemore » memory.« less
Non-volatile memory for checkpoint storage

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blumrich, Matthias A.; Chen, Dong; Cipolla, Thomas M.

A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, themore » non-volatile memory is a pluggable flash memory card.« less
Software Tools on the Peregrine System | High-Performance Computing | NREL

Science.gov Websites

Debugger or performance analysis Tool for understanding the behavior of MPI applications. Intel VTune environment for statistical computing and graphics. VirtualGL/TurboVNC Visualization and analytics Remote Tools on the Peregrine System Software Tools on the Peregrine System NREL has a variety of
High Performance Descriptive Semantic Analysis of Semantic Graph Databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Joslyn, Cliff A.; Adolf, Robert D.; al-Saffar, Sinan

As semantic graph database technology grows to address components ranging from extant large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to understand their inherent semantic structure, whether codified in explicit ontologies or not. Our group is researching novel methods for what we call descriptive semantic analysis of RDF triplestores, to serve purposes of analysis, interpretation, visualization, and optimization. But data size and computational complexity makes it increasingly necessary to bring high performance computational resources to bear on this task. Our research group built a novel high performance hybrid system comprisingmore » computational capability for semantic graph database processing utilizing the large multi-threaded architecture of the Cray XMT platform, conventional servers, and large data stores. In this paper we describe that architecture and our methods, and present the results of our analyses of basic properties, connected components, namespace interaction, and typed paths such for the Billion Triple Challenge 2010 dataset.« less
Experimental Evaluation and Workload Characterization for High-Performance Computer Architectures

NASA Technical Reports Server (NTRS)

El-Ghazawi, Tarek A.

1995-01-01

This research is conducted in the context of the Joint NSF/NASA Initiative on Evaluation (JNNIE). JNNIE is an inter-agency research program that goes beyond typical.bencbking to provide and in-depth evaluations and understanding of the factors that limit the scalability of high-performance computing systems. Many NSF and NASA centers have participated in the effort. Our research effort was an integral part of implementing JNNIE in the NASA ESS grand challenge applications context. Our research work under this program was composed of three distinct, but related activities. They include the evaluation of NASA ESS high- performance computing testbeds using the wavelet decomposition application; evaluation of NASA ESS testbeds using astrophysical simulation applications; and developing an experimental model for workload characterization for understanding workload requirements. In this report, we provide a summary of findings that covers all three parts, a list of the publications that resulted from this effort, and three appendices with the details of each of the studies using a key publication developed under the respective work.
Computational Environments and Analysis methods available on the NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform

NASA Astrophysics Data System (ADS)

Evans, B. J. K.; Foster, C.; Minchin, S. A.; Pugh, T.; Lewis, A.; Wyborn, L. A.; Evans, B. J.; Uhlherr, A.

2014-12-01

The National Computational Infrastructure (NCI) has established a powerful in-situ computational environment to enable both high performance computing and data-intensive science across a wide spectrum of national environmental data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress in addressing harmonisation of the underlying data collections for future transdisciplinary research that enable accurate climate projections. NCI makes available 10+ PB major data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. The data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. This computational environment supports a catalogue of integrated reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. To enable transdisciplinary research on this scale, data needs to be harmonised so that researchers can readily apply techniques and software across the corpus of data available and not be constrained to work within artificial disciplinary boundaries. Future challenges will involve the further integration and analysis of this data across the social sciences to facilitate the impacts across the societal domain, including timely analysis to more accurately predict and forecast future climate and environmental state.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

PubMed

Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

PubMed Central

Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
Using Modeling and Simulation to Complement Testing for Increased Understanding of Weapon Subassembly Response.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wong, Michael K.; Davidson, Megan

As part of Sandia’s nuclear deterrence mission, the B61-12 Life Extension Program (LEP) aims to modernize the aging weapon system. Modernization requires requalification and Sandia is using high performance computing to perform advanced computational simulations to better understand, evaluate, and verify weapon system performance in conjunction with limited physical testing. The Nose Bomb Subassembly (NBSA) of the B61-12 is responsible for producing a fuzing signal upon ground impact. The fuzing signal is dependent upon electromechanical impact sensors producing valid electrical fuzing signals at impact. Computer generated models were used to assess the timing between the impact sensor’s response to themore » deceleration of impact and damage to major components and system subassemblies. The modeling and simulation team worked alongside the physical test team to design a large-scale reverse ballistic test to not only assess system performance, but to also validate their computational models. The reverse ballistic test conducted at Sandia’s sled test facility sent a rocket sled with a representative target into a stationary B61-12 (NBSA) to characterize the nose crush and functional response of NBSA components. Data obtained from data recorders and high-speed photometrics were integrated with previously generated computer models in order to refine and validate the model’s ability to reliably simulate real-world effects. Large-scale tests are impractical to conduct for every single impact scenario. By creating reliable computer models, we can perform simulations that identify trends and produce estimates of outcomes over the entire range of required impact conditions. Sandia’s HPCs enable geometric resolution that was unachievable before, allowing for more fidelity and detail, and creating simulations that can provide insight to support evaluation of requirements and performance margins. As computing resources continue to improve, researchers at Sandia are hoping to improve these simulations so they provide increasingly credible analysis of the system response and performance over the full range of conditions.« less
Arctic Boreal Vulnerability Experiment (ABoVE) Science Cloud

NASA Astrophysics Data System (ADS)

Duffy, D.; Schnase, J. L.; McInerney, M.; Webster, W. P.; Sinno, S.; Thompson, J. H.; Griffith, P. C.; Hoy, E.; Carroll, M.

2014-12-01

The effects of climate change are being revealed at alarming rates in the Arctic and Boreal regions of the planet. NASA's Terrestrial Ecology Program has launched a major field campaign to study these effects over the next 5 to 8 years. The Arctic Boreal Vulnerability Experiment (ABoVE) will challenge scientists to take measurements in the field, study remote observations, and even run models to better understand the impacts of a rapidly changing climate for areas of Alaska and western Canada. The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center (GSFC) has partnered with the Terrestrial Ecology Program to create a science cloud designed for this field campaign - the ABoVE Science Cloud. The cloud combines traditional high performance computing with emerging technologies to create an environment specifically designed for large-scale climate analytics. The ABoVE Science Cloud utilizes (1) virtualized high-speed InfiniBand networks, (2) a combination of high-performance file systems and object storage, and (3) virtual system environments tailored for data intensive, science applications. At the center of the architecture is a large object storage environment, much like a traditional high-performance file system, that supports data proximal processing using technologies like MapReduce on a Hadoop Distributed File System (HDFS). Surrounding the storage is a cloud of high performance compute resources with many processing cores and large memory coupled to the storage through an InfiniBand network. Virtual systems can be tailored to a specific scientist and provisioned on the compute resources with extremely high-speed network connectivity to the storage and to other virtual systems. In this talk, we will present the architectural components of the science cloud and examples of how it is being used to meet the needs of the ABoVE campaign. In our experience, the science cloud approach significantly lowers the barriers and risks to organizations that require high performance computing solutions and provides the NCCS with the agility required to meet our customers' rapidly increasing and evolving requirements.
Short-term Temperature Prediction Using Adaptive Computing on Dynamic Scales

NASA Astrophysics Data System (ADS)

Hu, W.; Cervone, G.; Jha, S.; Balasubramanian, V.; Turilli, M.

2017-12-01

When predicting temperature, there are specific places and times when high accuracy predictions are harder. For example, not all the sub-regions in the domain require the same amount of computing resources to generate an accurate prediction. Plateau areas might require less computing resources than mountainous areas because of the steeper gradient of temperature change in the latter. However, it is difficult to estimate beforehand the optimal allocation of computational resources because several parameters play a role in determining the accuracy of the forecasts, in addition to orography. The allocation of resources to perform simulations can become a bottleneck because it requires human intervention to stop jobs or start new ones. The goal of this project is to design and develop a dynamic approach to generate short-term temperature predictions that can automatically determines the required computing resources and the geographic scales of the predictions based on the spatial and temporal uncertainties. The predictions and the prediction quality metrics are computed using a numeric weather prediction model, Analog Ensemble (AnEn), and the parallelization on high performance computing systems is accomplished using Ensemble Toolkit, one component of the RADICAL-Cybertools family of tools. RADICAL-Cybertools decouple the science needs from the computational capabilities by building an intermediate layer to run general ensemble patterns, regardless of the science. In this research, we show how the ensemble toolkit allows generating high resolution temperature forecasts at different spatial and temporal resolution. The AnEn algorithm is run using NAM analysis and forecasts data for the continental United States for a period of 2 years. AnEn results show that temperature forecasts perform well according to different probabilistic and deterministic statistical tests.
High throughput computing: a solution for scientific analysis

USGS Publications Warehouse

O'Donnell, M.

2011-01-01

handle job failures due to hardware, software, or network interruptions (obviating the need to manually resubmit the job after each stoppage); be affordable; and most importantly, allow us to complete very large, complex analyses that otherwise would not even be possible. In short, we envisioned a job-management system that would take advantage of unused FORT CPUs within a local area network (LAN) to effectively distribute and run highly complex analytical processes. What we found was a solution that uses High Throughput Computing (HTC) and High Performance Computing (HPC) systems to do exactly that (Figure 1).
Applied Information Systems Research Program (AISRP). Workshop 2: Meeting Proceedings

NASA Technical Reports Server (NTRS)

1992-01-01

The Earth and space science participants were able to see where the current research can be applied in their disciplines and computer science participants could see potential areas for future application of computer and information systems research. The Earth and Space Science research proposals for the High Performance Computing and Communications (HPCC) program were under evaluation. Therefore, this effort was not discussed at the AISRP Workshop. OSSA's other high priority area in computer science is scientific visualization, with the entire second day of the workshop devoted to it.
DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerber, Richard; Allcock, William; Beggio, Chris

2014-10-17

U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at themore » DOE national laboratories. The report contains findings from that review.« less
High Performance Semantic Factoring of Giga-Scale Semantic Graph Databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Joslyn, Cliff A.; Adolf, Robert D.; Al-Saffar, Sinan

2010-10-04

As semantic graph database technology grows to address components ranging from extant large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to bring high performance computational resources to bear on their analysis, interpretation, and visualization, especially with respect to their innate semantic structure. Our research group built a novel high performance hybrid system comprising computational capability for semantic graph database processing utilizing the large multithreaded architecture of the Cray XMT platform, conventional clusters, and large data stores. In this paper we describe that architecture, and present the results of our deployingmore » that for the analysis of the Billion Triple dataset with respect to its semantic factors.« less
Development of High-speed Visualization System of Hypocenter Data Using CUDA-based GPU computing

NASA Astrophysics Data System (ADS)

Kumagai, T.; Okubo, K.; Uchida, N.; Matsuzawa, T.; Kawada, N.; Takeuchi, N.

2014-12-01

After the Great East Japan Earthquake on March 11, 2011, intelligent visualization of seismic information is becoming important to understand the earthquake phenomena. On the other hand, to date, the quantity of seismic data becomes enormous as a progress of high accuracy observation network; we need to treat many parameters (e.g., positional information, origin time, magnitude, etc.) to efficiently display the seismic information. Therefore, high-speed processing of data and image information is necessary to handle enormous amounts of seismic data. Recently, GPU (Graphic Processing Unit) is used as an acceleration tool for data processing and calculation in various study fields. This movement is called GPGPU (General Purpose computing on GPUs). In the last few years the performance of GPU keeps on improving rapidly. GPU computing gives us the high-performance computing environment at a lower cost than before. Moreover, use of GPU has an advantage of visualization of processed data, because GPU is originally architecture for graphics processing. In the GPU computing, the processed data is always stored in the video memory. Therefore, we can directly write drawing information to the VRAM on the video card by combining CUDA and the graphics API. In this study, we employ CUDA and OpenGL and/or DirectX to realize full-GPU implementation. This method makes it possible to write drawing information to the VRAM on the video card without PCIe bus data transfer: It enables the high-speed processing of seismic data. The present study examines the GPU computing-based high-speed visualization and the feasibility for high-speed visualization system of hypocenter data.
The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

PubMed

Thackston, Russell; Fortenberry, Ryan C

2015-05-05

The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.

Fog computing job scheduling optimization based on bees swarm

NASA Astrophysics Data System (ADS)

Bitam, Salim; Zeadally, Sherali; Mellouk, Abdelhamid

2018-04-01

Fog computing is a new computing architecture, composed of a set of near-user edge devices called fog nodes, which collaborate together in order to perform computational services such as running applications, storing an important amount of data, and transmitting messages. Fog computing extends cloud computing by deploying digital resources at the premise of mobile users. In this new paradigm, management and operating functions, such as job scheduling aim at providing high-performance, cost-effective services requested by mobile users and executed by fog nodes. We propose a new bio-inspired optimization approach called Bees Life Algorithm (BLA) aimed at addressing the job scheduling problem in the fog computing environment. Our proposed approach is based on the optimized distribution of a set of tasks among all the fog computing nodes. The objective is to find an optimal tradeoff between CPU execution time and allocated memory required by fog computing services established by mobile users. Our empirical performance evaluation results demonstrate that the proposal outperforms the traditional particle swarm optimization and genetic algorithm in terms of CPU execution time and allocated memory.
Computer Simulation Performed for Columbia Project Cooling System

NASA Technical Reports Server (NTRS)

Ahmad, Jasim

2005-01-01

This demo shows a high-fidelity simulation of the air flow in the main computer room housing the Columbia (10,024 intel titanium processors) system. The simulation asseses the performance of the cooling system and identified deficiencies, and recommended modifications to eliminate them. It used two in house software packages on NAS supercomputers: Chimera Grid tools to generate a geometric model of the computer room, OVERFLOW-2 code for fluid and thermal simulation. This state-of-the-art technology can be easily extended to provide a general capability for air flow analyses on any modern computer room. Columbia_CFD_black.tiff
Application of Adjoint Method and Spectral-Element Method to Tomographic Inversion of Regional Seismological Structure Beneath Japanese Islands

NASA Astrophysics Data System (ADS)

Tsuboi, S.; Miyoshi, T.; Obayashi, M.; Tono, Y.; Ando, K.

2014-12-01

Recent progress in large scale computing by using waveform modeling technique and high performance computing facility has demonstrated possibilities to perform full-waveform inversion of three dimensional (3D) seismological structure inside the Earth. We apply the adjoint method (Liu and Tromp, 2006) to obtain 3D structure beneath Japanese Islands. First we implemented Spectral-Element Method to K-computer in Kobe, Japan. We have optimized SPECFEM3D_GLOBE (Komatitsch and Tromp, 2002) by using OpenMP so that the code fits hybrid architecture of K-computer. Now we could use 82,134 nodes of K-computer (657,072 cores) to compute synthetic waveform with about 1 sec accuracy for realistic 3D Earth model and its performance was 1.2 PFLOPS. We use this optimized SPECFEM3D_GLOBE code and take one chunk around Japanese Islands from global mesh and compute synthetic seismograms with accuracy of about 10 second. We use GAP-P2 mantle tomography model (Obayashi et al., 2009) as an initial 3D model and use as many broadband seismic stations available in this region as possible to perform inversion. We then use the time windows for body waves and surface waves to compute adjoint sources and calculate adjoint kernels for seismic structure. We have performed several iteration and obtained improved 3D structure beneath Japanese Islands. The result demonstrates that waveform misfits between observed and theoretical seismograms improves as the iteration proceeds. We now prepare to use much shorter period in our synthetic waveform computation and try to obtain seismic structure for basin scale model, such as Kanto basin, where there are dense seismic network and high seismic activity. Acknowledgements: This research was partly supported by MEXT Strategic Program for Innovative Research. We used F-net seismograms of the National Research Institute for Earth Science and Disaster Prevention.
NASA Tech Briefs, December 1993. Volume 17, No. 12

NASA Technical Reports Server (NTRS)

1993-01-01

Topics covered include: High-Performance Computing; Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Computer Programs; Mechanics; Machinery/Automation; Manufacturing/Fabrication; Mathematics and Information Sciences; Life Sciences; Books and Reports.
Analytical methodology for safety validation of computer controlled subsystems. Volume 1 : state-of-the-art and assessment of safety verification/validation methodologies

DOT National Transportation Integrated Search

1995-09-01

This report describes the development of a methodology designed to assure that a sufficiently high level of safety is achieved and maintained in computer-based systems which perform safety critical functions in high-speed rail or magnetic levitation ...
GPU acceleration of a petascale application for turbulent mixing at high Schmidt number using OpenMP 4.5

NASA Astrophysics Data System (ADS)

Clay, M. P.; Buaria, D.; Yeung, P. K.; Gotoh, T.

2018-07-01

This paper reports on the successful implementation of a massively parallel GPU-accelerated algorithm for the direct numerical simulation of turbulent mixing at high Schmidt number. The work stems from a recent development (Comput. Phys. Commun., vol. 219, 2017, 313-328), in which a low-communication algorithm was shown to attain high degrees of scalability on the Cray XE6 architecture when overlapping communication and computation via dedicated communication threads. An even higher level of performance has now been achieved using OpenMP 4.5 on the Cray XK7 architecture, where on each node the 16 integer cores of an AMD Interlagos processor share a single Nvidia K20X GPU accelerator. In the new algorithm, data movements are minimized by performing virtually all of the intensive scalar field computations in the form of combined compact finite difference (CCD) operations on the GPUs. A memory layout in departure from usual practices is found to provide much better performance for a specific kernel required to apply the CCD scheme. Asynchronous execution enabled by adding the OpenMP 4.5 NOWAIT clause to TARGET constructs improves scalability when used to overlap computation on the GPUs with computation and communication on the CPUs. On the 27-petaflops supercomputer Titan at Oak Ridge National Laboratory, USA, a GPU-to-CPU speedup factor of approximately 5 is consistently observed at the largest problem size of 81923 grid points for the scalar field computed with 8192 XK7 nodes.
Understanding and Improving High-Performance I/O Subsystems

NASA Technical Reports Server (NTRS)

El-Ghazawi, Tarek A.; Frieder, Gideon; Clark, A. James

1996-01-01

This research program has been conducted in the framework of the NASA Earth and Space Science (ESS) evaluations led by Dr. Thomas Sterling. In addition to the many important research findings for NASA and the prestigious publications, the program has helped orienting the doctoral research program of two students towards parallel input/output in high-performance computing. Further, the experimental results in the case of the MasPar were very useful and helpful to MasPar with which the P.I. has had many interactions with the technical management. The contributions of this program are drawn from three experimental studies conducted on different high-performance computing testbeds/platforms, and therefore presented in 3 different segments as follows: 1. Evaluating the parallel input/output subsystem of a NASA high-performance computing testbeds, namely the MasPar MP- 1 and MP-2; 2. Characterizing the physical input/output request patterns for NASA ESS applications, which used the Beowulf platform; and 3. Dynamic scheduling techniques for hiding I/O latency in parallel applications such as sparse matrix computations. This study also has been conducted on the Intel Paragon and has also provided an experimental evaluation for the Parallel File System (PFS) and parallel input/output on the Paragon. This report is organized as follows. The summary of findings discusses the results of each of the aforementioned 3 studies. Three appendices, each containing a key scholarly research paper that details the work in one of the studies are included.
Implementing Simulation Design of Experiments and Remote Execution on a High Performance Computing Cluster

DTIC Science & Technology

2007-09-01

example, an application developed in Sun’s Netbeans [2007] integrated development environment (IDE) uses Swing class object for graphical user... Netbeans Version 5.5.1 [Computer Software]. Santa Clara, CA: Sun Microsystems. Process Modeler Version 7.0 [Computer Software]. Santa Clara, Ca
Tse computers. [ultrahigh speed optical processing for two dimensional binary image

NASA Technical Reports Server (NTRS)

Schaefer, D. H.; Strong, J. P., III

1977-01-01

An ultra-high-speed computer that utilizes binary images as its basic computational entity is being developed. The basic logic components perform thousands of operations simultaneously. Technologies of the fiber optics, display, thin film, and semiconductor industries are being utilized in the building of the hardware.
Turbocharged molecular discovery of OLED emitters: from high-throughput quantum simulation to highly efficient TADF devices

NASA Astrophysics Data System (ADS)

Gómez-Bombarelli, Rafael; Aguilera-Iparraguirre, Jorge; Hirzel, Timothy D.; Ha, Dong-Gwang; Einzinger, Markus; Wu, Tony; Baldo, Marc A.; Aspuru-Guzik, Alán.

2016-09-01

Discovering new OLED emitters requires many experiments to synthesize candidates and test performance in devices. Large scale computer simulation can greatly speed this search process but the problem remains challenging enough that brute force application of massive computing power is not enough to successfully identify novel structures. We report a successful High Throughput Virtual Screening study that leveraged a range of methods to optimize the search process. The generation of candidate structures was constrained to contain combinatorial explosion. Simulations were tuned to the specific problem and calibrated with experimental results. Experimentalists and theorists actively collaborated such that experimental feedback was regularly utilized to update and shape the computational search. Supervised machine learning methods prioritized candidate structures prior to quantum chemistry simulation to prevent wasting compute on likely poor performers. With this combination of techniques, each multiplying the strength of the search, this effort managed to navigate an area of molecular space and identify hundreds of promising OLED candidate structures. An experimentally validated selection of this set shows emitters with external quantum efficiencies as high as 22%.
Overview of High-Fidelity Modeling Activities in the Numerical Propulsion System Simulations (NPSS) Project

NASA Technical Reports Server (NTRS)

Veres, Joseph P.

2002-01-01

A high-fidelity simulation of a commercial turbofan engine has been created as part of the Numerical Propulsion System Simulation Project. The high-fidelity computer simulation utilizes computer models that were developed at NASA Glenn Research Center in cooperation with turbofan engine manufacturers. The average-passage (APNASA) Navier-Stokes based viscous flow computer code is used to simulate the 3D flow in the compressors and turbines of the advanced commercial turbofan engine. The 3D National Combustion Code (NCC) is used to simulate the flow and chemistry in the advanced aircraft combustor. The APNASA turbomachinery code and the NCC combustor code exchange boundary conditions at the interface planes at the combustor inlet and exit. This computer simulation technique can evaluate engine performance at steady operating conditions. The 3D flow models provide detailed knowledge of the airflow within the fan and compressor, the high and low pressure turbines, and the flow and chemistry within the combustor. The models simulate the performance of the engine at operating conditions that include sea level takeoff and the altitude cruise condition.
Large-scale structural analysis: The structural analyst, the CSM Testbed and the NAS System

NASA Technical Reports Server (NTRS)

Knight, Norman F., Jr.; Mccleary, Susan L.; Macy, Steven C.; Aminpour, Mohammad A.

1989-01-01

The Computational Structural Mechanics (CSM) activity is developing advanced structural analysis and computational methods that exploit high-performance computers. Methods are developed in the framework of the CSM testbed software system and applied to representative complex structural analysis problems from the aerospace industry. An overview of the CSM testbed methods development environment is presented and some numerical methods developed on a CRAY-2 are described. Selected application studies performed on the NAS CRAY-2 are also summarized.
Queueing Network Models for Parallel Processing of Task Systems: an Operational Approach

NASA Technical Reports Server (NTRS)

Mak, Victor W. K.

1986-01-01

Computer performance modeling of possibly complex computations running on highly concurrent systems is considered. Earlier works in this area either dealt with a very simple program structure or resulted in methods with exponential complexity. An efficient procedure is developed to compute the performance measures for series-parallel-reducible task systems using queueing network models. The procedure is based on the concept of hierarchical decomposition and a new operational approach. Numerical results for three test cases are presented and compared to those of simulations.
High Order Semi-Lagrangian Advection Scheme

NASA Astrophysics Data System (ADS)

Malaga, Carlos; Mandujano, Francisco; Becerra, Julian

2014-11-01

In most fluid phenomena, advection plays an important roll. A numerical scheme capable of making quantitative predictions and simulations must compute correctly the advection terms appearing in the equations governing fluid flow. Here we present a high order forward semi-Lagrangian numerical scheme specifically tailored to compute material derivatives. The scheme relies on the geometrical interpretation of material derivatives to compute the time evolution of fields on grids that deform with the material fluid domain, an interpolating procedure of arbitrary order that preserves the moments of the interpolated distributions, and a nonlinear mapping strategy to perform interpolations between undeformed and deformed grids. Additionally, a discontinuity criterion was implemented to deal with discontinuous fields and shocks. Tests of pure advection, shock formation and nonlinear phenomena are presented to show performance and convergence of the scheme. The high computational cost is considerably reduced when implemented on massively parallel architectures found in graphic cards. The authors acknowledge funding from Fondo Sectorial CONACYT-SENER Grant Number 42536 (DGAJ-SPI-34-170412-217).
Understanding Portability of a High-Level Programming Model on Contemporary Heterogeneous Architectures

DOE PAGES

Sabne, Amit J.; Sakdhnagool, Putt; Lee, Seyong; ...

2015-07-13

Accelerator-based heterogeneous computing is gaining momentum in the high-performance computing arena. However, the increased complexity of heterogeneous architectures demands more generic, high-level programming models. OpenACC is one such attempt to tackle this problem. Although the abstraction provided by OpenACC offers productivity, it raises questions concerning both functional and performance portability. In this article, the authors propose HeteroIR, a high-level, architecture-independent intermediate representation, to map high-level programming models, such as OpenACC, to heterogeneous architectures. They present a compiler approach that translates OpenACC programs into HeteroIR and accelerator kernels to obtain OpenACC functional portability. They then evaluate the performance portability obtained bymore » OpenACC with their approach on 12 OpenACC programs on Nvidia CUDA, AMD GCN, and Intel Xeon Phi architectures. They study the effects of various compiler optimizations and OpenACC program settings on these architectures to provide insights into the achieved performance portability.« less
Connectionist Models and Parallelism in High Level Vision.

DTIC Science & Technology

1985-01-01

GRANT NUMBER(s) Jerome A. Feldman N00014-82-K-0193 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENt. PROJECT, TASK Computer Science...Connectionist Models 2.1 Background and Overviev % Computer science is just beginning to look seriously at parallel computation : it may turn out that...the chair. The program includes intermediate level networks that compute more complex joints and ones that compute parallelograms in the image. These
A survey of CPU-GPU heterogeneous computing techniques

DOE PAGES

Mittal, Sparsh; Vetter, Jeffrey S.

2015-07-04

As both CPU and GPU become employed in a wide range of applications, it has been acknowledged that both of these processing units (PUs) have their unique features and strengths and hence, CPU-GPU collaboration is inevitable to achieve high-performance computing. This has motivated significant amount of research on heterogeneous computing techniques, along with the design of CPU-GPU fused chips and petascale heterogeneous supercomputers. In this paper, we survey heterogeneous computing techniques (HCTs) such as workload-partitioning which enable utilizing both CPU and GPU to improve performance and/or energy efficiency. We review heterogeneous computing approaches at runtime, algorithm, programming, compiler and applicationmore » level. Further, we review both discrete and fused CPU-GPU systems; and discuss benchmark suites designed for evaluating heterogeneous computing systems (HCSs). Furthermore, we believe that this paper will provide insights into working and scope of applications of HCTs to researchers and motivate them to further harness the computational powers of CPUs and GPUs to achieve the goal of exascale performance.« less
A survey of CPU-GPU heterogeneous computing techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mittal, Sparsh; Vetter, Jeffrey S.

As both CPU and GPU become employed in a wide range of applications, it has been acknowledged that both of these processing units (PUs) have their unique features and strengths and hence, CPU-GPU collaboration is inevitable to achieve high-performance computing. This has motivated significant amount of research on heterogeneous computing techniques, along with the design of CPU-GPU fused chips and petascale heterogeneous supercomputers. In this paper, we survey heterogeneous computing techniques (HCTs) such as workload-partitioning which enable utilizing both CPU and GPU to improve performance and/or energy efficiency. We review heterogeneous computing approaches at runtime, algorithm, programming, compiler and applicationmore » level. Further, we review both discrete and fused CPU-GPU systems; and discuss benchmark suites designed for evaluating heterogeneous computing systems (HCSs). Furthermore, we believe that this paper will provide insights into working and scope of applications of HCTs to researchers and motivate them to further harness the computational powers of CPUs and GPUs to achieve the goal of exascale performance.« less
Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

NASA Astrophysics Data System (ADS)

Alameda, J. C.

2011-12-01

Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into computational science and engineering codes. Finally, we are partnering with the lead PTP developers at IBM, to ensure we are as effective as possible within the Eclipse community development. We are also conducting training and outreach to our user community, including conference BOF sessions, monthly user calls, and an annual user meeting, so that we can best inform the improvements we make to Eclipse PTP. With these activities we endeavor to encourage use of modern software engineering practices, as enabled through the Eclipse IDE, with computational science and engineering applications. These practices include proper use of source code repositories, tracking and rectifying issues, measuring and monitoring code performance changes against both optimizations as well as ever-changing software stacks and configurations on HPC systems, as well as ultimately encouraging development and maintenance of testing suites -- things that have become commonplace in many software endeavors, but have lagged in the development of science applications. We view that the challenge with the increased complexity of both HPC systems and science applications demands the use of better software engineering methods, preferably enabled by modern tools such as Eclipse PTP, to help the computational science community thrive as we evolve the HPC landscape.
Computer versus paper--does it make any difference in test performance?

PubMed

Karay, Yassin; Schauber, Stefan K; Stosch, Christoph; Schüttpelz-Brauns, Katrin

2015-01-01

CONSTRUCT: In this study, we examine the differences in test performance between the paper-based and the computer-based version of the Berlin formative Progress Test. In this context it is the first study that allows controlling for students' prior performance. Computer-based tests make possible a more efficient examination procedure for test administration and review. Although university staff will benefit largely from computer-based tests, the question arises if computer-based tests influence students' test performance. A total of 266 German students from the 9th and 10th semester of medicine (comparable with the 4th-year North American medical school schedule) participated in the study (paper = 132, computer = 134). The allocation of the test format was conducted as a randomized matched-pair design in which students were first sorted according to their prior test results. The organizational procedure, the examination conditions, the room, and seating arrangements, as well as the order of questions and answers, were identical in both groups. The sociodemographic variables and pretest scores of both groups were comparable. The test results from the paper and computer versions did not differ. The groups remained within the allotted time, but students using the computer version (particularly the high performers) needed significantly less time to complete the test. In addition, we found significant differences in guessing behavior. Low performers using the computer version guess significantly more than low-performing students in the paper-pencil version. Participants in computer-based tests are not at a disadvantage in terms of their test results. The computer-based test required less processing time. The reason for the longer processing time when using the paper-pencil version might be due to the time needed to write the answer down, controlling for transferring the answer correctly. It is still not known why students using the computer version (particularly low-performing students) guess at a higher rate. Further studies are necessary to understand this finding.

Innovative architectures for dense multi-microprocessor computers

NASA Technical Reports Server (NTRS)

Larson, Robert E.

1989-01-01

The purpose is to summarize a Phase 1 SBIR project performed for the NASA/Langley Computational Structural Mechanics Group. The project was performed from February to August 1987. The main objectives of the project were to: (1) expand upon previous research into the application of chordal ring architectures to the general problem of designing multi-microcomputer architectures, (2) attempt to identify a family of chordal rings such that each chordal ring can be simply expanded to produce the next member of the family, (3) perform a preliminary, high-level design of an expandable multi-microprocessor computer based upon chordal rings, (4) analyze the potential use of chordal ring based multi-microprocessors for sparse matrix problems and other applications arising in computational structural mechanics.
GPUs: An Emerging Platform for General-Purpose Computation

DTIC Science & Technology

2007-08-01

programming; real-time cinematic quality graphics Peak stream (26) License required (limited time no- cost evaluation program) Commercially...folding.stanford.edu (accessed 30 March 2007). 2. Fan, Z.; Qiu, F.; Kaufman, A.; Yoakum-Stover, S. GPU Cluster for High Performance Computing. ACM/IEEE...accessed 30 March 2007). 8. Goodnight, N.; Wang, R.; Humphreys, G. Computation on Programmable Graphics Hardware. IEEE Computer Graphics and
Unstructured Grid Euler Method Assessment for Longitudinal and Lateral/Directional Aerodynamic Performance Analysis of the HSR Technology Concept Airplane at Supersonic Cruise Speed

NASA Technical Reports Server (NTRS)

Ghaffari, Farhad

1999-01-01

Unstructured grid Euler computations, performed at supersonic cruise speed, are presented for a High Speed Civil Transport (HSCT) configuration, designated as the Technology Concept Airplane (TCA) within the High Speed Research (HSR) Program. The numerical results are obtained for the complete TCA cruise configuration which includes the wing, fuselage, empennage, diverters, and flow through nacelles at M (sub infinity) = 2.4 for a range of angles-of-attack and sideslip. Although all the present computations are performed for the complete TCA configuration, appropriate assumptions derived from the fundamental supersonic aerodynamic principles have been made to extract aerodynamic predictions to complement the experimental data obtained from a 1.675%-scaled truncated (aft fuselage/empennage components removed) TCA model. The validity of the computational results, derived from the latter assumptions, are thoroughly addressed and discussed in detail. The computed surface and off-surface flow characteristics are analyzed and the pressure coefficient contours on the wing lower surface are shown to correlate reasonably well with the available pressure sensitive paint results, particularly, for the complex flow structures around the nacelles. The predicted longitudinal and lateral/directional performance characteristics for the truncated TCA configuration are shown to correlate very well with the corresponding wind-tunnel data across the examined range of angles-of-attack and sideslip. The complementary computational results for the longitudinal and lateral/directional performance characteristics for the complete TCA configuration are also presented along with the aerodynamic effects due to empennage components. Results are also presented to assess the computational method performance, solution sensitivity to grid refinement, and solution convergence characteristics.
Advanced Dynamically Adaptive Algorithms for Stochastic Simulations on Extreme Scales

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiu, Dongbin

2017-03-03

The focus of the project is the development of mathematical methods and high-performance computational tools for stochastic simulations, with a particular emphasis on computations on extreme scales. The core of the project revolves around the design of highly efficient and scalable numerical algorithms that can adaptively and accurately, in high dimensional spaces, resolve stochastic problems with limited smoothness, even containing discontinuities.
The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences

PubMed Central

Merchant, Nirav; Lyons, Eric; Goff, Stephen; Vaughn, Matthew; Ware, Doreen; Micklos, David; Antin, Parker

2016-01-01

The iPlant Collaborative provides life science research communities access to comprehensive, scalable, and cohesive computational infrastructure for data management; identity management; collaboration tools; and cloud, high-performance, high-throughput computing. iPlant provides training, learning material, and best practice resources to help all researchers make the best use of their data, expand their computational skill set, and effectively manage their data and computation when working as distributed teams. iPlant’s platform permits researchers to easily deposit and share their data and deploy new computational tools and analysis workflows, allowing the broader community to easily use and reuse those data and computational analyses. PMID:26752627
Symplectic molecular dynamics simulations on specially designed parallel computers.

PubMed

Borstnik, Urban; Janezic, Dusanka

2005-01-01

We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables the use of longer simulation time steps. The low-frequency motion is treated numerically on specially designed parallel computers, which decreases the computational time of each simulation time step. The combination of these approaches means that less time is required and fewer steps are needed and so enables fast MD simulations. We study the computational performance of MD simulation of molecular systems on specialized computers and provide a comparison to standard personal computers. The combination of the SISM with two specialized parallel computers is an effective way to increase the speed of MD simulations up to 16-fold over a single PC processor.
DNS of Flow in a Low-Pressure Turbine Cascade Using a Discontinuous-Galerkin Spectral-Element Method

NASA Technical Reports Server (NTRS)

Garai, Anirban; Diosady, Laslo Tibor; Murman, Scott; Madavan, Nateri

2015-01-01

A new computational capability under development for accurate and efficient high-fidelity direct numerical simulation (DNS) and large eddy simulation (LES) of turbomachinery is described. This capability is based on an entropy-stable Discontinuous-Galerkin spectral-element approach that extends to arbitrarily high orders of spatial and temporal accuracy and is implemented in a computationally efficient manner on a modern high performance computer architecture. A validation study using this method to perform DNS of flow in a low-pressure turbine airfoil cascade are presented. Preliminary results indicate that the method captures the main features of the flow. Discrepancies between the predicted results and the experiments are likely due to the effects of freestream turbulence not being included in the simulation and will be addressed in the final paper.
Mathematical modelling of Bit-Level Architecture using Reciprocal Quantum Logic

NASA Astrophysics Data System (ADS)

Narendran, S.; Selvakumar, J.

2018-04-01

Efficiency of high-performance computing is on high demand with both speed and energy efficiency. Reciprocal Quantum Logic (RQL) is one of the technology which will produce high speed and zero static power dissipation. RQL uses AC power supply as input rather than DC input. RQL has three set of basic gates. Series of reciprocal transmission lines are placed in between each gate to avoid loss of power and to achieve high speed. Analytical model of Bit-Level Architecture are done through RQL. Major drawback of reciprocal Quantum Logic is area, because of lack in proper power supply. To achieve proper power supply we need to use splitters which will occupy large area. Distributed arithmetic uses vector- vector multiplication one is constant and other is signed variable and each word performs as a binary number, they rearranged and mixed to form distributed system. Distributed arithmetic is widely used in convolution and high performance computational devices.
Acceleration of fluoro-CT reconstruction for a mobile C-Arm on GPU and FPGA hardware: a simulation study

NASA Astrophysics Data System (ADS)

Xue, Xinwei; Cheryauka, Arvi; Tubbs, David

2006-03-01

CT imaging in interventional and minimally-invasive surgery requires high-performance computing solutions that meet operational room demands, healthcare business requirements, and the constraints of a mobile C-arm system. The computational requirements of clinical procedures using CT-like data are increasing rapidly, mainly due to the need for rapid access to medical imagery during critical surgical procedures. The highly parallel nature of Radon transform and CT algorithms enables embedded computing solutions utilizing a parallel processing architecture to realize a significant gain of computational intensity with comparable hardware and program coding/testing expenses. In this paper, using a sample 2D and 3D CT problem, we explore the programming challenges and the potential benefits of embedded computing using commodity hardware components. The accuracy and performance results obtained on three computational platforms: a single CPU, a single GPU, and a solution based on FPGA technology have been analyzed. We have shown that hardware-accelerated CT image reconstruction can be achieved with similar levels of noise and clarity of feature when compared to program execution on a CPU, but gaining a performance increase at one or more orders of magnitude faster. 3D cone-beam or helical CT reconstruction and a variety of volumetric image processing applications will benefit from similar accelerations.
Taming Pipelines, Users, and High Performance Computing with Rector

NASA Astrophysics Data System (ADS)

Estes, N. M.; Bowley, K. S.; Paris, K. N.; Silva, V. H.; Robinson, M. S.

2018-04-01

Rector is a high-performance job management system created by the LROC SOC team to enable processing of thousands of observations and ancillary data products as well as ad-hoc user jobs across a 634 CPU core processing cluster.
Implementing Scientific Simulation Codes Highly Tailored for Vector Architectures Using Custom Configurable Computing Machines

NASA Technical Reports Server (NTRS)

Rutishauser, David

2006-01-01

The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters that attempts to minimize execution time, while staying within resource constraints. The flexibility of using a custom reconfigurable implementation is exploited in a unique manner to leverage the lessons learned in vector supercomputer development. The vector processing framework is tailored to the application, with variable parameters that are fixed in traditional vector processing. Benchmark data that demonstrates the functionality and utility of the approach is presented. The benchmark data includes an identified bottleneck in a real case study example vector code, the NASA Langley Terminal Area Simulation System (TASS) application.
MultiPhyl: a high-throughput phylogenomics webserver using distributed computing

PubMed Central

Keane, Thomas M.; Naughton, Thomas J.; McInerney, James O.

2007-01-01

With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php. PMID:17553837
Computing Systems | High-Performance Computing | NREL

Science.gov Websites

investigate, build, and test models of complex phenomena or entire integrated systems-that cannot be directly observed or manipulated in the lab, or would be too expensive or time consuming. Models and visualizations
High-Performance Computing Data Center Efficiency Dashboard | Computational

Science.gov Websites

recovery water (ERW) loop Heat exchanger for energy recovery Thermosyphon Heat exchanger between ERW loop and cooling tower loop Evaporative cooling towers Learn more about our energy-efficient facility
System Access | High-Performance Computing | NREL

Science.gov Websites

) systems. Photo of man looking at a large computer monitor with a colorful, visual display of data. System secure shell gateway (SSH) or virtual private network (VPN). User Accounts Request a user account
A Framework for Debugging Geoscience Projects in a High Performance Computing Environment

NASA Astrophysics Data System (ADS)

Baxter, C.; Matott, L.

2012-12-01

High performance computing (HPC) infrastructure has become ubiquitous in today's world with the emergence of commercial cloud computing and academic supercomputing centers. Teams of geoscientists, hydrologists and engineers can take advantage of this infrastructure to undertake large research projects - for example, linking one or more site-specific environmental models with soft computing algorithms, such as heuristic global search procedures, to perform parameter estimation and predictive uncertainty analysis, and/or design least-cost remediation systems. However, the size, complexity and distributed nature of these projects can make identifying failures in the associated numerical experiments using conventional ad-hoc approaches both time- consuming and ineffective. To address these problems a multi-tiered debugging framework has been developed. The framework allows for quickly isolating and remedying a number of potential experimental failures, including: failures in the HPC scheduler; bugs in the soft computing code; bugs in the modeling code; and permissions and access control errors. The utility of the framework is demonstrated via application to a series of over 200,000 numerical experiments involving a suite of 5 heuristic global search algorithms and 15 mathematical test functions serving as cheap analogues for the simulation-based optimization of pump-and-treat subsurface remediation systems.
High performance in silico virtual drug screening on many-core processors.

PubMed

McIntosh-Smith, Simon; Price, James; Sessions, Richard B; Ibarra, Amaurys A

2015-05-01

Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel's Xeon Phi and multi-core CPUs with SIMD instruction sets.
High performance in silico virtual drug screening on many-core processors

PubMed Central

Price, James; Sessions, Richard B; Ibarra, Amaurys A

2015-01-01

Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets. PMID:25972727
Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

DOE PAGES

Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael

2015-04-08

The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on themore » performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.« less
Accessible high performance computing solutions for near real-time image processing for time critical applications

NASA Astrophysics Data System (ADS)

Bielski, Conrad; Lemoine, Guido; Syryczynski, Jacek

2009-09-01

High Performance Computing (HPC) hardware solutions such as grid computing and General Processing on a Graphics Processing Unit (GPGPU) are now accessible to users with general computing needs. Grid computing infrastructures in the form of computing clusters or blades are becoming common place and GPGPU solutions that leverage the processing power of the video card are quickly being integrated into personal workstations. Our interest in these HPC technologies stems from the need to produce near real-time maps from a combination of pre- and post-event satellite imagery in support of post-disaster management. Faster processing provides a twofold gain in this situation: 1. critical information can be provided faster and 2. more elaborate automated processing can be performed prior to providing the critical information. In our particular case, we test the use of the PANTEX index which is based on analysis of image textural measures extracted using anisotropic, rotation-invariant GLCM statistics. The use of this index, applied in a moving window, has been shown to successfully identify built-up areas in remotely sensed imagery. Built-up index image masks are important input to the structuring of damage assessment interpretation because they help optimise the workload. The performance of computing the PANTEX workflow is compared on two different HPC hardware architectures: (1) a blade server with 4 blades, each having dual quad-core CPUs and (2) a CUDA enabled GPU workstation. The reference platform is a dual CPU-quad core workstation and the PANTEX workflow total computing time is measured. Furthermore, as part of a qualitative evaluation, the differences in setting up and configuring various hardware solutions and the related software coding effort is presented.

High performance pipelined multiplier with fast carry-save adder

NASA Technical Reports Server (NTRS)

Wu, Angus

1990-01-01

A high-performance pipelined multiplier is described. Its high performance results from the fast carry-save adder basic cell which has a simple structure and is suitable for the Gate Forest semi-custom environment. The carry-save adder computes the sum and carry within two gate delay. Results show that the proposed adder can operate at 200 MHz for a 2-micron CMOS process; better performance is expected in a Gate Forest realization.
Image Processor Electronics (IPE): The High-Performance Computing System for NASA SWIFT Mission

NASA Technical Reports Server (NTRS)

Nguyen, Quang H.; Settles, Beverly A.

2003-01-01

Gamma Ray Bursts (GRBs) are believed to be the most powerful explosions that have occurred in the Universe since the Big Bang and are a mystery to the scientific community. Swift, a NASA mission that includes international participation, was designed and built in preparation for a 2003 launch to help to determine the origin of Gamma Ray Bursts. Locating the position in the sky where a burst originates requires intensive computing, because the duration of a GRB can range between a few milliseconds up to approximately a minute. The instrument data system must constantly accept multiple images representing large regions of the sky that are generated by sixteen gamma ray detectors operating in parallel. It then must process the received images very quickly in order to determine the existence of possible gamma ray bursts and their locations. The high-performance instrument data computing system that accomplishes this is called the Image Processor Electronics (IPE). The IPE was designed, built and tested by NASA Goddard Space Flight Center (GSFC) in order to meet these challenging requirements. The IPE is a small size, low power and high performing computing system for space applications. This paper addresses the system implementation and the system hardware architecture of the IPE. The paper concludes with the IPE system performance that was measured during end-to-end system testing.
The Impact of Computer Simulations as Interactive Demonstration Tools on the Performance of Grade 11 Learners in Electromagnetism

ERIC Educational Resources Information Center

Kotoka, Jonas; Kriek, Jeanne

2014-01-01

The impact of computer simulations on the performance of 65 grade 11 learners in electromagnetism in a South African high school in the Mpumalanga province is investigated. Learners did not use the simulations individually, but teachers used them as an interactive demonstration tool. Basic concepts in electromagnetism are difficult to understand…
Architectural Specialization for Inter-Iteration Loop Dependence Patterns

DTIC Science & Technology

2015-10-01

Architectural Specialization for Inter-Iteration Loop Dependence Patterns Christopher Batten Computer Systems Laboratory School of Electrical and...Trends in Computer Architecture Transistors (Thousands) Frequency (MHz) Typical Power (W) MIPS R2K Intel P4 DEC Alpha 21264 Data collected by M...T as ks p er Jo ule ) Simple Processor Design Power Constraint High-Performance Architectures Embedded Architectures Design Performance
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE PAGES

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; ...

2017-09-14

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
A simple grid implementation with Berkeley Open Infrastructure for Network Computing using BLAST as a model

PubMed Central

Pinthong, Watthanai; Muangruen, Panya

2016-01-01

Development of high-throughput technologies, such as Next-generation sequencing, allows thousands of experiments to be performed simultaneously while reducing resource requirement. Consequently, a massive amount of experiment data is now rapidly generated. Nevertheless, the data are not readily usable or meaningful until they are further analysed and interpreted. Due to the size of the data, a high performance computer (HPC) is required for the analysis and interpretation. However, the HPC is expensive and difficult to access. Other means were developed to allow researchers to acquire the power of HPC without a need to purchase and maintain one such as cloud computing services and grid computing system. In this study, we implemented grid computing in a computer training center environment using Berkeley Open Infrastructure for Network Computing (BOINC) as a job distributor and data manager combining all desktop computers to virtualize the HPC. Fifty desktop computers were used for setting up a grid system during the off-hours. In order to test the performance of the grid system, we adapted the Basic Local Alignment Search Tools (BLAST) to the BOINC system. Sequencing results from Illumina platform were aligned to the human genome database by BLAST on the grid system. The result and processing time were compared to those from a single desktop computer and HPC. The estimated durations of BLAST analysis for 4 million sequence reads on a desktop PC, HPC and the grid system were 568, 24 and 5 days, respectively. Thus, the grid implementation of BLAST by BOINC is an efficient alternative to the HPC for sequence alignment. The grid implementation by BOINC also helped tap unused computing resources during the off-hours and could be easily modified for other available bioinformatics software. PMID:27547555
New NAS Parallel Benchmarks Results

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)

1997-01-01

NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.
High Performance Computing and Storage Requirements for Nuclear Physics: Target 2017

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerber, Richard; Wasserman, Harvey

2014-04-30

In April 2014, NERSC, ASCR, and the DOE Office of Nuclear Physics (NP) held a review to characterize high performance computing (HPC) and storage requirements for NP research through 2017. This review is the 12th in a series of reviews held by NERSC and Office of Science program offices that began in 2009. It is the second for NP, and the final in the second round of reviews that covered the six Office of Science program offices. This report is the result of that review
Strategic research in the social sciences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bainbridge, W.S.

1995-12-31

The federal government has identified a number of multi-agency funding initiatives for science in strategic areas, such as the initiatives on global environmental change and high performance computing, that give some role to the social sciences. Seven strategic areas for social science research are given with potential for federal funding: (1) Democratization. (2) Human Capital. (3) Administrative Science. (4) Cognitive Science. (5) High Performance Computing and Digital Libraries. (6) Human Dimensions of Environmental Change. and (7) Human Genetic Diversity. The first two are addressed in detail and the remainder as a group. 10 refs.
National High Performance Computer Technology Act of 1989. Hearings Before the Subcommittee on Science, Technology, and Space of the Committee on Commerce, Science, and Transportation. United States Senate, One Hundred First Congress, First Session (June 21, July 26, and September 15, 1989).

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

This collection of statements focuses on Title 2 of S. 1067, which calls for the National Science Foundation to establish a National Research and Education Network (NREN) by 1996. This is one of several titles in a bill to provide for a coordinated federal research program to ensure continued U.S. leadership in high performance computing. The…
Evolving Storage and Cyber Infrastructure at the NASA Center for Climate Simulation

NASA Technical Reports Server (NTRS)

Salmon, Ellen; Duffy, Daniel; Spear, Carrie; Sinno, Scott; Vaughan, Garrison; Bowen, Michael

2018-01-01

This talk will describe recent developments at the NASA Center for Climate Simulation, which is funded by NASAs Science Mission Directorate, and supports the specialized data storage and computational needs of weather, ocean, and climate researchers, as well as astrophysicists, heliophysicists, and planetary scientists. To meet requirements for higher-resolution, higher-fidelity simulations, the NCCS augments its High Performance Computing (HPC) and storage retrieval environment. As the petabytes of model and observational data grow, the NCCS is broadening data services offerings and deploying and expanding virtualization resources for high performance analytics.
The Roots of Beowulf

NASA Technical Reports Server (NTRS)

Fischer, James R.

2014-01-01

The first Beowulf Linux commodity cluster was constructed at NASA's Goddard Space Flight Center in 1994 and its origins are a part of the folklore of high-end computing. In fact, the conditions within Goddard that brought the idea into being were shaped by rich historical roots, strategic pressures brought on by the ramp up of the Federal High-Performance Computing and Communications Program, growth of the open software movement, microprocessor performance trends, and the vision of key technologists. This multifaceted story is told here for the first time from the point of view of NASA project management.
NCI's High Performance Computing (HPC) and High Performance Data (HPD) Computing Platform for Environmental and Earth System Data Science

NASA Astrophysics Data System (ADS)

Evans, Ben; Allen, Chris; Antony, Joseph; Bastrakova, Irina; Gohar, Kashif; Porter, David; Pugh, Tim; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley

2015-04-01

The National Computational Infrastructure (NCI) has established a powerful and flexible in-situ petascale computational environment to enable both high performance computing and Data-intensive Science across a wide spectrum of national environmental and earth science data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress so far to harmonise the underlying data collections for future interdisciplinary research across these large volume data collections. NCI has established 10+ PBytes of major national and international data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the major Australian national-scale scientific collections), leading research communities, and collaborating overseas organisations. New infrastructures created at NCI mean the data collections are now accessible within an integrated High Performance Computing and Data (HPC-HPD) environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large-scale high-bandwidth Lustre filesystems. The hardware was designed at inception to ensure that it would allow the layered software environment to flexibly accommodate the advancement of future data science. New approaches to software technology and data models have also had to be developed to enable access to these large and exponentially increasing data volumes at NCI. Traditional HPC and data environments are still made available in a way that flexibly provides the tools, services and supporting software systems on these new petascale infrastructures. But to enable the research to take place at this scale, the data, metadata and software now need to evolve together - creating a new integrated high performance infrastructure. The new infrastructure at NCI currently supports a catalogue of integrated, reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. One of the challenges for NCI has been to support existing techniques and methods, while carefully preparing the underlying infrastructure for the transition needed for the next class of Data-intensive Science. In doing so, a flexible range of techniques and software can be made available for application across the corpus of data collections available, and to provide a new infrastructure for future interdisciplinary research.
Large Scale Document Inversion using a Multi-threaded Computing System

PubMed Central

Jung, Sungbo; Chang, Dar-Jen; Park, Juw Won

2018-01-01

Current microprocessor architecture is moving towards multi-core/multi-threaded systems. This trend has led to a surge of interest in using multi-threaded computing devices, such as the Graphics Processing Unit (GPU), for general purpose computing. We can utilize the GPU in computation as a massive parallel coprocessor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Nowadays a lot of information has been flooded into the digital domain around the world. Huge volume of data, such as digital libraries, social networking services, e-commerce product data, and reviews, etc., is produced or collected every moment with dramatic growth in size. Although the inverted index is a useful data structure that can be used for full text searches or document retrieval, a large number of documents will require a tremendous amount of time to create the index. The performance of document inversion can be improved by multi-thread or multi-core GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing. Our proposed parallel document inversion system shows 2-3 times faster performance than a sequential system on two different test datasets from PubMed abstract and e-commerce product reviews. CCS Concepts •Information systems➝Information retrieval • Computing methodologies➝Massively parallel and high-performance simulations. PMID:29861701
Large Scale Document Inversion using a Multi-threaded Computing System.

PubMed

Jung, Sungbo; Chang, Dar-Jen; Park, Juw Won

2017-06-01

Current microprocessor architecture is moving towards multi-core/multi-threaded systems. This trend has led to a surge of interest in using multi-threaded computing devices, such as the Graphics Processing Unit (GPU), for general purpose computing. We can utilize the GPU in computation as a massive parallel coprocessor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Nowadays a lot of information has been flooded into the digital domain around the world. Huge volume of data, such as digital libraries, social networking services, e-commerce product data, and reviews, etc., is produced or collected every moment with dramatic growth in size. Although the inverted index is a useful data structure that can be used for full text searches or document retrieval, a large number of documents will require a tremendous amount of time to create the index. The performance of document inversion can be improved by multi-thread or multi-core GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing. Our proposed parallel document inversion system shows 2-3 times faster performance than a sequential system on two different test datasets from PubMed abstract and e-commerce product reviews. •Information systems➝Information retrieval • Computing methodologies➝Massively parallel and high-performance simulations.
Computational physics in RISC environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rhoades, C.E. Jr.

The new high performance Reduced Instruction Set Computers (RISC) promise near Cray-level performance at near personal-computer prices. This paper explores the performance, conversion and compatibility issues associated with developing, testing and using our traditional, large-scale simulation models in the RISC environments exemplified by the IBM RS6000 and MISP R3000 machines. The questions of operating systems (CTSS versus UNIX), compilers (Fortran, C, pointers) and data are addressed in detail. Overall, it is concluded that the RISC environments are practical for a very wide range of computational physic activities. Indeed, all but the very largest two- and three-dimensional codes will work quitemore » well, particularly in a single user environment. Easily projected hardware-performance increases will revolutionize the field of computational physics. The way we do research will change profoundly in the next few years. There is, however, nothing more difficult to plan, nor more dangerous to manage than the creation of this new world.« less
Computational physics in RISC environments. Revision 1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rhoades, C.E. Jr.

The new high performance Reduced Instruction Set Computers (RISC) promise near Cray-level performance at near personal-computer prices. This paper explores the performance, conversion and compatibility issues associated with developing, testing and using our traditional, large-scale simulation models in the RISC environments exemplified by the IBM RS6000 and MISP R3000 machines. The questions of operating systems (CTSS versus UNIX), compilers (Fortran, C, pointers) and data are addressed in detail. Overall, it is concluded that the RISC environments are practical for a very wide range of computational physic activities. Indeed, all but the very largest two- and three-dimensional codes will work quitemore » well, particularly in a single user environment. Easily projected hardware-performance increases will revolutionize the field of computational physics. The way we do research will change profoundly in the next few years. There is, however, nothing more difficult to plan, nor more dangerous to manage than the creation of this new world.« less
QPSO-Based Adaptive DNA Computing Algorithm

PubMed Central

Karakose, Mehmet; Cigdem, Ugur

2013-01-01

DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm. PMID:23935409
What Physicists Should Know About High Performance Computing - Circa 2002

NASA Astrophysics Data System (ADS)

Frederick, Donald

2002-08-01

High Performance Computing (HPC) is a dynamic, cross-disciplinary field that traditionally has involved applied mathematicians, computer scientists, and others primarily from the various disciplines that have been major users of HPC resources - physics, chemistry, engineering, with increasing use by those in the life sciences. There is a technological dynamic that is powered by economic as well as by technical innovations and developments. This talk will discuss practical ideas to be considered when developing numerical applications for research purposes. Even with the rapid pace of development in the field, the author believes that these concepts will not become obsolete for a while, and will be of use to scientists who either are considering, or who have already started down the HPC path. These principles will be applied in particular to current parallel HPC systems, but there will also be references of value to desktop users. The talk will cover such topics as: computing hardware basics, single-cpu optimization, compilers, timing, numerical libraries, debugging and profiling tools and the emergence of Computational Grids.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.