performance computing: Topics by Science.gov

Sample records for performance computing

High-Performance Computing and Visualization | Energy Systems Integration

Science.gov Websites

Facility | NREL High-Performance Computing and Visualization High-Performance Computing and Visualization High-performance computing (HPC) and visualization at NREL propel technology innovation as a . Capabilities High-Performance Computing NREL is home to Peregrine-the largest high-performance computing system
High-Performance Computing Data Center | Energy Systems Integration

Science.gov Websites

Facility | NREL High-Performance Computing Data Center High-Performance Computing Data Center The Energy Systems Integration Facility's High-Performance Computing Data Center is home to Peregrine -the largest high-performance computing system in the world exclusively dedicated to advancing
High-Performance Computing Systems and Operations | Computational Science |

Science.gov Websites

NREL Systems and Operations High-Performance Computing Systems and Operations NREL operates high-performance computing (HPC) systems dedicated to advancing energy efficiency and renewable energy technologies. Capabilities NREL's HPC capabilities include: High-Performance Computing Systems We operate
Identifying failure in a tree network of a parallel computer

DOEpatents

Archer, Charles J.; Pinnow, Kurt W.; Wallenfelt, Brian P.

2010-08-24

Methods, parallel computers, and products are provided for identifying failure in a tree network of a parallel computer. The parallel computer includes one or more processing sets including an I/O node and a plurality of compute nodes. For each processing set embodiments include selecting a set of test compute nodes, the test compute nodes being a subset of the compute nodes of the processing set; measuring the performance of the I/O node of the processing set; measuring the performance of the selected set of test compute nodes; calculating a current test value in dependence upon the measured performance of the I/O node of the processing set, the measured performance of the set of test compute nodes, and a predetermined value for I/O node performance; and comparing the current test value with a predetermined tree performance threshold. If the current test value is below the predetermined tree performance threshold, embodiments include selecting another set of test compute nodes. If the current test value is not below the predetermined tree performance threshold, embodiments include selecting from the test compute nodes one or more potential problem nodes and testing individually potential problem nodes and links to potential problem nodes.
Computer-Related Success and Failure: A Longitudinal Field Study of the Factors Influencing Computer-Related Performance.

ERIC Educational Resources Information Center

Rozell, E. J.; Gardner, W. L., III

1999-01-01

A model of the intrapersonal processes impacting computer-related performance was tested using data from 75 manufacturing employees in a computer training course. Gender, computer experience, and attributional style were predictive of computer attitudes, which were in turn related to computer efficacy, task-specific performance expectations, and…
High-Performance Computing User Facility | Computational Science | NREL

Science.gov Websites

User Facility High-Performance Computing User Facility The High-Performance Computing User Facility technologies. Photo of the Peregrine supercomputer The High Performance Computing (HPC) User Facility provides Gyrfalcon Mass Storage System. Access Our HPC User Facility Learn more about these systems and how to access
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

DOEpatents

Faraj, Ahmad [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.
Computational Science News | Computational Science | NREL

Science.gov Websites

-Cooled High-Performance Computing Technology at the ESIF February 28, 2018 NREL Launches New Website for High-Performance Computing System Users The National Renewable Energy Laboratory (NREL) Computational Science Center has launched a revamped website for users of the lab's high-performance computing (HPC
High performance computing and communications program

NASA Technical Reports Server (NTRS)

Holcomb, Lee

1992-01-01

A review of the High Performance Computing and Communications (HPCC) program is provided in vugraph format. The goals and objectives of this federal program are as follows: extend U.S. leadership in high performance computing and computer communications; disseminate the technologies to speed innovation and to serve national goals; and spur gains in industrial competitiveness by making high performance computing integral to design and production.
Performance Comparison of Mainframe, Workstations, Clusters, and Desktop Computers

NASA Technical Reports Server (NTRS)

Farley, Douglas L.

2005-01-01

A performance evaluation of a variety of computers frequently found in a scientific or engineering research environment was conducted using a synthetic and application program benchmarks. From a performance perspective, emerging commodity processors have superior performance relative to legacy mainframe computers. In many cases, the PC clusters exhibited comparable performance with traditional mainframe hardware when 8-12 processors were used. The main advantage of the PC clusters was related to their cost. Regardless of whether the clusters were built from new computers or whether they were created from retired computers their performance to cost ratio was superior to the legacy mainframe computers. Finally, the typical annual maintenance cost of legacy mainframe computers is several times the cost of new equipment such as multiprocessor PC workstations. The savings from eliminating the annual maintenance fee on legacy hardware can result in a yearly increase in total computational capability for an organization.
High Performance Computer Cluster for Theoretical Studies of Roaming in Chemical Reactions

DTIC Science & Technology

2016-08-30

High-performance Computer Cluster for Theoretical Studies of Roaming in Chemical Reactions A dedicated high-performance computer cluster was...SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS (ES) U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 Computer cluster ...peer-reviewed journals: Final Report: High-performance Computer Cluster for Theoretical Studies of Roaming in Chemical Reactions Report Title A dedicated
Effect of computer game playing on baseline laparoscopic simulator skills.

PubMed

Halvorsen, Fredrik H; Cvancarova, Milada; Fosse, Erik; Mjåland, Odd

2013-08-01

Studies examining the possible association between computer game playing and laparoscopic performance in general have yielded conflicting results and neither has a relationship between computer game playing and baseline performance on laparoscopic simulators been established. The aim of this study was to examine the possible association between previous and present computer game playing and baseline performance on a virtual reality laparoscopic performance in a sample of potential future medical students. The participating students completed a questionnaire covering the weekly amount and type of computer game playing activity during the previous year and 3 years ago. They then performed 2 repetitions of 2 tasks ("gallbladder dissection" and "traverse tube") on a virtual reality laparoscopic simulator. Performance on the simulator were then analyzed for association to their computer game experience. Local high school, Norway. Forty-eight students from 2 high school classes volunteered to participate in the study. No association between prior and present computer game playing and baseline performance was found. The results were similar both for prior and present action game playing and prior and present computer game playing in general. Our results indicate that prior and present computer game playing may not affect baseline performance in a virtual reality simulator.
The Relationship Between Computer Experience and Computerized Cognitive Test Performance Among Older Adults

PubMed Central

2013-01-01

Objective. This study compared the relationship between computer experience and performance on computerized cognitive tests and a traditional paper-and-pencil cognitive test in a sample of older adults (N = 634). Method. Participants completed computer experience and computer attitudes questionnaires, three computerized cognitive tests (Useful Field of View (UFOV) Test, Road Sign Test, and Stroop task) and a paper-and-pencil cognitive measure (Trail Making Test). Multivariate analysis of covariance was used to examine differences in cognitive performance across the four measures between those with and without computer experience after adjusting for confounding variables. Results. Although computer experience had a significant main effect across all cognitive measures, the effect sizes were similar. After controlling for computer attitudes, the relationship between computer experience and UFOV was fully attenuated. Discussion. Findings suggest that computer experience is not uniquely related to performance on computerized cognitive measures compared with paper-and-pencil measures. Because the relationship between computer experience and UFOV was fully attenuated by computer attitudes, this may imply that motivational factors are more influential to UFOV performance than computer experience. Our findings support the hypothesis that computer use is related to cognitive performance, and this relationship is not stronger for computerized cognitive measures. Implications and directions for future research are provided. PMID:22929395
High Performance Computing Meets Energy Efficiency - Continuum Magazine |

Science.gov Websites

NREL High Performance Computing Meets Energy Efficiency High Performance Computing Meets Energy turbines. Simulation by Patrick J. Moriarty and Matthew J. Churchfield, NREL The new High Performance Computing Data Center at the National Renewable Energy Laboratory (NREL) hosts high-speed, high-volume data
Bringing MapReduce Closer To Data With Active Drives

NASA Astrophysics Data System (ADS)

Golpayegani, N.; Prathapan, S.; Warmka, R.; Wyatt, B.; Halem, M.; Trantham, J. D.; Markey, C. A.

2017-12-01

Moving computation closer to the data location has been a much theorized improvement to computation for decades. The increase in processor performance, the decrease in processor size and power requirement combined with the increase in data intensive computing has created a push to move computation as close to data as possible. We will show the next logical step in this evolution in computing: moving computation directly to storage. Hypothetical systems, known as Active Drives, have been proposed as early as 1998. These Active Drives would have a general-purpose CPU on each disk allowing for computations to be performed on them without the need to transfer the data to the computer over the system bus or via a network. We will utilize Seagate's Active Drives to perform general purpose parallel computing using the MapReduce programming model directly on each drive. We will detail how the MapReduce programming model can be adapted to the Active Drive compute model to perform general purpose computing with comparable results to traditional MapReduce computations performed via Hadoop. We will show how an Active Drive based approach significantly reduces the amount of data leaving the drive when performing several common algorithms: subsetting and gridding. We will show that an Active Drive based design significantly improves data transfer speeds into and out of drives compared to Hadoop's HDFS while at the same time keeping comparable compute speeds as Hadoop.
Grand Challenges: High Performance Computing and Communications. The FY 1992 U.S. Research and Development Program.

ERIC Educational Resources Information Center

Federal Coordinating Council for Science, Engineering and Technology, Washington, DC.

This report presents a review of the High Performance Computing and Communications (HPCC) Program, which has as its goal the acceleration of the commercial availability and utilization of the next generation of high performance computers and networks in order to: (1) extend U.S. technological leadership in high performance computing and computer…
High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems

DOE PAGES

Dongarra, Jack; Heroux, Michael A.; Luszczek, Piotr

2015-08-17

Here, we describe a new high-performance conjugate-gradient (HPCG) benchmark. HPCG is composed of computations and data-access patterns commonly found in scientific applications. HPCG strives for a better correlation to existing codes from the computational science domain and to be representative of their performance. Furthermore, HPCG is meant to help drive the computer system design and implementation in directions that will better impact future performance improvement.
Research Activity in Computational Physics utilizing High Performance Computing: Co-authorship Network Analysis

NASA Astrophysics Data System (ADS)

Ahn, Sul-Ah; Jung, Youngim

2016-10-01

The research activities of the computational physicists utilizing high performance computing are analyzed by bibliometirc approaches. This study aims at providing the computational physicists utilizing high-performance computing and policy planners with useful bibliometric results for an assessment of research activities. In order to achieve this purpose, we carried out a co-authorship network analysis of journal articles to assess the research activities of researchers for high-performance computational physics as a case study. For this study, we used journal articles of the Scopus database from Elsevier covering the time period of 2004-2013. We extracted the author rank in the physics field utilizing high-performance computing by the number of papers published during ten years from 2004. Finally, we drew the co-authorship network for 45 top-authors and their coauthors, and described some features of the co-authorship network in relation to the author rank. Suggestions for further studies are discussed.
High-Performance Computing Data Center Warm-Water Liquid Cooling |

Science.gov Websites

Computational Science | NREL Warm-Water Liquid Cooling High-Performance Computing Data Center Warm-Water Liquid Cooling NREL's High-Performance Computing Data Center (HPC Data Center) is liquid water Liquid cooling technologies offer a more energy-efficient solution that also allows for effective
Development of a small-scale computer cluster

NASA Astrophysics Data System (ADS)

Wilhelm, Jay; Smith, Justin T.; Smith, James E.

2008-04-01

An increase in demand for computing power in academia has necessitated the need for high performance machines. Computing power of a single processor has been steadily increasing, but lags behind the demand for fast simulations. Since a single processor has hard limits to its performance, a cluster of computers can have the ability to multiply the performance of a single computer with the proper software. Cluster computing has therefore become a much sought after technology. Typical desktop computers could be used for cluster computing, but are not intended for constant full speed operation and take up more space than rack mount servers. Specialty computers that are designed to be used in clusters meet high availability and space requirements, but can be costly. A market segment exists where custom built desktop computers can be arranged in a rack mount situation, gaining the space saving of traditional rack mount computers while remaining cost effective. To explore these possibilities, an experiment was performed to develop a computing cluster using desktop components for the purpose of decreasing computation time of advanced simulations. This study indicates that small-scale cluster can be built from off-the-shelf components which multiplies the performance of a single desktop machine, while minimizing occupied space and still remaining cost effective.

On the tip of the tongue: learning typing and pointing with an intra-oral computer interface.

PubMed

Caltenco, Héctor A; Breidegard, Björn; Struijk, Lotte N S Andreasen

2014-07-01

To evaluate typing and pointing performance and improvement over time of four able-bodied participants using an intra-oral tongue-computer interface for computer control. A physically disabled individual may lack the ability to efficiently control standard computer input devices. There have been several efforts to produce and evaluate interfaces that provide individuals with physical disabilities the possibility to control personal computers. Training with the intra-oral tongue-computer interface was performed by playing games over 18 sessions. Skill improvement was measured through typing and pointing exercises at the end of each training session. Typing throughput improved from averages of 2.36 to 5.43 correct words per minute. Pointing throughput improved from averages of 0.47 to 0.85 bits/s. Target tracking performance, measured as relative time on target, improved from averages of 36% to 47%. Path following throughput improved from averages of 0.31 to 0.83 bits/s and decreased to 0.53 bits/s with more difficult tasks. Learning curves support the notion that the tongue can rapidly learn novel motor tasks. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, which makes the tongue a feasible input organ for computer control. Intra-oral computer interfaces could provide individuals with severe upper-limb mobility impairments the opportunity to control computers and automatic equipment. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, but does not cause fatigue easily and might be invisible to other people, which is highly prioritized by assistive device users. Combination of visual and auditory feedback is vital for a good performance of an intra-oral computer interface and helps to reduce involuntary or erroneous activations.
Asymmetric Core Computing for U.S. Army High-Performance Computing Applications

DTIC Science & Technology

2009-04-01

Playstation 4 (should one be announced). 8 4.2 FPGAs Reconfigurable computing refers to performing computations using Field Programmable Gate Arrays...2008 4 . TITLE AND SUBTITLE Asymmetric Core Computing for U.S. Army High-Performance Computing Applications 5a. CONTRACT NUMBER 5b. GRANT NUMBER...Acknowledgments vi 1. Introduction 1 2. Relevant Technologies 2 3. Technical Approach 5 4 . Research and Development Highlights 7 4.1 Cell
Computer Attitude of Teaching Faculty: Implications for Technology-Based Performance in Higher Education

ERIC Educational Resources Information Center

Larbi-Apau, Josephine A.; Moseley, James L.

2012-01-01

This study examined the validity of Selwyn's computer attitude scale (CAS) and its implication for technology-based performance of randomly sampled (n = 167) multidiscipline teaching faculty in higher education in Ghana. Considered, computer attitude is a critical function of computer attitude and potential performance. Composed of four…
High-Performance Computing Act of 1991. Report of the Senate Committee on Commerce, Science, and Transportation on S. 272. Senate, 102d Congress, 1st Session.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

This report discusses Senate Bill no. 272, which provides for a coordinated federal research and development program to ensure continued U.S. leadership in high-performance computing. High performance computing is defined as representing the leading edge of technological advancement in computing, i.e., the most sophisticated computer chips, the…
Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda A [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-01-10

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Cambridge, MA; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Volume accumulator design analysis computer codes

NASA Technical Reports Server (NTRS)

Whitaker, W. D.; Shimazaki, T. T.

1973-01-01

The computer codes, VANEP and VANES, were written and used to aid in the design and performance calculation of the volume accumulator units (VAU) for the 5-kwe reactor thermoelectric system. VANEP computes the VAU design which meets the primary coolant loop VAU volume and pressure performance requirements. VANES computes the performance of the VAU design, determined from the VANEP code, at the conditions of the secondary coolant loop. The codes can also compute the performance characteristics of the VAU's under conditions of possible modes of failure which still permit continued system operation.
Ku-Band rendezvous radar performance computer simulation model

NASA Technical Reports Server (NTRS)

Magnusson, H. G.; Goff, M. F.

1984-01-01

All work performed on the Ku-band rendezvous radar performance computer simulation model program since the release of the preliminary final report is summarized. Developments on the program fall into three distinct categories: (1) modifications to the existing Ku-band radar tracking performance computer model; (2) the addition of a highly accurate, nonrealtime search and acquisition performance computer model to the total software package developed on this program; and (3) development of radar cross section (RCS) computation models for three additional satellites. All changes in the tracking model involved improvements in the automatic gain control (AGC) and the radar signal strength (RSS) computer models. Although the search and acquisition computer models were developed under the auspices of the Hughes Aircraft Company Ku-Band Integrated Radar and Communications Subsystem program office, they have been supplied to NASA as part of the Ku-band radar performance comuter model package. Their purpose is to predict Ku-band acquisition performance for specific satellite targets on specific missions. The RCS models were developed for three satellites: the Long Duration Exposure Facility (LDEF) spacecraft, the Solar Maximum Mission (SMM) spacecraft, and the Space Telescopes.
Ku-Band rendezvous radar performance computer simulation model

NASA Astrophysics Data System (ADS)

Magnusson, H. G.; Goff, M. F.

1984-06-01

All work performed on the Ku-band rendezvous radar performance computer simulation model program since the release of the preliminary final report is summarized. Developments on the program fall into three distinct categories: (1) modifications to the existing Ku-band radar tracking performance computer model; (2) the addition of a highly accurate, nonrealtime search and acquisition performance computer model to the total software package developed on this program; and (3) development of radar cross section (RCS) computation models for three additional satellites. All changes in the tracking model involved improvements in the automatic gain control (AGC) and the radar signal strength (RSS) computer models. Although the search and acquisition computer models were developed under the auspices of the Hughes Aircraft Company Ku-Band Integrated Radar and Communications Subsystem program office, they have been supplied to NASA as part of the Ku-band radar performance comuter model package. Their purpose is to predict Ku-band acquisition performance for specific satellite targets on specific missions. The RCS models were developed for three satellites: the Long Duration Exposure Facility (LDEF) spacecraft, the Solar Maximum Mission (SMM) spacecraft, and the Space Telescopes.
Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

ERIC Educational Resources Information Center

Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

2013-01-01

With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…
Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

NASA Technical Reports Server (NTRS)

Logan, Terry G.

1994-01-01

The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.
Facilities | Integrated Energy Solutions | NREL

Science.gov Websites

strategies needed to optimize our entire energy system. A photo of the high-performance computer at NREL . High-Performance Computing Data Center High-performance computing facilities at NREL provide high-speed
A Software Rejuvenation Framework for Distributed Computing

NASA Technical Reports Server (NTRS)

Chau, Savio

2009-01-01

A performability-oriented conceptual framework for software rejuvenation has been constructed as a means of increasing levels of reliability and performance in distributed stateful computing. As used here, performability-oriented signifies that the construction of the framework is guided by the concept of analyzing the ability of a given computing system to deliver services with gracefully degradable performance. The framework is especially intended to support applications that involve stateful replicas of server computers.
Importance of balanced architectures in the design of high-performance imaging systems

NASA Astrophysics Data System (ADS)

Sgro, Joseph A.; Stanton, Paul C.

1999-03-01

Imaging systems employed in demanding military and industrial applications, such as automatic target recognition and computer vision, typically require real-time high-performance computing resources. While high- performances computing systems have traditionally relied on proprietary architectures and custom components, recent advances in high performance general-purpose microprocessor technology have produced an abundance of low cost components suitable for use in high-performance computing systems. A common pitfall in the design of high performance imaging system, particularly systems employing scalable multiprocessor architectures, is the failure to balance computational and memory bandwidth. The performance of standard cluster designs, for example, in which several processors share a common memory bus, is typically constrained by memory bandwidth. The symptom characteristic of this problem is failure to the performance of the system to scale as more processors are added. The problem becomes exacerbated if I/O and memory functions share the same bus. The recent introduction of microprocessors with large internal caches and high performance external memory interfaces makes it practical to design high performance imaging system with balanced computational and memory bandwidth. Real word examples of such designs will be presented, along with a discussion of adapting algorithm design to best utilize available memory bandwidth.
A high performance scientific cloud computing environment for materials simulations

NASA Astrophysics Data System (ADS)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
The Interdependence of Computers, Robots, and People.

ERIC Educational Resources Information Center

Ludden, Laverne; And Others

Computers and robots are becoming increasingly more advanced, with smaller and cheaper computers now doing jobs once reserved for huge multimillion dollar computers and with robots performing feats such as painting cars and using television cameras to simulate vision as they perform factory tasks. Technicians expect computers to become even more…
Quantum Accelerators for High-performance Computing Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humble, Travis S.; Britt, Keith A.; Mohiyaddin, Fahd A.

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, themore » prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.« less
Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

NASA Astrophysics Data System (ADS)

Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

2015-09-01

The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
On the "Exchangeability" of Hands-On and Computer-Simulated Science Performance Assessments. CSE Technical Report.

ERIC Educational Resources Information Center

Rosenquist, Anders; Shavelson, Richard J.; Ruiz-Primo, Maria Araceli

Inconsistencies in scores from computer-simulated and "hands-on" science performance assessments have led to questions about the exchangeability of these two methods in spite of the highly touted potential of computer-simulated performance assessment. This investigation considered possible explanations for students' inconsistent performances: (1)…
Perceptions and performance using computer-based testing: One institution's experience.

PubMed

Bloom, Timothy J; Rich, Wesley D; Olson, Stephanie M; Adams, Michael L

2018-02-01

The purpose of this study was to evaluate student and faculty perceptions of the transition to a required computer-based testing format and to identify any impact of this transition on student exam performance. Separate questionnaires sent to students and faculty asked about perceptions of and problems with computer-based testing. Exam results from program-required courses for two years prior to and two years following the adoption of computer-based testing were compared to determine if this testing format impacted student performance. Responses to Likert-type questions about perceived ease of use showed no difference between students with one and three semesters experience with computer-based testing. Of 223 student-reported problems, 23% related to faculty training with the testing software. Students most commonly reported improved feedback (46% of responses) and ease of exam-taking (17% of responses) as benefits to computer-based testing. Faculty-reported difficulties were most commonly related to problems with student computers during an exam (38% of responses) while the most commonly identified benefit was collecting assessment data (32% of responses). Neither faculty nor students perceived an impact on exam performance due to computer-based testing. An analysis of exam grades confirmed there was no consistent performance difference between the paper and computer-based formats. Both faculty and students rapidly adapted to using computer-based testing. There was no evidence that switching to computer-based testing had any impact on student exam performance. Copyright © 2017 Elsevier Inc. All rights reserved.

Computer versus paper--does it make any difference in test performance?

PubMed

Karay, Yassin; Schauber, Stefan K; Stosch, Christoph; Schüttpelz-Brauns, Katrin

2015-01-01

CONSTRUCT: In this study, we examine the differences in test performance between the paper-based and the computer-based version of the Berlin formative Progress Test. In this context it is the first study that allows controlling for students' prior performance. Computer-based tests make possible a more efficient examination procedure for test administration and review. Although university staff will benefit largely from computer-based tests, the question arises if computer-based tests influence students' test performance. A total of 266 German students from the 9th and 10th semester of medicine (comparable with the 4th-year North American medical school schedule) participated in the study (paper = 132, computer = 134). The allocation of the test format was conducted as a randomized matched-pair design in which students were first sorted according to their prior test results. The organizational procedure, the examination conditions, the room, and seating arrangements, as well as the order of questions and answers, were identical in both groups. The sociodemographic variables and pretest scores of both groups were comparable. The test results from the paper and computer versions did not differ. The groups remained within the allotted time, but students using the computer version (particularly the high performers) needed significantly less time to complete the test. In addition, we found significant differences in guessing behavior. Low performers using the computer version guess significantly more than low-performing students in the paper-pencil version. Participants in computer-based tests are not at a disadvantage in terms of their test results. The computer-based test required less processing time. The reason for the longer processing time when using the paper-pencil version might be due to the time needed to write the answer down, controlling for transferring the answer correctly. It is still not known why students using the computer version (particularly low-performing students) guess at a higher rate. Further studies are necessary to understand this finding.
Parallel computing works

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less
Reducing power consumption while performing collective operations on a plurality of compute nodes

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2011-10-18

Methods, apparatus, and products are disclosed for reducing power consumption while performing collective operations on a plurality of compute nodes that include: receiving, by each compute node, instructions to perform a type of collective operation; selecting, by each compute node from a plurality of collective operations for the collective operation type, a particular collective operation in dependence upon power consumption characteristics for each of the plurality of collective operations; and executing, by each compute node, the selected collective operation.
[Design and study of parallel computing environment of Monte Carlo simulation for particle therapy planning using a public cloud-computing infrastructure].

PubMed

Yokohama, Noriya

2013-07-01

This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
Data Storage and Transfer | High-Performance Computing | NREL

Science.gov Websites

High-Performance Computing (HPC) systems. Photo of computer server wiring and lights, blurred to show data. WinSCP for Windows File Transfers Use to transfer files from a local computer to a remote computer. Robinhood for File Management Use this tool to manage your data files on Peregrine. Best
Computer Assistance in Information Work. Part I: Conceptual Framework for Improving the Computer/User Interface in Information Work. Part II: Catalog of Acceleration, Augmentation, and Delegation Functions in Information Work.

ERIC Educational Resources Information Center

Paisley, William; Butler, Matilda

This study of the computer/user interface investigated the role of the computer in performing information tasks that users now perform without computer assistance. Users' perceptual/cognitive processes are to be accelerated or augmented by the computer; a long term goal is to delegate information tasks entirely to the computer. Cybernetic and…
Early experiences in developing and managing the neuroscience gateway.

PubMed

Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas T

2015-02-01

The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway.
Early experiences in developing and managing the neuroscience gateway

PubMed Central

Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas. T.

2015-01-01

SUMMARY The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway. PMID:26523124
Alliance for Computational Science Collaboration: HBCU Partnership at Alabama A&M University Continuing High Performance Computing Research and Education at AAMU

DOE Office of Scientific and Technical Information (OSTI.GOV)

Qian, Xiaoqing; Deng, Z. T.

2009-11-10

This is the final report for the Department of Energy (DOE) project DE-FG02-06ER25746, entitled, "Continuing High Performance Computing Research and Education at AAMU". This three-year project was started in August 15, 2006, and it was ended in August 14, 2009. The objective of this project was to enhance high performance computing research and education capabilities at Alabama A&M University (AAMU), and to train African-American and other minority students and scientists in the computational science field for eventual employment with DOE. AAMU has successfully completed all the proposed research and educational tasks. Through the support of DOE, AAMU was able tomore » provide opportunities to minority students through summer interns and DOE computational science scholarship program. In the past three years, AAMU (1). Supported three graduate research assistants in image processing for hypersonic shockwave control experiment and in computational science related area; (2). Recruited and provided full financial support for six AAMU undergraduate summer research interns to participate Research Alliance in Math and Science (RAMS) program at Oak Ridge National Lab (ORNL); (3). Awarded highly competitive 30 DOE High Performance Computing Scholarships ($1500 each) to qualified top AAMU undergraduate students in science and engineering majors; (4). Improved high performance computing laboratory at AAMU with the addition of three high performance Linux workstations; (5). Conducted image analysis for electromagnetic shockwave control experiment and computation of shockwave interactions to verify the design and operation of AAMU-Supersonic wind tunnel. The high performance computing research and education activities at AAMU created great impact to minority students. As praised by Accreditation Board for Engineering and Technology (ABET) in 2009, ?The work on high performance computing that is funded by the Department of Energy provides scholarships to undergraduate students as computational science scholars. This is a wonderful opportunity to recruit under-represented students.? Three ASEE papers were published in 2007, 2008 and 2009 proceedings of ASEE Annual Conferences, respectively. Presentations of these papers were also made at the ASEE Annual Conferences. It is very critical to continue the research and education activities.« less
Marc Henry de Frahan | NREL

Science.gov Websites

Computing Project, Marc develops high-fidelity turbulence models to enhance simulation accuracy and efficient numerical algorithms for future high performance computing hardware architectures. Research Interests High performance computing High order numerical methods for computational fluid dynamics Fluid
Computer task performance by subjects with Duchenne muscular dystrophy.

PubMed

Malheiros, Silvia Regina Pinheiro; da Silva, Talita Dias; Favero, Francis Meire; de Abreu, Luiz Carlos; Fregni, Felipe; Ribeiro, Denise Cardoso; de Mello Monteiro, Carlos Bandeira

2016-01-01

Two specific objectives were established to quantify computer task performance among people with Duchenne muscular dystrophy (DMD). First, we compared simple computational task performance between subjects with DMD and age-matched typically developing (TD) subjects. Second, we examined correlations between the ability of subjects with DMD to learn the computational task and their motor functionality, age, and initial task performance. The study included 84 individuals (42 with DMD, mean age of 18±5.5 years, and 42 age-matched controls). They executed a computer maze task; all participants performed the acquisition (20 attempts) and retention (five attempts) phases, repeating the same maze. A different maze was used to verify transfer performance (five attempts). The Motor Function Measure Scale was applied, and the results were compared with maze task performance. In the acquisition phase, a significant decrease was found in movement time (MT) between the first and last acquisition block, but only for the DMD group. For the DMD group, MT during transfer was shorter than during the first acquisition block, indicating improvement from the first acquisition block to transfer. In addition, the TD group showed shorter MT than the DMD group across the study. DMD participants improved their performance after practicing a computational task; however, the difference in MT was present in all attempts among DMD and control subjects. Computational task improvement was positively influenced by the initial performance of individuals with DMD. In turn, the initial performance was influenced by their distal functionality but not their age or overall functionality.
Gender plays no role in student ability to perform on computer-based examinations.

PubMed

Kies, Susan M; Williams, Benjamin D; Freund, Gregory G

2006-11-28

To see if there is a difference in performance when students switch from traditional paper-and-pencil examinations to computer-based examinations, and to determine whether there are gender differences in student performance in these two examination formats. This study involved first year medical students at the University of Illinois at Urbana-Champaign over three Academic Years 2002-03/2003-04 and 2003-05. Comparisons of student performance by overall class and gender were made. Specific comparisons within courses that utilized both the paper-and-pencil and computer formats were analyzed. Overall performance scores for students among the various Academic Years revealed no differences between exams given in the traditional pen-and-paper and computer formats. Further, when we looked specifically for gender differences in performance between these two testing formats, we found none. The format for examinations in the courses analyzed does not affect student performance. We find no evidence for gender differences in performance on exams on pen-and-paper or computer-based exams.
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation

NASA Technical Reports Server (NTRS)

Stocker, John C.; Golomb, Andrew M.

2011-01-01

Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 15 Commerce and Foreign Trade 2 2012-01-01 2012-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 15 Commerce and Foreign Trade 2 2011-01-01 2011-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 15 Commerce and Foreign Trade 2 2013-01-01 2013-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification...
Computer Self-Efficacy, Computer Anxiety, Performance and Personal Outcomes of Turkish Physical Education Teachers

ERIC Educational Resources Information Center

Aktag, Isil

2015-01-01

The purpose of this study is to determine the computer self-efficacy, performance outcome, personal outcome, and affect and anxiety level of physical education teachers. Influence of teaching experience, computer usage and participation of seminars or in-service programs on computer self-efficacy level were determined. The subjects of this study…
Study on the Computational Estimation Performance and Computational Estimation Attitude of Elementary School Fifth Graders in Taiwan

ERIC Educational Resources Information Center

Tsao, Yea-Ling; Pan, Ting-Rung

2011-01-01

Main purpose of this study is to investigate what level of computational estimation performance is possessed by fifth graders and explore computational estimation attitude towards fifth graders. Two hundred and thirty-five Grade-5 students from four elementary schools in Taipei City were selected for "Computational Estimation Test" and…
High-performance computing — an overview

NASA Astrophysics Data System (ADS)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.

Grand Challenges 1993: High Performance Computing and Communications. A Report by the Committee on Physical, Mathematical, and Engineering Sciences. The FY 1993 U.S. Research and Development Program.

ERIC Educational Resources Information Center

Office of Science and Technology Policy, Washington, DC.

This report presents the United States research and development program for 1993 for high performance computing and computer communications (HPCC) networks. The first of four chapters presents the program goals and an overview of the federal government's emphasis on high performance computing as an important factor in the nation's scientific and…
Development of a SaaS application probe to the physical properties of the Earth's interior: An attempt at moving HPC to the cloud

NASA Astrophysics Data System (ADS)

Huang, Qian

2014-09-01

Scientific computing often requires the availability of a massive number of computers for performing large-scale simulations, and computing in mineral physics is no exception. In order to investigate physical properties of minerals at extreme conditions in computational mineral physics, parallel computing technology is used to speed up the performance by utilizing multiple computer resources to process a computational task simultaneously thereby greatly reducing computation time. Traditionally, parallel computing has been addressed by using High Performance Computing (HPC) solutions and installed facilities such as clusters and super computers. Today, it has been seen that there is a tremendous growth in cloud computing. Infrastructure as a Service (IaaS), the on-demand and pay-as-you-go model, creates a flexible and cost-effective mean to access computing resources. In this paper, a feasibility report of HPC on a cloud infrastructure is presented. It is found that current cloud services in IaaS layer still need to improve performance to be useful to research projects. On the other hand, Software as a Service (SaaS), another type of cloud computing, is introduced into an HPC system for computing in mineral physics, and an application of which is developed. In this paper, an overall description of this SaaS application is presented. This contribution can promote cloud application development in computational mineral physics, and cross-disciplinary studies.
Computational Fluency Performance Profile of High School Students with Mathematics Disabilities

ERIC Educational Resources Information Center

Calhoon, Mary Beth; Emerson, Robert Wall; Flores, Margaret; Houchins, David E.

2007-01-01

The purpose of this descriptive study was to develop a computational fluency performance profile of 224 high school (Grades 9-12) students with mathematics disabilities (MD). Computational fluency performance was examined by grade-level expectancy (Grades 2-6) and skill area (whole numbers: addition, subtraction, multiplication, division;…
Computer Electromagnetics and Supercomputer Architecture

NASA Technical Reports Server (NTRS)

Cwik, Tom

1993-01-01

The dramatic increase in performance over the last decade for microporcessor computations is compared with that for the supercomputer computations. This performance, the projected performance, and a number of other issues such as cost and the inherent pysical limitations in curent supercomputer technology have naturally led to parallel supercomputers and ensemble of interconnected microprocessors.
15 CFR 743.2 - High performance computers: Post shipment verification reporting.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 15 Commerce and Foreign Trade 2 2014-01-01 2014-01-01 false High performance computers: Post... Commerce and Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION REGULATIONS SPECIAL REPORTING AND NOTIFICATION § 743.2 High performance computers: Post shipment...
Support Expressed in Congress for U.S. High-Performance Computing

NASA Astrophysics Data System (ADS)

Showstack, Randy

2004-06-01

Advocates for a stronger U.S. position in high-performance computing-which could help with a number of grand challenges in the Earth sciences and other disciplines-hope that legislation recently introduced in the House of Representatives, and, will help to revitalize U.S. efforts. The High-Performance Computing Revitalization Act of 2004 would amend the earlier High-Performance Computing Act of 1991 (Public Law 102-194), which is partially credited with helping to strengthen U.S. capabilities in this area. The bill has the support of the Bush administration.
Evaluation of Rankine cycle air conditioning system hardware by computer simulation

NASA Technical Reports Server (NTRS)

Healey, H. M.; Clark, D.

1978-01-01

A computer program for simulating the performance of a variety of solar powered Rankine cycle air conditioning system components (RCACS) has been developed. The computer program models actual equipment by developing performance maps from manufacturers data and is capable of simulating off-design operation of the RCACS components. The program designed to be a subroutine of the Marshall Space Flight Center (MSFC) Solar Energy System Analysis Computer Program 'SOLRAD', is a complete package suitable for use by an occasional computer user in developing performance maps of heating, ventilation and air conditioning components.
Current state and future direction of computer systems at NASA Langley Research Center

NASA Technical Reports Server (NTRS)

Rogers, James L. (Editor); Tucker, Jerry H. (Editor)

1992-01-01

Computer systems have advanced at a rate unmatched by any other area of technology. As performance has dramatically increased there has been an equally dramatic reduction in cost. This constant cost performance improvement has precipitated the pervasiveness of computer systems into virtually all areas of technology. This improvement is due primarily to advances in microelectronics. Most people are now convinced that the new generation of supercomputers will be built using a large number (possibly thousands) of high performance microprocessors. Although the spectacular improvements in computer systems have come about because of these hardware advances, there has also been a steady improvement in software techniques. In an effort to understand how these hardware and software advances will effect research at NASA LaRC, the Computer Systems Technical Committee drafted this white paper to examine the current state and possible future directions of computer systems at the Center. This paper discusses selected important areas of computer systems including real-time systems, embedded systems, high performance computing, distributed computing networks, data acquisition systems, artificial intelligence, and visualization.
Uncertain behaviours of integrated circuits improve computational performance.

PubMed

Yoshimura, Chihiro; Yamaoka, Masanao; Hayashi, Masato; Okuyama, Takuya; Aoki, Hidetaka; Kawarabayashi, Ken-ichi; Mizuno, Hiroyuki

2015-11-20

Improvements to the performance of conventional computers have mainly been achieved through semiconductor scaling; however, scaling is reaching its limitations. Natural phenomena, such as quantum superposition and stochastic resonance, have been introduced into new computing paradigms to improve performance beyond these limitations. Here, we explain that the uncertain behaviours of devices due to semiconductor scaling can improve the performance of computers. We prototyped an integrated circuit by performing a ground-state search of the Ising model. The bit errors of memory cell devices holding the current state of search occur probabilistically by inserting fluctuations into dynamic device characteristics, which will be actualised in the future to the chip. As a result, we observed more improvements in solution accuracy than that without fluctuations. Although the uncertain behaviours of devices had been intended to be eliminated in conventional devices, we demonstrate that uncertain behaviours has become the key to improving computational performance.
Performance measurement and modeling of component applications in a high performance computing environment : a case study.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Armstrong, Robert C.; Ray, Jaideep; Malony, A.

2003-11-01

We present a case study of performance measurement and modeling of a CCA (Common Component Architecture) component-based application in a high performance computing environment. We explore issues peculiar to component-based HPC applications and propose a performance measurement infrastructure for HPC based loosely on recent work done for Grid environments. A prototypical implementation of the infrastructure is used to collect data for a three components in a scientific application and construct performance models for two of them. Both computational and message-passing performance are addressed.
Biological modelling of a computational spiking neural network with neuronal avalanches.

PubMed

Li, Xiumin; Chen, Qing; Xue, Fangzheng

2017-06-28

In recent years, an increasing number of studies have demonstrated that networks in the brain can self-organize into a critical state where dynamics exhibit a mixture of ordered and disordered patterns. This critical branching phenomenon is termed neuronal avalanches. It has been hypothesized that the homeostatic level balanced between stability and plasticity of this critical state may be the optimal state for performing diverse neural computational tasks. However, the critical region for high performance is narrow and sensitive for spiking neural networks (SNNs). In this paper, we investigated the role of the critical state in neural computations based on liquid-state machines, a biologically plausible computational neural network model for real-time computing. The computational performance of an SNN when operating at the critical state and, in particular, with spike-timing-dependent plasticity for updating synaptic weights is investigated. The network is found to show the best computational performance when it is subjected to critical dynamic states. Moreover, the active-neuron-dominant structure refined from synaptic learning can remarkably enhance the robustness of the critical state and further improve computational accuracy. These results may have important implications in the modelling of spiking neural networks with optimal computational performance.This article is part of the themed issue 'Mathematical methods in medicine: neuroscience, cardiology and pathology'. © 2017 The Author(s).
Biological modelling of a computational spiking neural network with neuronal avalanches

NASA Astrophysics Data System (ADS)

Li, Xiumin; Chen, Qing; Xue, Fangzheng

2017-05-01

In recent years, an increasing number of studies have demonstrated that networks in the brain can self-organize into a critical state where dynamics exhibit a mixture of ordered and disordered patterns. This critical branching phenomenon is termed neuronal avalanches. It has been hypothesized that the homeostatic level balanced between stability and plasticity of this critical state may be the optimal state for performing diverse neural computational tasks. However, the critical region for high performance is narrow and sensitive for spiking neural networks (SNNs). In this paper, we investigated the role of the critical state in neural computations based on liquid-state machines, a biologically plausible computational neural network model for real-time computing. The computational performance of an SNN when operating at the critical state and, in particular, with spike-timing-dependent plasticity for updating synaptic weights is investigated. The network is found to show the best computational performance when it is subjected to critical dynamic states. Moreover, the active-neuron-dominant structure refined from synaptic learning can remarkably enhance the robustness of the critical state and further improve computational accuracy. These results may have important implications in the modelling of spiking neural networks with optimal computational performance. This article is part of the themed issue `Mathematical methods in medicine: neuroscience, cardiology and pathology'.
Analysing the performance of personal computers based on Intel microprocessors for sequence aligning bioinformatics applications.

PubMed

Nair, Pradeep S; John, Eugene B

2007-01-01

Aligning specific sequences against a very large number of other sequences is a central aspect of bioinformatics. With the widespread availability of personal computers in biology laboratories, sequence alignment is now often performed locally. This makes it necessary to analyse the performance of personal computers for sequence aligning bioinformatics benchmarks. In this paper, we analyse the performance of a personal computer for the popular BLAST and FASTA sequence alignment suites. Results indicate that these benchmarks have a large number of recurring operations and use memory operations extensively. It seems that the performance can be improved with a bigger L1-cache.
Evaluating the Efficacy of the Cloud for Cluster Computation

NASA Technical Reports Server (NTRS)

Knight, David; Shams, Khawaja; Chang, George; Soderstrom, Tom

2012-01-01

Computing requirements vary by industry, and it follows that NASA and other research organizations have computing demands that fall outside the mainstream. While cloud computing made rapid inroads for tasks such as powering web applications, performance issues on highly distributed tasks hindered early adoption for scientific computation. One venture to address this problem is Nebula, NASA's homegrown cloud project tasked with delivering science-quality cloud computing resources. However, another industry development is Amazon's high-performance computing (HPC) instances on Elastic Cloud Compute (EC2) that promises improved performance for cluster computation. This paper presents results from a series of benchmarks run on Amazon EC2 and discusses the efficacy of current commercial cloud technology for running scientific applications across a cluster. In particular, a 240-core cluster of cloud instances achieved 2 TFLOPS on High-Performance Linpack (HPL) at 70% of theoretical computational performance. The cluster's local network also demonstrated sub-100 ?s inter-process latency with sustained inter-node throughput in excess of 8 Gbps. Beyond HPL, a real-world Hadoop image processing task from NASA's Lunar Mapping and Modeling Project (LMMP) was run on a 29 instance cluster to process lunar and Martian surface images with sizes on the order of tens of gigapixels. These results demonstrate that while not a rival of dedicated supercomputing clusters, commercial cloud technology is now a feasible option for moderately demanding scientific workloads.
Performance limits and trade-offs in entropy-driven biochemical computers.

PubMed

Chu, Dominique

2018-04-14

It is now widely accepted that biochemical reaction networks can perform computations. Examples are kinetic proof reading, gene regulation, or signalling networks. For many of these systems it was found that their computational performance is limited by a trade-off between the metabolic cost, the speed and the accuracy of the computation. In order to gain insight into the origins of these trade-offs, we consider entropy-driven computers as a model of biochemical computation. Using tools from stochastic thermodynamics, we show that entropy-driven computation is subject to a trade-off between accuracy and metabolic cost, but does not involve time-trade-offs. Time trade-offs appear when it is taken into account that the result of the computation needs to be measured in order to be known. We argue that this measurement process, although usually ignored, is a major contributor to the cost of biochemical computation. Copyright © 2018 Elsevier Ltd. All rights reserved.
Sex differences on a computerized mental rotation task disappear with computer familiarization.

PubMed

Roberts, J E; Bell, M A

2000-12-01

The area of cognitive research that has produced the most consistent sex differences is spatial ability. Particularly, men consistently perform better on mental rotation tasks than do women. This study examined the effects of familiarization with a computer on performance of a computerized two-dimensional mental rotation task. Two groups of college students (N=44) performed the rotation task, with one group performing a color-matching task that allowed them to be familiarized with the computer prior to the rotation task. Among the participants who only performed the rotation task, the 11 men performed better than the 11 women. Among the participants who performed the computer familiarization task before the rotation task, how ever, there were no sex differences on the mental rotation task between the 10 men and 12 women. These data indicate that sex differences on this two-dimensional task may reflect familiarization with the computer, not the mental rotation component of the task. Further research with larger samples and increased range of task difficulty is encouraged.
Advanced Certification Program for Computer Graphic Specialists. Final Performance Report.

ERIC Educational Resources Information Center

Parkland Coll., Champaign, IL.

A pioneer program in computer graphics was implemented at Parkland College (Illinois) to meet the demand for specialized technicians to visualize data generated on high performance computers. In summer 1989, 23 students were accepted into the pilot program. Courses included C programming, calculus and analytic geometry, computer graphics, and…
CPMIP: measurements of real computational performance of Earth system models in CMIP6

NASA Astrophysics Data System (ADS)

Balaji, Venkatramani; Maisonnave, Eric; Zadeh, Niki; Lawrence, Bryan N.; Biercamp, Joachim; Fladrich, Uwe; Aloisio, Giovanni; Benson, Rusty; Caubel, Arnaud; Durachta, Jeffrey; Foujols, Marie-Alice; Lister, Grenville; Mocavero, Silvia; Underwood, Seth; Wright, Garrett

2017-01-01

A climate model represents a multitude of processes on a variety of timescales and space scales: a canonical example of multi-physics multi-scale modeling. The underlying climate system is physically characterized by sensitive dependence on initial conditions, and natural stochastic variability, so very long integrations are needed to extract signals of climate change. Algorithms generally possess weak scaling and can be I/O and/or memory-bound. Such weak-scaling, I/O, and memory-bound multi-physics codes present particular challenges to computational performance. Traditional metrics of computational efficiency such as performance counters and scaling curves do not tell us enough about real sustained performance from climate models on different machines. They also do not provide a satisfactory basis for comparative information across models. codes present particular challenges to computational performance. We introduce a set of metrics that can be used for the study of computational performance of climate (and Earth system) models. These measures do not require specialized software or specific hardware counters, and should be accessible to anyone. They are independent of platform and underlying parallel programming models. We show how these metrics can be used to measure actually attained performance of Earth system models on different machines, and identify the most fruitful areas of research and development for performance engineering. codes present particular challenges to computational performance. We present results for these measures for a diverse suite of models from several modeling centers, and propose to use these measures as a basis for a CPMIP, a computational performance model intercomparison project (MIP).
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE PAGES

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.; ...

2015-07-30

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less

Efficient universal blind quantum computation.

PubMed

Giovannetti, Vittorio; Maccone, Lorenzo; Morimae, Tomoyuki; Rudolph, Terry G

2013-12-06

We give a cheat sensitive protocol for blind universal quantum computation that is efficient in terms of computational and communication resources: it allows one party to perform an arbitrary computation on a second party's quantum computer without revealing either which computation is performed, or its input and output. The first party's computational capabilities can be extremely limited: she must only be able to create and measure single-qubit superposition states. The second party is not required to use measurement-based quantum computation. The protocol requires the (optimal) exchange of O(Jlog2(N)) single-qubit states, where J is the computational depth and N is the number of qubits needed for the computation.
HPCCP/CAS Workshop Proceedings 1998

NASA Technical Reports Server (NTRS)

Schulbach, Catherine; Mata, Ellen (Editor); Schulbach, Catherine (Editor)

1999-01-01

This publication is a collection of extended abstracts of presentations given at the HPCCP/CAS (High Performance Computing and Communications Program/Computational Aerosciences Project) Workshop held on August 24-26, 1998, at NASA Ames Research Center, Moffett Field, California. The objective of the Workshop was to bring together the aerospace high performance computing community, consisting of airframe and propulsion companies, independent software vendors, university researchers, and government scientists and engineers. The Workshop was sponsored by the HPCCP Office at NASA Ames Research Center. The Workshop consisted of over 40 presentations, including an overview of NASA's High Performance Computing and Communications Program and the Computational Aerosciences Project; ten sessions of papers representative of the high performance computing research conducted within the Program by the aerospace industry, academia, NASA, and other government laboratories; two panel sessions; and a special presentation by Mr. James Bailey.
Improving Student Performance through Computer-Based Assessment: Insights from Recent Research.

ERIC Educational Resources Information Center

Ricketts, C.; Wilks, S. J.

2002-01-01

Compared student performance on computer-based assessment to machine-graded multiple choice tests. Found that performance improved dramatically on the computer-based assessment when students were not required to scroll through the question paper. Concluded that students may be disadvantaged by the introduction of online assessment unless care is…
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

NASA Astrophysics Data System (ADS)

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

2015-05-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
Optical interconnection networks for high-performance computing systems

NASA Astrophysics Data System (ADS)

Biberman, Aleksandr; Bergman, Keren

2012-04-01

Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers.
Demonstration of Cost-Effective, High-Performance Computing at Performance and Reliability Levels Equivalent to a 1994 Vector Supercomputer

NASA Technical Reports Server (NTRS)

Babrauckas, Theresa

2000-01-01

The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.
High-performance computing on GPUs for resistivity logging of oil and gas wells

NASA Astrophysics Data System (ADS)

Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

2017-10-01

We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
Chemical calculations on Cray computers

NASA Technical Reports Server (NTRS)

Taylor, Peter R.; Bauschlicher, Charles W., Jr.; Schwenke, David W.

1989-01-01

The influence of recent developments in supercomputing on computational chemistry is discussed with particular reference to Cray computers and their pipelined vector/limited parallel architectures. After reviewing Cray hardware and software the performance of different elementary program structures are examined, and effective methods for improving program performance are outlined. The computational strategies appropriate for obtaining optimum performance in applications to quantum chemistry and dynamics are discussed. Finally, some discussion is given of new developments and future hardware and software improvements.
Affordable Emerging Computer Hardware for Neuromorphic Computing Applications

DTIC Science & Technology

2011-09-01

DATES COVERED (From - To) 4 . TITLE AND SUBTITLE AFFORDABLE EMERGING COMPUTER HARDWARE FOR NEUROMORPHIC COMPUTING APPLICATIONS 5a. CONTRACT NUMBER...speedup over software [3, 4 ]. 3 Table 1 shows a comparison of the computing performance, communication performance, power consumption...time is probably 5 frames per second, corresponding to 5 saccades. III. RESULTS AND DISCUSSION The use of IBM Cell-BE technology (Sony PlayStation
Gender plays no role in student ability to perform on computer-based examinations

PubMed Central

Kies, Susan M; Williams, Benjamin D; Freund, Gregory G

2006-01-01

Background To see if there is a difference in performance when students switch from traditional paper-and-pencil examinations to computer-based examinations, and to determine whether there are gender differences in student performance in these two examination formats. Methods This study involved first year medical students at the University of Illinois at Urbana-Champaign over three Academic Years 2002–03/2003–04 and 2003–05. Comparisons of student performance by overall class and gender were made. Specific comparisons within courses that utilized both the paper-and-pencil and computer formats were analyzed. Results Overall performance scores for students among the various Academic Years revealed no differences between exams given in the traditional pen-and-paper and computer formats. Further, when we looked specifically for gender differences in performance between these two testing formats, we found none. Conclusion The format for examinations in the courses analyzed does not affect student performance. We find no evidence for gender differences in performance on exams on pen-and-paper or computer-based exams. PMID:17132169
High Performance Computing and Communications Act of 1991. Hearing Before the Subcommittee on Science, Technology, and Space of the Committee on Commerce, Science, and Transportation. One Hundred Second Congress, First Session on S. 272 To Provide for a Coordinated Federal Research Program To Ensure Continued United States Leadership in High-Performance Computing.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

This hearing before the Senate Subcommittee on Science, Technology, and Space focuses on S. 272, the High-Performance Computing and Communications Act of 1991, a bill that provides for a coordinated federal research and development program to ensure continued U.S. leadership in this area. Performance computing is defined as representing the…
Analyses of ACPL thermal/fluid conditioning system

NASA Technical Reports Server (NTRS)

Stephen, L. A.; Usher, L. H.

1976-01-01

Results of engineering analyses are reported. Initial computations were made using a modified control transfer function where the systems performance was characterized parametrically using an analytical model. The analytical model was revised to represent the latest expansion chamber fluid manifold design, and systems performance predictions were made. Parameters which were independently varied in these computations are listed. Systems predictions which were used to characterize performance are primarily transient computer plots comparing the deviation between average chamber temperature and the chamber temperature requirement. Additional computer plots were prepared. Results of parametric computations with the latest fluid manifold design are included.
Performance of parallel computation using CUDA for solving the one-dimensional elasticity equations

NASA Astrophysics Data System (ADS)

Darmawan, J. B. B.; Mungkasi, S.

2017-01-01

In this paper, we investigate the performance of parallel computation in solving the one-dimensional elasticity equations. Elasticity equations are usually implemented in engineering science. Solving these equations fast and efficiently is desired. Therefore, we propose the use of parallel computation. Our parallel computation uses CUDA of the NVIDIA. Our research results show that parallel computation using CUDA has a great advantage and is powerful when the computation is of large scale.
Tse computers. [Chinese pictograph character binary image processor design for high speed applications

NASA Technical Reports Server (NTRS)

Strong, J. P., III

1973-01-01

Tse computers have the potential of operating four or five orders of magnitude faster than present digital computers. The computers of the new design use binary images as their basic computational entity. The word 'tse' is the transliteration of the Chinese word for 'pictograph character.' Tse computers are large collections of devices that perform logical operations on binary images. The operations on binary images are to be performed over the entire image simultaneously.
Lanczos eigensolution method for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1991-01-01

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors.
[Series: Medical Applications of the PHITS Code (2): Acceleration by Parallel Computing].

PubMed

Furuta, Takuya; Sato, Tatsuhiko

2015-01-01

Time-consuming Monte Carlo dose calculation becomes feasible owing to the development of computer technology. However, the recent development is due to emergence of the multi-core high performance computers. Therefore, parallel computing becomes a key to achieve good performance of software programs. A Monte Carlo simulation code PHITS contains two parallel computing functions, the distributed-memory parallelization using protocols of message passing interface (MPI) and the shared-memory parallelization using open multi-processing (OpenMP) directives. Users can choose the two functions according to their needs. This paper gives the explanation of the two functions with their advantages and disadvantages. Some test applications are also provided to show their performance using a typical multi-core high performance workstation.
Performing an allreduce operation using shared memory

DOEpatents

Archer, Charles J [Rochester, MN; Dozsa, Gabor [Ardsley, NY; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
Performing an allreduce operation using shared memory

DOEpatents

Archer, Charles J; Dozsa, Gabor; Ratterman, Joseph D; Smith, Brian E

2014-06-10

Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
Performance Evaluation in Network-Based Parallel Computing

NASA Technical Reports Server (NTRS)

Dezhgosha, Kamyar

1996-01-01

Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
Combining high performance simulation, data acquisition, and graphics display computers

NASA Technical Reports Server (NTRS)

Hickman, Robert J.

1989-01-01

Issues involved in the continuing development of an advanced simulation complex are discussed. This approach provides the capability to perform the majority of tests on advanced systems, non-destructively. The controlled test environments can be replicated to examine the response of the systems under test to alternative treatments of the system control design, or test the function and qualification of specific hardware. Field tests verify that the elements simulated in the laboratories are sufficient. The digital computer is hosted by a Digital Equipment Corp. MicroVAX computer with an Aptec Computer Systems Model 24 I/O computer performing the communication function. An Applied Dynamics International AD100 performs the high speed simulation computing and an Evans and Sutherland PS350 performs on-line graphics display. A Scientific Computer Systems SCS40 acts as a high performance FORTRAN program processor to support the complex, by generating numerous large files from programs coded in FORTRAN that are required for the real time processing. Four programming languages are involved in the process, FORTRAN, ADSIM, ADRIO, and STAPLE. FORTRAN is employed on the MicroVAX host to initialize and terminate the simulation runs on the system. The generation of the data files on the SCS40 also is performed with FORTRAN programs. ADSIM and ADIRO are used to program the processing elements of the AD100 and its IOCP processor. STAPLE is used to program the Aptec DIP and DIA processors.

User's guide to the NOZL3D and NOZLIC computer programs

NASA Technical Reports Server (NTRS)

Thomas, P. D.

1980-01-01

Complete FORTRAN listings and running instructions are given for a set of computer programs that perform an implicit numerical solution to the unsteady Navier-Stokes equations to predict the flow characteristics and performance of nonaxisymmetric nozzles. The set includes the NOZL3D program, which performs the flow computations; the NOZLIC program, which sets up the flow field initial conditions for general nozzle configurations, and also generates the computational grid for simple two dimensional and axisymmetric configurations; and the RGRIDD program, which generates the computational grid for complicated three dimensional configurations. The programs are designed specifically for the NASA-Langley CYBER 175 computer, and employ auxiliary disk files for primary data storage. Input instructions and computed results are given for four test cases that include two dimensional, three dimensional, and axisymmetric configurations.
Gender Differences in Attitudes toward Computers and Performance in the Accounting Information Systems Class

ERIC Educational Resources Information Center

Lenard, Mary Jane; Wessels, Susan; Khanlarian, Cindi

2010-01-01

Using a model developed by Young (2000), this paper explores the relationship between performance in the Accounting Information Systems course, self-assessed computer skills, and attitudes toward computers. Results show that after taking the AIS course, students experience a change in perception about their use of computers. Females'…
Distributed Accounting on the Grid

NASA Technical Reports Server (NTRS)

Thigpen, William; Hacker, Thomas J.; McGinnis, Laura F.; Athey, Brian D.

2001-01-01

By the late 1990s, the Internet was adequately equipped to move vast amounts of data between HPC (High Performance Computing) systems, and efforts were initiated to link together the national infrastructure of high performance computational and data storage resources together into a general computational utility 'grid', analogous to the national electrical power grid infrastructure. The purpose of the Computational grid is to provide dependable, consistent, pervasive, and inexpensive access to computational resources for the computing community in the form of a computing utility. This paper presents a fully distributed view of Grid usage accounting and a methodology for allocating Grid computational resources for use on a Grid computing system.
HPC on Competitive Cloud Resources

NASA Astrophysics Data System (ADS)

Bientinesi, Paolo; Iakymchuk, Roman; Napper, Jeff

Computing as a utility has reached the mainstream. Scientists can now easily rent time on large commercial clusters that can be expanded and reduced on-demand in real-time. However, current commercial cloud computing performance falls short of systems specifically designed for scientific applications. Scientific computing needs are quite different from those of the web applications that have been the focus of cloud computing vendors. In this chapter we demonstrate through empirical evaluation the computational efficiency of high-performance numerical applications in a commercial cloud environment when resources are shared under high contention. Using the Linpack benchmark as a case study, we show that cache utilization becomes highly unpredictable and similarly affects computation time. For some problems, not only is it more efficient to underutilize resources, but the solution can be reached sooner in realtime (wall-time). We also show that the smallest, cheapest (64-bit) instance on the studied environment is the best for price to performance ration. In light of the high-contention we witness, we believe that alternative definitions of efficiency for commercial cloud environments should be introduced where strong performance guarantees do not exist. Concepts like average, expected performance and execution time, expected cost to completion, and variance measures--traditionally ignored in the high-performance computing context--now should complement or even substitute the standard definitions of efficiency.
Performance of the MIR Cooperative Solar Array After 2.5 Years in Orbit

NASA Technical Reports Server (NTRS)

Kerslake, Thomas W.; Hoffman, David J.

1999-01-01

The Mir Cooperative Solar Array (MCSA) was developed jointly by the United States and Russia to produce 6 kW of power for the Russian space station Mir. Four, multi-orbit test sequences were executed between June 1996 and December 1998 to measure MCSA electrical performance. A dedicated Fortran computer code was developed to analyze the detailed thermal-electrical performance of the MCSA. The computational performance results compared very favorably with the measured flight data in most cases. Minor performance degradation was detected in one current generating section of the MCSA. Yet overall, the flight data indicated the MCSA was meeting and exceeding performance expectations. There was no precipitous performance loss due to contamination or other causes after 2.5 years of operation. In this paper, we review the MCSA flight electrical performance tests, data and computational modeling and discuss findings from data comparisons with the computational results.
Generic Divide and Conquer Internet-Based Computing

NASA Technical Reports Server (NTRS)

Follen, Gregory J. (Technical Monitor); Radenski, Atanas

2003-01-01

The growth of Internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of Peer to Peer (P2P) software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high-performance computing applications community. The general goal of this project is to achieve better understanding of the transition to Internet-based high-performance computing and to develop solutions for some of the technical challenges of this transition. In particular, we are interested in creating long-term motivation for end users to provide their idle processor time to support computationally intensive tasks. We believe that a practical P2P architecture should provide useful service to both clients with high-performance computing needs and contributors of lower-end computing resources. To achieve this, we are designing dual -service architecture for P2P high-performance divide-and conquer computing; we are also experimenting with a prototype implementation. Our proposed architecture incorporates a master server, utilizes dual satellite servers, and operates on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. A dual satellite server comprises a high-performance computing engine and a lower-end contributor service engine. The computing engine provides generic support for divide and conquer computations. The service engine is intended to provide free useful HTTP-based services to contributors of lower-end computing resources. Our proposed architecture is complementary to and accessible from computational grids, such as Globus, Legion, and Condor. Grids provide remote access to existing higher-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end Internet nodes. Our project is focused on a generic divide and conquer paradigm and on mobile applications of this paradigm that can operate on a loose and ever changing pool of lower-end Internet nodes.
The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction.

PubMed

Casey, M

1996-08-15

Recurrent neural networks (RNNs) can learn to perform finite state computations. It is shown that an RNN performing a finite state computation must organize its state space to mimic the states in the minimal deterministic finite state machine that can perform that computation, and a precise description of the attractor structure of such systems is given. This knowledge effectively predicts activation space dynamics, which allows one to understand RNN computation dynamics in spite of complexity in activation dynamics. This theory provides a theoretical framework for understanding finite state machine (FSM) extraction techniques and can be used to improve training methods for RNNs performing FSM computations. This provides an example of a successful approach to understanding a general class of complex systems that has not been explicitly designed, e.g., systems that have evolved or learned their internal structure.
Evaluating the generalization of math fact fluency gains across paper and computer performance modalities.

PubMed

Duhon, Gary J; House, Sara H; Stinnett, Terry A

2012-06-01

Computer-based interventions are being used more in the classroom. Student responses to these interventions often contribute to decisions making regarding important outcomes. It is important to understand the effect of these interventions within the context of the intervention as well as across related context. The current study examined the generalization of math fact fluency gains resulting from a computer-based intervention to paper-and-pencil performance. A total of 31 second grade students completed fluency drills on the computer or with paper and pencil. Pretest-posttest performance on both computer and paper and pencil for all students was evaluated using a doubly multivariate repeated measure ANOVA. Results indicated that gains achieved on the computer did not generalize to paper-and-pencil performance. Copyright © 2012 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
The DoD's High Performance Computing Modernization Program - Ensuing the National Earth Systems Prediction Capability Becomes Operational

NASA Astrophysics Data System (ADS)

Burnett, W.

2016-12-01

The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will provide the DoD with future computing assets to initially operate the N-ESPC in 2019. This talk will further describe how DoD's HPCMP will ensure N-ESPC becomes operational, efficiently and effectively, using next-generation high performance computing.
Verification of a VRF Heat Pump Computer Model in EnergyPlus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nigusse, Bereket; Raustad, Richard

2013-06-15

This paper provides verification results of the EnergyPlus variable refrigerant flow (VRF) heat pump computer model using manufacturer's performance data. The paper provides an overview of the VRF model, presents the verification methodology, and discusses the results. The verification provides quantitative comparison of full and part-load performance to manufacturer's data in cooling-only and heating-only modes of operation. The VRF heat pump computer model uses dual range bi-quadratic performance curves to represent capacity and Energy Input Ratio (EIR) as a function of indoor and outdoor air temperatures, and dual range quadratic performance curves as a function of part-load-ratio for modeling part-loadmore » performance. These performance curves are generated directly from manufacturer's published performance data. The verification compared the simulation output directly to manufacturer's performance data, and found that the dual range equation fit VRF heat pump computer model predicts the manufacturer's performance data very well over a wide range of indoor and outdoor temperatures and part-load conditions. The predicted capacity and electric power deviations are comparbale to equation-fit HVAC computer models commonly used for packaged and split unitary HVAC equipment.« less
Heterogeneous high throughput scientific computing with APM X-Gene and Intel Xeon Phi

DOE PAGES

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; ...

2015-05-22

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. As a result, we report our experience on software porting, performance and energy efficiency and evaluatemore » the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).« less
Computer software.

PubMed

Rosenthal, L E

1986-10-01

Software is the component in a computer system that permits the hardware to perform the various functions that a computer system is capable of doing. The history of software and its development can be traced to the early nineteenth century. All computer systems are designed to utilize the "stored program concept" as first developed by Charles Babbage in the 1850s. The concept was lost until the mid-1940s, when modern computers made their appearance. Today, because of the complex and myriad tasks that a computer system can perform, there has been a differentiation of types of software. There is software designed to perform specific business applications. There is software that controls the overall operation of a computer system. And there is software that is designed to carry out specialized tasks. Regardless of types, software is the most critical component of any computer system. Without it, all one has is a collection of circuits, transistors, and silicone chips.
Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

NASA Astrophysics Data System (ADS)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems

PubMed Central

Teodoro, George; Kurc, Tahsin M.; Pan, Tony; Cooper, Lee A.D.; Kong, Jun; Widener, Patrick; Saltz, Joel H.

2014-01-01

The past decade has witnessed a major paradigm shift in high performance computing with the introduction of accelerators as general purpose processors. These computing devices make available very high parallel computing power at low cost and power consumption, transforming current high performance platforms into heterogeneous CPU-GPU equipped systems. Although the theoretical performance achieved by these hybrid systems is impressive, taking practical advantage of this computing power remains a very challenging problem. Most applications are still deployed to either GPU or CPU, leaving the other resource under- or un-utilized. In this paper, we propose, implement, and evaluate a performance aware scheduling technique along with optimizations to make efficient collaborative use of CPUs and GPUs on a parallel system. In the context of feature computations in large scale image analysis applications, our evaluations show that intelligently co-scheduling CPUs and GPUs can significantly improve performance over GPU-only or multi-core CPU-only approaches. PMID:25419545
Multicore Challenges and Benefits for High Performance Scientific Computing

DOE PAGES

Nielsen, Ida M. B.; Janssen, Curtis L.

2008-01-01

Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexitymore » of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.« less
Relationship between quality of care and choice of clinical computing system: retrospective analysis of family practice performance under the UK's quality and outcomes framework.

PubMed

Kontopantelis, Evangelos; Buchan, Iain; Reeves, David; Checkland, Kath; Doran, Tim

2013-08-02

To investigate the relationship between performance on the UK Quality and Outcomes Framework pay-for-performance scheme and choice of clinical computer system. Retrospective longitudinal study. Data for 2007-2008 to 2010-2011, extracted from the clinical computer systems of general practices in England. All English practices participating in the pay-for-performance scheme: average 8257 each year, covering over 99% of the English population registered with a general practice. Levels of achievement on 62 quality-of-care indicators, measured as: reported achievement (levels of care after excluding inappropriate patients); population achievement (levels of care for all patients with the relevant condition) and percentage of available quality points attained. Multilevel mixed effects multiple linear regression models were used to identify population, practice and clinical computing system predictors of achievement. Seven clinical computer systems were consistently active in the study period, collectively holding approximately 99% of the market share. Of all population and practice characteristics assessed, choice of clinical computing system was the strongest predictor of performance across all three outcome measures. Differences between systems were greatest for intermediate outcomes indicators (eg, control of cholesterol levels). Under the UK's pay-for-performance scheme, differences in practice performance were associated with the choice of clinical computing system. This raises the question of whether particular system characteristics facilitate higher quality of care, better data recording or both. Inconsistencies across systems need to be understood and addressed, and researchers need to be cautious when generalising findings from samples of providers using a single computing system.
Computer system performance measurement techniques for ARTS III computer systems.

DOT National Transportation Integrated Search

1973-12-01

Direct measurement of computer systems is of vital importance in: a) developing an intelligent grasp of the variables which affect overall performance; b)tuning the system for optimum benefit; c)determining under what conditions saturation thresholds...
Dense, Efficient Chip-to-Chip Communication at the Extremes of Computing

ERIC Educational Resources Information Center

Loh, Matthew

2013-01-01

The scalability of CMOS technology has driven computation into a diverse range of applications across the power consumption, performance and size spectra. Communication is a necessary adjunct to computation, and whether this is to push data from node-to-node in a high-performance computing cluster or from the receiver of wireless link to a neural…
D-region blunt probe data analysis using hybrid computer techniques

NASA Technical Reports Server (NTRS)

Burkhard, W. J.

1973-01-01

The feasibility of performing data reduction techniques with a hybrid computer was studied. The data was obtained from the flight of a parachute born probe through the D-region of the ionosphere. A presentation of the theory of blunt probe operation is included with emphasis on the equations necessary to perform the analysis. This is followed by a discussion of computer program development. Included in this discussion is a comparison of computer and hand reduction results for the blunt probe launched on 31 January 1972. The comparison showed that it was both feasible and desirable to use the computer for data reduction. The results of computer data reduction performed on flight data acquired from five blunt probes are also presented.
Method and apparatus for managing access to a memory

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeBenedictis, Erik

A method and apparatus for managing access to a memory of a computing system. A controller transforms a plurality of operations that represent a computing job into an operational memory layout that reduces a size of a selected portion of the memory that needs to be accessed to perform the computing job. The controller stores the operational memory layout in a plurality of memory cells within the selected portion of the memory. The controller controls a sequence by which a processor in the computing system accesses the memory to perform the computing job using the operational memory layout. The operationalmore » memory layout reduces an amount of energy consumed by the processor to perform the computing job.« less

National High-Performance Computing and Networking Act. Report To Accompany S. 343, Senate, 102d Congess, 1st Session.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Energy and Natural Resources.

The purpose of the bill (S. 343), as reported by the Senate Committee on Energy and Natural Resources, is to establish a federal commitment to the advancement of high-performance computing, improve interagency planning and coordination of federal high-performance computing and networking activities, authorize a national high-speed computer…
GRAPE project

NASA Astrophysics Data System (ADS)

Makino, Junichiro

2002-12-01

We overview our GRAvity PipE (GRAPE) project to develop special-purpose computers for astrophysical N-body simulations. The basic idea of GRAPE is to attach a custom-build computer dedicated to the calculation of gravitational interaction between particles to a general-purpose programmable computer. By this hybrid architecture, we can achieve both a wide range of applications and very high peak performance. Our newest machine, GRAPE-6, achieved the peak speed of 32 Tflops, and sustained performance of 11.55 Tflops, for the total budget of about 4 million USD. We also discuss relative advantages of special-purpose and general-purpose computers and the future of high-performance computing for science and technology.
Performance Models for Split-execution Computing Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humble, Travis S; McCaskey, Alex; Schrock, Jonathan

Split-execution computing leverages the capabilities of multiple computational models to solve problems, but splitting program execution across different computational models incurs costs associated with the translation between domains. We analyze the performance of a split-execution computing system developed from conventional and quantum processing units (QPUs) by using behavioral models that track resource usage. We focus on asymmetric processing models built using conventional CPUs and a family of special-purpose QPUs that employ quantum computing principles. Our performance models account for the translation of a classical optimization problem into the physical representation required by the quantum processor while also accounting for hardwaremore » limitations and conventional processor speed and memory. We conclude that the bottleneck in this split-execution computing system lies at the quantum-classical interface and that the primary time cost is independent of quantum processor behavior.« less
Cell-NPE (Numerical Performance Evaluation): Programming the IBM Cell Broadband Engine -- A General Parallelization Strategy

DTIC Science & Technology

2008-04-01

Space GmbH as follows: B. TECHNICAL PRPOPOSA/DESCRIPTION OF WORK Cell: A Revolutionary High Performance Computing Platform On 29 June 2005 [1...IBM has announced that is has partnered with Mercury Computer Systems, a maker of specialized computers . The Cell chip provides massive floating-point...the computing industry away from the traditional processor technology dominated by Intel. While in the past, the development of computing power has
Design Trade-off Between Performance and Fault-Tolerance of Space Onboard Computers

NASA Astrophysics Data System (ADS)

Gorbunov, M. S.; Antonov, A. A.

2017-01-01

It is well known that there is a trade-off between performance and power consumption in onboard computers. The fault-tolerance is another important factor affecting performance, chip area and power consumption. Involving special SRAM cells and error-correcting codes is often too expensive with relation to the performance needed. We discuss the possibility of finding the optimal solutions for modern onboard computer for scientific apparatus focusing on multi-level cache memory design.
Hyperswitch Communication Network Computer

NASA Technical Reports Server (NTRS)

Peterson, John C.; Chow, Edward T.; Priel, Moshe; Upchurch, Edwin T.

1993-01-01

Hyperswitch Communications Network (HCN) computer is prototype multiple-processor computer being developed. Incorporates improved version of hyperswitch communication network described in "Hyperswitch Network For Hypercube Computer" (NPO-16905). Designed to support high-level software and expansion of itself. HCN computer is message-passing, multiple-instruction/multiple-data computer offering significant advantages over older single-processor and bus-based multiple-processor computers, with respect to price/performance ratio, reliability, availability, and manufacturing. Design of HCN operating-system software provides flexible computing environment accommodating both parallel and distributed processing. Also achieves balance among following competing factors; performance in processing and communications, ease of use, and tolerance of (and recovery from) faults.
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Carter, Jonathan; Shalf, John; Skinner, David; Ethier, Stephane; Biswas, Rupak; Djomehri, Jahed; VanderWijngaart, Rob

2003-01-01

The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple optimizations. Finally, we evaluate the perfor- mance of several numerical codes from key scientific computing domains. Overall results demonstrate that the SX6 achieves high performance on a large fraction of our application suite and in many cases significantly outperforms the RISC-based architectures. However, certain classes of applications are not easily amenable to vectorization and would likely require extensive reengineering of both algorithm and implementation to utilize the SX6 effectively.
All-optical reservoir computing.

PubMed

Duport, François; Schneider, Bendix; Smerieri, Anteo; Haelterman, Marc; Massar, Serge

2012-09-24

Reservoir Computing is a novel computing paradigm that uses a nonlinear recurrent dynamical system to carry out information processing. Recent electronic and optoelectronic Reservoir Computers based on an architecture with a single nonlinear node and a delay loop have shown performance on standardized tasks comparable to state-of-the-art digital implementations. Here we report an all-optical implementation of a Reservoir Computer, made of off-the-shelf components for optical telecommunications. It uses the saturation of a semiconductor optical amplifier as nonlinearity. The present work shows that, within the Reservoir Computing paradigm, all-optical computing with state-of-the-art performance is possible.
Shaded-Color Picture Generation of Computer-Defined Arbitrary Shapes

NASA Technical Reports Server (NTRS)

Cozzolongo, J. V.; Hermstad, D. L.; Mccoy, D. S.; Clark, J.

1986-01-01

SHADE computer program generates realistic color-shaded pictures from computer-defined arbitrary shapes. Objects defined for computer representation displayed as smooth, color-shaded surfaces, including varying degrees of transparency. Results also used for presentation of computational results. By performing color mapping, SHADE colors model surface to display analysis results as pressures, stresses, and temperatures. NASA has used SHADE extensively in sign and analysis of high-performance aircraft. Industry should find applications for SHADE in computer-aided design and computer-aided manufacturing. SHADE written in VAX FORTRAN and MACRO Assembler for either interactive or batch execution.
Computational aeroelasticity using a pressure-based solver

NASA Astrophysics Data System (ADS)

Kamakoti, Ramji

A computational methodology for performing fluid-structure interaction computations for three-dimensional elastic wing geometries is presented. The flow solver used is based on an unsteady Reynolds-Averaged Navier-Stokes (RANS) model. A well validated k-ε turbulence model with wall function treatment for near wall region was used to perform turbulent flow calculations. Relative merits of alternative flow solvers were investigated. The predictor-corrector-based Pressure Implicit Splitting of Operators (PISO) algorithm was found to be computationally economic for unsteady flow computations. Wing structure was modeled using Bernoulli-Euler beam theory. A fully implicit time-marching scheme (using the Newmark integration method) was used to integrate the equations of motion for structure. Bilinear interpolation and linear extrapolation techniques were used to transfer necessary information between fluid and structure solvers. Geometry deformation was accounted for by using a moving boundary module. The moving grid capability was based on a master/slave concept and transfinite interpolation techniques. Since computations were performed on a moving mesh system, the geometric conservation law must be preserved. This is achieved by appropriately evaluating the Jacobian values associated with each cell. Accurate computation of contravariant velocities for unsteady flows using the momentum interpolation method on collocated, curvilinear grids was also addressed. Flutter computations were performed for the AGARD 445.6 wing at subsonic, transonic and supersonic Mach numbers. Unsteady computations were performed at various dynamic pressures to predict the flutter boundary. Results showed favorable agreement of experiment and previous numerical results. The computational methodology exhibited capabilities to predict both qualitative and quantitative features of aeroelasticity.
Developments in REDES: The rocket engine design expert system

NASA Technical Reports Server (NTRS)

Davidian, Kenneth O.

1990-01-01

The Rocket Engine Design Expert System (REDES) is being developed at the NASA-Lewis to collect, automate, and perpetuate the existing expertise of performing a comprehensive rocket engine analysis and design. Currently, REDES uses the rigorous JANNAF methodology to analyze the performance of the thrust chamber and perform computational studies of liquid rocket engine problems. The following computer codes were included in REDES: a gas properties program named GASP, a nozzle design program named RAO, a regenerative cooling channel performance evaluation code named RTE, and the JANNAF standard liquid rocket engine performance prediction code TDK (including performance evaluation modules ODE, ODK, TDE, TDK, and BLM). Computational analyses are being conducted by REDES to provide solutions to liquid rocket engine thrust chamber problems. REDES is built in the Knowledge Engineering Environment (KEE) expert system shell and runs on a Sun 4/110 computer.
Developments in REDES: The Rocket Engine Design Expert System

NASA Technical Reports Server (NTRS)

Davidian, Kenneth O.

1990-01-01

The Rocket Engine Design Expert System (REDES) was developed at NASA-Lewis to collect, automate, and perpetuate the existing expertise of performing a comprehensive rocket engine analysis and design. Currently, REDES uses the rigorous JANNAF methodology to analyze the performance of the thrust chamber and perform computational studies of liquid rocket engine problems. The following computer codes were included in REDES: a gas properties program named GASP; a nozzle design program named RAO; a regenerative cooling channel performance evaluation code named RTE; and the JANNAF standard liquid rocket engine performance prediction code TDK (including performance evaluation modules ODE, ODK, TDE, TDK, and BLM). Computational analyses are being conducted by REDES to provide solutions to liquid rocket engine thrust chamber problems. REDES was built in the Knowledge Engineering Environment (KEE) expert system shell and runs on a Sun 4/110 computer.
Computers for Cognitive Development in Early Childhood--The Teacher's Role in the Computer Learning Environment

ERIC Educational Resources Information Center

Nir-Gal, Ofra; Klein, Pnina S.

2004-01-01

This study was designed to examine the effect of different kinds of adult mediation on the cognitive performance of young children who used computers. The study sample included 150 kindergarten children aged 5-6. The findings indicate that children who engaged in adult-mediated computer activity improved the level of their cognitive performance on…
Home Computer Use and Academic Performance of Nine-Year-Olds

ERIC Educational Resources Information Center

Casey, Alice; Layte, Richard; Lyons, Sean; Silles, Mary

2012-01-01

A recent rise in home computer ownership has seen a growing number of children using computers and accessing the internet from a younger age. This paper examines the link between children's home computing and their academic performance in the areas of reading and mathematics. Data from the nine-year-old cohort of the Growing Up in Ireland survey…
Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aimone, James Bradley; Betty, Rita

Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information - Sandia researchers developed novel methods and metrics for studying the computational function of neurogenesis, thus generating substantial impact to the neuroscience and neural computing communities. This work could benefit applications in machine learning and other analysis activities.
Templet Web: the use of volunteer computing approach in PaaS-style cloud

NASA Astrophysics Data System (ADS)

Vostokin, Sergei; Artamonov, Yuriy; Tsarev, Daniil

2018-03-01

This article presents the Templet Web cloud service. The service is designed for high-performance scientific computing automation. The use of high-performance technology is specifically required by new fields of computational science such as data mining, artificial intelligence, machine learning, and others. Cloud technologies provide a significant cost reduction for high-performance scientific applications. The main objectives to achieve this cost reduction in the Templet Web service design are: (a) the implementation of "on-demand" access; (b) source code deployment management; (c) high-performance computing programs development automation. The distinctive feature of the service is the approach mainly used in the field of volunteer computing, when a person who has access to a computer system delegates his access rights to the requesting user. We developed an access procedure, algorithms, and software for utilization of free computational resources of the academic cluster system in line with the methods of volunteer computing. The Templet Web service has been in operation for five years. It has been successfully used for conducting laboratory workshops and solving research problems, some of which are considered in this article. The article also provides an overview of research directions related to service development.
Versatile analog pulse height computer performs real-time arithmetic operations

NASA Technical Reports Server (NTRS)

Brenner, R.; Strauss, M. G.

1967-01-01

Multipurpose analog pulse height computer performs real-time arithmetic operations on relatively fast pulses. This computer can be used for identification of charged particles, pulse shape discrimination, division of signals from position sensitive detectors, and other on-line data reduction techniques.
Non-unitary probabilistic quantum computing circuit and method

NASA Technical Reports Server (NTRS)

Williams, Colin P. (Inventor); Gingrich, Robert M. (Inventor)

2009-01-01

A quantum circuit performing quantum computation in a quantum computer. A chosen transformation of an initial n-qubit state is probabilistically obtained. The circuit comprises a unitary quantum operator obtained from a non-unitary quantum operator, operating on an n-qubit state and an ancilla state. When operation on the ancilla state provides a success condition, computation is stopped. When operation on the ancilla state provides a failure condition, computation is performed again on the ancilla state and the n-qubit state obtained in the previous computation, until a success condition is obtained.
Expanding the Scope of High-Performance Computing Facilities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uram, Thomas D.; Papka, Michael E.

The high-performance computing centers of the future will expand their roles as service providers, and as the machines scale up, so should the sizes of the communities they serve. National facilities must cultivate their users as much as they focus on operating machines reliably. The authors present five interrelated topic areas that are essential to expanding the value provided to those performing computational science.
X-33 Computational Aeroheating/Aerodynamic Predictions and Comparisons With Experimental Data

NASA Technical Reports Server (NTRS)

Hollis, Brian R.; Thompson, Richard A.; Berry, Scott A.; Horvath, Thomas J.; Murphy, Kelly J.; Nowak, Robert J.; Alter, Stephen J.

2003-01-01

This report details a computational fluid dynamics study conducted in support of the phase II development of the X-33 vehicle. Aerodynamic and aeroheating predictions were generated for the X-33 vehicle at both flight and wind-tunnel test conditions using two finite-volume, Navier-Stokes solvers. Aerodynamic computations were performed at Mach 6 and Mach 10 wind-tunnel conditions for angles of attack from 10 to 50 with body-flap deflections of 0 to 20. Additional aerodynamic computations were performed over a parametric range of free-stream conditions at Mach numbers of 4 to 10 and angles of attack from 10 to 50. Laminar and turbulent wind-tunnel aeroheating computations were performed at Mach 6 for angles of attack of 20 to 40 with body-flap deflections of 0 to 20. Aeroheating computations were performed at four flight conditions with Mach numbers of 6.6 to 8.9 and angles of attack of 10 to 40. Surface heating and pressure distributions, surface streamlines, flow field information, and aerodynamic coefficients from these computations are presented, and comparisons are made with wind-tunnel data.

A National Study of the Relationship between Home Access to a Computer and Academic Performance Scores of Grade 12 U.S. Science Students: An Analysis of the 2009 NAEP Data

NASA Astrophysics Data System (ADS)

Coffman, Mitchell Ward

The purpose of this dissertation was to examine the relationship between student access to a computer at home and academic achievement. The 2009 National Assessment of Educational Progress (NAEP) dataset was probed using the National Data Explorer (NDE) to investigate correlations in the subsets of SES, Parental Education, Race, and Gender as it relates to access of a home computer and improved performance scores for U.S. public school grade 12 science students. A causal-comparative approach was employed seeking clarity on the relationship between home access and performance scores. The influence of home access cannot overcome the challenges students of lower SES face. The achievement gap, or a second digital divide, for underprivileged classes of students, including minorities does not appear to contract via student access to a home computer. Nonetheless, in tests for significance, statistically significant improvement in science performance scores was reported for those having access to a computer at home compared to those not having access. Additionally, regression models reported evidence of correlations between and among subsets of controls for the demographic factors gender, race, and socioeconomic status. Variability in these correlations was high; suggesting influence from unobserved factors may have more impact upon the dependent variable. Having access to a computer at home increases performance scores for grade 12 general science students of all races, genders and socioeconomic levels. However, the performance gap is roughly equivalent to the existing performance gap of the national average for science scores, suggesting little influence from access to a computer on academic achievement. The variability of scores reported in the regression analysis models reflects a moderate to low effect, suggesting an absence of causation. These statistical results are accurate and confirm the literature review, whereby having access to a computer at home and the predictor variables were found to have a significant impact on performance scores, although the data presented suggest computer access at home is less influential upon performance scores than poverty and its correlates.
A performance comparison of scalar, vector, and concurrent vector computers including supercomputers for modeling transport of reactive contaminants in groundwater

NASA Astrophysics Data System (ADS)

Tripathi, Vijay S.; Yeh, G. T.

1993-06-01

Sophisticated and highly computation-intensive models of transport of reactive contaminants in groundwater have been developed in recent years. Application of such models to real-world contaminant transport problems, e.g., simulation of groundwater transport of 10-15 chemically reactive elements (e.g., toxic metals) and relevant complexes and minerals in two and three dimensions over a distance of several hundred meters, requires high-performance computers including supercomputers. Although not widely recognized as such, the computational complexity and demand of these models compare with well-known computation-intensive applications including weather forecasting and quantum chemical calculations. A survey of the performance of a variety of available hardware, as measured by the run times for a reactive transport model HYDROGEOCHEM, showed that while supercomputers provide the fastest execution times for such problems, relatively low-cost reduced instruction set computer (RISC) based scalar computers provide the best performance-to-price ratio. Because supercomputers like the Cray X-MP are inherently multiuser resources, often the RISC computers also provide much better turnaround times. Furthermore, RISC-based workstations provide the best platforms for "visualization" of groundwater flow and contaminant plumes. The most notable result, however, is that current workstations costing less than $10,000 provide performance within a factor of 5 of a Cray X-MP.
Relationship between quality of care and choice of clinical computing system: retrospective analysis of family practice performance under the UK's quality and outcomes framework

PubMed Central

Kontopantelis, Evangelos; Buchan, Iain; Reeves, David; Checkland, Kath; Doran, Tim

2013-01-01

Objectives To investigate the relationship between performance on the UK Quality and Outcomes Framework pay-for-performance scheme and choice of clinical computer system. Design Retrospective longitudinal study. Setting Data for 2007–2008 to 2010–2011, extracted from the clinical computer systems of general practices in England. Participants All English practices participating in the pay-for-performance scheme: average 8257 each year, covering over 99% of the English population registered with a general practice. Main outcome measures Levels of achievement on 62 quality-of-care indicators, measured as: reported achievement (levels of care after excluding inappropriate patients); population achievement (levels of care for all patients with the relevant condition) and percentage of available quality points attained. Multilevel mixed effects multiple linear regression models were used to identify population, practice and clinical computing system predictors of achievement. Results Seven clinical computer systems were consistently active in the study period, collectively holding approximately 99% of the market share. Of all population and practice characteristics assessed, choice of clinical computing system was the strongest predictor of performance across all three outcome measures. Differences between systems were greatest for intermediate outcomes indicators (eg, control of cholesterol levels). Conclusions Under the UK's pay-for-performance scheme, differences in practice performance were associated with the choice of clinical computing system. This raises the question of whether particular system characteristics facilitate higher quality of care, better data recording or both. Inconsistencies across systems need to be understood and addressed, and researchers need to be cautious when generalising findings from samples of providers using a single computing system. PMID:23913774
High Performance Computing (HPC)-Enabled Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation

DTIC Science & Technology

2016-11-01

Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation by Kathryn Esham, Luis Bravo, Anindya Ghoshal, Muthuvel Murugan, and Michael...Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation by Luis Bravo, Anindya Ghoshal, Muthuvel...High Performance Computing (HPC)-Enabled Computational Study on the Feasibility of using Shape Memory Alloys for Gas Turbine Blade Actuation 5a
The effect of psychosocial stress on muscle activity during computer work: Comparative study between desktop computer and mobile computing products.

PubMed

Taib, Mohd Firdaus Mohd; Bahn, Sangwoo; Yun, Myung Hwan

2016-06-27

The popularity of mobile computing products is well known. Thus, it is crucial to evaluate their contribution to musculoskeletal disorders during computer usage under both comfortable and stressful environments. This study explores the effect of different computer products' usages with different tasks used to induce psychosocial stress on muscle activity. Fourteen male subjects performed computer tasks: sixteen combinations of four different computer products with four different tasks used to induce stress. Electromyography for four muscles on the forearm, shoulder and neck regions and task performances were recorded. The increment of trapezius muscle activity was dependent on the task used to induce the stress where a higher level of stress made a greater increment. However, this relationship was not found in the other three muscles. Besides that, compared to desktop and laptop use, the lowest activity for all muscles was obtained during the use of a tablet or smart phone. The best net performance was obtained in a comfortable environment. However, during stressful conditions, the best performance can be obtained using the device that a user is most comfortable with or has the most experience with. Different computer products and different levels of stress play a big role in muscle activity during computer work. Both of these factors must be taken into account in order to reduce the occurrence of musculoskeletal disorders or problems.
Challenges and opportunities of cloud computing for atmospheric sciences

NASA Astrophysics Data System (ADS)

Pérez Montes, Diego A.; Añel, Juan A.; Pena, Tomás F.; Wallom, David C. H.

2016-04-01

Cloud computing is an emerging technological solution widely used in many fields. Initially developed as a flexible way of managing peak demand it has began to make its way in scientific research. One of the greatest advantages of cloud computing for scientific research is independence of having access to a large cyberinfrastructure to fund or perform a research project. Cloud computing can avoid maintenance expenses for large supercomputers and has the potential to 'democratize' the access to high-performance computing, giving flexibility to funding bodies for allocating budgets for the computational costs associated with a project. Two of the most challenging problems in atmospheric sciences are computational cost and uncertainty in meteorological forecasting and climate projections. Both problems are closely related. Usually uncertainty can be reduced with the availability of computational resources to better reproduce a phenomenon or to perform a larger number of experiments. Here we expose results of the application of cloud computing resources for climate modeling using cloud computing infrastructures of three major vendors and two climate models. We show how the cloud infrastructure compares in performance to traditional supercomputers and how it provides the capability to complete experiments in shorter periods of time. The monetary cost associated is also analyzed. Finally we discuss the future potential of this technology for meteorological and climatological applications, both from the point of view of operational use and research.
Comparing student performance on paper- and computer-based math curriculum-based measures.

PubMed

Hensley, Kiersten; Rankin, Angelica; Hosp, John

2017-01-01

As the number of computerized curriculum-based measurement (CBM) tools increases, it is necessary to examine whether or not student performance can generalize across a variety of test administration modes (i.e., paper or computer). The purpose of this study is to compare math fact fluency on paper versus computer for 197 upper elementary students. Students completed identical sets of probes on paper and on the computer, which were then scored for digits correct, problems correct, and accuracy. Results showed a significant difference in performance between the two sets of probes, with higher fluency rates on the paper probes. Because decisions about levels of student support and interventions often rely on measures such as these, more research in this area is needed to examine the potential differences in student performance between paper-based and computer-based CBMs.
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
GPU Accelerated Prognostics

NASA Technical Reports Server (NTRS)

Gorospe, George E., Jr.; Daigle, Matthew J.; Sankararaman, Shankar; Kulkarni, Chetan S.; Ng, Eley

2017-01-01

Prognostic methods enable operators and maintainers to predict the future performance for critical systems. However, these methods can be computationally expensive and may need to be performed each time new information about the system becomes available. In light of these computational requirements, we have investigated the application of graphics processing units (GPUs) as a computational platform for real-time prognostics. Recent advances in GPU technology have reduced cost and increased the computational capability of these highly parallel processing units, making them more attractive for the deployment of prognostic software. We present a survey of model-based prognostic algorithms with considerations for leveraging the parallel architecture of the GPU and a case study of GPU-accelerated battery prognostics with computational performance results.
Iterative-method performance evaluation for multiple vectors associated with a large-scale sparse matrix

NASA Astrophysics Data System (ADS)

Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo

2016-07-01

Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
Distributed computation of graphics primitives on a transputer network

NASA Technical Reports Server (NTRS)

Ellis, Graham K.

1988-01-01

A method is developed for distributing the computation of graphics primitives on a parallel processing network. Off-the-shelf transputer boards are used to perform the graphics transformations and scan-conversion tasks that would normally be assigned to a single transputer based display processor. Each node in the network performs a single graphics primitive computation. Frequently requested tasks can be duplicated on several nodes. The results indicate that the current distribution of commands on the graphics network shows a performance degradation when compared to the graphics display board alone. A change to more computation per node for every communication (perform more complex tasks on each node) may cause the desired increase in throughput.
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs

NASA Technical Reports Server (NTRS)

Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan

2006-01-01

Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

NASA Technical Reports Server (NTRS)

Morgan, Philip E.

2004-01-01

This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
PDS: A Performance Database Server

DOE PAGES

Berry, Michael W.; Dongarra, Jack J.; Larose, Brian H.; ...

1994-01-01

The process of gathering, archiving, and distributing computer benchmark data is a cumbersome task usually performed by computer users and vendors with little coordination. Most important, there is no publicly available central depository of performance data for all ranges of machines from personal computers to supercomputers. We present an Internet-accessible performance database server (PDS) that can be used to extract current benchmark data and literature. As an extension to the X-Windows-based user interface (Xnetlib) to the Netlib archival system, PDS provides an on-line catalog of public domain computer benchmarks such as the LINPACK benchmark, Perfect benchmarks, and the NAS parallelmore » benchmarks. PDS does not reformat or present the benchmark data in any way that conflicts with the original methodology of any particular benchmark; it is thereby devoid of any subjective interpretations of machine performance. We believe that all branches (research laboratories, academia, and industry) of the general computing community can use this facility to archive performance metrics and make them readily available to the public. PDS can provide a more manageable approach to the development and support of a large dynamic database of published performance metrics.« less
Cloud Computing for Complex Performance Codes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin

This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
Role of HPC in Advancing Computational Aeroelasticity

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.

2004-01-01

On behalf of the High Performance Computing and Modernization Program (HPCMP) and NASA Advanced Supercomputing Division (NAS) a study is conducted to assess the role of supercomputers on computational aeroelasticity of aerospace vehicles. The study is mostly based on the responses to a web based questionnaire that was designed to capture the nuances of high performance computational aeroelasticity, particularly on parallel computers. A procedure is presented to assign a fidelity-complexity index to each application. Case studies based on major applications using HPCMP resources are presented.
CPE--A New Perspective: The Impact of the Technology Revolution. Proceedings of the Computer Performance Evaluation Users Group Meeting (19th, San Francisco, California, October 25-28, 1983). Final Report. Reports on Computer Science and Technology.

ERIC Educational Resources Information Center

Mobray, Deborah, Ed.

Papers on local area networks (LANs), modelling techniques, software improvement, capacity planning, software engineering, microcomputers and end user computing, cost accounting and chargeback, configuration and performance management, and benchmarking presented at this conference include: (1) "Theoretical Performance Analysis of Virtual…
Performance of Fourth-Grade Students in the 2012 NAEP Computer-Based Writing Pilot Assessment: Scores, Text Length, and Use of Editing Tools. Working Paper Series. NCES 2015-119

ERIC Educational Resources Information Center

White, Sheida; Kim, Young Yee; Chen, Jing; Liu, Fei

2015-01-01

This study examined whether or not fourth-graders could fully demonstrate their writing skills on the computer and factors associated with their performance on the National Assessment of Educational Progress (NAEP) computer-based writing assessment. The results suggest that high-performing fourth-graders (those who scored in the upper 20 percent…
Intelligent Computer Assisted Instruction (ICAI): Formative Evaluation of Two Systems

DTIC Science & Technology

1986-03-01

appreciation .’.,-* for the power of computer technology. Interpretati on Yale students are a strikingly high performing group by traditional academic ...COMPUTER ASSISTED INSTRUCTION April 1984 - August 1985 (ICAI): FORMATIVE EVALUATION OF TWO SYSTEMS 6. PERFORMING ORG. REPORT NUMBER 7. AUTHOR(*) S...956881 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK AREA & WORK UNIT NUMBERS Jet Propulsion Laboratory 2Q263743A794
Design and Implementation of High-Performance GIS Dynamic Objects Rendering Engine

NASA Astrophysics Data System (ADS)

Zhong, Y.; Wang, S.; Li, R.; Yun, W.; Song, G.

2017-12-01

Spatio-temporal dynamic visualization is more vivid than static visualization. It important to use dynamic visualization techniques to reveal the variation process and trend vividly and comprehensively for the geographical phenomenon. To deal with challenges caused by dynamic visualization of both 2D and 3D spatial dynamic targets, especially for different spatial data types require high-performance GIS dynamic objects rendering engine. The main approach for improving the rendering engine with vast dynamic targets relies on key technologies of high-performance GIS, including memory computing, parallel computing, GPU computing and high-performance algorisms. In this study, high-performance GIS dynamic objects rendering engine is designed and implemented for solving the problem based on hybrid accelerative techniques. The high-performance GIS rendering engine contains GPU computing, OpenGL technology, and high-performance algorism with the advantage of 64-bit memory computing. It processes 2D, 3D dynamic target data efficiently and runs smoothly with vast dynamic target data. The prototype system of high-performance GIS dynamic objects rendering engine is developed based SuperMap GIS iObjects. The experiments are designed for large-scale spatial data visualization, the results showed that the high-performance GIS dynamic objects rendering engine have the advantage of high performance. Rendering two-dimensional and three-dimensional dynamic objects achieve 20 times faster on GPU than on CPU.

Comparison of computer-assisted instruction (CAI) versus traditional textbook methods for training in abdominal examination (Japanese experience).

PubMed

Qayumi, A K; Kurihara, Y; Imai, M; Pachev, G; Seo, H; Hoshino, Y; Cheifetz, R; Matsuura, K; Momoi, M; Saleem, M; Lara-Guerra, H; Miki, Y; Kariya, Y

2004-10-01

This study aimed to compare the effects of computer-assisted, text-based and computer-and-text learning conditions on the performances of 3 groups of medical students in the pre-clinical years of their programme, taking into account their academic achievement to date. A fourth group of students served as a control (no-study) group. Participants were recruited from the pre-clinical years of the training programmes in 2 medical schools in Japan, Jichi Medical School near Tokyo and Kochi Medical School near Osaka. Participants were randomly assigned to 4 learning conditions and tested before and after the study on their knowledge of and skill in performing an abdominal examination, in a multiple-choice test and an objective structured clinical examination (OSCE), respectively. Information about performance in the programme was collected from school records and students were classified as average, good or excellent. Student and faculty evaluations of their experience in the study were explored by means of a short evaluation survey. Compared to the control group, all 3 study groups exhibited significant gains in performance on knowledge and performance measures. For the knowledge measure, the gains of the computer-assisted and computer-assisted plus text-based learning groups were significantly greater than the gains of the text-based learning group. The performances of the 3 groups did not differ on the OSCE measure. Analyses of gains by performance level revealed that high achieving students' learning was independent of study method. Lower achieving students performed better after using computer-based learning methods. The results suggest that computer-assisted learning methods will be of greater help to students who do not find the traditional methods effective. Explorations of the factors behind this are a matter for future research.
The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

PubMed

Thackston, Russell; Fortenberry, Ryan C

2015-05-05

The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.
Fog computing job scheduling optimization based on bees swarm

NASA Astrophysics Data System (ADS)

Bitam, Salim; Zeadally, Sherali; Mellouk, Abdelhamid

2018-04-01

Fog computing is a new computing architecture, composed of a set of near-user edge devices called fog nodes, which collaborate together in order to perform computational services such as running applications, storing an important amount of data, and transmitting messages. Fog computing extends cloud computing by deploying digital resources at the premise of mobile users. In this new paradigm, management and operating functions, such as job scheduling aim at providing high-performance, cost-effective services requested by mobile users and executed by fog nodes. We propose a new bio-inspired optimization approach called Bees Life Algorithm (BLA) aimed at addressing the job scheduling problem in the fog computing environment. Our proposed approach is based on the optimized distribution of a set of tasks among all the fog computing nodes. The objective is to find an optimal tradeoff between CPU execution time and allocated memory required by fog computing services established by mobile users. Our empirical performance evaluation results demonstrate that the proposal outperforms the traditional particle swarm optimization and genetic algorithm in terms of CPU execution time and allocated memory.
Impact of IQ, computer-gaming skills, general dexterity, and laparoscopic experience on performance with the da Vinci surgical system.

PubMed

Hagen, Monika E; Wagner, Oliver J; Inan, Ihsan; Morel, Philippe

2009-09-01

Due to improved ergonomics and dexterity, robotic surgery is promoted as being easily performed by surgeons with no special skills necessary. We tested this hypothesis by measuring IQ elements, computer gaming skills, general dexterity with chopsticks, and evaluating laparoscopic experience in correlation to performance ability with the da Vinci robot. Thirty-four individuals were tested for robotic dexterity, IQ elements, computer-gaming skills and general dexterity. Eighteen surgically inexperienced and 16 laparoscopically trained surgeons were included. Each individual performed three different tasks with the da Vinci surgical system and their times were recorded. An IQ test (elements: logical thinking, 3D imagination and technical understanding) was completed by each participant. Computer skills were tested with a simple computer game (hand-eye coordination) and general dexterity was evaluated by the ability to use chopsticks. We found no correlation between logical thinking, 3D imagination and robotic skills. Both computer gaming and general dexterity showed a slight but non-significant improvement in performance with the da Vinci robot (p > 0.05). A significant correlation between robotic skills, technical understanding and laparoscopic experience was observed (p < 0.05). The data support the conclusion that there are no significant correlations between robotic performance and logical thinking, 3D understanding, computer gaming skills and general dexterity. A correlation between robotic skills and technical understanding may exist. Laparoscopic experience seems to be the strongest predictor of performance with the da Vinci surgical system. Generally, it appears difficult to determine non-surgical predictors for robotic surgery.
A Modular Environment for Geophysical Inversion and Run-time Autotuning using Heterogeneous Computing Systems

NASA Astrophysics Data System (ADS)

Myre, Joseph M.

Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
The new landscape of parallel computer architecture

NASA Astrophysics Data System (ADS)

Shalf, John

2007-07-01

The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
Hypersonic Research Vehicle (HRV) real-time flight test support feasibility and requirements study. Part 2: Remote computation support for flight systems functions

NASA Technical Reports Server (NTRS)

Rediess, Herman A.; Hewett, M. D.

1991-01-01

The requirements are assessed for the use of remote computation to support HRV flight testing. First, remote computational requirements were developed to support functions that will eventually be performed onboard operational vehicles of this type. These functions which either cannot be performed onboard in the time frame of initial HRV flight test programs because the technology of airborne computers will not be sufficiently advanced to support the computational loads required, or it is not desirable to perform the functions onboard in the flight test program for other reasons. Second, remote computational support either required or highly desirable to conduct flight testing itself was addressed. The use is proposed of an Automated Flight Management System which is described in conceptual detail. Third, autonomous operations is discussed and finally, unmanned operations.
High-Performance Computing for the Electromagnetic Modeling and Simulation of Interconnects

NASA Technical Reports Server (NTRS)

Schutt-Aine, Jose E.

1996-01-01

The electromagnetic modeling of packages and interconnects plays a very important role in the design of high-speed digital circuits, and is most efficiently performed by using computer-aided design algorithms. In recent years, packaging has become a critical area in the design of high-speed communication systems and fast computers, and the importance of the software support for their development has increased accordingly. Throughout this project, our efforts have focused on the development of modeling and simulation techniques and algorithms that permit the fast computation of the electrical parameters of interconnects and the efficient simulation of their electrical performance.
Method and system for benchmarking computers

DOEpatents

Gustafson, John L.

1993-09-14

A testing system and method for benchmarking computer systems. The system includes a store containing a scalable set of tasks to be performed to produce a solution in ever-increasing degrees of resolution as a larger number of the tasks are performed. A timing and control module allots to each computer a fixed benchmarking interval in which to perform the stored tasks. Means are provided for determining, after completion of the benchmarking interval, the degree of progress through the scalable set of tasks and for producing a benchmarking rating relating to the degree of progress for each computer.
Methods for operating parallel computing systems employing sequenced communications

DOEpatents

Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

1999-01-01

A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
High-performance scientific computing in the cloud

NASA Astrophysics Data System (ADS)

Jorissen, Kevin; Vila, Fernando; Rehr, John

2011-03-01

Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.
Manycore Performance-Portability: Kokkos Multidimensional Array Library

DOE PAGES

Edwards, H. Carter; Sunderland, Daniel; Porter, Vicki; ...

2012-01-01

Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel executionmore » performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].« less
Scientific Inquiry Self-Efficacy and Computer Game Self-Efficacy as Predictors and Outcomes of Middle School Boys' and Girls' Performance in a Science Assessment in a Virtual Environment

NASA Astrophysics Data System (ADS)

Bergey, Bradley W.; Ketelhut, Diane Jass; Liang, Senfeng; Natarajan, Uma; Karakus, Melissa

2015-10-01

The primary aim of the study was to examine whether performance on a science assessment in an immersive virtual environment was associated with changes in scientific inquiry self-efficacy. A secondary aim of the study was to examine whether performance on the science assessment was equitable for students with different levels of computer game self-efficacy, including whether gender differences were observed. We examined 407 middle school students' scientific inquiry self-efficacy and computer game self-efficacy before and after completing a computer game-like assessment about a science mystery. Results from path analyses indicated that prior scientific inquiry self-efficacy predicted achievement on end-of-module questions, which in turn predicted change in scientific inquiry self-efficacy. By contrast, computer game self-efficacy was neither predictive of nor predicted by performance on the science assessment. While boys had higher computer game self-efficacy compared to girls, multi-group analyses suggested only minor gender differences in how efficacy beliefs related to performance. Implications for assessments with virtual environments and future design and research are discussed.
QPSO-Based Adaptive DNA Computing Algorithm

PubMed Central

Karakose, Mehmet; Cigdem, Ugur

2013-01-01

DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm. PMID:23935409
Scaling predictive modeling in drug development with cloud computing.

PubMed

Moghadam, Behrooz Torabi; Alvarsson, Jonathan; Holm, Marcus; Eklund, Martin; Carlsson, Lars; Spjuth, Ola

2015-01-26

Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.
Matching and correlation computations in stereoscopic depth perception.

PubMed

Doi, Takahiro; Tanabe, Seiji; Fujita, Ichiro

2011-03-02

A fundamental task of the visual system is to infer depth by using binocular disparity. To encode binocular disparity, the visual cortex performs two distinct computations: one detects matched patterns in paired images (matching computation); the other constructs the cross-correlation between the images (correlation computation). How the two computations are used in stereoscopic perception is unclear. We dissociated their contributions in near/far discrimination by varying the magnitude of the disparity across separate sessions. For small disparity (0.03°), subjects performed at chance level to a binocularly opposite-contrast (anti-correlated) random-dot stereogram (RDS) but improved their performance with the proportion of contrast-matched (correlated) dots. For large disparity (0.48°), the direction of perceived depth reversed with an anti-correlated RDS relative to that for a correlated one. Neither reversed nor normal depth was perceived when anti-correlation was applied to half of the dots. We explain the decision process as a weighted average of the two computations, with the relative weight of the correlation computation increasing with the disparity magnitude. We conclude that matching computation dominates fine depth perception, while both computations contribute to coarser depth perception. Thus, stereoscopic depth perception recruits different computations depending on the disparity magnitude.
An ergonomic evaluation comparing desktop, notebook, and subnotebook computers.

PubMed

Szeto, Grace P; Lee, Raymond

2002-04-01

To evaluate and compare the postures and movements of the cervical and upper thoracic spine, the typing performance, and workstation ergonomic factors when using a desktop, notebook, and subnotebook computers. Repeated-measures design. A motion analysis laboratory with an electromagnetic tracking device. A convenience sample of 21 university students between ages 20 and 24 years with no history of neck or shoulder discomfort. Each subject performed a standardized typing task by using each of the 3 computers. Measurements during the typing task were taken at set intervals. Cervical and thoracic spines adopted a more flexed posture in using the smaller-sized computers. There were significantly greater neck movements in using desktop computers when compared with the notebook and subnotebook computers. The viewing distances adopted by the subjects decreased as the computer size decreased. Typing performance and subjective rating of difficulty in using the keyboards were also significantly different among the 3 types of computers. Computer users need to consider the posture of the spine and potential risk of developing musculoskeletal discomfort in choosing computers. Copyright 2002 by the American Congress of Rehabilitation Medicine and the American Academy of Physical Medicine and Rehabilitation
System Resource Allocations | High-Performance Computing | NREL

Science.gov Websites

Allocations System Resource Allocations To use NREL's high-performance computing (HPC) resources : Compute hours on NREL HPC Systems including Peregrine and Eagle Storage space (in Terabytes) on Peregrine , Eagle and Gyrfalcon. Allocations are principally done in response to an annual call for allocation
Performance of a computer-based assessment of cognitive function measures in two cohorts of seniors

USDA-ARS?s Scientific Manuscript database

Computer-administered assessment of cognitive function is being increasingly incorporated in clinical trials, however its performance in these settings has not been systematically evaluated. The Seniors Health and Activity Research Program (SHARP) pilot trial (N=73) developed a computer-based tool f...
47 CFR 73.151 - Field strength measurements to establish performance of directional antennas.

Code of Federal Regulations, 2010 CFR

2010-10-01

... verified either by field strength measurement or by computer modeling and sampling system verification. (a... specifically identified by the Commission. (c) Computer modeling and sample system verification of modeled... performance verified by computer modeling and sample system verification. (1) A matrix of impedance...

Achieving High Performance with FPGA-Based Computing

PubMed Central

Herbordt, Martin C.; VanCourt, Tom; Gu, Yongfeng; Sukhwani, Bharat; Conti, Al; Model, Josh; DiSabello, Doug

2011-01-01

Numerous application areas, including bioinformatics and computational biology, demand increasing amounts of processing capability. In many cases, the computation cores and data types are suited to field-programmable gate arrays. The challenge is identifying the design techniques that can extract high performance potential from the FPGA fabric. PMID:21603088
Promoting High-Performance Computing and Communications. A CBO Study.

ERIC Educational Resources Information Center

Webre, Philip

In 1991 the Federal Government initiated the multiagency High Performance Computing and Communications program (HPCC) to further the development of U.S. supercomputer technology and high-speed computer network technology. This overview by the Congressional Budget Office (CBO) concentrates on obstacles that might prevent the growth of the…
Debugging a high performance computing program

DOEpatents

Gooding, Thomas M.

2014-08-19

Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.
Debugging a high performance computing program

DOEpatents

Gooding, Thomas M.

2013-08-20

Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.
Computer program for assessing the theoretical performance of a three dimensional inlet

NASA Technical Reports Server (NTRS)

Agnone, A. M.; Kung, F.

1972-01-01

A computer program for determining the theoretical performance of a three dimensional inlet is presented. An analysis for determining the capture area, ram force, spillage force, and surface pressure force is presented, along with the necessary computer program. A sample calculation is also included.
Accelerating epistasis analysis in human genetics with consumer graphics hardware.

PubMed

Sinnott-Armstrong, Nicholas A; Greene, Casey S; Cancare, Fabio; Moore, Jason H

2009-07-24

Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR) is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs) have more memory bandwidth and computational capability than Central Processing Units (CPUs) and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective performance while leaving the CPU available for other tasks. The GPU workstation containing three GPUs costs $2000 while obtaining similar performance on a Beowulf cluster requires 150 CPU cores which, including the added infrastructure and support cost of the cluster system, cost approximately $82,500. Graphics hardware based computing provides a cost effective means to perform genetic analysis of epistasis using MDR on large datasets without the infrastructure of a computing cluster.
Development of small scale cluster computer for numerical analysis

NASA Astrophysics Data System (ADS)

Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

2017-09-01

In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.
Spatiotemporal Dynamics and Reliable Computations in Recurrent Spiking Neural Networks

NASA Astrophysics Data System (ADS)

Pyle, Ryan; Rosenbaum, Robert

2017-01-01

Randomly connected networks of excitatory and inhibitory spiking neurons provide a parsimonious model of neural variability, but are notoriously unreliable for performing computations. We show that this difficulty is overcome by incorporating the well-documented dependence of connection probability on distance. Spatially extended spiking networks exhibit symmetry-breaking bifurcations and generate spatiotemporal patterns that can be trained to perform dynamical computations under a reservoir computing framework.
IUE Data Analysis Software for Personal Computers

NASA Technical Reports Server (NTRS)

Thompson, R.; Caplinger, J.; Taylor, L.; Lawton , P.

1996-01-01

This report summarizes the work performed for the program titled, "IUE Data Analysis Software for Personal Computers" awarded under Astrophysics Data Program NRA 92-OSSA-15. The work performed was completed over a 2-year period starting in April 1994. As a result of the project, 450 IDL routines and eight database tables are now available for distribution for Power Macintosh computers and Personal Computers running Windows 3.1.
FPGA-Based High-Performance Embedded Systems for Adaptive Edge Computing in Cyber-Physical Systems: The ARTICo³ Framework.

PubMed

Rodríguez, Alfonso; Valverde, Juan; Portilla, Jorge; Otero, Andrés; Riesgo, Teresa; de la Torre, Eduardo

2018-06-08

Cyber-Physical Systems are experiencing a paradigm shift in which processing has been relocated to the distributed sensing layer and is no longer performed in a centralized manner. This approach, usually referred to as Edge Computing, demands the use of hardware platforms that are able to manage the steadily increasing requirements in computing performance, while keeping energy efficiency and the adaptability imposed by the interaction with the physical world. In this context, SRAM-based FPGAs and their inherent run-time reconfigurability, when coupled with smart power management strategies, are a suitable solution. However, they usually fail in user accessibility and ease of development. In this paper, an integrated framework to develop FPGA-based high-performance embedded systems for Edge Computing in Cyber-Physical Systems is presented. This framework provides a hardware-based processing architecture, an automated toolchain, and a runtime to transparently generate and manage reconfigurable systems from high-level system descriptions without additional user intervention. Moreover, it provides users with support for dynamically adapting the available computing resources to switch the working point of the architecture in a solution space defined by computing performance, energy consumption and fault tolerance. Results show that it is indeed possible to explore this solution space at run time and prove that the proposed framework is a competitive alternative to software-based edge computing platforms, being able to provide not only faster solutions, but also higher energy efficiency for computing-intensive algorithms with significant levels of data-level parallelism.
Models for evaluating the performability of degradable computing systems

NASA Technical Reports Server (NTRS)

Wu, L. T.

1982-01-01

Recent advances in multiprocessor technology established the need for unified methods to evaluate computing systems performance and reliability. In response to this modeling need, a general modeling framework that permits the modeling, analysis and evaluation of degradable computing systems is considered. Within this framework, several user oriented performance variables are identified and shown to be proper generalizations of the traditional notions of system performance and reliability. Furthermore, a time varying version of the model is developed to generalize the traditional fault tree reliability evaluation methods of phased missions.
Software Systems for High-performance Quantum Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humble, Travis S; Britt, Keith A

Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventionalmore » computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.« less
Accelerated Application Development: The ORNL Titan Experience

DOE PAGES

Joubert, Wayne; Archibald, Richard K.; Berrill, Mark A.; ...

2015-05-09

The use of computational accelerators such as NVIDIA GPUs and Intel Xeon Phi processors is now widespread in the high performance computing community, with many applications delivering impressive performance gains. However, programming these systems for high performance, performance portability and software maintainability has been a challenge. In this paper we discuss experiences porting applications to the Titan system. Titan, which began planning in 2009 and was deployed for general use in 2013, was the first multi-petaflop system based on accelerator hardware. To ready applications for accelerated computing, a preparedness effort was undertaken prior to delivery of Titan. In this papermore » we report experiences and lessons learned from this process and describe how users are currently making use of computational accelerators on Titan.« less
Accelerated application development: The ORNL Titan experience

DOE Office of Scientific and Technical Information (OSTI.GOV)

Joubert, Wayne; Archibald, Rick; Berrill, Mark

2015-08-01

The use of computational accelerators such as NVIDIA GPUs and Intel Xeon Phi processors is now widespread in the high performance computing community, with many applications delivering impressive performance gains. However, programming these systems for high performance, performance portability and software maintainability has been a challenge. In this paper we discuss experiences porting applications to the Titan system. Titan, which began planning in 2009 and was deployed for general use in 2013, was the first multi-petaflop system based on accelerator hardware. To ready applications for accelerated computing, a preparedness effort was undertaken prior to delivery of Titan. In this papermore » we report experiences and lessons learned from this process and describe how users are currently making use of computational accelerators on Titan.« less
Job Superscheduler Architecture and Performance in Computational Grid Environments

NASA Technical Reports Server (NTRS)

Shan, Hongzhang; Oliker, Leonid; Biswas, Rupak

2003-01-01

Computational grids hold great promise in utilizing geographically separated heterogeneous resources to solve large-scale complex scientific problems. However, a number of major technical hurdles, including distributed resource management and effective job scheduling, stand in the way of realizing these gains. In this paper, we propose a novel grid superscheduler architecture and three distributed job migration algorithms. We also model the critical interaction between the superscheduler and autonomous local schedulers. Extensive performance comparisons with ideal, central, and local schemes using real workloads from leading computational centers are conducted in a simulation environment. Additionally, synthetic workloads are used to perform a detailed sensitivity analysis of our superscheduler. Several key metrics demonstrate that substantial performance gains can be achieved via smart superscheduling in distributed computational grids.
Some system considerations in configuring a digital flight control - navigation system

NASA Technical Reports Server (NTRS)

Boone, J. H.; Flynn, G. R.

1976-01-01

A trade study was conducted with the objective of providing a technical guideline for selection of the most appropriate computer technology for the automatic flight control system of a civil subsonic jet transport. The trade study considers aspects of using either an analog, incremental type special purpose computer or a general purpose computer to perform critical autopilot computation functions. It also considers aspects of integration of noncritical autopilot and autothrottle modes into the computer performing the critical autoland functions, as compared to the federation of the noncritical modes into either a separate computer or with a R-Nav computer. The study is accomplished by establishing the relative advantages and/or risks associated with each of the computer configurations.
Performing process migration with allreduce operations

DOEpatents

Archer, Charles Jens; Peters, Amanda; Wallenfelt, Brian Paul

2010-12-14

Compute nodes perform allreduce operations that swap processes at nodes. A first allreduce operation generates a first result and uses a first process from a first compute node, a second process from a second compute node, and zeros from other compute nodes. The first compute node replaces the first process with the first result. A second allreduce operation generates a second result and uses the first result from the first compute node, the second process from the second compute node, and zeros from others. The second compute node replaces the second process with the second result, which is the first process. A third allreduce operation generates a third result and uses the first result from first compute node, the second result from the second compute node, and zeros from others. The first compute node replaces the first result with the third result, which is the second process.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schmalz, Mark S

2011-07-24

Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
Relativistic Zeroth-Order Regular Approximation Combined with Nonhybrid and Hybrid Density Functional Theory: Performance for NMR Indirect Nuclear Spin-Spin Coupling in Heavy Metal Compounds.

PubMed

Moncho, Salvador; Autschbach, Jochen

2010-01-12

A benchmark study for relativistic density functional calculations of NMR spin-spin coupling constants has been performed. The test set contained 47 complexes with heavy metal atoms (W, Pt, Hg, Tl, Pb) with a total of 88 coupling constants involving one or two heavy metal atoms. One-, two-, three-, and four-bond spin-spin couplings have been computed at different levels of theory (nonhybrid vs hybrid DFT, scalar vs two-component relativistic). The computational model was based on geometries fully optimized at the BP/TZP scalar relativistic zeroth-order regular approximation (ZORA) and the conductor-like screening model (COSMO) to include solvent effects. The NMR computations also employed the continuum solvent model. Computations in the gas phase were performed in order to assess the importance of the solvation model. The relative median deviations between various computational models and experiment were found to range between 13% and 21%, with the highest-level computational model (hybrid density functional computations including scalar plus spin-orbit relativistic effects, the COSMO solvent model, and a Gaussian finite-nucleus model) performing best.
LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor

NASA Astrophysics Data System (ADS)

Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram

2007-09-01

Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.

Developing a system for computing and reporting MAP-21 and other freight performance measures.

DOT National Transportation Integrated Search

2015-07-01

This report documents the use of the National Performance Monitoring Research Data Set : (NPMRDS) for the computation of freight performance measures on Interstate highways in Washington : state. The report documents the data availability and specifi...
David Sickinger | NREL

Science.gov Websites

Sickinger David Sickinger Researcher III-High Performance Computing David.Sickinger@nrel.gov | 303 -275-3724 David Sickinger works with NREL's High Performance Computing Systems & Operations group
Impact of Classroom Computer Use on Computer Anxiety.

ERIC Educational Resources Information Center

Lambert, Matthew E.; And Others

Increasing use of computer programs for undergraduate psychology education has raised concern over the impact of computer anxiety on educational performance. Additionally, some researchers have indicated that classroom computer use can exacerbate pre-existing computer anxiety. To evaluate the relationship between in-class computer use and computer…
Hadoop-MCC: Efficient Multiple Compound Comparison Algorithm Using Hadoop.

PubMed

Hua, Guan-Jie; Hung, Che-Lun; Tang, Chuan Yi

2018-01-01

In the past decade, the drug design technologies have been improved enormously. The computer-aided drug design (CADD) has played an important role in analysis and prediction in drug development, which makes the procedure more economical and efficient. However, computation with big data, such as ZINC containing more than 60 million compounds data and GDB-13 with more than 930 million small molecules, is a noticeable issue of time-consuming problem. Therefore, we propose a novel heterogeneous high performance computing method, named as Hadoop-MCC, integrating Hadoop and GPU, to copy with big chemical structure data efficiently. Hadoop-MCC gains the high availability and fault tolerance from Hadoop, as Hadoop is used to scatter input data to GPU devices and gather the results from GPU devices. Hadoop framework adopts mapper/reducer computation model. In the proposed method, mappers response for fetching SMILES data segments and perform LINGO method on GPU, then reducers collect all comparison results produced by mappers. Due to the high availability of Hadoop, all of LINGO computational jobs on mappers can be completed, even if some of the mappers encounter problems. A comparison of LINGO is performed on each the GPU device in parallel. According to the experimental results, the proposed method on multiple GPU devices can achieve better computational performance than the CUDA-MCC on a single GPU device. Hadoop-MCC is able to achieve scalability, high availability, and fault tolerance granted by Hadoop, and high performance as well by integrating computational power of both of Hadoop and GPU. It has been shown that using the heterogeneous architecture as Hadoop-MCC effectively can enhance better computational performance than on a single GPU device. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization

PubMed Central

Chen, Qingkui; Zhao, Deyu; Wang, Jingjuan

2017-01-01

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes’ diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services. PMID:28777325
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization.

PubMed

Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan

2017-08-04

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.
Micromagnetics on high-performance workstation and mobile computational platforms

NASA Astrophysics Data System (ADS)

Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

2015-05-01

The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.
A survey of CPU-GPU heterogeneous computing techniques

DOE PAGES

Mittal, Sparsh; Vetter, Jeffrey S.

2015-07-04

As both CPU and GPU become employed in a wide range of applications, it has been acknowledged that both of these processing units (PUs) have their unique features and strengths and hence, CPU-GPU collaboration is inevitable to achieve high-performance computing. This has motivated significant amount of research on heterogeneous computing techniques, along with the design of CPU-GPU fused chips and petascale heterogeneous supercomputers. In this paper, we survey heterogeneous computing techniques (HCTs) such as workload-partitioning which enable utilizing both CPU and GPU to improve performance and/or energy efficiency. We review heterogeneous computing approaches at runtime, algorithm, programming, compiler and applicationmore » level. Further, we review both discrete and fused CPU-GPU systems; and discuss benchmark suites designed for evaluating heterogeneous computing systems (HCSs). Furthermore, we believe that this paper will provide insights into working and scope of applications of HCTs to researchers and motivate them to further harness the computational powers of CPUs and GPUs to achieve the goal of exascale performance.« less
Exploring Cloud Computing for Large-scale Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Guang; Han, Binh; Yin, Jian

This paper explores cloud computing for large-scale data-intensive scientific applications. Cloud computing is attractive because it provides hardware and software resources on-demand, which relieves the burden of acquiring and maintaining a huge amount of resources that may be used only once by a scientific application. However, unlike typical commercial applications that often just requires a moderate amount of ordinary resources, large-scale scientific applications often need to process enormous amount of data in the terabyte or even petabyte range and require special high performance hardware with low latency connections to complete computation in a reasonable amount of time. To address thesemore » challenges, we build an infrastructure that can dynamically select high performance computing hardware across institutions and dynamically adapt the computation to the selected resources to achieve high performance. We have also demonstrated the effectiveness of our infrastructure by building a system biology application and an uncertainty quantification application for carbon sequestration, which can efficiently utilize data and computation resources across several institutions.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

DOE PAGES

Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian; ...

2017-09-29

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
Failure detection and isolation investigation for strapdown skew redundant tetrad laser gyro inertial sensor arrays

NASA Technical Reports Server (NTRS)

Eberlein, A. J.; Lahm, T. G.

1976-01-01

The degree to which flight-critical failures in a strapdown laser gyro tetrad sensor assembly can be isolated in short-haul aircraft after a failure occurrence has been detected by the skewed sensor failure-detection voting logic is investigated along with the degree to which a failure in the tetrad computer can be detected and isolated at the computer level, assuming a dual-redundant computer configuration. The tetrad system was mechanized with two two-axis inertial navigation channels (INCs), each containing two gyro/accelerometer axes, computer, control circuitry, and input/output circuitry. Gyro/accelerometer data is crossfed between the two INCs to enable each computer to independently perform the navigation task. Computer calculations are synchronized between the computers so that calculated quantities are identical and may be compared. Fail-safe performance (identification of the first failure) is accomplished with a probability approaching 100 percent of the time, while fail-operational performance (identification and isolation of the first failure) is achieved 93 to 96 percent of the time.
A survey of CPU-GPU heterogeneous computing techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mittal, Sparsh; Vetter, Jeffrey S.

As both CPU and GPU become employed in a wide range of applications, it has been acknowledged that both of these processing units (PUs) have their unique features and strengths and hence, CPU-GPU collaboration is inevitable to achieve high-performance computing. This has motivated significant amount of research on heterogeneous computing techniques, along with the design of CPU-GPU fused chips and petascale heterogeneous supercomputers. In this paper, we survey heterogeneous computing techniques (HCTs) such as workload-partitioning which enable utilizing both CPU and GPU to improve performance and/or energy efficiency. We review heterogeneous computing approaches at runtime, algorithm, programming, compiler and applicationmore » level. Further, we review both discrete and fused CPU-GPU systems; and discuss benchmark suites designed for evaluating heterogeneous computing systems (HCSs). Furthermore, we believe that this paper will provide insights into working and scope of applications of HCTs to researchers and motivate them to further harness the computational powers of CPUs and GPUs to achieve the goal of exascale performance.« less
SCCT guidelines for the performance and acquisition of coronary computed tomographic angiography: A report of the society of Cardiovascular Computed Tomography Guidelines Committee: Endorsed by the North American Society for Cardiovascular Imaging (NASCI).

PubMed

Abbara, Suhny; Blanke, Philipp; Maroules, Christopher D; Cheezum, Michael; Choi, Andrew D; Han, B Kelly; Marwan, Mohamed; Naoum, Chris; Norgaard, Bjarne L; Rubinshtein, Ronen; Schoenhagen, Paul; Villines, Todd; Leipsic, Jonathon

In response to recent technological advancements in acquisition techniques as well as a growing body of evidence regarding the optimal performance of coronary computed tomography angiography (coronary CTA), the Society of Cardiovascular Computed Tomography Guidelines Committee has produced this update to its previously established 2009 "Guidelines for the Performance of Coronary CTA" (1). The purpose of this document is to provide standards meant to ensure reliable practice methods and quality outcomes based on the best available data in order to improve the diagnostic care of patients. Society of Cardiovascular Computed Tomography Guidelines for the Interpretation is published separately (2). The Society of Cardiovascular Computed Tomography Guidelines Committee ensures compliance with all existing standards for the declaration of conflict of interest by all authors and reviewers for the purpose ofclarity and transparency. Copyright © 2016 Society of Cardiovascular Computed Tomography. All rights reserved.
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
Towards fully analog hardware reservoir computing for speech recognition

NASA Astrophysics Data System (ADS)

Smerieri, Anteo; Duport, François; Paquot, Yvan; Haelterman, Marc; Schrauwen, Benjamin; Massar, Serge

2012-09-01

Reservoir computing is a very recent, neural network inspired unconventional computation technique, where a recurrent nonlinear system is used in conjunction with a linear readout to perform complex calculations, leveraging its inherent internal dynamics. In this paper we show the operation of an optoelectronic reservoir computer in which both the nonlinear recurrent part and the readout layer are implemented in hardware for a speech recognition application. The performance obtained is close to the one of to state-of-the-art digital reservoirs, while the analog architecture opens the way to ultrafast computation.
The Use of High Performance Computing (HPC) to Strengthen the Development of Army Systems

DTIC Science & Technology

2011-11-01

accurately predicting the supersonic magus effect about spinning cones, ogive- cylinders , and boat-tailed afterbodies. This work led to the successful...successful computer model of the proposed product or system, one can then build prototypes on the computer and study the effects on the performance of...needed. The NRC report discusses the requirements for effective use of such computing power. One needs “models, algorithms, software, hardware
Update of aircraft profile data for the Integrated Noise Model computer program, vol. 3 : appendix B aircraft performance coefficients

DOT National Transportation Integrated Search

1992-03-01

This report provides aircraft takeoff and landing profiles, : aircraft aerodynamic performance coefficients and engine : performance coefficients for the aircraft data base : (Database 9) in the Integrated Noise Model (INM) computer : program. Flight...
Design of a Performance-Responsive Drill and Practice Algorithm for Computer-Based Training.

ERIC Educational Resources Information Center

Vazquez-Abad, Jesus; LaFleur, Marc

1990-01-01

Reviews criticisms of the use of drill and practice programs in educational computing and describes potentials for its use in instruction. Topics discussed include guidelines for developing computer-based drill and practice; scripted training courseware; item format design; item bank design; and a performance-responsive algorithm for item…
Staff | Computational Science | NREL

Science.gov Websites

develops and leads laboratory-wide efforts in high-performance computing and energy-efficient data centers Professional IV-High Perf Computing Jim.Albin@nrel.gov 303-275-4069 Ananthan, Shreyas Senior Scientist - High -Performance Algorithms and Modeling Shreyas.Ananthan@nrel.gov 303-275-4807 Bendl, Kurt IT Professional IV-High
An Intervention Study on Mental Computation for Second Graders in Taiwan

ERIC Educational Resources Information Center

Yang, Der-Ching; Huang, Ke-Lun

2014-01-01

The authors compared the mental computation performance and mental strategies used by an experimental Grade 2 class and a control Grade 2 class before and after instructional intervention. Results indicate that students in the experimental group had better performance on mental computation. The use of mental strategies (counting, separation,…

Optimising the Parallelisation of OpenFOAM Simulations

DTIC Science & Technology

2014-06-01

UNCLASSIFIED UNCLASSIFIED Optimising the Parallelisation of OpenFOAM Simulations Shannon Keough Maritime Division Defence...Science and Technology Organisation DSTO-TR-2987 ABSTRACT The OpenFOAM computational fluid dynamics toolbox allows parallel computation of...performance of a given high performance computing cluster with several OpenFOAM cases, running using a combination of MPI libraries and corresponding MPI
A Research and Development Strategy for High Performance Computing.

ERIC Educational Resources Information Center

Office of Science and Technology Policy, Washington, DC.

This report is the result of a systematic review of the status and directions of high performance computing and its relationship to federal research and development. Conducted by the Federal Coordinating Council for Science, Engineering, and Technology (FCCSET), the review involved a series of workshops attended by numerous computer scientists and…
Comparability of Computer-Based and Paper-Based Science Assessments

ERIC Educational Resources Information Center

Herrmann-Abell, Cari F.; Hardcastle, Joseph; DeBoer, George E.

2018-01-01

We compared students' performance on a paper-based test (PBT) and three computer-based tests (CBTs). The three computer-based tests used different test navigation and answer selection features, allowing us to examine how these features affect student performance. The study sample consisted of 9,698 fourth through twelfth grade students from across…
Models and techniques for evaluating the effectiveness of aircraft computing systems

NASA Technical Reports Server (NTRS)

Meyer, J. F.

1978-01-01

Progress in the development of system models and techniques for the formulation and evaluation of aircraft computer system effectiveness is reported. Topics covered include: analysis of functional dependence: a prototype software package, METAPHOR, developed to aid the evaluation of performability; and a comprehensive performability modeling and evaluation exercise involving the SIFT computer.
Personalized Computer-Assisted Mathematics Problem-Solving Program and Its Impact on Taiwanese Students

ERIC Educational Resources Information Center

Chen, Chiu-Jung; Liu, Pei-Lin

2007-01-01

This study evaluated the effects of a personalized computer-assisted mathematics problem-solving program on the performance and attitude of Taiwanese fourth grade students. The purpose of this study was to determine whether the personalized computer-assisted program improved student performance and attitude over the nonpersonalized program.…
DCL System Using Deep Learning Approaches for Land-Based or Ship-Based Real Time Recognition and Localization of Marine Mammals

DTIC Science & Technology

2015-09-30

Clark (2014), "Using High Performance Computing to Explore Large Complex Bioacoustic Soundscapes : Case Study for Right Whale Acoustics," Procedia...34Using High Performance Computing to Explore Large Complex Bioacoustic Soundscapes : Case Study for Right Whale Acoustics," Procedia Computer Science 20
Robust tuning of robot control systems

NASA Technical Reports Server (NTRS)

Minis, I.; Uebel, M.

1992-01-01

The computed torque control problem is examined for a robot arm with flexible, geared, joint drive systems which are typical in many industrial robots. The standard computed torque algorithm is not directly applicable to this class of manipulators because of the dynamics introduced by the joint drive system. The proposed approach to computed torque control combines a computed torque algorithm with torque controller at each joint. Three such control schemes are proposed. The first scheme uses the joint torque control system currently implemented on the robot arm and a novel form of the computed torque algorithm. The other two use the standard computed torque algorithm and a novel model following torque control system based on model following techniques. Standard tasks and performance indices are used to evaluate the performance of the controllers. Both numerical simulations and experiments are used in evaluation. The study shows that all three proposed systems lead to improved tracking performance over a conventional PD controller.
Computational Investigation of the Performance and Back-Pressure Limits of a Hypersonic Inlet

NASA Technical Reports Server (NTRS)

Smart, Michael K.; White, Jeffery A.

2002-01-01

A computational analysis of Mach 6.2 operation of a hypersonic inlet with rectangular-to-elliptical shape transition has been performed. The results of the computations are compared with experimental data for cases with and without a manually imposed back-pressure. While the no-back-pressure numerical solutions match the general trends of the data, certain features observed in the experiments did not appear in the computational solutions. The reasons for these discrepancies are discussed and possible remedies are suggested. Most importantly, however, the computational analysis increased the understanding of the consequences of certain aspects of the inlet design. This will enable the performance of future inlets of this class to be improved. Computational solutions with back-pressure under-estimated the back-pressure limit observed in the experiments, but did supply significant insight into the character of highly back-pressured inlet flows.
Unobtrusive monitoring of computer interactions to detect cognitive status in elders.

PubMed

Jimison, Holly; Pavel, Misha; McKanna, James; Pavel, Jesse

2004-09-01

The U.S. has experienced a rapid growth in the use of computers by elders. E-mail, Web browsing, and computer games are among the most common routine activities for this group of users. In this paper, we describe techniques for unobtrusively monitoring naturally occurring computer interactions to detect sustained changes in cognitive performance. Researchers have demonstrated the importance of the early detection of cognitive decline. Users over the age of 75 are at risk for medically related cognitive problems and confusion, and early detection allows for more effective clinical intervention. In this paper, we present algorithms for inferring a user's cognitive performance using monitoring data from computer games and psychomotor measurements associated with keyboard entry and mouse movement. The inferences are then used to classify significant performance changes, and additionally, to adapt computer interfaces with tailored hints and assistance when needed. These methods were tested in a group of elders in a residential facility.
Computational Aerodynamic Simulations of a Spacecraft Cabin Ventilation Fan Design

NASA Technical Reports Server (NTRS)

Tweedt, Daniel L.

2010-01-01

Quieter working environments for astronauts are needed if future long-duration space exploration missions are to be safe and productive. Ventilation and payload cooling fans are known to be dominant sources of noise, with the International Space Station being a good case in point. To address this issue cost effectively, early attention to fan design, selection, and installation has been recommended, leading to an effort by NASA to examine the potential for small-fan noise reduction by improving fan aerodynamic design. As a preliminary part of that effort, the aerodynamics of a cabin ventilation fan designed by Hamilton Sundstrand has been simulated using computational fluid dynamics codes, and the computed solutions analyzed to quantify various aspects of the fan aerodynamics and performance. Four simulations were performed at the design rotational speed: two at the design flow rate and two at off-design flow rates. Following a brief discussion of the computational codes, various aerodynamic- and performance-related quantities derived from the computed flow fields are presented along with relevant flow field details. The results show that the computed fan performance is in generally good agreement with stated design goals.
Dynamic remapping of parallel computations with varying resource demands

NASA Technical Reports Server (NTRS)

Nicol, D. M.; Saltz, J. H.

1986-01-01

A large class of computational problems is characterized by frequent synchronization, and computational requirements which change as a function of time. When such a problem must be solved on a message passing multiprocessor machine, the combination of these characteristics lead to system performance which decreases in time. Performance can be improved with periodic redistribution of computational load; however, redistribution can exact a sometimes large delay cost. We study the issue of deciding when to invoke a global load remapping mechanism. Such a decision policy must effectively weigh the costs of remapping against the performance benefits. We treat this problem by constructing two analytic models which exhibit stochastically decreasing performance. One model is quite tractable; we are able to describe the optimal remapping algorithm, and the optimal decision policy governing when to invoke that algorithm. However, computational complexity prohibits the use of the optimal remapping decision policy. We then study the performance of a general remapping policy on both analytic models. This policy attempts to minimize a statistic W(n) which measures the system degradation (including the cost of remapping) per computation step over a period of n steps. We show that as a function of time, the expected value of W(n) has at most one minimum, and that when this minimum exists it defines the optimal fixed-interval remapping policy. Our decision policy appeals to this result by remapping when it estimates that W(n) is minimized. Our performance data suggests that this policy effectively finds the natural frequency of remapping. We also use the analytic models to express the relationship between performance and remapping cost, number of processors, and the computation's stochastic activity.
Computational Predictions of the Performance Wright 'Bent End' Propellers

NASA Technical Reports Server (NTRS)

Wang, Xiang-Yu; Ash, Robert L.; Bobbitt, Percy J.; Prior, Edwin (Technical Monitor)

2002-01-01

Computational analysis of two 1911 Wright brothers 'Bent End' wooden propeller reproductions have been performed and compared with experimental test results from the Langley Full Scale Wind Tunnel. The purpose of the analysis was to check the consistency of the experimental results and to validate the reliability of the tests. This report is one part of the project on the propeller performance research of the Wright 'Bent End' propellers, intend to document the Wright brothers' pioneering propeller design contributions. Two computer codes were used in the computational predictions. The FLO-MG Navier-Stokes code is a CFD (Computational Fluid Dynamics) code based on the Navier-Stokes Equations. It is mainly used to compute the lift coefficient and the drag coefficient at specified angles of attack at different radii. Those calculated data are the intermediate results of the computation and a part of the necessary input for the Propeller Design Analysis Code (based on Adkins and Libeck method), which is a propeller design code used to compute the propeller thrust coefficient, the propeller power coefficient and the propeller propulsive efficiency.
A cross-sectional study of the effects of load carriage on running characteristics and tibial mechanical stress: implications for stress fracture injuries in women

DTIC Science & Technology

2017-03-23

performance computing resources made available by the US Department of Defense High Performance Computing Modernization Program at the Air Force...1Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, United...States Army Medical Research and Materiel Command, Fort Detrick, Maryland, USA Full list of author information is available at the end of the article
Psychology of computer use: XXXII. Computer screen-savers as distractors.

PubMed

Volk, F A; Halcomb, C G

1994-12-01

The differences in performance of 16 male and 16 female undergraduates on three cognitive tasks were investigated in the presence of visual distractors (computer-generated dynamic graphic images). These tasks included skilled and unskilled proofreading and listening comprehension. The visually demanding task of proofreading (skilled and unskilled) showed no significant decreases in performance in the distractor conditions. Results showed significant decrements, however, in performance on listening comprehension in at least one of the distractor conditions.
Understanding the Performance and Potential of Cloud Computing for Scientific Applications

DOE PAGES

Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin; ...

2015-02-19

In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
Understanding the Performance and Potential of Cloud Computing for Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin

In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
V/STOL AND digital avionics system for UH-1H

NASA Technical Reports Server (NTRS)

Liden, S.

1978-01-01

A hardware and software system for the Bell UH-1H helicopter was developed that provides sophisticated navigation, guidance, control, display, and data acquisition capabilities for performing terminal area navigation, guidance and control research. Two Sperry 1819B general purpose digital computers were used. One contains the development software that performs all the specified system flight computations. The second computer is available to NASA for experimental programs that run simultaneously with the other computer programs and which may, at the push of a button, replace selected computer computations. Other features that provide research flexibility include keyboard selectable gains and parameters and software generated alphanumeric and CRT displays.
Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

NASA Astrophysics Data System (ADS)

Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

2018-01-01

We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.
A parallel-processing approach to computing for the geographic sciences; applications and systems enhancements

USGS Publications Warehouse

Crane, Michael; Steinwand, Dan; Beckmann, Tim; Krpan, Greg; Liu, Shu-Guang; Nichols, Erin; Haga, Jim; Maddox, Brian; Bilderback, Chris; Feller, Mark; Homer, George

2001-01-01

The overarching goal of this project is to build a spatially distributed infrastructure for information science research by forming a team of information science researchers and providing them with similar hardware and software tools to perform collaborative research. Four geographically distributed Centers of the U.S. Geological Survey (USGS) are developing their own clusters of low-cost, personal computers into parallel computing environments that provide a costeffective way for the USGS to increase participation in the high-performance computing community. Referred to as Beowulf clusters, these hybrid systems provide the robust computing power required for conducting information science research into parallel computing systems and applications.
High performance network and channel-based storage

NASA Technical Reports Server (NTRS)

Katz, Randy H.

1991-01-01

In the traditional mainframe-centered view of a computer system, storage devices are coupled to the system through complex hardware subsystems called input/output (I/O) channels. With the dramatic shift towards workstation-based computing, and its associated client/server model of computation, storage facilities are now found attached to file servers and distributed throughout the network. We discuss the underlying technology trends that are leading to high performance network-based storage, namely advances in networks, storage devices, and I/O controller and server architectures. We review several commercial systems and research prototypes that are leading to a new approach to high performance computing based on network-attached storage.

Performability evaluation of the SIFT computer

NASA Technical Reports Server (NTRS)

Meyer, J. F.; Furchtgott, D. G.; Wu, L. T.

1979-01-01

Performability modeling and evaluation techniques are applied to the SIFT computer as it might operate in the computational evironment of an air transport mission. User-visible performance of the total system (SIFT plus its environment) is modeled as a random variable taking values in a set of levels of accomplishment. These levels are defined in terms of four attributes of total system behavior: safety, no change in mission profile, no operational penalties, and no economic process whose states describe the internal structure of SIFT as well as relavant conditions of the environment. Base model state trajectories are related to accomplishment levels via a capability function which is formulated in terms of a 3-level model hierarchy. Performability evaluation algorithms are then applied to determine the performability of the total system for various choices of computer and environment parameter values. Numerical results of those evaluations are presented and, in conclusion, some implications of this effort are discussed.
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

DOEpatents

Faraj, Ahmad

2013-02-12

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.
Systems, methods and computer-readable media for modeling cell performance fade of rechargeable electrochemical devices

DOEpatents

Gering, Kevin L

2013-08-27

A system includes an electrochemical cell, monitoring hardware, and a computing system. The monitoring hardware periodically samples performance characteristics of the electrochemical cell. The computing system determines cell information from the performance characteristics of the electrochemical cell. The computing system also develops a mechanistic level model of the electrochemical cell to determine performance fade characteristics of the electrochemical cell and analyzing the mechanistic level model to estimate performance fade characteristics over aging of a similar electrochemical cell. The mechanistic level model uses first constant-current pulses applied to the electrochemical cell at a first aging period and at three or more current values bracketing a first exchange current density. The mechanistic level model also is based on second constant-current pulses applied to the electrochemical cell at a second aging period and at three or more current values bracketing the second exchange current density.
Numerical Stability and Control Analysis Towards Falling-Leaf Prediction Capabilities of Splitflow for Two Generic High-Performance Aircraft Models

NASA Technical Reports Server (NTRS)

Charlton, Eric F.

1998-01-01

Aerodynamic analysis are performed using the Lockheed-Martin Tactical Aircraft Systems (LMTAS) Splitflow computational fluid dynamics code to investigate the computational prediction capabilities for vortex-dominated flow fields of two different tailless aircraft models at large angles of attack and sideslip. These computations are performed with the goal of providing useful stability and control data to designers of high performance aircraft. Appropriate metrics for accuracy, time, and ease of use are determined in consultations with both the LMTAS Advanced Design and Stability and Control groups. Results are obtained and compared to wind-tunnel data for all six components of forces and moments. Moment data is combined to form a "falling leaf" stability analysis. Finally, a handful of viscous simulations were also performed to further investigate nonlinearities and possible viscous effects in the differences between the accumulated inviscid computational and experimental data.
An Approach to Experimental Design for the Computer Analysis of Complex Phenomenon

NASA Technical Reports Server (NTRS)

Rutherford, Brian

2000-01-01

The ability to make credible system assessments, predictions and design decisions related to engineered systems and other complex phenomenon is key to a successful program for many large-scale investigations in government and industry. Recently, many of these large-scale analyses have turned to computational simulation to provide much of the required information. Addressing specific goals in the computer analysis of these complex phenomenon is often accomplished through the use of performance measures that are based on system response models. The response models are constructed using computer-generated responses together with physical test results where possible. They are often based on probabilistically defined inputs and generally require estimation of a set of response modeling parameters. As a consequence, the performance measures are themselves distributed quantities reflecting these variabilities and uncertainties. Uncertainty in the values of the performance measures leads to uncertainties in predicted performance and can cloud the decisions required of the analysis. A specific goal of this research has been to develop methodology that will reduce this uncertainty in an analysis environment where limited resources and system complexity together restrict the number of simulations that can be performed. An approach has been developed that is based on evaluation of the potential information provided for each "intelligently selected" candidate set of computer runs. Each candidate is evaluated by partitioning the performance measure uncertainty into two components - one component that could be explained through the additional computational simulation runs and a second that would remain uncertain. The portion explained is estimated using a probabilistic evaluation of likely results for the additional computational analyses based on what is currently known about the system. The set of runs indicating the largest potential reduction in uncertainty is then selected and the computational simulations are performed. Examples are provided to demonstrate this approach on small scale problems. These examples give encouraging results. Directions for further research are indicated.
Blind Quantum Signature with Blind Quantum Computation

NASA Astrophysics Data System (ADS)

Li, Wei; Shi, Ronghua; Guo, Ying

2017-04-01

Blind quantum computation allows a client without quantum abilities to interact with a quantum server to perform a unconditional secure computing protocol, while protecting client's privacy. Motivated by confidentiality of blind quantum computation, a blind quantum signature scheme is designed with laconic structure. Different from the traditional signature schemes, the signing and verifying operations are performed through measurement-based quantum computation. Inputs of blind quantum computation are securely controlled with multi-qubit entangled states. The unique signature of the transmitted message is generated by the signer without leaking information in imperfect channels. Whereas, the receiver can verify the validity of the signature using the quantum matching algorithm. The security is guaranteed by entanglement of quantum system for blind quantum computation. It provides a potential practical application for e-commerce in the cloud computing and first-generation quantum computation.
Desktop Computing Integration Project

NASA Technical Reports Server (NTRS)

Tureman, Robert L., Jr.

1992-01-01

The Desktop Computing Integration Project for the Human Resources Management Division (HRMD) of LaRC was designed to help division personnel use personal computing resources to perform job tasks. The three goals of the project were to involve HRMD personnel in desktop computing, link mainframe data to desktop capabilities, and to estimate training needs for the division. The project resulted in increased usage of personal computers by Awards specialists, an increased awareness of LaRC resources to help perform tasks, and personal computer output that was used in presentation of information to center personnel. In addition, the necessary skills for HRMD personal computer users were identified. The Awards Office was chosen for the project because of the consistency of their data requests and the desire of employees in that area to use the personal computer.
Systems, methods and computer-readable media to model kinetic performance of rechargeable electrochemical devices

DOEpatents

Gering, Kevin L.

2013-01-01

A system includes an electrochemical cell, monitoring hardware, and a computing system. The monitoring hardware samples performance characteristics of the electrochemical cell. The computing system determines cell information from the performance characteristics. The computing system also analyzes the cell information of the electrochemical cell with a Butler-Volmer (BV) expression modified to determine exchange current density of the electrochemical cell by including kinetic performance information related to pulse-time dependence, electrode surface availability, or a combination thereof. A set of sigmoid-based expressions may be included with the modified-BV expression to determine kinetic performance as a function of pulse time. The determined exchange current density may be used with the modified-BV expression, with or without the sigmoid expressions, to analyze other characteristics of the electrochemical cell. Model parameters can be defined in terms of cell aging, making the overall kinetics model amenable to predictive estimates of cell kinetic performance along the aging timeline.
Liquid rocket performance computer model with distributed energy release

NASA Technical Reports Server (NTRS)

Combs, L. P.

1972-01-01

Development of a computer program for analyzing the effects of bipropellant spray combustion processes on liquid rocket performance is described and discussed. The distributed energy release (DER) computer program was designed to become part of the JANNAF liquid rocket performance evaluation methodology and to account for performance losses associated with the propellant combustion processes, e.g., incomplete spray gasification, imperfect mixing between sprays and their reacting vapors, residual mixture ratio striations in the flow, and two-phase flow effects. The DER computer program begins by initializing the combustion field at the injection end of a conventional liquid rocket engine, based on injector and chamber design detail, and on propellant and combustion gas properties. It analyzes bipropellant combustion, proceeding stepwise down the chamber from those initial conditions through the nozzle throat.
Finite difference methods for reducing numerical diffusion in TEACH-type calculations. [Teaching Elliptic Axisymmetric Characteristics Heuristically

NASA Technical Reports Server (NTRS)

Syed, S. A.; Chiappetta, L. M.

1985-01-01

A methodological evaluation for two-finite differencing schemes for computer-aided gas turbine design is presented. The two computational schemes include; a Bounded Skewed Finite Differencing Scheme (BSUDS); and a Quadratic Upwind Differencing Scheme (QSDS). In the evaluation, the derivations of the schemes were incorporated into two-dimensional and three-dimensional versions of the Teaching Axisymmetric Characteristics Heuristically (TEACH) computer code. Assessments were made according to performance criteria for the solution of problems of turbulent, laminar, and coannular turbulent flow. The specific performance criteria used in the evaluation were simplicity, accuracy, and computational economy. It is found that the BSUDS scheme performed better with respect to the criteria than the QUDS. Some of the reasons for the more successful performance BSUDS are discussed.
High Performance Computing Software Applications for Space Situational Awareness

NASA Astrophysics Data System (ADS)

Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.
Workload Characterization of CFD Applications Using Partial Differential Equation Solvers

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.
Computer program for design and performance analysis of navigation-aid power systems. Program documentation. Volume 1: Software requirements document

NASA Technical Reports Server (NTRS)

Goltz, G.; Kaiser, L. M.; Weiner, H.

1977-01-01

A computer program has been developed for designing and analyzing the performance of solar array/battery power systems for the U.S. Coast Guard Navigational Aids. This program is called the Design Synthesis/Performance Analysis (DSPA) Computer Program. The basic function of the Design Synthesis portion of the DSPA program is to evaluate functional and economic criteria to provide specifications for viable solar array/battery power systems. The basic function of the Performance Analysis portion of the DSPA program is to simulate the operation of solar array/battery power systems under specific loads and environmental conditions. This document establishes the software requirements for the DSPA computer program, discusses the processing that occurs within the program, and defines the necessary interfaces for operation.
Generic Divide and Conquer Internet-Based Computing

NASA Technical Reports Server (NTRS)

Radenski, Atanas; Follen, Gregory J. (Technical Monitor)

2001-01-01

The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
Development and Validation of a Fast, Accurate and Cost-Effective Aeroservoelastic Method on Advanced Parallel Computing Systems

NASA Technical Reports Server (NTRS)

Goodwin, Sabine A.; Raj, P.

1999-01-01

Progress to date towards the development and validation of a fast, accurate and cost-effective aeroelastic method for advanced parallel computing platforms such as the IBM SP2 and the SGI Origin 2000 is presented in this paper. The ENSAERO code, developed at the NASA-Ames Research Center has been selected for this effort. The code allows for the computation of aeroelastic responses by simultaneously integrating the Euler or Navier-Stokes equations and the modal structural equations of motion. To assess the computational performance and accuracy of the ENSAERO code, this paper reports the results of the Navier-Stokes simulations of the transonic flow over a flexible aeroelastic wing body configuration. In addition, a forced harmonic oscillation analysis in the frequency domain and an analysis in the time domain are done on a wing undergoing a rigid pitch and plunge motion. Finally, to demonstrate the ENSAERO flutter-analysis capability, aeroelastic Euler and Navier-Stokes computations on an L-1011 wind tunnel model including pylon, nacelle and empennage are underway. All computational solutions are compared with experimental data to assess the level of accuracy of ENSAERO. As the computations described above are performed, a meticulous log of computational performance in terms of wall clock time, execution speed, memory and disk storage is kept. Code scalability is also demonstrated by studying the impact of varying the number of processors on computational performance on the IBM SP2 and the Origin 2000 systems.
Analytical modeling and feasibility study of a multi-GPU cloud-based server (MGCS) framework for non-voxel-based dose calculations.

PubMed

Neylon, J; Min, Y; Kupelian, P; Low, D A; Santhanam, A

2017-04-01

In this paper, a multi-GPU cloud-based server (MGCS) framework is presented for dose calculations, exploring the feasibility of remote computing power for parallelization and acceleration of computationally and time intensive radiotherapy tasks in moving toward online adaptive therapies. An analytical model was developed to estimate theoretical MGCS performance acceleration and intelligently determine workload distribution. Numerical studies were performed with a computing setup of 14 GPUs distributed over 4 servers interconnected by a 1 Gigabits per second (Gbps) network. Inter-process communication methods were optimized to facilitate resource distribution and minimize data transfers over the server interconnect. The analytically predicted computation time predicted matched experimentally observations within 1-5 %. MGCS performance approached a theoretical limit of acceleration proportional to the number of GPUs utilized when computational tasks far outweighed memory operations. The MGCS implementation reproduced ground-truth dose computations with negligible differences, by distributing the work among several processes and implemented optimization strategies. The results showed that a cloud-based computation engine was a feasible solution for enabling clinics to make use of fast dose calculations for advanced treatment planning and adaptive radiotherapy. The cloud-based system was able to exceed the performance of a local machine even for optimized calculations, and provided significant acceleration for computationally intensive tasks. Such a framework can provide access to advanced technology and computational methods to many clinics, providing an avenue for standardization across institutions without the requirements of purchasing, maintaining, and continually updating hardware.
Systems cost/performance analysis (study 2.3). Volume 3: Programmer's manual and user's guide. [for unmanned spacecraft

NASA Technical Reports Server (NTRS)

Janz, R. F.

1974-01-01

The systems cost/performance model was implemented as a digital computer program to perform initial program planning, cost/performance tradeoffs, and sensitivity analyses. The computer is described along with the operating environment in which the program was written and checked, the program specifications such as discussions of logic and computational flow, the different subsystem models involved in the design of the spacecraft, and routines involved in the nondesign area such as costing and scheduling of the design. Preliminary results for the DSCS-II design are also included.
Study to design and develop remote manipulator system. [computer simulation of human performance

NASA Technical Reports Server (NTRS)

Hill, J. W.; Mcgovern, D. E.; Sword, A. J.

1974-01-01

Modeling of human performance in remote manipulation tasks is reported by automated procedures using computers to analyze and count motions during a manipulation task. Performance is monitored by an on-line computer capable of measuring the joint angles of both master and slave and in some cases the trajectory and velocity of the hand itself. In this way the operator's strategies with different transmission delays, displays, tasks, and manipulators can be analyzed in detail for comparison. Some progress is described in obtaining a set of standard tasks and difficulty measures for evaluating manipulator performance.
Human performance models for computer-aided engineering

NASA Technical Reports Server (NTRS)

Elkind, Jerome I. (Editor); Card, Stuart K. (Editor); Hochberg, Julian (Editor); Huey, Beverly Messick (Editor)

1989-01-01

This report discusses a topic important to the field of computational human factors: models of human performance and their use in computer-based engineering facilities for the design of complex systems. It focuses on a particular human factors design problem -- the design of cockpit systems for advanced helicopters -- and on a particular aspect of human performance -- vision and related cognitive functions. By focusing in this way, the authors were able to address the selected topics in some depth and develop findings and recommendations that they believe have application to many other aspects of human performance and to other design domains.
Interfacing Computer Aided Parallelization and Performance Analysis

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Biegel, Bryan A. (Technical Monitor)

2003-01-01

When porting sequential applications to parallel computer architectures, the program developer will typically go through several cycles of source code optimization and performance analysis. We have started a project to develop an environment where the user can jointly navigate through program structure and performance data information in order to make efficient optimization decisions. In a prototype implementation we have interfaced the CAPO computer aided parallelization tool with the Paraver performance analysis tool. We describe both tools and their interface and give an example for how the interface helps within the program development cycle of a benchmark code.

Conversion of cardiac performance data in analog form for digital computer entry

NASA Technical Reports Server (NTRS)

Miller, R. L.

1972-01-01

A system is presented which will reduce analog cardiac performance data and convert the results to digital form for direct entry into a commercial time-shared computer. Circuits are discussed which perform the measurement and digital conversion of instantaneous systolic and diastolic parameters from the analog blood pressure waveform. Digital averaging over a selected number of heart cycles is performed on these measurements, as well as those of flow and heart rate. The determination of average cardiac output and peripheral resistance, including trends, is the end result after processing by digital computer.
VASCOMP 2. The V/STOL aircraft sizing and performance computer program. Volume 6: User's manual, revision 3

NASA Technical Reports Server (NTRS)

Schoen, A. H.; Rosenstein, H.; Stanzione, K.; Wisniewski, J. S.

1980-01-01

This report describes the use of the V/STOL Aircraft Sizing and Performance Computer Program (VASCOMP II). The program is useful in performing aircraft parametric studies in a quick and cost efficient manner. Problem formulation and data development were performed by the Boeing Vertol Company and reflects the present preliminary design technology. The computer program, written in FORTRAN IV, has a broad range of input parameters, to enable investigation of a wide variety of aircraft. User oriented features of the program include minimized input requirements, diagnostic capabilities, and various options for program flexibility.
The Effect of Instructional Method on Cardiopulmonary Resuscitation Skill Performance: A Comparison Between Instructor-Led Basic Life Support and Computer-Based Basic Life Support With Voice-Activated Manikin.

PubMed

Wilson-Sands, Cathy; Brahn, Pamela; Graves, Kristal

2015-01-01

Validating participants' ability to correctly perform cardiopulmonary resuscitation (CPR) skills during basic life support courses can be a challenge for nursing professional development specialists. This study compares two methods of basic life support training, instructor-led and computer-based learning with voice-activated manikins, to identify if one method is more effective for performance of CPR skills. The findings suggest that a computer-based learning course with voice-activated manikins is a more effective method of training for improved CPR performance.
Computer-Mediated Training Tools to Enhance Joint Task Force Cognitive Leadership Skills

DTIC Science & Technology

2007-04-01

University); and 5d. TASK NUMBER Barclay Lewis (American Systems) 5e. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ...ple G am ing Platform D ecisive A ction for Training ..................................................... 43 6. Perform ance M etrics...Figure 15: Automated Performance Measurement System ................................................................... 48 iv COMPUTER-MEDIATED TRAINING
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.

PubMed

Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

2011-09-07

Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed.
48 CFR 227.7202-4 - Contract clause.

Code of Federal Regulations, 2012 CFR

2012-10-01

... Software and Computer Software Documentation 227.7202-4 Contract clause. A specific contract clause governing the Government's rights in commercial computer software or commercial computer software..., release, perform, display, or disclose computer software or computer software documentation shall be...
48 CFR 227.7202-4 - Contract clause.

Code of Federal Regulations, 2011 CFR

2011-10-01

... Software and Computer Software Documentation 227.7202-4 Contract clause. A specific contract clause governing the Government's rights in commercial computer software or commercial computer software..., release, perform, display, or disclose computer software or computer software documentation shall be...
48 CFR 227.7202-4 - Contract clause.

Code of Federal Regulations, 2014 CFR

2014-10-01

... Software and Computer Software Documentation 227.7202-4 Contract clause. A specific contract clause governing the Government's rights in commercial computer software or commercial computer software..., release, perform, display, or disclose computer software or computer software documentation shall be...
48 CFR 227.7202-4 - Contract clause.

Code of Federal Regulations, 2013 CFR

2013-10-01

... Software and Computer Software Documentation 227.7202-4 Contract clause. A specific contract clause governing the Government's rights in commercial computer software or commercial computer software..., release, perform, display, or disclose computer software or computer software documentation shall be...
48 CFR 227.7202-4 - Contract clause.

Code of Federal Regulations, 2010 CFR

2010-10-01

... Software and Computer Software Documentation 227.7202-4 Contract clause. A specific contract clause governing the Government's rights in commercial computer software or commercial computer software..., release, perform, display, or disclose computer software or computer software documentation shall be...
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kozacik, Stephen

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
Coal-seismic, desktop computer programs in BASIC; Part 5, Perform X-square T-square analysis and plot normal moveout lines on seismogram overlay

USGS Publications Warehouse

Hasbrouck, W.P.

1983-01-01

Processing of data taken with the U.S. Geological Survey's coal-seismic system is done with a desktop, stand-alone computer. Programs for this computer are written in the extended BASIC language used by the Tektronix 4051 Graphic System. This report presents computer programs to perform X-square/T-square analyses and to plot normal moveout lines on a seismogram overlay.
Final Report Extreme Computing and U.S. Competitiveness DOE Award. DE-FG02-11ER26087/DE-SC0008764

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mustain, Christopher J.

The Council has acted on each of the grant deliverables during the funding period. The deliverables are: (1) convening the Council’s High Performance Computing Advisory Committee (HPCAC) on a bi-annual basis; (2) broadening public awareness of high performance computing (HPC) and exascale developments; (3) assessing the industrial applications of extreme computing; and (4) establishing a policy and business case for an exascale economy.
Computer systems performance measurement techniques.

DOT National Transportation Integrated Search

1971-06-01

Computer system performance measurement techniques, tools, and approaches are presented as a foundation for future recommendations regarding the instrumentation of the ARTS ATC data processing subsystem for purposes of measurement and evaluation.
Analytical Cost Metrics : Days of Future Past

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prajapati, Nirmal; Rajopadhye, Sanjay; Djidjev, Hristo Nikolov

As we move towards the exascale era, the new architectures must be capable of running the massive computational problems efficiently. Scientists and researchers are continuously investing in tuning the performance of extreme-scale computational problems. These problems arise in almost all areas of computing, ranging from big data analytics, artificial intelligence, search, machine learning, virtual/augmented reality, computer vision, image/signal processing to computational science and bioinformatics. With Moore’s law driving the evolution of hardware platforms towards exascale, the dominant performance metric (time efficiency) has now expanded to also incorporate power/energy efficiency. Therefore the major challenge that we face in computing systems researchmore » is: “how to solve massive-scale computational problems in the most time/power/energy efficient manner?”« less
Heterogeneous Distributed Computing for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Sunderam, Vaidy S.

1998-01-01

The research supported under this award focuses on heterogeneous distributed computing for high-performance applications, with particular emphasis on computational aerosciences. The overall goal of this project was to and investigate issues in, and develop solutions to, efficient execution of computational aeroscience codes in heterogeneous concurrent computing environments. In particular, we worked in the context of the PVM[1] system and, subsequent to detailed conversion efforts and performance benchmarking, devising novel techniques to increase the efficacy of heterogeneous networked environments for computational aerosciences. Our work has been based upon the NAS Parallel Benchmark suite, but has also recently expanded in scope to include the NAS I/O benchmarks as specified in the NHT-1 document. In this report we summarize our research accomplishments under the auspices of the grant.
Modeling Improvements and Users Manual for Axial-flow Turbine Off-design Computer Code AXOD

NASA Technical Reports Server (NTRS)

Glassman, Arthur J.

1994-01-01

An axial-flow turbine off-design performance computer code used for preliminary studies of gas turbine systems was modified and calibrated based on the experimental performance of large aircraft-type turbines. The flow- and loss-model modifications and calibrations are presented in this report. Comparisons are made between computed performances and experimental data for seven turbines over wide ranges of speed and pressure ratio. This report also serves as the users manual for the revised code, which is named AXOD.
Design geometry and design/off-design performance computer codes for compressors and turbines

NASA Technical Reports Server (NTRS)

Glassman, Arthur J.

1995-01-01

This report summarizes some NASA Lewis (i.e., government owned) computer codes capable of being used for airbreathing propulsion system studies to determine the design geometry and to predict the design/off-design performance of compressors and turbines. These are not CFD codes; velocity-diagram energy and continuity computations are performed fore and aft of the blade rows using meanline, spanline, or streamline analyses. Losses are provided by empirical methods. Both axial-flow and radial-flow configurations are included.
A High Performance Computing Approach to the Simulation of Fluid Solid Interaction Problems with Rigid and Flexible Components (Open Access Publisher’s Version)

DTIC Science & Technology

2014-08-01

performance computing, smoothed particle hydrodynamics, rigid body dynamics, flexible body dynamics ARMAN PAZOUKI ∗, RADU SERBAN ∗, DAN NEGRUT ∗ A...HIGH PERFORMANCE COMPUTING APPROACH TO THE SIMULATION OF FLUID-SOLID INTERACTION PROBLEMS WITH RIGID AND FLEXIBLE COMPONENTS This work outlines a unified...are implemented to model rigid and flexible multibody dynamics. The two- way coupling of the fluid and solid phases is supported through use of
A Comparative Study of Multi-material Data Structures for Computational Physics Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garimella, Rao Veerabhadra; Robey, Robert W.

The data structures used to represent the multi-material state of a computational physics application can have a drastic impact on the performance of the application. We look at efficient data structures for sparse applications where there may be many materials, but only one or few in most computational cells. We develop simple performance models for use in selecting possible data structures and programming patterns. We verify the analytic models of performance through a small test program of the representative cases.

A Perspective on Computational Human Performance Models as Design Tools

NASA Technical Reports Server (NTRS)

Jones, Patricia M.

2010-01-01

The design of interactive systems, including levels of automation, displays, and controls, is usually based on design guidelines and iterative empirical prototyping. A complementary approach is to use computational human performance models to evaluate designs. An integrated strategy of model-based and empirical test and evaluation activities is particularly attractive as a methodology for verification and validation of human-rated systems for commercial space. This talk will review several computational human performance modeling approaches and their applicability to design of display and control requirements.
Effectiveness of an Electronic Performance Support System on Computer Ethics and Ethical Decision-Making Education

ERIC Educational Resources Information Center

Kert, Serhat Bahadir; Uz, Cigdem; Gecu, Zeynep

2014-01-01

This study examined the effectiveness of an electronic performance support system (EPSS) on computer ethics education and the ethical decision-making processes. There were five different phases to this ten month study: (1) Writing computer ethics scenarios, (2) Designing a decision-making framework (3) Developing EPSS software (4) Using EPSS in a…
Computer-Assisted Performance Evaluation for Navy Anti-Air Warfare Training: Concepts, Methods, and Constraints.

ERIC Educational Resources Information Center

Chesler, David J.

An improved general methodological approach for the development of computer-assisted evaluation of trainee performance in the computer-based simulation environment is formulated in this report. The report focuses on the Tactical Advanced Combat Direction and Electronic Warfare system (TACDEW) at the Fleet Anti-Air Warfare Training Center at San…
The Effect of Using Computer Skills on Teachers' Perceived Self-Efficacy Beliefs towards Technology Integration, Attitudes and Performance

ERIC Educational Resources Information Center

EL-Daou, Badrie Mohammad Nour

2016-01-01

The current study analyzes the relationship between the apparent teacher's self-efficacy and attitudes towards integrating technology into classroom teaching, self-evaluation reports and computer performance results. Pre-post measurement of the Computer Technology Integration Survey (CTIS) (Wang et al, 2004) was used to determine the confidence…
The Effect of Using in Computer Skills on Teachers' Perceived Self-Efficacy Beliefs towards Technology Integration, Attitudes and Performance

ERIC Educational Resources Information Center

EL-Daou, Badrie Mohammad Nour

2016-01-01

The current study analyzes the relationship between the apparent teacher's self-efficacy and attitudes towards integrating technology into classroom teaching, self-evaluation reports and computer performance results. Pre-post measurement of the Computer Technology Integration Survey (CTIS) (Wang et al, 2004) was used to determine the confidence…
The Impact of a Professional Learning Intervention Designed to Enhance Year Six Students' Computational Estimation Performance

ERIC Educational Resources Information Center

Mildenhall, Paula; Hackling, Mark

2012-01-01

This paper reports on the analysis of a study of a professional learning intervention focussing on computational estimation. Using a multiple case study design it was possible to describe the impact of the intervention of students' beliefs and computational estimation performance. The study revealed some noteworthy impacts on computational…
Petascale supercomputing to accelerate the design of high-temperature alloys

DOE PAGES

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit; ...

2017-10-25

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ'-Al 2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviourmore » of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. As a result, the approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.« less
Computational physics in RISC environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rhoades, C.E. Jr.

The new high performance Reduced Instruction Set Computers (RISC) promise near Cray-level performance at near personal-computer prices. This paper explores the performance, conversion and compatibility issues associated with developing, testing and using our traditional, large-scale simulation models in the RISC environments exemplified by the IBM RS6000 and MISP R3000 machines. The questions of operating systems (CTSS versus UNIX), compilers (Fortran, C, pointers) and data are addressed in detail. Overall, it is concluded that the RISC environments are practical for a very wide range of computational physic activities. Indeed, all but the very largest two- and three-dimensional codes will work quitemore » well, particularly in a single user environment. Easily projected hardware-performance increases will revolutionize the field of computational physics. The way we do research will change profoundly in the next few years. There is, however, nothing more difficult to plan, nor more dangerous to manage than the creation of this new world.« less
Computational physics in RISC environments. Revision 1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rhoades, C.E. Jr.

The new high performance Reduced Instruction Set Computers (RISC) promise near Cray-level performance at near personal-computer prices. This paper explores the performance, conversion and compatibility issues associated with developing, testing and using our traditional, large-scale simulation models in the RISC environments exemplified by the IBM RS6000 and MISP R3000 machines. The questions of operating systems (CTSS versus UNIX), compilers (Fortran, C, pointers) and data are addressed in detail. Overall, it is concluded that the RISC environments are practical for a very wide range of computational physic activities. Indeed, all but the very largest two- and three-dimensional codes will work quitemore » well, particularly in a single user environment. Easily projected hardware-performance increases will revolutionize the field of computational physics. The way we do research will change profoundly in the next few years. There is, however, nothing more difficult to plan, nor more dangerous to manage than the creation of this new world.« less
Petascale supercomputing to accelerate the design of high-temperature alloys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ'-Al 2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviourmore » of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. As a result, the approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.« less
DISCRETE EVENT SIMULATION OF OPTICAL SWITCH MATRIX PERFORMANCE IN COMPUTER NETWORKS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Imam, Neena; Poole, Stephen W

2013-01-01

In this paper, we present application of a Discrete Event Simulator (DES) for performance modeling of optical switching devices in computer networks. Network simulators are valuable tools in situations where one cannot investigate the system directly. This situation may arise if the system under study does not exist yet or the cost of studying the system directly is prohibitive. Most available network simulators are based on the paradigm of discrete-event-based simulation. As computer networks become increasingly larger and more complex, sophisticated DES tool chains have become available for both commercial and academic research. Some well-known simulators are NS2, NS3, OPNET,more » and OMNEST. For this research, we have applied OMNEST for the purpose of simulating multi-wavelength performance of optical switch matrices in computer interconnection networks. Our results suggest that the application of DES to computer interconnection networks provides valuable insight in device performance and aids in topology and system optimization.« less
Petascale supercomputing to accelerate the design of high-temperature alloys

NASA Astrophysics Data System (ADS)

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit; Haynes, J. Allen

2017-12-01

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ‧-Al2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviour of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. The approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.
Results of comparative RBMK neutron computation using VNIIEF codes (cell computation, 3D statics, 3D kinetics). Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grebennikov, A.N.; Zhitnik, A.K.; Zvenigorodskaya, O.A.

1995-12-31

In conformity with the protocol of the Workshop under Contract {open_quotes}Assessment of RBMK reactor safety using modern Western Codes{close_quotes} VNIIEF performed a neutronics computation series to compare western and VNIIEF codes and assess whether VNIIEF codes are suitable for RBMK type reactor safety assessment computation. The work was carried out in close collaboration with M.I. Rozhdestvensky and L.M. Podlazov, NIKIET employees. The effort involved: (1) cell computations with the WIMS, EKRAN codes (improved modification of the LOMA code) and the S-90 code (VNIIEF Monte Carlo). Cell, polycell, burnup computation; (2) 3D computation of static states with the KORAT-3D and NEUmore » codes and comparison with results of computation with the NESTLE code (USA). The computations were performed in the geometry and using the neutron constants presented by the American party; (3) 3D computation of neutron kinetics with the KORAT-3D and NEU codes. These computations were performed in two formulations, both being developed in collaboration with NIKIET. Formulation of the first problem maximally possibly agrees with one of NESTLE problems and imitates gas bubble travel through a core. The second problem is a model of the RBMK as a whole with imitation of control and protection system controls (CPS) movement in a core.« less
Critical thinking traits of top-tier experts and implications for computer science education

NASA Astrophysics Data System (ADS)

Bushey, Dean E.

A documented shortage of technical leadership and top-tier performers in computer science jeopardizes the technological edge, security, and economic well-being of the nation. The 2005 President's Information and Technology Advisory Committee (PITAC) Report on competitiveness in computational sciences highlights the major impact of science, technology, and innovation in keeping America competitive in the global marketplace. It stresses the fact that the supply of science, technology, and engineering experts is at the core of America's technological edge, national competitiveness and security. However, recent data shows that both undergraduate and postgraduate production of computer scientists is falling. The decline is "a quiet crisis building in the United States," a crisis that, if allowed to continue unchecked, could endanger America's well-being and preeminence among the world's nations. Past research on expert performance has shown that the cognitive traits of critical thinking, creativity, and problem solving possessed by top-tier performers can be identified, observed and measured. The studies show that the identified attributes are applicable across many domains and disciplines. Companies have begun to realize that cognitive skills are important for high-level performance and are reevaluating the traditional academic standards they have used to predict success for their top-tier performers in computer science. Previous research in the computer science field has focused either on programming skills of its experts or has attempted to predict the academic success of students at the undergraduate level. This study, on the other hand, examines the critical-thinking skills found among experts in the computer science field in order to explore the questions, "What cognitive skills do outstanding performers possess that make them successful?" and "How do currently used measures of academic performance correlate to critical-thinking skills among students?" The results of this study suggest a need to examine how critical-thinking abilities are learned in the undergraduate computer science curriculum and the need to foster these abilities in order to produce the high-level, critical-thinking professionals necessary to fill the growing need for these experts. Due to the fact that current measures of academic performance do not adequately depict students' cognitive abilities, assessment of these skills must be incorporated into existing curricula.
P2P Technology for High-Performance Computing: An Overview

NASA Technical Reports Server (NTRS)

Follen, Gregory J. (Technical Monitor); Berry, Jason

2003-01-01

The transition from cluster computing to peer-to-peer (P2P) high-performance computing has recently attracted the attention of the computer science community. It has been recognized that existing local networks and dedicated clusters of headless workstations can serve as inexpensive yet powerful virtual supercomputers. It has also been recognized that the vast number of lower-end computers connected to the Internet stay idle for as long as 90% of the time. The growing speed of Internet connections and the high availability of free CPU time encourage exploration of the possibility to use the whole Internet rather than local clusters as massively parallel yet almost freely available P2P supercomputer. As a part of a larger project on P2P high-performance computing, it has been my goal to compile an overview of the 2P2 paradigm. I have studied various P2P platforms and I have compiled systematic brief descriptions of their most important characteristics. I have also experimented and obtained hands-on experience with selected P2P platforms focusing on those that seem promising with respect to P2P high-performance computing. I have also compiled relevant literature and web references. I have prepared a draft technical report and I have summarized my findings in a poster paper.
Optimization of tomographic reconstruction workflows on geographically distributed resources

DOE PAGES

Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar; ...

2016-01-01

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
Optimization of tomographic reconstruction workflows on geographically distributed resources

PubMed Central

Bicer, Tekin; Gürsoy, Doǧa; Kettimuthu, Rajkumar; De Carlo, Francesco; Foster, Ian T.

2016-01-01

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modeling of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Moreover, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks. PMID:27359149
Optimization of tomographic reconstruction workflows on geographically distributed resources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar

New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
Computational nuclear quantum many-body problem: The UNEDF project

NASA Astrophysics Data System (ADS)

Bogner, S.; Bulgac, A.; Carlson, J.; Engel, J.; Fann, G.; Furnstahl, R. J.; Gandolfi, S.; Hagen, G.; Horoi, M.; Johnson, C.; Kortelainen, M.; Lusk, E.; Maris, P.; Nam, H.; Navratil, P.; Nazarewicz, W.; Ng, E.; Nobre, G. P. A.; Ormand, E.; Papenbrock, T.; Pei, J.; Pieper, S. C.; Quaglioni, S.; Roche, K. J.; Sarich, J.; Schunck, N.; Sosonkina, M.; Terasaki, J.; Thompson, I.; Vary, J. P.; Wild, S. M.

2013-10-01

The UNEDF project was a large-scale collaborative effort that applied high-performance computing to the nuclear quantum many-body problem. The primary focus of the project was on constructing, validating, and applying an optimized nuclear energy density functional, which entailed a wide range of pioneering developments in microscopic nuclear structure and reactions, algorithms, high-performance computing, and uncertainty quantification. UNEDF demonstrated that close associations among nuclear physicists, mathematicians, and computer scientists can lead to novel physics outcomes built on algorithmic innovations and computational developments. This review showcases a wide range of UNEDF science results to illustrate this interplay.
An automated procedure for developing hybrid computer simulations of turbofan engines

NASA Technical Reports Server (NTRS)

Szuch, J. R.; Krosel, S. M.

1980-01-01

A systematic, computer-aided, self-documenting methodology for developing hybrid computer simulations of turbofan engines is presented. The methodology makes use of a host program that can run on a large digital computer and a machine-dependent target (hybrid) program. The host program performs all of the calculations and date manipulations needed to transform user-supplied engine design information to a form suitable for the hybrid computer. The host program also trims the self contained engine model to match specified design point information. A test case is described and comparisons between hybrid simulation and specified engine performance data are presented.

Reservoir computing with a slowly modulated mask signal for preprocessing using a mutually coupled optoelectronic system

NASA Astrophysics Data System (ADS)

Tezuka, Miwa; Kanno, Kazutaka; Bunsen, Masatoshi

2016-08-01

Reservoir computing is a machine-learning paradigm based on information processing in the human brain. We numerically demonstrate reservoir computing with a slowly modulated mask signal for preprocessing by using a mutually coupled optoelectronic system. The performance of our system is quantitatively evaluated by a chaotic time series prediction task. Our system can produce comparable performance with reservoir computing with a single feedback system and a fast modulated mask signal. We showed that it is possible to slow down the modulation speed of the mask signal by using the mutually coupled system in reservoir computing.
Symplectic molecular dynamics simulations on specially designed parallel computers.

PubMed

Borstnik, Urban; Janezic, Dusanka

2005-01-01

We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables the use of longer simulation time steps. The low-frequency motion is treated numerically on specially designed parallel computers, which decreases the computational time of each simulation time step. The combination of these approaches means that less time is required and fewer steps are needed and so enables fast MD simulations. We study the computational performance of MD simulation of molecular systems on specialized computers and provide a comparison to standard personal computers. The combination of the SISM with two specialized parallel computers is an effective way to increase the speed of MD simulations up to 16-fold over a single PC processor.
Implementation of Multispectral Image Classification on a Remote Adaptive Computer

NASA Technical Reports Server (NTRS)

Figueiredo, Marco A.; Gloster, Clay S.; Stephens, Mark; Graves, Corey A.; Nakkar, Mouna

1999-01-01

As the demand for higher performance computers for the processing of remote sensing science algorithms increases, the need to investigate new computing paradigms its justified. Field Programmable Gate Arrays enable the implementation of algorithms at the hardware gate level, leading to orders of m a,gnitude performance increase over microprocessor based systems. The automatic classification of spaceborne multispectral images is an example of a computation intensive application, that, can benefit from implementation on an FPGA - based custom computing machine (adaptive or reconfigurable computer). A probabilistic neural network is used here to classify pixels of of a multispectral LANDSAT-2 image. The implementation described utilizes Java client/server application programs to access the adaptive computer from a remote site. Results verify that a remote hardware version of the algorithm (implemented on an adaptive computer) is significantly faster than a local software version of the same algorithm implemented on a typical general - purpose computer).
Redirecting Under-Utilised Computer Laboratories into Cluster Computing Facilities

ERIC Educational Resources Information Center

Atkinson, John S.; Spenneman, Dirk H. R.; Cornforth, David

2005-01-01

Purpose: To provide administrators at an Australian university with data on the feasibility of redirecting under-utilised computer laboratories facilities into a distributed high performance computing facility. Design/methodology/approach: The individual log-in records for each computer located in the computer laboratories at the university were…
Mir Cooperative Solar Array Flight Performance Data and Computational Analysis

NASA Technical Reports Server (NTRS)

Kerslake, Thomas W.; Hoffman, David J.

1997-01-01

The Mir Cooperative Solar Array (MCSA) was developed jointly by the United States (US) and Russia to provide approximately 6 kW of photovoltaic power to the Russian space station Mir. The MCSA was launched to Mir in November 1995 and installed on the Kvant-1 module in May 1996. Since the MCSA photovoltaic panel modules (PPMs) are nearly identical to those of the International Space Station (ISS) photovoltaic arrays, MCSA operation offered an opportunity to gather multi-year performance data on this technology prior to its implementation on ISS. Two specially designed test sequences were executed in June and December 1996 to measure MCSA performance. Each test period encompassed 3 orbital revolutions whereby the current produced by the MCSA channels was measured. The temperature of MCSA PPMs was also measured. To better interpret the MCSA flight data, a dedicated FORTRAN computer code was developed to predict the detailed thermal-electrical performance of the MCSA. Flight data compared very favorably with computational performance predictions. This indicated that the MCSA electrical performance was fully meeting pre-flight expectations. There were no measurable indications of unexpected or precipitous MCSA performance degradation due to contamination or other causes after 7 months of operation on orbit. Power delivered to the Mir bus was lower than desired as a consequence of the retrofitted power distribution cabling. The strong correlation of experimental and computational results further bolsters the confidence level of performance codes used in critical ISS electric power forecasting. In this paper, MCSA flight performance tests are described as well as the computational modeling behind the performance predictions.
Play for Performance: Using Computer Games to Improve Motivation and Test-Taking Performance

ERIC Educational Resources Information Center

Dennis, Alan R.; Bhagwatwar, Akshay; Minas, Randall K.

2013-01-01

The importance of testing, especially certification and high-stakes testing, has increased substantially over the past decade. Building on the "serious gaming" literature and the psychology "priming" literature, we developed a computer game designed to improve test-taking performance using psychological priming. The game primed…
Connecting Performance Analysis and Visualization to Advance Extreme Scale Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bremer, Peer-Timo; Mohr, Bernd; Schulz, Martin

2015-07-29

The characterization, modeling, analysis, and tuning of software performance has been a central topic in High Performance Computing (HPC) since its early beginnings. The overall goal is to make HPC software run faster on particular hardware, either through better scheduling, on-node resource utilization, or more efficient distributed communication.
High-Performance Java Codes for Computational Fluid Dynamics

NASA Technical Reports Server (NTRS)

Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.
Classical multiparty computation using quantum resources

NASA Astrophysics Data System (ADS)

Clementi, Marco; Pappa, Anna; Eckstein, Andreas; Walmsley, Ian A.; Kashefi, Elham; Barz, Stefanie

2017-12-01

In this work, we demonstrate a way to perform classical multiparty computing among parties with limited computational resources. Our method harnesses quantum resources to increase the computational power of the individual parties. We show how a set of clients restricted to linear classical processing are able to jointly compute a nonlinear multivariable function that lies beyond their individual capabilities. The clients are only allowed to perform classical xor gates and single-qubit gates on quantum states. We also examine the type of security that can be achieved in this limited setting. Finally, we provide a proof-of-concept implementation using photonic qubits that allows four clients to compute a specific example of a multiparty function, the pairwise and.
Linear static structural and vibration analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.

1993-01-01

Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.
[Earth Science Technology Office's Computational Technologies Project

NASA Technical Reports Server (NTRS)

Fischer, James (Technical Monitor); Merkey, Phillip

2005-01-01

This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
Job Management and Task Bundling

NASA Astrophysics Data System (ADS)

Berkowitz, Evan; Jansen, Gustav R.; McElvain, Kenneth; Walker-Loud, André

2018-03-01

High Performance Computing is often performed on scarce and shared computing resources. To ensure computers are used to their full capacity, administrators often incentivize large workloads that are not possible on smaller systems. Measurements in Lattice QCD frequently do not scale to machine-size workloads. By bundling tasks together we can create large jobs suitable for gigantic partitions. We discuss METAQ and mpi_jm, software developed to dynamically group computational tasks together, that can intelligently backfill to consume idle time without substantial changes to users' current workflows or executables.
New VHP-Female v. 2.0 full-body computational phantom and its performance metrics using FEM simulator ANSYS HFSS.

PubMed

Yanamadala, Janakinadh; Noetscher, Gregory M; Rathi, Vishal K; Maliye, Saili; Win, Htay A; Tran, Anh L; Jackson, Xavier J; Htet, Aung T; Kozlov, Mikhail; Nazarian, Ara; Louie, Sara; Makarov, Sergey N

2015-01-01

Simulation of the electromagnetic response of the human body relies heavily upon efficient computational models or phantoms. The first objective of this paper is to present a new platform-independent full-body electromagnetic computational model (computational phantom), the Visible Human Project(®) (VHP)-Female v. 2.0 and to describe its distinct features. The second objective is to report phantom simulation performance metrics using the commercial FEM electromagnetic solver ANSYS HFSS.
DoD High Performance Computing Modernization Program Users Group Conference (HPCMP UGC 2011) Held in Portland, Oregon on June 20-23, 2011

DTIC Science & Technology

2011-06-01

4. Conclusion The Web -based AGeS system described in this paper is a computationally-efficient and scalable system for high- throughput genome...method for protecting web services involves making them more resilient to attack using autonomic computing techniques. This paper presents our initial...20–23, 2011 2011 DoD High Performance Computing Modernzation Program Users Group Conference HPCMP UGC 2011 The papers in this book comprise the
Some Aspects of Parallel Implementation of the Finite Element Method on Message Passing Architectures

DTIC Science & Technology

1988-05-01

for Advanced Computer Studies and Department of Computer Science University of Maryland College Park, MD 20742 4, ABSTRACT We discuss some aspects of...Computer Studies and Technology & Dept. of Compute. Scienc II. CONTROLLING OFFICE NAME AND ADDRESS Viyriyf~ 12. REPORT DATE Department of the Navy uo...number)-1/ 2.) We study the performance of CG and PCG by examining its performance for u E (0,1), for solving the two model problems with an accuracy
Computational Analysis of a Prototype Martian Rotorcraft Experiment

NASA Technical Reports Server (NTRS)

Corfeld, Kelly J.; Strawn, Roger C.; Long, Lyle N.

2002-01-01

This paper presents Reynolds-averaged Navier-Stokes calculations for a prototype Martian rotorcraft. The computations are intended for comparison with an ongoing Mars rotor hover test at NASA Ames Research Center. These computational simulations present a new and challenging problem, since rotors that operate on Mars will experience a unique low Reynolds number and high Mach number environment. Computed results for the 3-D rotor differ substantially from 2-D sectional computations in that the 3-D results exhibit a stall delay phenomenon caused by rotational forces along the blade span. Computational results have yet to be compared to experimental data, but computed performance predictions match the experimental design goals fairly well. In addition, the computed results provide a high level of detail in the rotor wake and blade surface aerodynamics. These details provide an important supplement to the expected experimental performance data.
Heterogeneous concurrent computing with exportable services

NASA Technical Reports Server (NTRS)

Sunderam, Vaidy

1995-01-01

Heterogeneous concurrent computing, based on the traditional process-oriented model, is approaching its functionality and performance limits. An alternative paradigm, based on the concept of services, supporting data driven computation, and built on a lightweight process infrastructure, is proposed to enhance the functional capabilities and the operational efficiency of heterogeneous network-based concurrent computing. TPVM is an experimental prototype system supporting exportable services, thread-based computation, and remote memory operations that is built as an extension of and an enhancement to the PVM concurrent computing system. TPVM offers a significantly different computing paradigm for network-based computing, while maintaining a close resemblance to the conventional PVM model in the interest of compatibility and ease of transition Preliminary experiences have demonstrated that the TPVM framework presents a natural yet powerful concurrent programming interface, while being capable of delivering performance improvements of upto thirty percent.
Understanding and enhancing user acceptance of computer technology

NASA Technical Reports Server (NTRS)

Rouse, William B.; Morris, Nancy M.

1986-01-01

Technology-driven efforts to implement computer technology often encounter problems due to lack of acceptance or begrudging acceptance of the personnel involved. It is argued that individuals' acceptance of automation, in terms of either computerization or computer aiding, is heavily influenced by their perceptions of the impact of the automation on their discretion in performing their jobs. It is suggested that desired levels of discretion reflect needs to feel in control and achieve self-satisfaction in task performance, as well as perceptions of inadequacies of computer technology. Discussion of these factors leads to a structured set of considerations for performing front-end analysis, deciding what to automate, and implementing the resulting changes.
Ground temperature measurement by PRT-5 for maps experiment

NASA Technical Reports Server (NTRS)

Gupta, S. K.; Tiwari, S. N.

1978-01-01

A simple algorithm and computer program were developed for determining the actual surface temperature from the effective brightness temperature as measured remotely by a radiation thermometer called PRT-5. This procedure allows the computation of atmospheric correction to the effective brightness temperature without performing detailed radiative transfer calculations. Model radiative transfer calculations were performed to compute atmospheric corrections for several values of the surface and atmospheric parameters individually and in combination. Polynomial regressions were performed between the magnitudes or deviations of these parameters and the corresponding computed corrections to establish simple analytical relations between them. Analytical relations were also developed to represent combined correction for simultaneous variation of parameters in terms of their individual corrections.
An Approach to Integrate a Space-Time GIS Data Model with High Performance Computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Dali; Zhao, Ziliang; Shaw, Shih-Lung

2011-01-01

In this paper, we describe an approach to integrate a Space-Time GIS data model on a high performance computing platform. The Space-Time GIS data model has been developed on a desktop computing environment. We use the Space-Time GIS data model to generate GIS module, which organizes a series of remote sensing data. We are in the process of porting the GIS module into an HPC environment, in which the GIS modules handle large dataset directly via parallel file system. Although it is an ongoing project, authors hope this effort can inspire further discussions on the integration of GIS on highmore » performance computing platforms.« less

Teaching Musical Expression: Effects of Production and Delivery of Feedback by Teacher vs. Computer on Rated Feedback Quality

ERIC Educational Resources Information Center

Karlsson, Jessika; Liljestrom, Simon; Juslin, Patrik N.

2009-01-01

Previous research has shown that a computer program may improve performers' abilities to express emotions through their performance. Yet, performers seem reluctant to embrace this novel technology. In this study we explored possible reasons for these negative impressions. Eighty guitarists performed a piece of music to express various emotions,…
Computational Model-Based Prediction of Human Episodic Memory Performance Based on Eye Movements

NASA Astrophysics Data System (ADS)

Sato, Naoyuki; Yamaguchi, Yoko

Subjects' episodic memory performance is not simply reflected by eye movements. We use a ‘theta phase coding’ model of the hippocampus to predict subjects' memory performance from their eye movements. Results demonstrate the ability of the model to predict subjects' memory performance. These studies provide a novel approach to computational modeling in the human-machine interface.
Integrated multi sensors and camera video sequence application for performance monitoring in archery

NASA Astrophysics Data System (ADS)

Taha, Zahari; Arif Mat-Jizat, Jessnor; Amirul Abdullah, Muhammad; Muazu Musa, Rabiu; Razali Abdullah, Mohamad; Fauzi Ibrahim, Mohamad; Hanafiah Shaharudin, Mohd Ali

2018-03-01

This paper explains the development of a comprehensive archery performance monitoring software which consisted of three camera views and five body sensors. The five body sensors evaluate biomechanical related variables of flexor and extensor muscle activity, heart rate, postural sway and bow movement during archery performance. The three camera views with the five body sensors are integrated into a single computer application which enables the user to view all the data in a single user interface. The five body sensors’ data are displayed in a numerical and graphical form in real-time. The information transmitted by the body sensors are computed with an embedded algorithm that automatically transforms the summary of the athlete’s biomechanical performance and displays in the application interface. This performance will be later compared to the pre-computed psycho-fitness performance from the prefilled data into the application. All the data; camera views, body sensors; performance-computations; are recorded for further analysis by a sports scientist. Our developed application serves as a powerful tool for assisting the coach and athletes to observe and identify any wrong technique employ during training which gives room for correction and re-evaluation to improve overall performance in the sport of archery.
Towards Scalable Graph Computation on Mobile Devices.

PubMed

Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng

2014-10-01

Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach.
High-performance computing for airborne applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather M; Manuzzato, Andrea; Fairbanks, Tom

2010-06-28

Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even thoughmore » the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.« less
Towards Scalable Graph Computation on Mobile Devices

PubMed Central

Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng

2015-01-01

Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach. PMID:25859564
Radio Synthesis Imaging - A High Performance Computing and Communications Project

NASA Astrophysics Data System (ADS)

Crutcher, Richard M.

The National Science Foundation has funded a five-year High Performance Computing and Communications project at the National Center for Supercomputing Applications (NCSA) for the direct implementation of several of the computing recommendations of the Astronomy and Astrophysics Survey Committee (the "Bahcall report"). This paper is a summary of the project goals and a progress report. The project will implement a prototype of the next generation of astronomical telescope systems - remotely located telescopes connected by high-speed networks to very high performance, scalable architecture computers and on-line data archives, which are accessed by astronomers over Gbit/sec networks. Specifically, a data link has been installed between the BIMA millimeter-wave synthesis array at Hat Creek, California and NCSA at Urbana, Illinois for real-time transmission of data to NCSA. Data are automatically archived, and may be browsed and retrieved by astronomers using the NCSA Mosaic software. In addition, an on-line digital library of processed images will be established. BIMA data will be processed on a very high performance distributed computing system, with I/O, user interface, and most of the software system running on the NCSA Convex C3880 supercomputer or Silicon Graphics Onyx workstations connected by HiPPI to the high performance, massively parallel Thinking Machines Corporation CM-5. The very computationally intensive algorithms for calibration and imaging of radio synthesis array observations will be optimized for the CM-5 and new algorithms which utilize the massively parallel architecture will be developed. Code running simultaneously on the distributed computers will communicate using the Data Transport Mechanism developed by NCSA. The project will also use the BLANCA Gbit/s testbed network between Urbana and Madison, Wisconsin to connect an Onyx workstation in the University of Wisconsin Astronomy Department to the NCSA CM-5, for development of long-distance distributed computing. Finally, the project is developing 2D and 3D visualization software as part of the international AIPS++ project. This research and development project is being carried out by a team of experts in radio astronomy, algorithm development for massively parallel architectures, high-speed networking, database management, and Thinking Machines Corporation personnel. The development of this complete software, distributed computing, and data archive and library solution to the radio astronomy computing problem will advance our expertise in high performance computing and communications technology and the application of these techniques to astronomical data processing.
High-Performance Computing Act of 1990: Report of the Senate Committee on Commerce, Science, and Transportation on S. 1067.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

This committee report is intended to accompany S. 1067, a bill designed to provide for a coordinated federal research program in high-performance computing (HPC). The primary objective of the legislation is given as the acceleration of research, development, and application of the most advanced computing technology in research, education, and…
Effects of Learning Style and Training Method on Computer Attitude and Performance in World Wide Web Page Design Training.

ERIC Educational Resources Information Center

Chou, Huey-Wen; Wang, Yu-Fang

1999-01-01

Compares the effects of two training methods on computer attitude and performance in a World Wide Web page design program in a field experiment with high school students in Taiwan. Discusses individual differences, Kolb's Experiential Learning Theory and Learning Style Inventory, Computer Attitude Scale, and results of statistical analyses.…
Performance of High-Reliability Space-Qualified Processors Implementing Software Defined Radios

DTIC Science & Technology

2014-03-01

ADDRESS(ES) AND ADDRESS(ES) Naval Postgraduate School, Department of Electrical and Computer Engineering, 833 Dyer Road, Monterey, CA 93943-5121 8...Chairman Jeffrey D. Paduan Electrical and Computer Engineering Dean of Research iii THIS PAGE...capability. Radiation in space poses a considerable threat to modern microelectronic devices, in particular to the high-performance low-cost computing
Analyzing Log Files to Predict Students' Problem Solving Performance in a Computer-Based Physics Tutor

ERIC Educational Resources Information Center

Lee, Young-Jin

2015-01-01

This study investigates whether information saved in the log files of a computer-based tutor can be used to predict the problem solving performance of students. The log files of a computer-based physics tutoring environment called Andes Physics Tutor was analyzed to build a logistic regression model that predicted success and failure of students'…
Multivariant function model generation

NASA Technical Reports Server (NTRS)

1974-01-01

The development of computer programs applicable to space vehicle guidance was conducted. The subjects discussed are as follows: (1) determination of optimum reentry trajectories, (2) development of equations for performance of trajectory computation, (3) vehicle control for fuel optimization, (4) development of equations for performance trajectory computations, (5) applications and solution of Hamilton-Jacobi equation, and (6) stresses in dome shaped shells with discontinuities at the apex.
Process for selecting NEAMS applications for access to Idaho National Laboratory high performance computing resources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Michael Pernice

2010-09-01

INL has agreed to provide participants in the Nuclear Energy Advanced Mod- eling and Simulation (NEAMS) program with access to its high performance computing (HPC) resources under sponsorship of the Enabling Computational Technologies (ECT) program element. This report documents the process used to select applications and the software stack in place at INL.
Fault-Tolerant Computing: An Overview

DTIC Science & Technology

1991-06-01

Addison Wesley:, Reading, MA) 1984. [8] J. Wakerly , Error Detecting Codes, Self-Checking Circuits and Applications , (Elsevier North Holland, Inc.- New York... applicable to bit-sliced organi- zations of hardware. In the first time step, the normal computation is performed on the operands and the results...for error detection and fault tolerance in parallel processor systems while perform- ing specific computation-intensive applications [111. Contrary to
High Performance Computing at NASA

NASA Technical Reports Server (NTRS)

Bailey, David H.; Cooper, D. M. (Technical Monitor)

1994-01-01

The speaker will give an overview of high performance computing in the U.S. in general and within NASA in particular, including a description of the recently signed NASA-IBM cooperative agreement. The latest performance figures of various parallel systems on the NAS Parallel Benchmarks will be presented. The speaker was one of the authors of the NAS (National Aerospace Standards) Parallel Benchmarks, which are now widely cited in the industry as a measure of sustained performance on realistic high-end scientific applications. It will be shown that significant progress has been made by the highly parallel supercomputer industry during the past year or so, with several new systems, based on high-performance RISC processors, that now deliver superior performance per dollar compared to conventional supercomputers. Various pitfalls in reporting performance will be discussed. The speaker will then conclude by assessing the general state of the high performance computing field.
Custom Sky-Image Mosaics from NASA's Information Power Grid

NASA Technical Reports Server (NTRS)

Jacob, Joseph; Collier, James; Craymer, Loring; Curkendall, David

2005-01-01

yourSkyG is the second generation of the software described in yourSky: Custom Sky-Image Mosaics via the Internet (NPO-30556), NASA Tech Briefs, Vol. 27, No. 6 (June 2003), page 45. Like its predecessor, yourSkyG supplies custom astronomical image mosaics of sky regions specified by requesters using client computers connected to the Internet. Whereas yourSky constructs mosaics on a local multiprocessor system, yourSkyG performs the computations on NASA s Information Power Grid (IPG), which is capable of performing much larger mosaicking tasks. (The IPG is high-performance computation and data grid that integrates geographically distributed 18 NASA Tech Briefs, September 2005 computers, databases, and instruments.) A user of yourSkyG can specify parameters describing a mosaic to be constructed. yourSkyG then constructs the mosaic on the IPG and makes it available for downloading by the user. The complexities of determining which input images are required to construct a mosaic, retrieving the required input images from remote sky-survey archives, uploading the images to the computers on the IPG, performing the computations remotely on the Grid, and downloading the resulting mosaic from the Grid are all transparent to the user
Advances in Engineering Software for Lift Transportation Systems

NASA Astrophysics Data System (ADS)

Kazakoff, Alexander Borisoff

2012-03-01

In this paper an attempt is performed at computer modelling of ropeway ski lift systems. The logic in these systems is based on a travel form between the two terminals, which operates with high capacity cabins, chairs, gondolas or draw-bars. Computer codes AUTOCAD, MATLAB and Compaq-Visual Fortran - version 6.6 are used in the computer modelling. The rope systems computer modelling is organized in two stages in this paper. The first stage is organization of the ground relief profile and a design of the lift system as a whole, according to the terrain profile and the climatic and atmospheric conditions. The ground profile is prepared by the geodesists and is presented in an AUTOCAD view. The next step is the design of the lift itself which is performed by programmes using the computer code MATLAB. The second stage of the computer modelling is performed after the optimization of the co-ordinates and the lift profile using the computer code MATLAB. Then the co-ordinates and the parameters are inserted into a program written in Compaq Visual Fortran - version 6.6., which calculates 171 lift parameters, organized in 42 tables. The objective of the work presented in this paper is an attempt at computer modelling of the design and parameters derivation of the rope way systems and their computer variation and optimization.
GPU and APU computations of Finite Time Lyapunov Exponent fields

NASA Astrophysics Data System (ADS)

Conti, Christian; Rossinelli, Diego; Koumoutsakos, Petros

2012-03-01

We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.
Impact of singular excessive computer game and television exposure on sleep patterns and memory performance of school-aged children.

PubMed

Dworak, Markus; Schierl, Thomas; Bruns, Thomas; Strüder, Heiko Klaus

2007-11-01

Television and computer game consumption are a powerful influence in the lives of most children. Previous evidence has supported the notion that media exposure could impair a variety of behavioral characteristics. Excessive television viewing and computer game playing have been associated with many psychiatric symptoms, especially emotional and behavioral symptoms, somatic complaints, attention problems such as hyperactivity, and family interaction problems. Nevertheless, there is insufficient knowledge about the relationship between singular excessive media consumption on sleep patterns and linked implications on children. The aim of this study was to investigate the effects of singular excessive television and computer game consumption on sleep patterns and memory performance of children. Eleven school-aged children were recruited for this polysomnographic study. Children were exposed to voluntary excessive television and computer game consumption. In the subsequent night, polysomnographic measurements were conducted to measure sleep-architecture and sleep-continuity parameters. In addition, a visual and verbal memory test was conducted before media stimulation and after the subsequent sleeping period to determine visuospatial and verbal memory performance. Only computer game playing resulted in significant reduced amounts of slow-wave sleep as well as significant declines in verbal memory performance. Prolonged sleep-onset latency and more stage 2 sleep were also detected after previous computer game consumption. No effects on rapid eye movement sleep were observed. Television viewing reduced sleep efficiency significantly but did not affect sleep patterns. The results suggest that television and computer game exposure affect children's sleep and deteriorate verbal cognitive performance, which supports the hypothesis of the negative influence of media consumption on children's sleep, learning, and memory.
Computational strategies for three-dimensional flow simulations on distributed computer systems. Ph.D. Thesis Semiannual Status Report, 15 Aug. 1993 - 15 Feb. 1994

NASA Technical Reports Server (NTRS)

Weed, Richard Allen; Sankar, L. N.

1994-01-01

An increasing amount of research activity in computational fluid dynamics has been devoted to the development of efficient algorithms for parallel computing systems. The increasing performance to price ratio of engineering workstations has led to research to development procedures for implementing a parallel computing system composed of distributed workstations. This thesis proposal outlines an ongoing research program to develop efficient strategies for performing three-dimensional flow analysis on distributed computing systems. The PVM parallel programming interface was used to modify an existing three-dimensional flow solver, the TEAM code developed by Lockheed for the Air Force, to function as a parallel flow solver on clusters of workstations. Steady flow solutions were generated for three different wing and body geometries to validate the code and evaluate code performance. The proposed research will extend the parallel code development to determine the most efficient strategies for unsteady flow simulations.

PHoToNs–A parallel heterogeneous and threads oriented code for cosmological N-body simulation

NASA Astrophysics Data System (ADS)

Wang, Qiao; Cao, Zong-Yan; Gao, Liang; Chi, Xue-Bin; Meng, Chen; Wang, Jie; Wang, Long

2018-06-01

We introduce a new code for cosmological simulations, PHoToNs, which incorporates features for performing massive cosmological simulations on heterogeneous high performance computer (HPC) systems and threads oriented programming. PHoToNs adopts a hybrid scheme to compute gravitational force, with the conventional Particle-Mesh (PM) algorithm to compute the long-range force, the Tree algorithm to compute the short range force and the direct summation Particle-Particle (PP) algorithm to compute gravity from very close particles. A self-similar space filling a Peano-Hilbert curve is used to decompose the computing domain. Threads programming is advantageously used to more flexibly manage the domain communication, PM calculation and synchronization, as well as Dual Tree Traversal on the CPU+MIC platform. PHoToNs scales well and efficiency of the PP kernel achieves 68.6% of peak performance on MIC and 74.4% on CPU platforms. We also test the accuracy of the code against the much used Gadget-2 in the community and found excellent agreement.
Computerized systems analysis and optimization of aircraft engine performance, weight, and life cycle costs

NASA Technical Reports Server (NTRS)

Fishbach, L. H.

1979-01-01

The computational techniques utilized to determine the optimum propulsion systems for future aircraft applications and to identify system tradeoffs and technology requirements are described. The characteristics and use of the following computer codes are discussed: (1) NNEP - a very general cycle analysis code that can assemble an arbitrary matrix fans, turbines, ducts, shafts, etc., into a complete gas turbine engine and compute on- and off-design thermodynamic performance; (2) WATE - a preliminary design procedure for calculating engine weight using the component characteristics determined by NNEP; (3) POD DRG - a table look-up program to calculate wave and friction drag of nacelles; (4) LIFCYC - a computer code developed to calculate life cycle costs of engines based on the output from WATE; and (5) INSTAL - a computer code developed to calculate installation effects, inlet performance and inlet weight. Examples are given to illustrate how these computer techniques can be applied to analyze and optimize propulsion system fuel consumption, weight, and cost for representative types of aircraft and missions.
NASA's Participation in the National Computational Grid

NASA Technical Reports Server (NTRS)

Feiereisen, William J.; Zornetzer, Steve F. (Technical Monitor)

1998-01-01

Over the last several years it has become evident that the character of NASA's supercomputing needs has changed. One of the major missions of the agency is to support the design and manufacture of aero- and space-vehicles with technologies that will significantly reduce their cost. It is becoming clear that improvements in the process of aerospace design and manufacturing will require a high performance information infrastructure that allows geographically dispersed teams to draw upon resources that are broader than traditional supercomputing. A computational grid draws together our information resources into one system. We can foresee the time when a Grid will allow engineers and scientists to use the tools of supercomputers, databases and on line experimental devices in a virtual environment to collaborate with distant colleagues. The concept of a computational grid has been spoken of for many years, but several events in recent times are conspiring to allow us to actually build one. In late 1997 the National Science Foundation initiated the Partnerships for Advanced Computational Infrastructure (PACI) which is built around the idea of distributed high performance computing. The Alliance lead, by the National Computational Science Alliance (NCSA), and the National Partnership for Advanced Computational Infrastructure (NPACI), lead by the San Diego Supercomputing Center, have been instrumental in drawing together the "Grid Community" to identify the technology bottlenecks and propose a research agenda to address them. During the same period NASA has begun to reformulate parts of two major high performance computing research programs to concentrate on distributed high performance computing and has banded together with the PACI centers to address the research agenda in common.
Computational structural mechanics engine structures computational simulator

NASA Technical Reports Server (NTRS)

Chamis, C. C.

1989-01-01

The Computational Structural Mechanics (CSM) program at Lewis encompasses: (1) fundamental aspects for formulating and solving structural mechanics problems, and (2) development of integrated software systems to computationally simulate the performance/durability/life of engine structures.
Tools for 3D scientific visualization in computational aerodynamics

NASA Technical Reports Server (NTRS)

Bancroft, Gordon; Plessel, Todd; Merritt, Fergus; Watson, Val

1989-01-01

The purpose is to describe the tools and techniques in use at the NASA Ames Research Center for performing visualization of computational aerodynamics, for example visualization of flow fields from computer simulations of fluid dynamics about vehicles such as the Space Shuttle. The hardware used for visualization is a high-performance graphics workstation connected to a super computer with a high speed channel. At present, the workstation is a Silicon Graphics IRIS 3130, the supercomputer is a CRAY2, and the high speed channel is a hyperchannel. The three techniques used for visualization are post-processing, tracking, and steering. Post-processing analysis is done after the simulation. Tracking analysis is done during a simulation but is not interactive, whereas steering analysis involves modifying the simulation interactively during the simulation. Using post-processing methods, a flow simulation is executed on a supercomputer and, after the simulation is complete, the results of the simulation are processed for viewing. The software in use and under development at NASA Ames Research Center for performing these types of tasks in computational aerodynamics is described. Workstation performance issues, benchmarking, and high-performance networks for this purpose are also discussed as well as descriptions of other hardware for digital video and film recording.
Large-scale optimization-based non-negative computational framework for diffusion equations: Parallel implementation and performance studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, Justin; Karra, Satish; Nakshatrala, Kalyana B.

It is well-known that the standard Galerkin formulation, which is often the formulation of choice under the finite element method for solving self-adjoint diffusion equations, does not meet maximum principles and the non-negative constraint for anisotropic diffusion equations. Recently, optimization-based methodologies that satisfy maximum principles and the non-negative constraint for steady-state and transient diffusion-type equations have been proposed. To date, these methodologies have been tested only on small-scale academic problems. The purpose of this paper is to systematically study the performance of the non-negative methodology in the context of high performance computing (HPC). PETSc and TAO libraries are, respectively, usedmore » for the parallel environment and optimization solvers. For large-scale problems, it is important for computational scientists to understand the computational performance of current algorithms available in these scientific libraries. The numerical experiments are conducted on the state-of-the-art HPC systems, and a single-core performance model is used to better characterize the efficiency of the solvers. Furthermore, our studies indicate that the proposed non-negative computational framework for diffusion-type equations exhibits excellent strong scaling for real-world large-scale problems.« less
Large-scale optimization-based non-negative computational framework for diffusion equations: Parallel implementation and performance studies

DOE PAGES

Chang, Justin; Karra, Satish; Nakshatrala, Kalyana B.

2016-07-26

It is well-known that the standard Galerkin formulation, which is often the formulation of choice under the finite element method for solving self-adjoint diffusion equations, does not meet maximum principles and the non-negative constraint for anisotropic diffusion equations. Recently, optimization-based methodologies that satisfy maximum principles and the non-negative constraint for steady-state and transient diffusion-type equations have been proposed. To date, these methodologies have been tested only on small-scale academic problems. The purpose of this paper is to systematically study the performance of the non-negative methodology in the context of high performance computing (HPC). PETSc and TAO libraries are, respectively, usedmore » for the parallel environment and optimization solvers. For large-scale problems, it is important for computational scientists to understand the computational performance of current algorithms available in these scientific libraries. The numerical experiments are conducted on the state-of-the-art HPC systems, and a single-core performance model is used to better characterize the efficiency of the solvers. Furthermore, our studies indicate that the proposed non-negative computational framework for diffusion-type equations exhibits excellent strong scaling for real-world large-scale problems.« less
Geocomputation over Hybrid Computer Architecture and Systems: Prior Works and On-going Initiatives at UARK

NASA Astrophysics Data System (ADS)

Shi, X.

2015-12-01

As NSF indicated - "Theory and experimentation have for centuries been regarded as two fundamental pillars of science. It is now widely recognized that computational and data-enabled science forms a critical third pillar." Geocomputation is the third pillar of GIScience and geosciences. With the exponential growth of geodata, the challenge of scalable and high performance computing for big data analytics become urgent because many research activities are constrained by the inability of software or tool that even could not complete the computation process. Heterogeneous geodata integration and analytics obviously magnify the complexity and operational time frame. Many large-scale geospatial problems may be not processable at all if the computer system does not have sufficient memory or computational power. Emerging computer architectures, such as Intel's Many Integrated Core (MIC) Architecture and Graphics Processing Unit (GPU), and advanced computing technologies provide promising solutions to employ massive parallelism and hardware resources to achieve scalability and high performance for data intensive computing over large spatiotemporal and social media data. Exploring novel algorithms and deploying the solutions in massively parallel computing environment to achieve the capability for scalable data processing and analytics over large-scale, complex, and heterogeneous geodata with consistent quality and high-performance has been the central theme of our research team in the Department of Geosciences at the University of Arkansas (UARK). New multi-core architectures combined with application accelerators hold the promise to achieve scalability and high performance by exploiting task and data levels of parallelism that are not supported by the conventional computing systems. Such a parallel or distributed computing environment is particularly suitable for large-scale geocomputation over big data as proved by our prior works, while the potential of such advanced infrastructure remains unexplored in this domain. Within this presentation, our prior and on-going initiatives will be summarized to exemplify how we exploit multicore CPUs, GPUs, and MICs, and clusters of CPUs, GPUs and MICs, to accelerate geocomputation in different applications.
Engineering and Computing Portal to Solve Environmental Problems

NASA Astrophysics Data System (ADS)

Gudov, A. M.; Zavozkin, S. Y.; Sotnikov, I. Y.

2018-01-01

This paper describes architecture and services of the Engineering and Computing Portal, which is considered to be a complex solution that provides access to high-performance computing resources, enables to carry out computational experiments, teach parallel technologies and solve computing tasks, including technogenic safety ones.
Effects of portable computing devices on posture, muscle activation levels and efficiency.

PubMed

Werth, Abigail; Babski-Reeves, Kari

2014-11-01

Very little research exists on ergonomic exposures when using portable computing devices. This study quantified muscle activity (forearm and neck), posture (wrist, forearm and neck), and performance (gross typing speed and error rates) differences across three portable computing devices (laptop, netbook, and slate computer) and two work settings (desk and computer) during data entry tasks. Twelve participants completed test sessions on a single computer using a test-rest-test protocol (30min of work at one work setting, 15min of rest, 30min of work at the other work setting). The slate computer resulted in significantly more non-neutral wrist, elbow and neck postures, particularly when working on the sofa. Performance on the slate computer was four times less than that of the other computers, though lower muscle activity levels were also found. Potential or injury or illness may be elevated when working on smaller, portable computers in non-traditional work settings. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Running Jobs on the Peregrine System | High-Performance Computing | NREL

Science.gov Websites

on the Peregrine high-performance computing (HPC) system. Running Different Types of Jobs Batch jobs scheduling policies - queue names, limits, etc. Requesting different node types Sample batch scripts
Silicon photonics for high-performance interconnection networks

NASA Astrophysics Data System (ADS)

Biberman, Aleksandr

2011-12-01

We assert in the course of this work that silicon photonics has the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems, and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. This work showcases that chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, enable unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of this work, we demonstrate such feasibility of waveguides, modulators, switches, and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. Furthermore, we leverage the unique properties of available silicon photonic materials to create novel silicon photonic devices, subsystems, network topologies, and architectures to enable unprecedented performance of these photonic interconnection networks and computing systems. We show that the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. Furthermore, we explore the immense potential of all-optical functionalities implemented using parametric processing in the silicon platform, demonstrating unique methods that have the ability to revolutionize computation and communication. Silicon photonics enables new sets of opportunities that we can leverage for performance gains, as well as new sets of challenges that we must solve. Leveraging its inherent compatibility with standard fabrication techniques of the semiconductor industry, combined with its capability of dense integration with advanced microelectronics, silicon photonics also offers a clear path toward commercialization through low-cost mass-volume production. Combining empirical validations of feasibility, demonstrations of massive performance gains in large-scale systems, and the potential for commercial penetration of silicon photonics, the impact of this work will become evident in the many decades that follow.
Gigaflop performance on a CRAY-2: Multitasking a computational fluid dynamics application

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Overman, Andrea L.; Lambiotte, Jules J.; Streett, Craig L.

1991-01-01

The methodology is described for converting a large, long-running applications code that executed on a single processor of a CRAY-2 supercomputer to a version that executed efficiently on multiple processors. Although the conversion of every application is different, a discussion of the types of modification used to achieve gigaflop performance is included to assist others in the parallelization of applications for CRAY computers, especially those that were developed for other computers. An existing application, from the discipline of computational fluid dynamics, that had utilized over 2000 hrs of CPU time on CRAY-2 during the previous year was chosen as a test case to study the effectiveness of multitasking on a CRAY-2. The nature of dominant calculations within the application indicated that a sustained computational rate of 1 billion floating-point operations per second, or 1 gigaflop, might be achieved. The code was first analyzed and modified for optimal performance on a single processor in a batch environment. After optimal performance on a single CPU was achieved, the code was modified to use multiple processors in a dedicated environment. The results of these two efforts were merged into a single code that had a sustained computational rate of over 1 gigaflop on a CRAY-2. Timings and analysis of performance are given for both single- and multiple-processor runs.
NASA HPCC Technology for Aerospace Analysis and Design

NASA Technical Reports Server (NTRS)

Schulbach, Catherine H.

1999-01-01

The Computational Aerosciences (CAS) Project is part of NASA's High Performance Computing and Communications Program. Its primary goal is to accelerate the availability of high-performance computing technology to the US aerospace community-thus providing the US aerospace community with key tools necessary to reduce design cycle times and increase fidelity in order to improve safety, efficiency and capability of future aerospace vehicles. A complementary goal is to hasten the emergence of a viable commercial market within the aerospace community for the advantage of the domestic computer hardware and software industry. The CAS Project selects representative aerospace problems (especially design) and uses them to focus efforts on advancing aerospace algorithms and applications, systems software, and computing machinery to demonstrate vast improvements in system performance and capability over the life of the program. Recent demonstrations have served to assess the benefits of possible performance improvements while reducing the risk of adopting high-performance computing technology. This talk will discuss past accomplishments in providing technology to the aerospace community, present efforts, and future goals. For example, the times to do full combustor and compressor simulations (of aircraft engines) have been reduced by factors of 320:1 and 400:1 respectively. While this has enabled new capabilities in engine simulation, the goal of an overnight, dynamic, multi-disciplinary, 3-dimensional simulation of an aircraft engine is still years away and will require new generations of high-end technology.
Organization of the secure distributed computing based on multi-agent system

NASA Astrophysics Data System (ADS)

Khovanskov, Sergey; Rumyantsev, Konstantin; Khovanskova, Vera

2018-04-01

Nowadays developing methods for distributed computing is received much attention. One of the methods of distributed computing is using of multi-agent systems. The organization of distributed computing based on the conventional network computers can experience security threats performed by computational processes. Authors have developed the unified agent algorithm of control system of computing network nodes operation. Network PCs is used as computing nodes. The proposed multi-agent control system for the implementation of distributed computing allows in a short time to organize using of the processing power of computers any existing network to solve large-task by creating a distributed computing. Agents based on a computer network can: configure a distributed computing system; to distribute the computational load among computers operated agents; perform optimization distributed computing system according to the computing power of computers on the network. The number of computers connected to the network can be increased by connecting computers to the new computer system, which leads to an increase in overall processing power. Adding multi-agent system in the central agent increases the security of distributed computing. This organization of the distributed computing system reduces the problem solving time and increase fault tolerance (vitality) of computing processes in a changing computing environment (dynamic change of the number of computers on the network). Developed a multi-agent system detects cases of falsification of the results of a distributed system, which may lead to wrong decisions. In addition, the system checks and corrects wrong results.
Analysis and selection of optimal function implementations in massively parallel computer

DOEpatents

Archer, Charles Jens [Rochester, MN; Peters, Amanda [Rochester, MN; Ratterman, Joseph D [Rochester, MN

2011-05-31

An apparatus, program product and method optimize the operation of a parallel computer system by, in part, collecting performance data for a set of implementations of a function capable of being executed on the parallel computer system based upon the execution of the set of implementations under varying input parameters in a plurality of input dimensions. The collected performance data may be used to generate selection program code that is configured to call selected implementations of the function in response to a call to the function under varying input parameters. The collected performance data may be used to perform more detailed analysis to ascertain the comparative performance of the set of implementations of the function under the varying input parameters.
DET/MPS - The GSFC Energy Balance Programs

NASA Technical Reports Server (NTRS)

Jagielski, J. M.

1994-01-01

Direct Energy Transfer (DET) and MultiMission Spacecraft Modular Power System (MPS) computer programs perform mathematical modeling and simulation to aid in design and analysis of DET and MPS spacecraft power system performance in order to determine energy balance of subsystem. DET spacecraft power system feeds output of solar photovoltaic array and nickel cadmium batteries directly to spacecraft bus. MPS system, Standard Power Regulator Unit (SPRU) utilized to operate array at array's peak power point. DET and MPS perform minute-by-minute simulation of performance of power system. Results of simulation focus mainly on output of solar array and characteristics of batteries. Both packages limited in terms of orbital mechanics, they have sufficient capability to calculate data on eclipses and performance of arrays for circular or near-circular orbits. DET and MPS written in FORTRAN-77 with some VAX FORTRAN-type extensions. Both available in three versions: GSC-13374, for DEC VAX-series computers running VMS. GSC-13443, for UNIX-based computers. GSC-13444, for Apple Macintosh computers.
A Computational Framework for Efficient Low Temperature Plasma Simulations

NASA Astrophysics Data System (ADS)

Verma, Abhishek Kumar; Venkattraman, Ayyaswamy

2016-10-01

Over the past years, scientific computing has emerged as an essential tool for the investigation and prediction of low temperature plasmas (LTP) applications which includes electronics, nanomaterial synthesis, metamaterials etc. To further explore the LTP behavior with greater fidelity, we present a computational toolbox developed to perform LTP simulations. This framework will allow us to enhance our understanding of multiscale plasma phenomenon using high performance computing tools mainly based on OpenFOAM FVM distribution. Although aimed at microplasma simulations, the modular framework is able to perform multiscale, multiphysics simulations of physical systems comprises of LTP. Some salient introductory features are capability to perform parallel, 3D simulations of LTP applications on unstructured meshes. Performance of the solver is tested based on numerical results assessing accuracy and efficiency of benchmarks for problems in microdischarge devices. Numerical simulation of microplasma reactor at atmospheric pressure with hemispherical dielectric coated electrodes will be discussed and hence, provide an overview of applicability and future scope of this framework.
High-performance computing with quantum processing units

DOE PAGES

Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.; ...

2017-03-01

The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Performance Evaluation of Counter-Based Dynamic Load Balancing Schemes for Massive Contingency Analysis with Different Computing Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Yousu; Huang, Zhenyu; Chavarría-Miranda, Daniel

Contingency analysis is a key function in the Energy Management System (EMS) to assess the impact of various combinations of power system component failures based on state estimation. Contingency analysis is also extensively used in power market operation for feasibility test of market solutions. High performance computing holds the promise of faster analysis of more contingency cases for the purpose of safe and reliable operation of today’s power grids with less operating margin and more intermittent renewable energy sources. This paper evaluates the performance of counter-based dynamic load balancing schemes for massive contingency analysis under different computing environments. Insights frommore » the performance evaluation can be used as guidance for users to select suitable schemes in the application of massive contingency analysis. Case studies, as well as MATLAB simulations, of massive contingency cases using the Western Electricity Coordinating Council power grid model are presented to illustrate the application of high performance computing with counter-based dynamic load balancing schemes.« less

High-performance computing with quantum processing units

DOE Office of Scientific and Technical Information (OSTI.GOV)

Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.

The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Scout: high-performance heterogeneous computing made simple

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jablin, James; Mc Cormick, Patrick; Herlihy, Maurice

2011-01-26

Researchers must often write their own simulation and analysis software. During this process they simultaneously confront both computational and scientific problems. Current strategies for aiding the generation of performance-oriented programs do not abstract the software development from the science. Furthermore, the problem is becoming increasingly complex and pressing with the continued development of many-core and heterogeneous (CPU-GPU) architectures. To acbieve high performance, scientists must expertly navigate both software and hardware. Co-design between computer scientists and research scientists can alleviate but not solve this problem. The science community requires better tools for developing, optimizing, and future-proofing codes, allowing scientists to focusmore » on their research while still achieving high computational performance. Scout is a parallel programming language and extensible compiler framework targeting heterogeneous architectures. It provides the abstraction required to buffer scientists from the constantly-shifting details of hardware while still realizing higb-performance by encapsulating software and hardware optimization within a compiler framework.« less
H.R. 656--The High Performance Computer Technology Act of 1991. Hearing before the Subcommittee on Science, and the Subcommittee on Technology and Competitiveness of the Committee on Science, Space, and Technology. U.S. House of Representatives, One Hundred Second Congess, First Session.

ERIC Educational Resources Information Center

Congress of the U.S., Washington, DC. House Committee on Science, Space and Technology.

This hearing focused on H. R. 656, companion bill of S. 272, which calls for high performance computing legislation. This is one of several initiatives to provide for a coordinated federal research program to ensure continued U.S. leadership in high performance computing. The bill authorizes the development of a National Research and Education…
Nuclear Analysis

NASA Technical Reports Server (NTRS)

Clement, J. D.; Kirby, K. D.

1973-01-01

Exploratory calculations were performed for several gas core breeder reactor configurations. The computational method involved the use of the MACH-1 one dimensional diffusion theory code and the THERMOS integral transport theory code for thermal cross sections. Computations were performed to analyze thermal breeder concepts and nonbreeder concepts. Analysis of breeders was restricted to the (U-233)-Th breeding cycle, and computations were performed to examine a range of parameters. These parameters include U-233 to hydrogen atom ratio in the gaseous cavity, carbon to thorium atom ratio in the breeding blanket, cavity size, and blanket size.
Modeling Students' Problem Solving Performance in the Computer-Based Mathematics Learning Environment

ERIC Educational Resources Information Center

Lee, Young-Jin

2017-01-01

Purpose: The purpose of this paper is to develop a quantitative model of problem solving performance of students in the computer-based mathematics learning environment. Design/methodology/approach: Regularized logistic regression was used to create a quantitative model of problem solving performance of students that predicts whether students can…
21 CFR 1271.160 - Establishment and maintenance of a quality program.

Code of Federal Regulations, 2014 CFR

2014-04-01

... perform for management review a quality audit, as defined in § 1271.3(gg), of activities related to core CGTP requirements. (d) Computers. You must validate the performance of computer software for the intended use, and the performance of any changes to that software for the intended use, if you rely upon...
21 CFR 1271.160 - Establishment and maintenance of a quality program.

Code of Federal Regulations, 2012 CFR

2012-04-01

... perform for management review a quality audit, as defined in § 1271.3(gg), of activities related to core CGTP requirements. (d) Computers. You must validate the performance of computer software for the intended use, and the performance of any changes to that software for the intended use, if you rely upon...
21 CFR 1271.160 - Establishment and maintenance of a quality program.

Code of Federal Regulations, 2011 CFR

2011-04-01

... perform for management review a quality audit, as defined in § 1271.3(gg), of activities related to core CGTP requirements. (d) Computers. You must validate the performance of computer software for the intended use, and the performance of any changes to that software for the intended use, if you rely upon...
21 CFR 1271.160 - Establishment and maintenance of a quality program.

Code of Federal Regulations, 2013 CFR

2013-04-01

... perform for management review a quality audit, as defined in § 1271.3(gg), of activities related to core CGTP requirements. (d) Computers. You must validate the performance of computer software for the intended use, and the performance of any changes to that software for the intended use, if you rely upon...
Comparing Student Performance on Paper-and-Pencil and Computer-Based-Tests

ERIC Educational Resources Information Center

Hardcastle, Joseph; Herrmann-Abell, Cari F.; DeBoer, George E.

2017-01-01

Can student performance on computer-based tests (CBT) and paper-and-pencil tests (PPT) be considered equivalent measures of student knowledge? States and school districts are grappling with this question, and although studies addressing this question are growing, additional research is needed. We report on the performance of students who took…
ASIC For Complex Fixed-Point Arithmetic

NASA Technical Reports Server (NTRS)

Petilli, Stephen G.; Grimm, Michael J.; Olson, Erlend M.

1995-01-01

Application-specific integrated circuit (ASIC) performs 24-bit, fixed-point arithmetic operations on arrays of complex-valued input data. High-performance, wide-band arithmetic logic unit (ALU) designed for use in computing fast Fourier transforms (FFTs) and for performing ditigal filtering functions. Other applications include general computations involved in analysis of spectra and digital signal processing.
Elastic Cloud Computing Architecture and System for Heterogeneous Spatiotemporal Computing

NASA Astrophysics Data System (ADS)

Shi, X.

2017-10-01

Spatiotemporal computation implements a variety of different algorithms. When big data are involved, desktop computer or standalone application may not be able to complete the computation task due to limited memory and computing power. Now that a variety of hardware accelerators and computing platforms are available to improve the performance of geocomputation, different algorithms may have different behavior on different computing infrastructure and platforms. Some are perfect for implementation on a cluster of graphics processing units (GPUs), while GPUs may not be useful on certain kind of spatiotemporal computation. This is the same situation in utilizing a cluster of Intel's many-integrated-core (MIC) or Xeon Phi, as well as Hadoop or Spark platforms, to handle big spatiotemporal data. Furthermore, considering the energy efficiency requirement in general computation, Field Programmable Gate Array (FPGA) may be a better solution for better energy efficiency when the performance of computation could be similar or better than GPUs and MICs. It is expected that an elastic cloud computing architecture and system that integrates all of GPUs, MICs, and FPGAs could be developed and deployed to support spatiotemporal computing over heterogeneous data types and computational problems.
Benchmarking Memory Performance with the Data Cube Operator

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Shabanov, Leonid V.

2004-01-01

Data movement across a computer memory hierarchy and across computational grids is known to be a limiting factor for applications processing large data sets. We use the Data Cube Operator on an Arithmetic Data Set, called ADC, to benchmark capabilities of computers and of computational grids to handle large distributed data sets. We present a prototype implementation of a parallel algorithm for computation of the operatol: The algorithm follows a known approach for computing views from the smallest parent. The ADC stresses all levels of grid memory and storage by producing some of 2d views of an Arithmetic Data Set of d-tuples described by a small number of integers. We control data intensity of the ADC by selecting the tuple parameters, the sizes of the views, and the number of realized views. Benchmarking results of memory performance of a number of computer architectures and of a small computational grid are presented.
Characterization of real-time computers

NASA Technical Reports Server (NTRS)

Shin, K. G.; Krishna, C. M.

1984-01-01

A real-time system consists of a computer controller and controlled processes. Despite the synergistic relationship between these two components, they have been traditionally designed and analyzed independently of and separately from each other; namely, computer controllers by computer scientists/engineers and controlled processes by control scientists. As a remedy for this problem, in this report real-time computers are characterized by performance measures based on computer controller response time that are: (1) congruent to the real-time applications, (2) able to offer an objective comparison of rival computer systems, and (3) experimentally measurable/determinable. These measures, unlike others, provide the real-time computer controller with a natural link to controlled processes. In order to demonstrate their utility and power, these measures are first determined for example controlled processes on the basis of control performance functionals. They are then used for two important real-time multiprocessor design applications - the number-power tradeoff and fault-masking and synchronization.
Using Modeling and Simulation to Complement Testing for Increased Understanding of Weapon Subassembly Response.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wong, Michael K.; Davidson, Megan

As part of Sandia’s nuclear deterrence mission, the B61-12 Life Extension Program (LEP) aims to modernize the aging weapon system. Modernization requires requalification and Sandia is using high performance computing to perform advanced computational simulations to better understand, evaluate, and verify weapon system performance in conjunction with limited physical testing. The Nose Bomb Subassembly (NBSA) of the B61-12 is responsible for producing a fuzing signal upon ground impact. The fuzing signal is dependent upon electromechanical impact sensors producing valid electrical fuzing signals at impact. Computer generated models were used to assess the timing between the impact sensor’s response to themore » deceleration of impact and damage to major components and system subassemblies. The modeling and simulation team worked alongside the physical test team to design a large-scale reverse ballistic test to not only assess system performance, but to also validate their computational models. The reverse ballistic test conducted at Sandia’s sled test facility sent a rocket sled with a representative target into a stationary B61-12 (NBSA) to characterize the nose crush and functional response of NBSA components. Data obtained from data recorders and high-speed photometrics were integrated with previously generated computer models in order to refine and validate the model’s ability to reliably simulate real-world effects. Large-scale tests are impractical to conduct for every single impact scenario. By creating reliable computer models, we can perform simulations that identify trends and produce estimates of outcomes over the entire range of required impact conditions. Sandia’s HPCs enable geometric resolution that was unachievable before, allowing for more fidelity and detail, and creating simulations that can provide insight to support evaluation of requirements and performance margins. As computing resources continue to improve, researchers at Sandia are hoping to improve these simulations so they provide increasingly credible analysis of the system response and performance over the full range of conditions.« less
Exascale computing and big data

DOE PAGES

Reed, Daniel A.; Dongarra, Jack

2015-06-25

Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
Exascale computing and big data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reed, Daniel A.; Dongarra, Jack

Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
Embedded assessment algorithms within home-based cognitive computer game exercises for elders.

PubMed

Jimison, Holly; Pavel, Misha

2006-01-01

With the recent consumer interest in computer-based activities designed to improve cognitive performance, there is a growing need for scientific assessment algorithms to validate the potential contributions of cognitive exercises. In this paper, we present a novel methodology for incorporating dynamic cognitive assessment algorithms within computer games designed to enhance cognitive performance. We describe how this approach works for variety of computer applications and describe cognitive monitoring results for one of the computer game exercises. The real-time cognitive assessments also provide a control signal for adapting the difficulty of the game exercises and providing tailored help for elders of varying abilities.
The Performance Improvement of the Lagrangian Particle Dispersion Model (LPDM) Using Graphics Processing Unit (GPU) Computing

DTIC Science & Technology

2017-08-01

access to the GPU for general purpose processing .5 CUDA is designed to work easily with multiple programming languages , including Fortran. CUDA is a...Using Graphics Processing Unit (GPU) Computing by Leelinda P Dawson Approved for public release; distribution unlimited...The Performance Improvement of the Lagrangian Particle Dispersion Model (LPDM) Using Graphics Processing Unit (GPU) Computing by Leelinda
DURIP: High Performance Computing in Biomathematics Applications

DTIC Science & Technology

2017-05-10

Mathematics and Statistics (AMS) at the University of California, Santa Cruz (UCSC) to conduct research and research-related education in areas of...Computing in Biomathematics Applications Report Title The goal of this award was to enhance the capabilities of the Department of Applied Mathematics and...DURIP: High Performance Computing in Biomathematics Applications The goal of this award was to enhance the capabilities of the Department of Applied

High performance computing for advanced modeling and simulation of materials

NASA Astrophysics Data System (ADS)

Wang, Jue; Gao, Fei; Vazquez-Poletti, Jose Luis; Li, Jianjiang

2017-02-01

The First International Workshop on High Performance Computing for Advanced Modeling and Simulation of Materials (HPCMS2015) was held in Austin, Texas, USA, Nov. 18, 2015. HPCMS 2015 was organized by Computer Network Information Center (Chinese Academy of Sciences), University of Michigan, Universidad Complutense de Madrid, University of Science and Technology Beijing, Pittsburgh Supercomputing Center, China Institute of Atomic Energy, and Ames Laboratory.
On the rational design of compressible flow ejectors

NASA Technical Reports Server (NTRS)

Ortwerth, P. J.

1979-01-01

A fluid mechanics review of chemical laser ejectors is presented. The characteristics of ejectors with single and multiple driver nozzles are discussed. Methods to compute an optimized performance map in which secondary Mach number and performance are computed versus mass ratio, to compute the flow distortion at each optimized condition, and to determine the thrust area for the design point to match diffuser impedence are examined.
Graphics Processing Unit Assisted Thermographic Compositing

NASA Technical Reports Server (NTRS)

Ragasa, Scott; Russell, Samuel S.

2012-01-01

Objective Develop a software application utilizing high performance computing techniques, including general purpose graphics processing units (GPGPUs), for the analysis and visualization of large thermographic data sets. Over the past several years, an increasing effort among scientists and engineers to utilize graphics processing units (GPUs) in a more general purpose fashion is allowing for previously unobtainable levels of computation by individual workstations. As data sets grow, the methods to work them grow at an equal, and often greater, pace. Certain common computations can take advantage of the massively parallel and optimized hardware constructs of the GPU which yield significant increases in performance. These common computations have high degrees of data parallelism, that is, they are the same computation applied to a large set of data where the result does not depend on other data elements. Image processing is one area were GPUs are being used to greatly increase the performance of certain analysis and visualization techniques.
Computing a Comprehensible Model for Spam Filtering

NASA Astrophysics Data System (ADS)

Ruiz-Sepúlveda, Amparo; Triviño-Rodriguez, José L.; Morales-Bueno, Rafael

In this paper, we describe the application of the Desicion Tree Boosting (DTB) learning model to spam email filtering.This classification task implies the learning in a high dimensional feature space. So, it is an example of how the DTB algorithm performs in such feature space problems. In [1], it has been shown that hypotheses computed by the DTB model are more comprehensible that the ones computed by another ensemble methods. Hence, this paper tries to show that the DTB algorithm maintains the same comprehensibility of hypothesis in high dimensional feature space problems while achieving the performance of other ensemble methods. Four traditional evaluation measures (precision, recall, F1 and accuracy) have been considered for performance comparison between DTB and others models usually applied to spam email filtering. The size of the hypothesis computed by a DTB is smaller and more comprehensible than the hypothesis computed by Adaboost and Naïve Bayes.
Experimental and analytical comparison of flowfields in a 110 N (25 lbf) H2/O2 rocket

NASA Technical Reports Server (NTRS)

Reed, Brian D.; Penko, Paul F.; Schneider, Steven J.; Kim, Suk C.

1991-01-01

A gaseous hydrogen/gaseous oxygen 110 N (25 lbf) rocket was examined through the RPLUS code using the full Navier-Stokes equations with finite rate chemistry. Performance tests were conducted on the rocket in an altitude test facility. Preliminary parametric analyses were performed for a range of mixture ratios and fuel film cooling pcts. It is shown that the computed values of specific impulse and characteristic exhaust velocity follow the trend of the experimental data. Specific impulse computed by the code is lower than the comparable test values by about two to three percent. The computed characteristic exhaust velocity values are lower than the comparable test values by three to four pct. Thrust coefficients computed by the code are found to be within two pct. of the measured values. It is concluded that the discrepancy between computed and experimental performance values could not be attributed to experimental uncertainty.
A Decentralized Eigenvalue Computation Method for Spectrum Sensing Based on Average Consensus

NASA Astrophysics Data System (ADS)

Mohammadi, Jafar; Limmer, Steffen; Stańczak, Sławomir

2016-07-01

This paper considers eigenvalue estimation for the decentralized inference problem for spectrum sensing. We propose a decentralized eigenvalue computation algorithm based on the power method, which is referred to as generalized power method GPM; it is capable of estimating the eigenvalues of a given covariance matrix under certain conditions. Furthermore, we have developed a decentralized implementation of GPM by splitting the iterative operations into local and global computation tasks. The global tasks require data exchange to be performed among the nodes. For this task, we apply an average consensus algorithm to efficiently perform the global computations. As a special case, we consider a structured graph that is a tree with clusters of nodes at its leaves. For an accelerated distributed implementation, we propose to use computation over multiple access channel (CoMAC) as a building block of the algorithm. Numerical simulations are provided to illustrate the performance of the two algorithms.
Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment.

PubMed

Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Abdulhamid, Shafi'i Muhammad; Usman, Mohammed Joda

2017-01-01

Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing.
Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment

PubMed Central

Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Usman, Mohammed Joda

2017-01-01

Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing. PMID:28467505
DMG-α--a computational geometry library for multimolecular systems.

PubMed

Szczelina, Robert; Murzyn, Krzysztof

2014-11-24

The DMG-α library grants researchers in the field of computational biology, chemistry, and biophysics access to an open-sourced, easy to use, and intuitive software for performing fine-grained geometric analysis of molecular systems. The library is capable of computing power diagrams (weighted Voronoi diagrams) in three dimensions with 3D periodic boundary conditions, computing approximate projective 2D Voronoi diagrams on arbitrarily defined surfaces, performing shape properties recognition using α-shape theory and can do exact Solvent Accessible Surface Area (SASA) computation. The software is written mainly as a template-based C++ library for greater performance, but a rich Python interface (pydmga) is provided as a convenient way to manipulate the DMG-α routines. To illustrate possible applications of the DMG-α library, we present results of sample analyses which allowed to determine nontrivial geometric properties of two Escherichia coli-specific lipids as emerging from molecular dynamics simulations of relevant model bilayers.
Advanced Architectures for Astrophysical Supercomputing

NASA Astrophysics Data System (ADS)

Barsdell, B. R.; Barnes, D. G.; Fluke, C. J.

2010-12-01

Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×) in general-purpose computation - performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.
Does familiarity with computers affect computerized neuropsychological test performance?

PubMed

Iverson, Grant L; Brooks, Brian L; Ashton, V Lynn; Johnson, Lynda G; Gualtieri, C Thomas

2009-07-01

The purpose of this study was to determine whether self-reported computer familiarity is related to performance on computerized neurocognitive testing. Participants were 130 healthy adults who self-reported whether their computer use was "some" (n = 65) or "frequent" (n = 65). The two groups were individually matched on age, education, sex, and race. All completed the CNS Vital Signs (Gualtieri & Johnson, 2006b) computerized neurocognitive battery. There were significant differences on 6 of the 23 scores, including scores derived from the Symbol-Digit Coding Test, Stroop Test, and the Shifting Attention Test. The two groups were also significantly different on the Psychomotor Speed (Cohen's d = 0.37), Reaction Time (d = 0.68), Complex Attention (d = 0.40), and Cognitive Flexibility (d = 0.64) domain scores. People with "frequent" computer use performed better than people with "some" computer use on some tests requiring rapid visual scanning and keyboard work.
GPU-based High-Performance Computing for Radiation Therapy

PubMed Central

Jia, Xun; Ziegenhein, Peter; Jiang, Steve B.

2014-01-01

Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. Graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past a few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of studies have been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this article, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. PMID:24486639
OpenACC performance for simulating 2D radial dambreak using FVM HLLE flux

NASA Astrophysics Data System (ADS)

Gunawan, P. H.; Pahlevi, M. R.

2018-03-01

The aim of this paper is to investigate the performances of openACC platform for computing 2D radial dambreak. Here, the shallow water equation will be used to describe and simulate 2D radial dambreak with finite volume method (FVM) using HLLE flux. OpenACC is a parallel computing platform based on GPU cores. Indeed, from this research this platform is used to minimize computational time on the numerical scheme performance. The results show the using OpenACC, the computational time is reduced. For the dry and wet radial dambreak simulations using 2048 grids, the computational time of parallel is obtained 575.984 s and 584.830 s respectively for both simulations. These results show the successful of OpenACC when they are compared with the serial time of dry and wet radial dambreak simulations which are collected 28047.500 s and 29269.40 s respectively.
Visual Form Perception Can Be a Cognitive Correlate of Lower Level Math Categories for Teenagers.

PubMed

Cui, Jiaxin; Zhang, Yiyun; Cheng, Dazhi; Li, Dawei; Zhou, Xinlin

2017-01-01

Numerous studies have assessed the cognitive correlates of performance in mathematics, but little research has been conducted to systematically examine the relations between visual perception as the starting point of visuospatial processing and typical mathematical performance. In the current study, we recruited 223 seventh graders to perform a visual form perception task (figure matching), numerosity comparison, digit comparison, exact computation, approximate computation, and curriculum-based mathematical achievement tests. Results showed that, after controlling for gender, age, and five general cognitive processes (choice reaction time, visual tracing, mental rotation, spatial working memory, and non-verbal matrices reasoning), visual form perception had unique contributions to numerosity comparison, digit comparison, and exact computation, but had no significant relation with approximate computation or curriculum-based mathematical achievement. These results suggest that visual form perception is an important independent cognitive correlate of lower level math categories, including the approximate number system, digit comparison, and exact computation.
National research and education network

NASA Technical Reports Server (NTRS)

Villasenor, Tony

1991-01-01

Some goals of this network are as follows: Extend U.S. technological leadership in high performance computing and computer communications; Provide wide dissemination and application of the technologies both to the speed and the pace of innovation and to serve the national economy, national security, education, and the global environment; and Spur gains in the U.S. productivity and industrial competitiveness by making high performance computing and networking technologies an integral part of the design and production process. Strategies for achieving these goals are as follows: Support solutions to important scientific and technical challenges through a vigorous R and D effort; Reduce the uncertainties to industry for R and D and use of this technology through increased cooperation between government, industry, and universities and by the continued use of government and government funded facilities as a prototype user for early commercial HPCC products; and Support underlying research, network, and computational infrastructures on which U.S. high performance computing technology is based.
ARTS III Computer Systems Performance Measurement Prototype Implementation

DOT National Transportation Integrated Search

1974-04-01

Direct measurement of computer systems is of vital importance in: a) developing an intelligent grasp of the variables which affect overall performance; b)tuning the systsem for optimum benefit; c)determining under what conditions saturation threshold...
EVALUATION OF VENTILATION PERFORMANCE FOR INDOOR SPACE

EPA Science Inventory

The paper discusses a personal-computer-based application of computational fluid dynamics that can be used to determine the turbulent flow field and time-dependent/steady-state contaminant concentration distributions within isothermal indoor space. (NOTE: Ventilation performance ...
Computer System Performance Measurement Techniques for ARTS III Computer Systems

DOT National Transportation Integrated Search

1973-12-01

The potential contribution of direct system measurement in the evolving ARTS 3 Program is discussed and software performance measurement techniques are comparatively assessed in terms of credibility of results, ease of implementation, volume of data,...
Evaluation of Emerging Energy-Efficient Heterogeneous Computing Platforms for Biomolecular and Cellular Simulation Workloads.

PubMed

Stone, John E; Hallock, Michael J; Phillips, James C; Peterson, Joseph R; Luthey-Schulten, Zaida; Schulten, Klaus

2016-05-01

Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers.
Computational Issues in Damping Identification for Large Scale Problems

NASA Technical Reports Server (NTRS)

Pilkey, Deborah L.; Roe, Kevin P.; Inman, Daniel J.

1997-01-01

Two damping identification methods are tested for efficiency in large-scale applications. One is an iterative routine, and the other a least squares method. Numerical simulations have been performed on multiple degree-of-freedom models to test the effectiveness of the algorithm and the usefulness of parallel computation for the problems. High Performance Fortran is used to parallelize the algorithm. Tests were performed using the IBM-SP2 at NASA Ames Research Center. The least squares method tested incurs high communication costs, which reduces the benefit of high performance computing. This method's memory requirement grows at a very rapid rate meaning that larger problems can quickly exceed available computer memory. The iterative method's memory requirement grows at a much slower pace and is able to handle problems with 500+ degrees of freedom on a single processor. This method benefits from parallelization, and significant speedup can he seen for problems of 100+ degrees-of-freedom.

Performing an allreduce operation on a plurality of compute nodes of a parallel computer

DOEpatents

Faraj, Ahmad

2013-07-09

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: establishing, for each node, a plurality of logical rings, each ring including a different set of at least one core on that node, each ring including the cores on at least two of the nodes; iteratively for each node: assigning each core of that node to one of the rings established for that node to which the core has not previously been assigned, and performing, for each ring for that node, a global allreduce operation using contribution data for the cores assigned to that ring or any global allreduce results from previous global allreduce operations, yielding current global allreduce results for each core; and performing, for each node, a local allreduce operation using the global allreduce results.
Mobile clusters of single board computers: an option for providing resources to student projects and researchers.

PubMed

Baun, Christian

2016-01-01

Clusters usually consist of servers, workstations or personal computers as nodes. But especially for academic purposes like student projects or scientific projects, the cost for purchase and operation can be a challenge. Single board computers cannot compete with the performance or energy-efficiency of higher-value systems, but they are an option to build inexpensive cluster systems. Because of the compact design and modest energy consumption, it is possible to build clusters of single board computers in a way that they are mobile and can be easily transported by the users. This paper describes the construction of such a cluster, useful applications and the performance of the single nodes. Furthermore, the clusters' performance and energy-efficiency is analyzed by executing the High Performance Linpack benchmark with a different number of nodes and different proportion of the systems total main memory utilized.
Performance and economics of residential solar space heating

NASA Astrophysics Data System (ADS)

Zehr, F. J.; Vineyard, T. A.; Barnes, R. W.; Oneal, D. L.

1982-11-01

The performance and economics of residential solar space heating were studied for various locations in the contiguous United States. Common types of active and passive solar heating systems were analyzed with respect to an average-size, single-family house designed to meet or exceed the thermal requirements of the Department of Housing and Urban Development Minimum Property Standards (HUD-MPS). The solar systems were evaluated in seventeen cities to provide a broad range of climatic conditions. Active systems evaluated consist of air and liquid flat plate collectors with single- and double-glazing: passive systems include Trombe wall, water wall, direct gain, and sunspace systems. The active system solar heating performance was computed using the University of Wisconsin's F-CHART computer program. The Los Alamos Scientific Laboratory's Solar Load Ratio (SLR) method was employed to compute solar heating performance for the passive systems. Heating costs were computed with gas, oil, and electricity as backups and as conventional heating system fuels.
Benefits of computer screen-based simulation in learning cardiac arrest procedures.

PubMed

Bonnetain, Elodie; Boucheix, Jean-Michel; Hamet, Maël; Freysz, Marc

2010-07-01

What is the best way to train medical students early so that they acquire basic skills in cardiopulmonary resuscitation as effectively as possible? Studies have shown the benefits of high-fidelity patient simulators, but have also demonstrated their limits. New computer screen-based multimedia simulators have fewer constraints than high-fidelity patient simulators. In this area, as yet, there has been no research on the effectiveness of transfer of learning from a computer screen-based simulator to more realistic situations such as those encountered with high-fidelity patient simulators. We tested the benefits of learning cardiac arrest procedures using a multimedia computer screen-based simulator in 28 Year 2 medical students. Just before the end of the traditional resuscitation course, we compared two groups. An experiment group (EG) was first asked to learn to perform the appropriate procedures in a cardiac arrest scenario (CA1) in the computer screen-based learning environment and was then tested on a high-fidelity patient simulator in another cardiac arrest simulation (CA2). While the EG was learning to perform CA1 procedures in the computer screen-based learning environment, a control group (CG) actively continued to learn cardiac arrest procedures using practical exercises in a traditional class environment. Both groups were given the same amount of practice, exercises and trials. The CG was then also tested on the high-fidelity patient simulator for CA2, after which it was asked to perform CA1 using the computer screen-based simulator. Performances with both simulators were scored on a precise 23-point scale. On the test on a high-fidelity patient simulator, the EG trained with a multimedia computer screen-based simulator performed significantly better than the CG trained with traditional exercises and practice (16.21 versus 11.13 of 23 possible points, respectively; p<0.001). Computer screen-based simulation appears to be effective in preparing learners to use high-fidelity patient simulators, which present simulations that are closer to real-life situations.
Application of Adjoint Method and Spectral-Element Method to Tomographic Inversion of Regional Seismological Structure Beneath Japanese Islands

NASA Astrophysics Data System (ADS)

Tsuboi, S.; Miyoshi, T.; Obayashi, M.; Tono, Y.; Ando, K.

2014-12-01

Recent progress in large scale computing by using waveform modeling technique and high performance computing facility has demonstrated possibilities to perform full-waveform inversion of three dimensional (3D) seismological structure inside the Earth. We apply the adjoint method (Liu and Tromp, 2006) to obtain 3D structure beneath Japanese Islands. First we implemented Spectral-Element Method to K-computer in Kobe, Japan. We have optimized SPECFEM3D_GLOBE (Komatitsch and Tromp, 2002) by using OpenMP so that the code fits hybrid architecture of K-computer. Now we could use 82,134 nodes of K-computer (657,072 cores) to compute synthetic waveform with about 1 sec accuracy for realistic 3D Earth model and its performance was 1.2 PFLOPS. We use this optimized SPECFEM3D_GLOBE code and take one chunk around Japanese Islands from global mesh and compute synthetic seismograms with accuracy of about 10 second. We use GAP-P2 mantle tomography model (Obayashi et al., 2009) as an initial 3D model and use as many broadband seismic stations available in this region as possible to perform inversion. We then use the time windows for body waves and surface waves to compute adjoint sources and calculate adjoint kernels for seismic structure. We have performed several iteration and obtained improved 3D structure beneath Japanese Islands. The result demonstrates that waveform misfits between observed and theoretical seismograms improves as the iteration proceeds. We now prepare to use much shorter period in our synthetic waveform computation and try to obtain seismic structure for basin scale model, such as Kanto basin, where there are dense seismic network and high seismic activity. Acknowledgements: This research was partly supported by MEXT Strategic Program for Innovative Research. We used F-net seismograms of the National Research Institute for Earth Science and Disaster Prevention.
An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

NASA Technical Reports Server (NTRS)

Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

2012-01-01

The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.
Compute Server Performance Results

NASA Technical Reports Server (NTRS)

Stockdale, I. E.; Barton, John; Woodrow, Thomas (Technical Monitor)

1994-01-01

Parallel-vector supercomputers have been the workhorses of high performance computing. As expectations of future computing needs have risen faster than projected vector supercomputer performance, much work has been done investigating the feasibility of using Massively Parallel Processor systems as supercomputers. An even more recent development is the availability of high performance workstations which have the potential, when clustered together, to replace parallel-vector systems. We present a systematic comparison of floating point performance and price-performance for various compute server systems. A suite of highly vectorized programs was run on systems including traditional vector systems such as the Cray C90, and RISC workstations such as the IBM RS/6000 590 and the SGI R8000. The C90 system delivers 460 million floating point operations per second (FLOPS), the highest single processor rate of any vendor. However, if the price-performance ration (PPR) is considered to be most important, then the IBM and SGI processors are superior to the C90 processors. Even without code tuning, the IBM and SGI PPR's of 260 and 220 FLOPS per dollar exceed the C90 PPR of 160 FLOPS per dollar when running our highly vectorized suite,
Innovative architectures for dense multi-microprocessor computers

NASA Technical Reports Server (NTRS)

Larson, Robert E.

1989-01-01

The purpose is to summarize a Phase 1 SBIR project performed for the NASA/Langley Computational Structural Mechanics Group. The project was performed from February to August 1987. The main objectives of the project were to: (1) expand upon previous research into the application of chordal ring architectures to the general problem of designing multi-microcomputer architectures, (2) attempt to identify a family of chordal rings such that each chordal ring can be simply expanded to produce the next member of the family, (3) perform a preliminary, high-level design of an expandable multi-microprocessor computer based upon chordal rings, (4) analyze the potential use of chordal ring based multi-microprocessors for sparse matrix problems and other applications arising in computational structural mechanics.
Extreme Scale Computing to Secure the Nation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, D L; McGraw, J R; Johnson, J R

2009-11-10

Since the dawn of modern electronic computing in the mid 1940's, U.S. national security programs have been dominant users of every new generation of high-performance computer. Indeed, the first general-purpose electronic computer, ENIAC (the Electronic Numerical Integrator and Computer), was used to calculate the expected explosive yield of early thermonuclear weapons designs. Even the U. S. numerical weather prediction program, another early application for high-performance computing, was initially funded jointly by sponsors that included the U.S. Air Force and Navy, agencies interested in accurate weather predictions to support U.S. military operations. For the decades of the cold war, national securitymore » requirements continued to drive the development of high performance computing (HPC), including advancement of the computing hardware and development of sophisticated simulation codes to support weapons and military aircraft design, numerical weather prediction as well as data-intensive applications such as cryptography and cybersecurity U.S. national security concerns continue to drive the development of high-performance computers and software in the U.S. and in fact, events following the end of the cold war have driven an increase in the growth rate of computer performance at the high-end of the market. This mainly derives from our nation's observance of a moratorium on underground nuclear testing beginning in 1992, followed by our voluntary adherence to the Comprehensive Test Ban Treaty (CTBT) beginning in 1995. The CTBT prohibits further underground nuclear tests, which in the past had been a key component of the nation's science-based program for assuring the reliability, performance and safety of U.S. nuclear weapons. In response to this change, the U.S. Department of Energy (DOE) initiated the Science-Based Stockpile Stewardship (SBSS) program in response to the Fiscal Year 1994 National Defense Authorization Act, which requires, 'in the absence of nuclear testing, a progam to: (1) Support a focused, multifaceted program to increase the understanding of the enduring stockpile; (2) Predict, detect, and evaluate potential problems of the aging of the stockpile; (3) Refurbish and re-manufacture weapons and components, as required; and (4) Maintain the science and engineering institutions needed to support the nation's nuclear deterrent, now and in the future'. This program continues to fulfill its national security mission by adding significant new capabilities for producing scientific results through large-scale computational simulation coupled with careful experimentation, including sub-critical nuclear experiments permitted under the CTBT. To develop the computational science and the computational horsepower needed to support its mission, SBSS initiated the Accelerated Strategic Computing Initiative, later renamed the Advanced Simulation & Computing (ASC) program (sidebar: 'History of ASC Computing Program Computing Capability'). The modern 3D computational simulation capability of the ASC program supports the assessment and certification of the current nuclear stockpile through calibration with past underground test (UGT) data. While an impressive accomplishment, continued evolution of national security mission requirements will demand computing resources at a significantly greater scale than we have today. In particular, continued observance and potential Senate confirmation of the Comprehensive Test Ban Treaty (CTBT) together with the U.S administration's promise for a significant reduction in the size of the stockpile and the inexorable aging and consequent refurbishment of the stockpile all demand increasing refinement of our computational simulation capabilities. Assessment of the present and future stockpile with increased confidence of the safety and reliability without reliance upon calibration with past or future test data is a long-term goal of the ASC program. This will be accomplished through significant increases in the scientific bases that underlie the computational tools. Computer codes must be developed that replace phenomenology with increased levels of scientific understanding together with an accompanying quantification of uncertainty. These advanced codes will place significantly higher demands on the computing infrastructure than do the current 3D ASC codes. This article discusses not only the need for a future computing capability at the exascale for the SBSS program, but also considers high performance computing requirements for broader national security questions. For example, the increasing concern over potential nuclear terrorist threats demands a capability to assess threats and potential disablement technologies as well as a rapid forensic capability for determining a nuclear weapons design from post-detonation evidence (nuclear counterterrorism).« less
Studying an Eulerian Computer Model on Different High-performance Computer Platforms and Some Applications

NASA Astrophysics Data System (ADS)

Georgiev, K.; Zlatev, Z.

2010-11-01

The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
Routing performance analysis and optimization within a massively parallel computer

DOEpatents

Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

2013-04-16

An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.
Multidisciplinary Design Optimization of a Full Vehicle with High Performance Computing

NASA Technical Reports Server (NTRS)

Yang, R. J.; Gu, L.; Tho, C. H.; Sobieszczanski-Sobieski, Jaroslaw

2001-01-01

Multidisciplinary design optimization (MDO) of a full vehicle under the constraints of crashworthiness, NVH (Noise, Vibration and Harshness), durability, and other performance attributes is one of the imperative goals for automotive industry. However, it is often infeasible due to the lack of computational resources, robust simulation capabilities, and efficient optimization methodologies. This paper intends to move closer towards that goal by using parallel computers for the intensive computation and combining different approximations for dissimilar analyses in the MDO process. The MDO process presented in this paper is an extension of the previous work reported by Sobieski et al. In addition to the roof crush, two full vehicle crash modes are added: full frontal impact and 50% frontal offset crash. Instead of using an adaptive polynomial response surface method, this paper employs a DOE/RSM method for exploring the design space and constructing highly nonlinear crash functions. Two NMO strategies are used and results are compared. This paper demonstrates that with high performance computing, a conventionally intractable real world full vehicle multidisciplinary optimization problem considering all performance attributes with large number of design variables become feasible.
Empirical Performance Model-Driven Data Layout Optimization and Library Call Selection for Tensor Contraction Expressions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lu, Qingda; Gao, Xiaoyang; Krishnamoorthy, Sriram

Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of loop unrolling is determined by executing different versions of the computation. In contrast, optimizing compilers use a model-driven approach to program transformation. While the model-driven approach of optimizing compilers is generally orders of magnitude faster than ATLAS-like library generators, its effectiveness can be limited by the accuracy of the performance models used. In this paper, we describe an approach where a class of computations is modeled in terms of constituent operations that are empiricallymore » measured, thereby allowing modeling of the overall execution time. The performance model with empirically determined cost components is used to perform data layout optimization together with the selection of library calls and layout transformations in the context of the Tensor Contraction Engine, a compiler for a high-level domain-specific language for expressing computational models in quantum chemistry. The effectiveness of the approach is demonstrated through experimental measurements on representative computations from quantum chemistry.« less
Turbulence modeling of free shear layers for high-performance aircraft

NASA Technical Reports Server (NTRS)

Sondak, Douglas L.

1993-01-01

The High Performance Aircraft (HPA) Grand Challenge of the High Performance Computing and Communications (HPCC) program involves the computation of the flow over a high performance aircraft. A variety of free shear layers, including mixing layers over cavities, impinging jets, blown flaps, and exhaust plumes, may be encountered in such flowfields. Since these free shear layers are usually turbulent, appropriate turbulence models must be utilized in computations in order to accurately simulate these flow features. The HPCC program is relying heavily on parallel computers. A Navier-Stokes solver (POVERFLOW) utilizing the Baldwin-Lomax algebraic turbulence model was developed and tested on a 128-node Intel iPSC/860. Algebraic turbulence models run very fast, and give good results for many flowfields. For complex flowfields such as those mentioned above, however, they are often inadequate. It was therefore deemed that a two-equation turbulence model will be required for the HPA computations. The k-epsilon two-equation turbulence model was implemented on the Intel iPSC/860. Both the Chien low-Reynolds-number model and a generalized wall-function formulation were included.
Gyrokinetic micro-turbulence simulations on the NERSC 16-way SMP IBM SP computer: experiences and performance results

NASA Astrophysics Data System (ADS)

Ethier, Stephane; Lin, Zhihong

2001-10-01

Earlier this year, the National Energy Research Scientific Computing center (NERSC) took delivery of the second most powerful computer in the world. With its 2,528 processors running at a peak performance of 1.5 GFlops, this IBM SP machine has a theoretical performance of almost 3.8 TFlops. To efficiently harness such computing power in one single code is not an easy task and requires a good knowledge of the computer's architecture. Here we present the steps that we followed to improve our gyrokinetic micro-turbulence code GTC in order to take advantage of the new 16-way shared memory nodes of the NERSC IBM SP. Performance results are shown as well as details about the improved mixed-mode MPI-OpenMP model that we use. The enhancements to the code allowed us to tackle much bigger problem sizes, getting closer to our goal of simulating an ITER-size tokamak with both kinetic ions and electrons.(This work is supported by DOE Contract No. DE-AC02-76CH03073 (PPPL), and in part by the DOE Fusion SciDAC Project.)
Fast algorithms for computing phylogenetic divergence time.

PubMed

Crosby, Ralph W; Williams, Tiffani L

2017-12-06

The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.
Computational structural mechanics for engine structures

NASA Technical Reports Server (NTRS)

Chamis, Christos C.

1988-01-01

The computational structural mechanics (CSM) program at Lewis encompasses the formulation and solution of structural mechanics problems and the development of integrated software systems to computationally simulate the performance, durability, and life of engine structures. It is structured to supplement, complement, and, whenever possible, replace costly experimental efforts. Specific objectives are to investigate unique advantages of parallel and multiprocessing for reformulating and solving structural mechanics and formulating and solving multidisciplinary mechanics and to develop integrated structural system computational simulators for predicting structural performance, evaluating newly developed methods, and identifying and prioritizing improved or missing methods.
Computational structural mechanics for engine structures

NASA Technical Reports Server (NTRS)

Chamis, Christos C.

1989-01-01

The computational structural mechanics (CSM) program at Lewis encompasses the formulation and solution of structural mechanics problems and the development of integrated software systems to computationally simulate the performance, durability, and life of engine structures. It is structured to supplement, complement, and, whenever possible, replace costly experimental efforts. Specific objectives are to investigate unique advantages of parallel and multiprocessing for reformulating and solving structural mechanics and formulating and solving multidisciplinary mechanics and to develop integrated structural system computational simulators for predicting structural performance, evaluating newly developed methods, and identifying and prioritizing improved or missing methods.
Student Achievement in Computer Programming: Lecture vs Computer-Aided Instruction

ERIC Educational Resources Information Center

Tsai, San-Yun W.; Pohl, Norval F.

1978-01-01

This paper discusses a study of the differences in student learning achievement, as measured by four different types of common performance evaluation techniques, in a college-level computer programming course under three teaching/learning environments: lecture, computer-aided instruction, and lecture supplemented with computer-aided instruction.…
Computers for Your Classroom: CAI and CMI.

ERIC Educational Resources Information Center

Thomas, David B.; Bozeman, William C.

1981-01-01

The availability of compact, low-cost computer systems provides a means of assisting classroom teachers in the performance of their duties. Computer-assisted instruction (CAI) and computer-managed instruction (CMI) are two applications of computer technology with which school administrators should become familiar. CAI is a teaching medium in which…

Research | Computational Science | NREL

Science.gov Websites

Research Research NREL's computational science experts use advanced high-performance computing (HPC technologies, thereby accelerating the transformation of our nation's energy system. Enabling High-Impact Research NREL's computational science capabilities enable high-impact research. Some recent examples
Comparative Implementation of High Performance Computing for Power System Dynamic Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng

Dynamic simulation for transient stability assessment is one of the most important, but intensive, computations for power system planning and operation. Present commercial software is mainly designed for sequential computation to run a single simulation, which is very time consuming with a single processer. The application of High Performance Computing (HPC) to dynamic simulations is very promising in accelerating the computing process by parallelizing its kernel algorithms while maintaining the same level of computation accuracy. This paper describes the comparative implementation of four parallel dynamic simulation schemes in two state-of-the-art HPC environments: Message Passing Interface (MPI) and Open Multi-Processing (OpenMP).more » These implementations serve to match the application with dedicated multi-processor computing hardware and maximize the utilization and benefits of HPC during the development process.« less
Application of Blind Quantum Computation to Two-Party Quantum Computation

NASA Astrophysics Data System (ADS)

Sun, Zhiyuan; Li, Qin; Yu, Fang; Chan, Wai Hong

2018-06-01

Blind quantum computation (BQC) allows a client who has only limited quantum power to achieve quantum computation with the help of a remote quantum server and still keep the client's input, output, and algorithm private. Recently, Kashefi and Wallden extended BQC to achieve two-party quantum computation which allows two parties Alice and Bob to perform a joint unitary transform upon their inputs. However, in their protocol Alice has to prepare rotated single qubits and perform Pauli operations, and Bob needs to have a powerful quantum computer. In this work, we also utilize the idea of BQC to put forward an improved two-party quantum computation protocol in which the operations of both Alice and Bob are simplified since Alice only needs to apply Pauli operations and Bob is just required to prepare and encrypt his input qubits.
Application of Blind Quantum Computation to Two-Party Quantum Computation

NASA Astrophysics Data System (ADS)

Sun, Zhiyuan; Li, Qin; Yu, Fang; Chan, Wai Hong

2018-03-01

Blind quantum computation (BQC) allows a client who has only limited quantum power to achieve quantum computation with the help of a remote quantum server and still keep the client's input, output, and algorithm private. Recently, Kashefi and Wallden extended BQC to achieve two-party quantum computation which allows two parties Alice and Bob to perform a joint unitary transform upon their inputs. However, in their protocol Alice has to prepare rotated single qubits and perform Pauli operations, and Bob needs to have a powerful quantum computer. In this work, we also utilize the idea of BQC to put forward an improved two-party quantum computation protocol in which the operations of both Alice and Bob are simplified since Alice only needs to apply Pauli operations and Bob is just required to prepare and encrypt his input qubits.
Computer Music

NASA Astrophysics Data System (ADS)

Cook, Perry R.

This chapter covers algorithms, technologies, computer languages, and systems for computer music. Computer music involves the application of computers and other digital/electronic technologies to music composition, performance, theory, history, and the study of perception. The field combines digital signal processing, computational algorithms, computer languages, hardware and software systems, acoustics, psychoacoustics (low-level perception of sounds from the raw acoustic signal), and music cognition (higher-level perception of musical style, form, emotion, etc.).
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure

NASA Astrophysics Data System (ADS)

Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

2011-09-01

Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed. This work was presented in part at the 2010 Annual Meeting of the American Association of Physicists in Medicine (AAPM), Philadelphia, PA.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

NASA Astrophysics Data System (ADS)

Simunovic, S.; Zacharia, T.; Baltas, N.; Spalding, D. B.

1995-03-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. The Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, S.; Zacharia, T.; Baltas, N.

1995-04-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. Themore » Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.« less
Shor's factoring algorithm and modern cryptography. An illustration of the capabilities inherent in quantum computers

NASA Astrophysics Data System (ADS)

Gerjuoy, Edward

2005-06-01

The security of messages encoded via the widely used RSA public key encryption system rests on the enormous computational effort required to find the prime factors of a large number N using classical (conventional) computers. In 1994 Peter Shor showed that for sufficiently large N, a quantum computer could perform the factoring with much less computational effort. This paper endeavors to explain, in a fashion comprehensible to the nonexpert, the RSA encryption protocol; the various quantum computer manipulations constituting the Shor algorithm; how the Shor algorithm performs the factoring; and the precise sense in which a quantum computer employing Shor's algorithm can be said to accomplish the factoring of very large numbers with less computational effort than a classical computer. It is made apparent that factoring N generally requires many successive runs of the algorithm. Our analysis reveals that the probability of achieving a successful factorization on a single run is about twice as large as commonly quoted in the literature.
Cogeneration Technology Alternatives Study (CTAS). Volume 6: Computer data. Part 2: Residual-fired nocogeneration process boiler

NASA Technical Reports Server (NTRS)

Knightly, W. F.

1980-01-01

Computer generated data on the performance of the cogeneration energy conversion system are presented. Performance parameters included fuel consumption and savings, capital costs, economics, and emissions of residual fired process boilers.
High performance systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vigil, M.B.

1995-03-01

This document provides a written compilation of the presentations and viewgraphs from the 1994 Conference on High Speed Computing given at the High Speed Computing Conference, {open_quotes}High Performance Systems,{close_quotes} held at Gleneden Beach, Oregon, on April 18 through 21, 1994.
Modeling and analysis to quantify MSE wall behavior and performance.

DOT National Transportation Integrated Search

2009-08-01

To better understand potential sources of adverse performance of mechanically stabilized earth (MSE) walls, a suite of analytical models was studied using the computer program FLAC, a numerical modeling computer program widely used in geotechnical en...
Personal Computer Price and Performance.

ERIC Educational Resources Information Center

Crawford, Walt

1993-01-01

Discusses personal computer price trends since 1986; describes offerings and prices for four direct-market suppliers, i.e., Dell CompuAdd, PC Brand, and Gateway 2000; and discusses overall value and price/performance ratios. Tables and graphs chart value over time. (EA)
Long term pavement performance computed parameter : moisture content

DOT National Transportation Integrated Search

2008-01-01

A study was conducted to compute in situ soil parameters based on time domain reflectometry (TDR) traces obtained from Long Term Pavement Performance (LTPP) test sections instrumented for the seasonal monitoring program (SMP). Ten TDR sensors were in...
OAO battery data analysis

NASA Technical Reports Server (NTRS)

Gaston, S.; Wertheim, M.; Orourke, J. A.

1973-01-01

Summary, consolidation and analysis of specifications, manufacturing process and test controls, and performance results for OAO-2 and OAO-3 lot 20 Amp-Hr sealed nickel cadmium cells and batteries are reported. Correlation of improvements in control requirements with performance is a key feature. Updates for a cell/battery computer model to improve performance prediction capability are included. Applicability of regression analysis computer techniques to relate process controls to performance is checked.
Correlation tracking study for meter-class solar telescope on space shuttle. [solar granulation

NASA Technical Reports Server (NTRS)

Smithson, R. C.; Tarbell, T. D.

1977-01-01

The theory and expected performance level of correlation trackers used to control the pointing of a solar telescope in space using white light granulation as a target were studied. Three specific trackers were modeled and their performance levels predicted for telescopes of various apertures. The performance of the computer model trackers on computer enhanced granulation photographs was evaluated. Parametric equations for predicting tracker performance are presented.
A High Performance SOAP Engine for Grid Computing

NASA Astrophysics Data System (ADS)

Wang, Ning; Welzl, Michael; Zhang, Liang

Web Service technology still has many defects that make its usage for Grid computing problematic, most notably the low performance of the SOAP engine. In this paper, we develop a novel SOAP engine called SOAPExpress, which adopts two key techniques for improving processing performance: SCTP data transport and dynamic early binding based data mapping. Experimental results show a significant and consistent performance improvement of SOAPExpress over Apache Axis.
Performance Analysis, Modeling and Scaling of HPC Applications and Tools

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhatele, Abhinav

2016-01-13

E cient use of supercomputers at DOE centers is vital for maximizing system throughput, mini- mizing energy costs and enabling science breakthroughs faster. This requires complementary e orts along several directions to optimize the performance of scienti c simulation codes and the under- lying runtimes and software stacks. This in turn requires providing scalable performance analysis tools and modeling techniques that can provide feedback to physicists and computer scientists developing the simulation codes and runtimes respectively. The PAMS project is using time allocations on supercomputers at ALCF, NERSC and OLCF to further the goals described above by performing research alongmore » the following fronts: 1. Scaling Study of HPC applications; 2. Evaluation of Programming Models; 3. Hardening of Performance Tools; 4. Performance Modeling of Irregular Codes; and 5. Statistical Analysis of Historical Performance Data. We are a team of computer and computational scientists funded by both DOE/NNSA and DOE/ ASCR programs such as ECRP, XStack (Traleika Glacier, PIPER), ExaOSR (ARGO), SDMAV II (MONA) and PSAAP II (XPACC). This allocation will enable us to study big data issues when analyzing performance on leadership computing class systems and to assist the HPC community in making the most e ective use of these resources.« less
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy

NASA Astrophysics Data System (ADS)

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

2014-03-01

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.
Portability and Cross-Platform Performance of an MPI-Based Parallel Polygon Renderer

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1999-01-01

Visualizing the results of computations performed on large-scale parallel computers is a challenging problem, due to the size of the datasets involved. One approach is to perform the visualization and graphics operations in place, exploiting the available parallelism to obtain the necessary rendering performance. Over the past several years, we have been developing algorithms and software to support visualization applications on NASA's parallel supercomputers. Our results have been incorporated into a parallel polygon rendering system called PGL. PGL was initially developed on tightly-coupled distributed-memory message-passing systems, including Intel's iPSC/860 and Paragon, and IBM's SP2. Over the past year, we have ported it to a variety of additional platforms, including the HP Exemplar, SGI Origin2OOO, Cray T3E, and clusters of Sun workstations. In implementing PGL, we have had two primary goals: cross-platform portability and high performance. Portability is important because (1) our manpower resources are limited, making it difficult to develop and maintain multiple versions of the code, and (2) NASA's complement of parallel computing platforms is diverse and subject to frequent change. Performance is important in delivering adequate rendering rates for complex scenes and ensuring that parallel computing resources are used effectively. Unfortunately, these two goals are often at odds. In this paper we report on our experiences with portability and performance of the PGL polygon renderer across a range of parallel computing platforms.

Comparing Postsecondary Marketing Student Performance on Computer-Based and Handwritten Essay Tests

ERIC Educational Resources Information Center

Truell, Allen D.; Alexander, Melody W.; Davis, Rodney E.

2004-01-01

The purpose of this study was to determine if there were differences in postsecondary marketing student performance on essay tests based on test format (i.e., computer-based or handwritten). Specifically, the variables of performance, test completion time, and gender were explored for differences based on essay test format. Results of the study…
Effects of Computer Skill on Mouse Move and Click Performance

ERIC Educational Resources Information Center

Panagiotakopoulos, Chris; Sarris, Menelaos

2008-01-01

This study focuses on the use of computers in the field of education. It reports a series of experimental mouse move and click tasks on constant and moving stimuli. These experiments attempt to explore the efficiency with which individuals of different skill level and age group perform using a mouse. Differences in performance between high-skill…
Virtual reality computer simulation.

PubMed

Grantcharov, T P; Rosenberg, J; Pahle, E; Funch-Jensen, P

2001-03-01

Objective assessment of psychomotor skills should be an essential component of a modern surgical training program. There are computer systems that can be used for this purpose, but their wide application is not yet generally accepted. The aim of this study was to validate the role of virtual reality computer simulation as a method for evaluating surgical laparoscopic skills. The study included 14 surgical residents. On day 1, they performed two runs of all six tasks on the Minimally Invasive Surgical Trainer, Virtual Reality (MIST VR). On day 2, they performed a laparoscopic cholecystectomy on living pigs; afterward, they were tested again on the MIST VR. A group of experienced surgeons evaluated the trainees' performance on the animal operation, giving scores for total performance error and economy of motion. During the tasks on the MIST VR, errors and noneconomy of movements for the left and right hand were also recorded. There were significant correlations between error scores in vivo and three of the six in vitro tasks (p < 0.05). In vivo economy scores correlated significantly with non-economy right-hand scores for five of the six tasks and with non-economy left-hand scores for one of the six tasks (p < 0.05). In this study, laparoscopic performance in the animal model correlated significantly with performance on the computer simulator. Thus, the computer model seems to be a promising objective method for the assessment of laparoscopic psychomotor skills.
Hybrid, experimental and computational, investigation of mechanical components

NASA Astrophysics Data System (ADS)

Furlong, Cosme; Pryputniewicz, Ryszard J.

1996-07-01

Computational and experimental methodologies have unique features for the analysis and solution of a wide variety of engineering problems. Computations provide results that depend on selection of input parameters such as geometry, material constants, and boundary conditions which, for correct modeling purposes, have to be appropriately chosen. In addition, it is relatively easy to modify the input parameters in order to computationally investigate different conditions. Experiments provide solutions which characterize the actual behavior of the object of interest subjected to specific operating conditions. However, it is impractical to experimentally perform parametric investigations. This paper discusses the use of a hybrid, computational and experimental, approach for study and optimization of mechanical components. Computational techniques are used for modeling the behavior of the object of interest while it is experimentally tested using noninvasive optical techniques. Comparisons are performed through a fringe predictor program used to facilitate the correlation between both techniques. In addition, experimentally obtained quantitative information, such as displacements and shape, can be applied in the computational model in order to improve this correlation. The result is a validated computational model that can be used for performing quantitative analyses and structural optimization. Practical application of the hybrid approach is illustrated with a representative example which demonstrates the viability of the approach as an engineering tool for structural analysis and optimization.
Visuospatial skills and computer game experience influence the performance of virtual endoscopy.

PubMed

Enochsson, Lars; Isaksson, Bengt; Tour, René; Kjellin, Ann; Hedman, Leif; Wredmark, Torsten; Tsai-Felländer, Li

2004-11-01

Advanced medical simulators have been introduced to facilitate surgical and endoscopic training and thereby improve patient safety. Residents trained in the Procedicus Minimally Invasive Surgical Trainer-Virtual Reality (MIST-VR) laparoscopic simulator perform laparoscopic cholecystectomy safer and faster than a control group. Little has been reported regarding whether factors like gender, computer experience, and visuospatial tests can predict the performance with a medical simulator. Our aim was to investigate whether such factors influence the performance of simulated gastroscopy. Seventeen medical students were asked about computer gaming experiences. Before virtual endoscopy, they performed the visuospatial test PicCOr, which discriminates the ability of the tested person to create a three-dimensional image from a two-dimensional presentation. Each student performed one gastroscopy (level 1, case 1) in the GI Mentor II, Simbionix, and several variables related to performance were registered. Percentage of time spent with a clear view in the endoscope correlated well with the performance on the PicSOr test (r = 0.56, P < 0.001). Efficiency of screening also correlated with PicSOr (r = 0.23, P < 0.05). In students with computer gaming experience, the efficiency of screening increased (33.6% +/- 3.1% versus 22.6% +/- 2.8%, P < 0.05) and the duration of the examination decreased by 1.5 minutes (P < 0.05). A similar trend was seen in men compared with women. The visuospatial test PicSOr predicts the results with the endoscopic simulator GI Mentor II. Two-dimensional image experience, as in computer games, also seems to affect the outcome.
Transient Three-Dimensional Side Load Analysis of a Film Cooled Nozzle

NASA Technical Reports Server (NTRS)

Wang, Ten-See; Guidos, Mike

2008-01-01

Transient three-dimensional numerical investigations on the side load physics for an engine encompassing a film cooled nozzle extension and a regeneratively cooled thrust chamber, were performed. The objectives of this study are to identify the three-dimensional side load physics and to compute the associated aerodynamic side load using an anchored computational methodology. The computational methodology is based on an unstructured-grid, pressure-based computational fluid dynamics formulation, and a transient inlet history based on an engine system simulation. Ultimately, the computational results will be provided to the nozzle designers for estimating of effect of the peak side load on the nozzle structure. Computations simulating engine startup at ambient pressures corresponding to sea level and three high altitudes were performed. In addition, computations for both engine startup and shutdown transients were also performed for a stub nozzle, operating at sea level. For engine with the full nozzle extension, computational result shows starting up at sea level, the peak side load occurs when the lambda shock steps into the turbine exhaust flow, while the side load caused by the transition from free-shock separation to restricted-shock separation comes at second; and the side loads decreasing rapidly and progressively as the ambient pressure decreases. For the stub nozzle operating at sea level, the computed side loads during both startup and shutdown becomes very small due to the much reduced flow area.
eXascale PRogramming Environment and System Software (XPRESS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chapman, Barbara; Gabriel, Edgar

Exascale systems, with a thousand times the compute capacity of today’s leading edge petascale computers, are expected to emerge during the next decade. Their software systems will need to facilitate the exploitation of exceptional amounts of concurrency in applications, and ensure that jobs continue to run despite the occurrence of system failures and other kinds of hard and soft errors. Adapting computations at runtime to cope with changes in the execution environment, as well as to improve power and performance characteristics, is likely to become the norm. As a result, considerable innovation is required to develop system support to meetmore » the needs of future computing platforms. The XPRESS project aims to develop and prototype a revolutionary software system for extreme-scale computing for both exascale and strongscaled problems. The XPRESS collaborative research project will advance the state-of-the-art in high performance computing and enable exascale computing for current and future DOE mission-critical applications and supporting systems. The goals of the XPRESS research project are to: A. enable exascale performance capability for DOE applications, both current and future, B. develop and deliver a practical computing system software X-stack, OpenX, for future practical DOE exascale computing systems, and C. provide programming methods and environments for effective means of expressing application and system software for portable exascale system execution.« less
Computation Directorate 2008 Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crawford, D L

2009-03-25

Whether a computer is simulating the aging and performance of a nuclear weapon, the folding of a protein, or the probability of rainfall over a particular mountain range, the necessary calculations can be enormous. Our computers help researchers answer these and other complex problems, and each new generation of system hardware and software widens the realm of possibilities. Building on Livermore's historical excellence and leadership in high-performance computing, Computation added more than 331 trillion floating-point operations per second (teraFLOPS) of power to LLNL's computer room floors in 2008. In addition, Livermore's next big supercomputer, Sequoia, advanced ever closer to itsmore » 2011-2012 delivery date, as architecture plans and the procurement contract were finalized. Hyperion, an advanced technology cluster test bed that teams Livermore with 10 industry leaders, made a big splash when it was announced during Michael Dell's keynote speech at the 2008 Supercomputing Conference. The Wall Street Journal touted Hyperion as a 'bright spot amid turmoil' in the computer industry. Computation continues to measure and improve the costs of operating LLNL's high-performance computing systems by moving hardware support in-house, by measuring causes of outages to apply resources asymmetrically, and by automating most of the account and access authorization and management processes. These improvements enable more dollars to go toward fielding the best supercomputers for science, while operating them at less cost and greater responsiveness to the customers.« less
Towards reversible basic linear algebra subprograms: A performance study

DOE PAGES

Perumalla, Kalyan S.; Yoginath, Srikanth B.

2014-12-06

Problems such as fault tolerance and scalable synchronization can be efficiently solved using reversibility of applications. Making applications reversible by relying on computation rather than on memory is ideal for large scale parallel computing, especially for the next generation of supercomputers in which memory is expensive in terms of latency, energy, and price. In this direction, a case study is presented here in reversing a computational core, namely, Basic Linear Algebra Subprograms, which is widely used in scientific applications. A new Reversible BLAS (RBLAS) library interface has been designed, and a prototype has been implemented with two modes: (1) amore » memory-mode in which reversibility is obtained by checkpointing to memory in forward and restoring from memory in reverse, and (2) a computational-mode in which nothing is saved in the forward, but restoration is done entirely via inverse computation in reverse. The article is focused on detailed performance benchmarking to evaluate the runtime dynamics and performance effects, comparing reversible computation with checkpointing on both traditional CPU platforms and recent GPU accelerator platforms. For BLAS Level-1 subprograms, data indicates over an order of magnitude better speed of reversible computation compared to checkpointing. For BLAS Level-2 and Level-3, a more complex tradeoff is observed between reversible computation and checkpointing, depending on computational and memory complexities of the subprograms.« less
Performance/Design Requirements and Detailed Technical Description for a Computer-Directed Training Subsystem for Integration into the Air Force Phase II Base Level System.

ERIC Educational Resources Information Center

Butler, A. K.; And Others

The performance/design requirements and a detailed technical description for a Computer-Directed Training Subsystem to be integrated into the Air Force Phase II Base Level System are described. The subsystem may be used for computer-assisted lesson construction and has presentation capability for on-the-job training for data automation, staff, and…
A stirling engine computer model for performance calculations

NASA Technical Reports Server (NTRS)

Tew, R.; Jefferies, K.; Miao, D.

1978-01-01

To support the development of the Stirling engine as a possible alternative to the automobile spark-ignition engine, the thermodynamic characteristics of the Stirling engine were analyzed and modeled on a computer. The modeling techniques used are presented. The performance of an existing rhombic-drive Stirling engine was simulated by use of this computer program, and some typical results are presented. Engine tests are planned in order to evaluate this model.
Achieving supercomputer performance for neural net simulation with an array of digital signal processors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Muller, U.A.; Baumle, B.; Kohler, P.

1992-10-01

Music, a DSP-based system with a parallel distributed-memory architecture, provides enormous computing power yet retains the flexibility of a general-purpose computer. Reaching a peak performance of 2.7 Gflops at a significantly lower cost, power consumption, and space requirement than conventional supercomputers, Music is well suited to computationally intensive applications such as neural network simulation. 12 refs., 9 figs., 2 tabs.
Applications of CFD and visualization techniques

NASA Technical Reports Server (NTRS)

Saunders, James H.; Brown, Susan T.; Crisafulli, Jeffrey J.; Southern, Leslie A.

1992-01-01

In this paper, three applications are presented to illustrate current techniques for flow calculation and visualization. The first two applications use a commercial computational fluid dynamics (CFD) code, FLUENT, performed on a Cray Y-MP. The results are animated with the aid of data visualization software, apE. The third application simulates a particulate deposition pattern using techniques inspired by developments in nonlinear dynamical systems. These computations were performed on personal computers.
Computer architecture evaluation for structural dynamics computations: Project summary

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1989-01-01

The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.
THREED: A computer program for three dimensional transformation of coordinates. [in lunar photo triangulation mapping

NASA Technical Reports Server (NTRS)

Wong, K. W.

1974-01-01

Program THREED was developed for the purpose of a research study on the treatment of control data in lunar phototriangulation. THREED is the code name of a computer program for performing absolute orientation by the method of three-dimensional projective transformation. It has the capability of performing complete error analysis on the computed transformation parameters as well as the transformed coordinates.
Optimal pre-scheduling of problem remappings

NASA Technical Reports Server (NTRS)

Nicol, David M.; Saltz, Joel H.

1987-01-01

A large class of scientific computational problems can be characterized as a sequence of steps where a significant amount of computation occurs each step, but the work performed at each step is not necessarily identical. Two good examples of this type of computation are: (1) regridding methods which change the problem discretization during the course of the computation, and (2) methods for solving sparse triangular systems of linear equations. Recent work has investigated a means of mapping such computations onto parallel processors; the method defines a family of static mappings with differing degrees of importance placed on the conflicting goals of good load balance and low communication/synchronization overhead. The performance tradeoffs are controllable by adjusting the parameters of the mapping method. To achieve good performance it may be necessary to dynamically change these parameters at run-time, but such changes can impose additional costs. If the computation's behavior can be determined prior to its execution, it can be possible to construct an optimal parameter schedule using a low-order-polynomial-time dynamic programming algorithm. Since the latter can be expensive, the performance is studied of the effect of a linear-time scheduling heuristic on one of the model problems, and it is shown to be effective and nearly optimal.
Evaluating the Theoretic Adequacy and Applied Potential of Computational Models of the Spacing Effect.

PubMed

Walsh, Matthew M; Gluck, Kevin A; Gunzelmann, Glenn; Jastrzembski, Tiffany; Krusmark, Michael

2018-06-01

The spacing effect is among the most widely replicated empirical phenomena in the learning sciences, and its relevance to education and training is readily apparent. Yet successful applications of spacing effect research to education and training is rare. Computational modeling can provide the crucial link between a century of accumulated experimental data on the spacing effect and the emerging interest in using that research to enable adaptive instruction. In this paper, we review relevant literature and identify 10 criteria for rigorously evaluating computational models of the spacing effect. Five relate to evaluating the theoretic adequacy of a model, and five relate to evaluating its application potential. We use these criteria to evaluate a novel computational model of the spacing effect called the Predictive Performance Equation (PPE). Predictive Performance Equation combines elements of earlier models of learning and memory including the General Performance Equation, Adaptive Control of Thought-Rational, and the New Theory of Disuse, giving rise to a novel computational account of the spacing effect that performs favorably across the complete sets of theoretic and applied criteria. We implemented two other previously published computational models of the spacing effect and compare them to PPE using the theoretic and applied criteria as guides. Copyright © 2018 Cognitive Science Society, Inc.
CUDA Optimization Strategies for Compute- and Memory-Bound Neuroimaging Algorithms

PubMed Central

Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W.

2011-01-01

As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. PMID:21159404
Exploiting GPUs in Virtual Machine for BioCloud

PubMed Central

Jo, Heeseung; Jeong, Jinkyu; Lee, Myoungho; Choi, Dong Hoon

2013-01-01

Recently, biological applications start to be reimplemented into the applications which exploit many cores of GPUs for better computation performance. Therefore, by providing virtualized GPUs to VMs in cloud computing environment, many biological applications will willingly move into cloud environment to enhance their computation performance and utilize infinite cloud computing resource while reducing expenses for computations. In this paper, we propose a BioCloud system architecture that enables VMs to use GPUs in cloud environment. Because much of the previous research has focused on the sharing mechanism of GPUs among VMs, they cannot achieve enough performance for biological applications of which computation throughput is more crucial rather than sharing. The proposed system exploits the pass-through mode of PCI express (PCI-E) channel. By making each VM be able to access underlying GPUs directly, applications can show almost the same performance as when those are in native environment. In addition, our scheme multiplexes GPUs by using hot plug-in/out device features of PCI-E channel. By adding or removing GPUs in each VM in on-demand manner, VMs in the same physical host can time-share their GPUs. We implemented the proposed system using the Xen VMM and NVIDIA GPUs and showed that our prototype is highly effective for biological GPU applications in cloud environment. PMID:23710465
Multiphysics Computational Analysis of a Solid-Core Nuclear Thermal Engine Thrust Chamber

NASA Technical Reports Server (NTRS)

Wang, Ten-See; Canabal, Francisco; Cheng, Gary; Chen, Yen-Sen

2007-01-01

The objective of this effort is to develop an efficient and accurate computational heat transfer methodology to predict thermal, fluid, and hydrogen environments for a hypothetical solid-core, nuclear thermal engine - the Small Engine. In addition, the effects of power profile and hydrogen conversion on heat transfer efficiency and thrust performance were also investigated. The computational methodology is based on an unstructured-grid, pressure-based, all speeds, chemically reacting, computational fluid dynamics platform, while formulations of conjugate heat transfer were implemented to describe the heat transfer from solid to hydrogen inside the solid-core reactor. The computational domain covers the entire thrust chamber so that the afore-mentioned heat transfer effects impact the thrust performance directly. The result shows that the computed core-exit gas temperature, specific impulse, and core pressure drop agree well with those of design data for the Small Engine. Finite-rate chemistry is very important in predicting the proper energy balance as naturally occurring hydrogen decomposition is endothermic. Locally strong hydrogen conversion associated with centralized power profile gives poor heat transfer efficiency and lower thrust performance. On the other hand, uniform hydrogen conversion associated with a more uniform radial power profile achieves higher heat transfer efficiency, and higher thrust performance.

Exploiting GPUs in virtual machine for BioCloud.

PubMed

Jo, Heeseung; Jeong, Jinkyu; Lee, Myoungho; Choi, Dong Hoon

2013-01-01

Recently, biological applications start to be reimplemented into the applications which exploit many cores of GPUs for better computation performance. Therefore, by providing virtualized GPUs to VMs in cloud computing environment, many biological applications will willingly move into cloud environment to enhance their computation performance and utilize infinite cloud computing resource while reducing expenses for computations. In this paper, we propose a BioCloud system architecture that enables VMs to use GPUs in cloud environment. Because much of the previous research has focused on the sharing mechanism of GPUs among VMs, they cannot achieve enough performance for biological applications of which computation throughput is more crucial rather than sharing. The proposed system exploits the pass-through mode of PCI express (PCI-E) channel. By making each VM be able to access underlying GPUs directly, applications can show almost the same performance as when those are in native environment. In addition, our scheme multiplexes GPUs by using hot plug-in/out device features of PCI-E channel. By adding or removing GPUs in each VM in on-demand manner, VMs in the same physical host can time-share their GPUs. We implemented the proposed system using the Xen VMM and NVIDIA GPUs and showed that our prototype is highly effective for biological GPU applications in cloud environment.
CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms.

PubMed

Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W

2012-06-01

As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
A Computer Program for Crystal Drawing.

ERIC Educational Resources Information Center

Dutch, Steven I.

1981-01-01

Described is a computer program which accepts face data, performs all necessary symmetry operations, and produces a drawing of the resulting crystal. The program shortens computing time to make it suitable for online teaching use or for use in small computers. (Author/DC)
Effects on Training Using Illumination in Virtual Environments

NASA Technical Reports Server (NTRS)

Maida, James C.; Novak, M. S. Jennifer; Mueller, Kristian

1999-01-01

Camera based tasks are commonly performed during orbital operations, and orbital lighting conditions, such as high contrast shadowing and glare, are a factor in performance. Computer based training using virtual environments is a common tool used to make and keep CTW members proficient. If computer based training included some of these harsh lighting conditions, would the crew increase their proficiency? The project goal was to determine whether computer based training increases proficiency if one trains for a camera based task using computer generated virtual environments with enhanced lighting conditions such as shadows and glare rather than color shaded computer images normally used in simulators. Previous experiments were conducted using a two degree of freedom docking system. Test subjects had to align a boresight camera using a hand controller with one axis of rotation and one axis of rotation. Two sets of subjects were trained on two computer simulations using computer generated virtual environments, one with lighting, and one without. Results revealed that when subjects were constrained by time and accuracy, those who trained with simulated lighting conditions performed significantly better than those who did not. To reinforce these results for speed and accuracy, the task complexity was increased.
Welcome to the NASA High Performance Computing and Communications Computational Aerosciences (CAS) Workshop 2000

NASA Technical Reports Server (NTRS)

Schulbach, Catherine H. (Editor)

2000-01-01

The purpose of the CAS workshop is to bring together NASA's scientists and engineers and their counterparts in industry, other government agencies, and academia working in the Computational Aerosciences and related fields. This workshop is part of the technology transfer plan of the NASA High Performance Computing and Communications (HPCC) Program. Specific objectives of the CAS workshop are to: (1) communicate the goals and objectives of HPCC and CAS, (2) promote and disseminate CAS technology within the appropriate technical communities, including NASA, industry, academia, and other government labs, (3) help promote synergy among CAS and other HPCC scientists, and (4) permit feedback from peer researchers on issues facing High Performance Computing in general and the CAS project in particular. This year we had a number of exciting presentations in the traditional aeronautics, aerospace sciences, and high-end computing areas and in the less familiar (to many of us affiliated with CAS) earth science, space science, and revolutionary computing areas. Presentations of more than 40 high quality papers were organized into ten sessions and presented over the three-day workshop. The proceedings are organized here for easy access: by author, title and topic.
Re-Computation of Numerical Results Contained in NACA Report No. 496

NASA Technical Reports Server (NTRS)

Perry, Boyd, III

2015-01-01

An extensive examination of NACA Report No. 496 (NACA 496), "General Theory of Aerodynamic Instability and the Mechanism of Flutter," by Theodore Theodorsen, is described. The examination included checking equations and solution methods and re-computing interim quantities and all numerical examples in NACA 496. The checks revealed that NACA 496 contains computational shortcuts (time- and effort-saving devices for engineers of the time) and clever artifices (employed in its solution methods), but, unfortunately, also contains numerous tripping points (aspects of NACA 496 that have the potential to cause confusion) and some errors. The re-computations were performed employing the methods and procedures described in NACA 496, but using modern computational tools. With some exceptions, the magnitudes and trends of the original results were in fair-to-very-good agreement with the re-computed results. The exceptions included what are speculated to be computational errors in the original in some instances and transcription errors in the original in others. Independent flutter calculations were performed and, in all cases, including those where the original and re-computed results differed significantly, were in excellent agreement with the re-computed results. Appendix A contains NACA 496; Appendix B contains a Matlab(Reistered) program that performs the re-computation of results; Appendix C presents three alternate solution methods, with examples, for the two-degree-of-freedom solution method of NACA 496; Appendix D contains the three-degree-of-freedom solution method (outlined in NACA 496 but never implemented), with examples.
The role of the host in a cooperating mainframe and workstation environment, volumes 1 and 2

NASA Technical Reports Server (NTRS)

Kusmanoff, Antone; Martin, Nancy L.

1989-01-01

In recent years, advancements made in computer systems have prompted a move from centralized computing based on timesharing a large mainframe computer to distributed computing based on a connected set of engineering workstations. A major factor in this advancement is the increased performance and lower cost of engineering workstations. The shift to distributed computing from centralized computing has led to challenges associated with the residency of application programs within the system. In a combined system of multiple engineering workstations attached to a mainframe host, the question arises as to how does a system designer assign applications between the larger mainframe host and the smaller, yet powerful, workstation. The concepts related to real time data processing are analyzed and systems are displayed which use a host mainframe and a number of engineering workstations interconnected by a local area network. In most cases, distributed systems can be classified as having a single function or multiple functions and as executing programs in real time or nonreal time. In a system of multiple computers, the degree of autonomy of the computers is important; a system with one master control computer generally differs in reliability, performance, and complexity from a system in which all computers share the control. This research is concerned with generating general criteria principles for software residency decisions (host or workstation) for a diverse yet coupled group of users (the clustered workstations) which may need the use of a shared resource (the mainframe) to perform their functions.
Efficient mapping algorithms for scheduling robot inverse dynamics computation on a multiprocessor system

NASA Technical Reports Server (NTRS)

Lee, C. S. G.; Chen, C. L.

1989-01-01

Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.
Construction of Blaze at the University of Illinois at Chicago: A Shared, High-Performance, Visual Computer for Next-Generation Cyberinfrastructure-Accelerated Scientific, Engineering, Medical and Public Policy Research

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Maxine D.; Leigh, Jason

2014-02-17

The Blaze high-performance visual computing system serves the high-performance computing research and education needs of University of Illinois at Chicago (UIC). Blaze consists of a state-of-the-art, networked, computer cluster and ultra-high-resolution visualization system called CAVE2(TM) that is currently not available anywhere in Illinois. This system is connected via a high-speed 100-Gigabit network to the State of Illinois' I-WIRE optical network, as well as to national and international high speed networks, such as the Internet2, and the Global Lambda Integrated Facility. This enables Blaze to serve as an on-ramp to national cyberinfrastructure, such as the National Science Foundation’s Blue Waters petascalemore » computer at the National Center for Supercomputing Applications at the University of Illinois at Chicago and the Department of Energy’s Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. DOE award # DE-SC005067, leveraged with NSF award #CNS-0959053 for “Development of the Next-Generation CAVE Virtual Environment (NG-CAVE),” enabled us to create a first-of-its-kind high-performance visual computing system. The UIC Electronic Visualization Laboratory (EVL) worked with two U.S. companies to advance their commercial products and maintain U.S. leadership in the global information technology economy. New applications are being enabled with the CAVE2/Blaze visual computing system that is advancing scientific research and education in the U.S. and globally, and help train the next-generation workforce.« less
Bridging Social and Semantic Computing - Design and Evaluation of User Interfaces for Hybrid Systems

ERIC Educational Resources Information Center

Bostandjiev, Svetlin Alex I.

2012-01-01

The evolution of the Web brought new interesting problems to computer scientists that we loosely classify in the fields of social and semantic computing. Social computing is related to two major paradigms: computations carried out by a large amount of people in a collective intelligence fashion (i.e. wikis), and performing computations on social…
Optical high-performance computing: introduction to the JOSA A and Applied Optics feature.

PubMed

Caulfield, H John; Dolev, Shlomi; Green, William M J

2009-08-01

The feature issues in both Applied Optics and the Journal of the Optical Society of America A focus on topics of immediate relevance to the community working in the area of optical high-performance computing.
Update of aircraft profile data for the Integrated Noise Model computer program, vol 1: final report

DOT National Transportation Integrated Search

1992-03-01

This report provides aircraft takeoff and landing profiles, aircraft aerodynamic performance coefficients and engine performance coefficients for the aircraft data base (Database 9) in the Integrated Noise Model (INM) computer program. Flight profile...
Bilayer avalanche spin-diode logic

DOE Office of Scientific and Technical Information (OSTI.GOV)

Friedman, Joseph S., E-mail: joseph.friedman@u-psud.fr; Querlioz, Damien; Fadel, Eric R.

2015-11-15

A novel spintronic computing paradigm is proposed and analyzed in which InSb p-n bilayer avalanche spin-diodes are cascaded to efficiently perform complex logic operations. This spin-diode logic family uses control wires to generate magnetic fields that modulate the resistance of the spin-diodes, and currents through these devices control the resistance of cascaded devices. Electromagnetic simulations are performed to demonstrate the cascading mechanism, and guidelines are provided for the development of this innovative computing technology. This cascading scheme permits compact logic circuits with switching speeds determined by electromagnetic wave propagation rather than electron motion, enabling high-performance spintronic computing.
Source Listings for Computer Code SPIRALI Incompressible, Turbulent Spiral Grooved Cylindrical and Face Seals

NASA Technical Reports Server (NTRS)

Walowit, Jed A.; Shapiro, Wibur

2005-01-01

This is the source listing of the computer code SPIRALI which predicts the performance characteristics of incompressible cylindrical and face seals with or without the inclusion of spiral grooves. Performance characteristics include load capacity (for face seals), leakage flow, power requirements and dynamic characteristics in the form of stiffness, damping and apparent mass coefficients in 4 degrees of freedom for cylindrical seals and 3 degrees of freedom for face seals. These performance characteristics are computed as functions of seal and groove geometry, load or film thickness, running and disturbance speeds, fluid viscosity, and boundary pressures.
Validation of MCNP6 Version 1.0 with the ENDF/B-VII.1 Cross Section Library for Plutonium Metals, Oxides, and Solutions on the High Performance Computing Platform Moonlight

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chapman, Bryan Scott; Gough, Sean T.

This report documents a validation of the MCNP6 Version 1.0 computer code on the high performance computing platform Moonlight, for operations at Los Alamos National Laboratory (LANL) that involve plutonium metals, oxides, and solutions. The validation is conducted using the ENDF/B-VII.1 continuous energy group cross section library at room temperature. The results are for use by nuclear criticality safety personnel in performing analysis and evaluation of various facility activities involving plutonium materials.
Performance of the Widely-Used CFD Code OVERFLOW on the Pleides Supercomputer

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.

2017-01-01

Computational performance studies were made for NASA's widely used Computational Fluid Dynamics code OVERFLOW on the Pleiades Supercomputer. Two test cases were considered: a full launch vehicle with a grid of 286 million points and a full rotorcraft model with a grid of 614 million points. Computations using up to 8000 cores were run on Sandy Bridge and Ivy Bridge nodes. Performance was monitored using times reported in the day files from the Portable Batch System utility. Results for two grid topologies are presented and compared in detail. Observations and suggestions for future work are made.
CFD Prediction for Spin Rate of Fixed Canards on a Spinning Projectile

NASA Astrophysics Data System (ADS)

Ji, X. L.; Jia, Ch. Y.; Jiang, T. Y.

2011-09-01

A computational study performed for spin rate of fixed canards on a spinning projectile is presented in this paper. The cancards configurations provide challenges in terms of the determination of the aerodynamic forces and moments and the flow field changes which could have significant effect on the stability, performance, and corrected round accuracy. Advanced time accurate Navier-Stokes computations have been performed to compute the spin rate associated with the spinning motion of the cancards configurations at supersonic speed. The results show that roll-damping moment of cancards varies linearly with the spin rate at supersonic velocity.
Method for simultaneous overlapped communications between neighboring processors in a multiple

DOEpatents

Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

1991-01-01

A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
An XML-Based Protocol for Distributed Event Services

NASA Technical Reports Server (NTRS)

Smith, Warren; Gunter, Dan; Quesnel, Darcy; Biegel, Bryan (Technical Monitor)

2001-01-01

A recent trend in distributed computing is the construction of high-performance distributed systems called computational grids. One difficulty we have encountered is that there is no standard format for the representation of performance information and no standard protocol for transmitting this information. This limits the types of performance analysis that can be undertaken in complex distributed systems. To address this problem, we present an XML-based protocol for transmitting performance events in distributed systems and evaluate the performance of this protocol.
Application of the Linux cluster for exhaustive window haplotype analysis using the FBAT and Unphased programs.

PubMed

Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun

2008-05-28

Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4-15.9 times faster, while Unphased jobs performed 1.1-18.6 times faster compared to the accumulated computation duration. Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance.

Application of the Linux cluster for exhaustive window haplotype analysis using the FBAT and Unphased programs

PubMed Central

Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun

2008-01-01

Background Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Results Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4–15.9 times faster, while Unphased jobs performed 1.1–18.6 times faster compared to the accumulated computation duration. Conclusion Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance. PMID:18541045
Evaluating open-source cloud computing solutions for geosciences

NASA Astrophysics Data System (ADS)

Huang, Qunying; Yang, Chaowei; Liu, Kai; Xia, Jizhe; Xu, Chen; Li, Jing; Gui, Zhipeng; Sun, Min; Li, Zhenglong

2013-09-01

Many organizations start to adopt cloud computing for better utilizing computing resources by taking advantage of its scalability, cost reduction, and easy to access characteristics. Many private or community cloud computing platforms are being built using open-source cloud solutions. However, little has been done to systematically compare and evaluate the features and performance of open-source solutions in supporting Geosciences. This paper provides a comprehensive study of three open-source cloud solutions, including OpenNebula, Eucalyptus, and CloudStack. We compared a variety of features, capabilities, technologies and performances including: (1) general features and supported services for cloud resource creation and management, (2) advanced capabilities for networking and security, and (3) the performance of the cloud solutions in provisioning and operating the cloud resources as well as the performance of virtual machines initiated and managed by the cloud solutions in supporting selected geoscience applications. Our study found that: (1) no significant performance differences in central processing unit (CPU), memory and I/O of virtual machines created and managed by different solutions, (2) OpenNebula has the fastest internal network while both Eucalyptus and CloudStack have better virtual machine isolation and security strategies, (3) Cloudstack has the fastest operations in handling virtual machines, images, snapshots, volumes and networking, followed by OpenNebula, and (4) the selected cloud computing solutions are capable for supporting concurrent intensive web applications, computing intensive applications, and small-scale model simulations without intensive data communication.
Automated procedure for developing hybrid computer simulations of turbofan engines. Part 1: General description

NASA Technical Reports Server (NTRS)

Szuch, J. R.; Krosel, S. M.; Bruton, W. M.

1982-01-01

A systematic, computer-aided, self-documenting methodology for developing hybrid computer simulations of turbofan engines is presented. The methodology that is pesented makes use of a host program that can run on a large digital computer and a machine-dependent target (hybrid) program. The host program performs all the calculations and data manipulations that are needed to transform user-supplied engine design information to a form suitable for the hybrid computer. The host program also trims the self-contained engine model to match specified design-point information. Part I contains a general discussion of the methodology, describes a test case, and presents comparisons between hybrid simulation and specified engine performance data. Part II, a companion document, contains documentation, in the form of computer printouts, for the test case.
PPC750 Performance Monitor

NASA Technical Reports Server (NTRS)

Meyer, Donald; Uchenik, Igor

2007-01-01

The PPC750 Performance Monitor (Perfmon) is a computer program that helps the user to assess the performance characteristics of application programs running under the Wind River VxWorks real-time operating system on a PPC750 computer. Perfmon generates a user-friendly interface and collects performance data by use of performance registers provided by the PPC750 architecture. It processes and presents run-time statistics on a per-task basis over a repeating time interval (typically, several seconds or minutes) specified by the user. When the Perfmon software module is loaded with the user s software modules, it is available for use through Perfmon commands, without any modification of the user s code and at negligible performance penalty. Per-task run-time performance data made available by Perfmon include percentage time, number of instructions executed per unit time, dispatch ratio, stack high water mark, and level-1 instruction and data cache miss rates. The performance data are written to a file specified by the user or to the serial port of the computer
Application Performance Analysis and Efficient Execution on Systems with multi-core CPUs, GPUs and MICs: A Case Study with Microscopy Image Analysis

PubMed Central

Teodoro, George; Kurc, Tahsin; Andrade, Guilherme; Kong, Jun; Ferreira, Renato; Saltz, Joel

2015-01-01

We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core-MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexities, and parallelization forms of the operations. The results show a significant variability in the performance of operations with respect to the device used. The performances of operations with regular data access are comparable or sometimes better on a MIC than that on a GPU. GPUs are more efficient than MICs for operations that access data irregularly, because of the lower bandwidth of the MIC for random data accesses. We propose new performance-aware scheduling strategies that consider variabilities in operation speedups. Our scheduling strategies significantly improve application performance compared to classic strategies in hybrid configurations. PMID:28239253
Thrust chamber performance using Navier-Stokes solution. [space shuttle main engine viscous nozzle calculation

NASA Technical Reports Server (NTRS)

Chan, J. S.; Freeman, J. A.

1984-01-01

The viscous, axisymmetric flow in the thrust chamber of the space shuttle main engine (SSME) was computed on the CRAY 205 computer using the general interpolants method (GIM) code. Results show that the Navier-Stokes codes can be used for these flows to study trends and viscous effects as well as determine flow patterns; but further research and development is needed before they can be used as production tools for nozzle performance calculations. The GIM formulation, numerical scheme, and computer code are described. The actual SSME nozzle computation showing grid points, flow contours, and flow parameter plots is discussed. The computer system and run times/costs are detailed.
Improved neutron activation prediction code system development

NASA Technical Reports Server (NTRS)

Saqui, R. M.

1971-01-01

Two integrated neutron activation prediction code systems have been developed by modifying and integrating existing computer programs to perform the necessary computations to determine neutron induced activation gamma ray doses and dose rates in complex geometries. Each of the two systems is comprised of three computational modules. The first program module computes the spatial and energy distribution of the neutron flux from an input source and prepares input data for the second program which performs the reaction rate, decay chain and activation gamma source calculations. A third module then accepts input prepared by the second program to compute the cumulative gamma doses and/or dose rates at specified detector locations in complex, three-dimensional geometries.
Sort computation

NASA Technical Reports Server (NTRS)

Dorband, John E.

1988-01-01

Sorting has long been used to organize data in preparation for further computation, but sort computation allows some types of computation to be performed during the sort. Sort aggregation and sort distribution are the two basic forms of sort computation. Sort aggregation generates an accumulative or aggregate result for each group of records and places this result in one of the records. An aggregate operation can be any operation that is both associative and commutative, i.e., any operation whose result does not depend on the order of the operands or the order in which the operations are performed. Sort distribution copies the value from a field of a specific record in a group into that field in every record of that group.
Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds

NASA Astrophysics Data System (ADS)

Seinstra, Frank J.; Maassen, Jason; van Nieuwpoort, Rob V.; Drost, Niels; van Kessel, Timo; van Werkhoven, Ben; Urbani, Jacopo; Jacobs, Ceriel; Kielmann, Thilo; Bal, Henri E.

In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle.
[Earth and Space Sciences Project Services for NASA HPCC

NASA Technical Reports Server (NTRS)

Merkey, Phillip

2002-01-01

This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Potok, Thomas E; Schuman, Catherine D; Young, Steven R

Current Deep Learning models use highly optimized convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers with a fairly simple layered network topology, i.e., highly connected layers, without intra-layer connections. Complex topologies have been proposed, but are intractable to train on current systems. Building the topologies of the deep learning network requires hand tuning, and implementing the network in hardware is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determinemore » network topology, and neuromorphic computing for a low-power hardware implementation. Due to input size limitations of current quantum computers we use the MNIST dataset for our evaluation. The results show the possibility of using the three architectures in tandem to explore complex deep learning networks that are untrainable using a von Neumann architecture. We show that a quantum computer can find high quality values of intra-layer connections and weights, while yielding a tractable time result as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. This represents a new capability that is not feasible with current von Neumann architecture. It potentially enables the ability to solve very complicated problems unsolvable with current computing technologies.« less
Allocating Tactical High-Performance Computer (HPC) Resources to Offloaded Computation in Battlefield Scenarios

DTIC Science & Technology

2013-12-01

authors present a Computing on Dissemination with predictable contacts ( pCoD ) algorithm, since it is impossible to reserve task execution time in advance...Computing While Charging DAG Directed Acyclic Graph 18 TTL Time-to-live pCoD Predictable contacts CoD Computing on Dissemination upCoD Unpredictable
Argonne Out Loud: Computation, Big Data, and the Future of Cities

ScienceCinema

Catlett, Charlie

2018-01-16

Charlie Catlett, a Senior Computer Scientist at Argonne and Director of the Urban Center for Computation and Data at the Computation Institute of the University of Chicago and Argonne, talks about how he and his colleagues are using high-performance computing, data analytics, and embedded systems to better understand and design cities.
Performance evaluation of the Engineering Analysis and Data Systems (EADS) 2

NASA Technical Reports Server (NTRS)

Debrunner, Linda S.

1994-01-01

The Engineering Analysis and Data System (EADS)II (1) was installed in March 1993 to provide high performance computing for science and engineering at Marshall Space Flight Center (MSFC). EADS II increased the computing capabilities over the existing EADS facility in the areas of throughput and mass storage. EADS II includes a Vector Processor Compute System (VPCS), a Virtual Memory Compute System (CFS), a Common Output System (COS), as well as Image Processing Station, Mini Super Computers, and Intelligent Workstations. These facilities are interconnected by a sophisticated network system. This work considers only the performance of the VPCS and the CFS. The VPCS is a Cray YMP. The CFS is implemented on an RS 6000 using the UniTree Mass Storage System. To better meet the science and engineering computing requirements, EADS II must be monitored, its performance analyzed, and appropriate modifications for performance improvement made. Implementing this approach requires tool(s) to assist in performance monitoring and analysis. In Spring 1994, PerfStat 2.0 was purchased to meet these needs for the VPCS and the CFS. PerfStat(2) is a set of tools that can be used to analyze both historical and real-time performance data. Its flexible design allows significant user customization. The user identifies what data is collected, how it is classified, and how it is displayed for evaluation. Both graphical and tabular displays are supported. The capability of the PerfStat tool was evaluated, appropriate modifications to EADS II to optimize throughput and enhance productivity were suggested and implemented, and the effects of these modifications on the systems performance were observed. In this paper, the PerfStat tool is described, then its use with EADS II is outlined briefly. Next, the evaluation of the VPCS, as well as the modifications made to the system are described. Finally, conclusions are drawn and recommendations for future worked are outlined.
Transformation of OODT CAS to Perform Larger Tasks

NASA Technical Reports Server (NTRS)

Mattmann, Chris; Freeborn, Dana; Crichton, Daniel; Hughes, John; Ramirez, Paul; Hardman, Sean; Woollard, David; Kelly, Sean

2008-01-01

A computer program denoted OODT CAS has been transformed to enable performance of larger tasks that involve greatly increased data volumes and increasingly intensive processing of data on heterogeneous, geographically dispersed computers. Prior to the transformation, OODT CAS (also alternatively denoted, simply, 'CAS') [wherein 'OODT' signifies 'Object-Oriented Data Technology' and 'CAS' signifies 'Catalog and Archive Service'] was a proven software component used to manage scientific data from spaceflight missions. In the transformation, CAS was split into two separate components representing its canonical capabilities: file management and workflow management. In addition, CAS was augmented by addition of a resource-management component. This third component enables CAS to manage heterogeneous computing by use of diverse resources, including high-performance clusters of computers, commodity computing hardware, and grid computing infrastructures. CAS is now more easily maintainable, evolvable, and reusable. These components can be used separately or, taking advantage of synergies, can be used together. Other elements of the transformation included addition of a separate Web presentation layer that supports distribution of data products via Really Simple Syndication (RSS) feeds, and provision for full Resource Description Framework (RDF) exports of metadata.
Do disk drives dream of buffer cache hits?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holt, A.

1994-12-31

G.E. Moore, in his book Principia Ethica, examines the popular view of ethics that deals with {open_quotes}what we ought to do{close_quotes} as well as using ethics to cover the general inquiry: {open_quotes}what is good?{close_quotes} This paper utilises Moore`s view of Ethics to examine computer systems performance. Moore asserts that {open_quotes}good{close_quotes} in itself is indefinable. It is argued in this report that, although we describe computer systems as good (or bad) a computer system cannot be good in itself, rather a means to good! In terms of {open_quotes}what we ought to do{close_quotes} this paper looks at what actions (would) bring aboutmore » good computer system performance according to computer science and engineering literature. In particular we look at duties, responsibilities and {open_quotes}to do what is right{close_quotes} in terms of system administration, design and usage. We further argue that in order to first make ethical observations with respect computer system performance and then apply them, requires technical knowledge which is typically limited to industry specialists and experts.« less
Real-time performance monitoring and management system

DOEpatents

Budhraja, Vikram S [Los Angeles, CA; Dyer, James D [La Mirada, CA; Martinez Morales, Carlos A [Upland, CA

2007-06-19

A real-time performance monitoring system for monitoring an electric power grid. The electric power grid has a plurality of grid portions, each grid portion corresponding to one of a plurality of control areas. The real-time performance monitoring system includes a monitor computer for monitoring at least one of reliability metrics, generation metrics, transmission metrics, suppliers metrics, grid infrastructure security metrics, and markets metrics for the electric power grid. The data for metrics being monitored by the monitor computer are stored in a data base, and a visualization of the metrics is displayed on at least one display computer having a monitor. The at least one display computer in one said control area enables an operator to monitor the grid portion corresponding to a different said control area.
48 CFR 227.7205 - Contracts for special works.

Code of Federal Regulations, 2014 CFR

2014-10-01

... Computer Software and Computer Software Documentation 227.7205 Contracts for special works. (a) Use the... a specific need to control the distribution of computer software or computer software documentation..., modification, reproduction, release, performance, display, or disclosure of such software or documentation. Use...
Inquiring Minds

Science.gov Websites

Proposed Projects and Experiments Fermilab's Tevatron Questions for the Universe Theory Computing High -performance Computing Grid Computing Networking Mass Storage Plan for the Future State of the Laboratory Homeland Security Industry Computing Sciences Workforce Development A Growing List Historic Results
48 CFR 227.7205 - Contracts for special works.

Code of Federal Regulations, 2010 CFR

2010-10-01

... Computer Software and Computer Software Documentation 227.7205 Contracts for special works. (a) Use the... a specific need to control the distribution of computer software or computer software documentation..., modification, reproduction, release, performance, display, or disclosure of such software or documentation. Use...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.