Image-Processing Software For A Hypercube Computer
NASA Technical Reports Server (NTRS)
Lee, Meemong; Mazer, Alan S.; Groom, Steven L.; Williams, Winifred I.
1992-01-01
Concurrent Image Processing Executive (CIPE) is software system intended to develop and use image-processing application programs on concurrent computing environment. Designed to shield programmer from complexities of concurrent-system architecture, it provides interactive image-processing environment for end user. CIPE utilizes architectural characteristics of particular concurrent system to maximize efficiency while preserving architectural independence from user and programmer. CIPE runs on Mark-IIIfp 8-node hypercube computer and associated SUN-4 host computer.
Parallel Algorithm Solves Coupled Differential Equations
NASA Technical Reports Server (NTRS)
Hayashi, A.
1987-01-01
Numerical methods adapted to concurrent processing. Algorithm solves set of coupled partial differential equations by numerical integration. Adapted to run on hypercube computer, algorithm separates problem into smaller problems solved concurrently. Increase in computing speed with concurrent processing over that achievable with conventional sequential processing appreciable, especially for large problems.
Preventing Run-Time Bugs at Compile-Time Using Advanced C++
DOE Office of Scientific and Technical Information (OSTI.GOV)
Neswold, Richard
When writing software, we develop algorithms that tell the computer what to do at run-time. Our solutions are easier to understand and debug when they are properly modeled using class hierarchies, enumerations, and a well-factored API. Unfortunately, even with these design tools, we end up having to debug our programs at run-time. Worse still, debugging an embedded system changes its dynamics, making it tough to find and fix concurrency issues. This paper describes techniques using C++ to detect run-time bugs *at compile time*. A concurrency library, developed at Fermilab, is used for examples in illustrating these techniques.
The Caltech Concurrent Computation Program - Project description
NASA Technical Reports Server (NTRS)
Fox, G.; Otto, S.; Lyzenga, G.; Rogstad, D.
1985-01-01
The Caltech Concurrent Computation Program wwhich studies basic issues in computational science is described. The research builds on initial work where novel concurrent hardware, the necessary systems software to use it and twenty significant scientific implementations running on the initial 32, 64, and 128 node hypercube machines have been constructed. A major goal of the program will be to extend this work into new disciplines and more complex algorithms including general packages that decompose arbitrary problems in major application areas. New high-performance concurrent processors with up to 1024-nodes, over a gigabyte of memory and multigigaflop performance are being constructed. The implementations cover a wide range of problems in areas such as high energy and astrophysics, condensed matter, chemical reactions, plasma physics, applied mathematics, geophysics, simulation, CAD for VLSI, graphics and image processing. The products of the research program include the concurrent algorithms, hardware, systems software, and complete program implementations.
NASA Technical Reports Server (NTRS)
Eberhardt, D. S.; Baganoff, D.; Stevens, K.
1984-01-01
Implicit approximate-factored algorithms have certain properties that are suitable for parallel processing. A particular computational fluid dynamics (CFD) code, using this algorithm, is mapped onto a multiple-instruction/multiple-data-stream (MIMD) computer architecture. An explanation of this mapping procedure is presented, as well as some of the difficulties encountered when trying to run the code concurrently. Timing results are given for runs on the Ames Research Center's MIMD test facility which consists of two VAX 11/780's with a common MA780 multi-ported memory. Speedups exceeding 1.9 for characteristic CFD runs were indicated by the timing results.
NASA Technical Reports Server (NTRS)
Muhsin, Mansour; Walters, Ian
2004-01-01
The Document Concurrence System is a combination of software modules for routing users expressions of concurrence with documents. This system enables determination of the current status of concurrences and eliminates the need for the prior practice of manually delivering paper documents to all persons whose approvals were required. This system runs on a server, and participants gain access via personal computers equipped with Web-browser and electronic-mail software. A user can begin a concurrence routing process by logging onto an administration module, naming the approvers and stating the sequence for routing among them, and attaching documents. The server then sends a message to the first person on the list. Upon concurrence by the first person, the system sends a message to the second person, and so forth. A person on the list indicates approval, places the documents on hold, or indicates disapproval, via a Web-based module. When the last person on the list has concurred, a message is sent to the initiator, who can then finalize the process through the administration module. A background process running on the server identifies concurrence processes that are overdue and sends reminders to the appropriate persons.
NASA Technical Reports Server (NTRS)
Springer, P.
1993-01-01
This paper discusses the method in which the Cascade-Correlation algorithm was parallelized in such a way that it could be run using the Time Warp Operating System (TWOS). TWOS is a special purpose operating system designed to run parellel discrete event simulations with maximum efficiency on parallel or distributed computers.
Queueing Network Models for Parallel Processing of Task Systems: an Operational Approach
NASA Technical Reports Server (NTRS)
Mak, Victor W. K.
1986-01-01
Computer performance modeling of possibly complex computations running on highly concurrent systems is considered. Earlier works in this area either dealt with a very simple program structure or resulted in methods with exponential complexity. An efficient procedure is developed to compute the performance measures for series-parallel-reducible task systems using queueing network models. The procedure is based on the concept of hierarchical decomposition and a new operational approach. Numerical results for three test cases are presented and compared to those of simulations.
Application configuration selection for energy-efficient execution on multicore systems
Wang, Shinan; Luo, Bing; Shi, Weisong; ...
2015-09-21
Balanced performance and energy consumption are incorporated in the design of modern computer systems. Several runtime factors, such as concurrency levels, thread mapping strategies, and dynamic voltage and frequency scaling (DVFS) should be considered in order to achieve optimal energy efficiency fora workload. Selecting appropriate run-time factors, however, is one of the most challenging tasks because the run-time factors are architecture-specific and workload-specific. And while most existing works concentrate on either static analysis of the workload or run-time prediction results, we present a hybrid two-step method that utilizes concurrency levels and DVFS settings to achieve the energy efficiency configuration formore » a worldoad. The experimental results based on a Xeon E5620 server with NPB and PARSEC benchmark suites show that the model is able to predict the energy efficient configuration accurately. On average, an additional 10% EDP (Energy Delay Product) saving is obtained by using run-time DVFS for the entire system. An off-line optimal solution is used to compare with the proposed scheme. Finally, the experimental results show that the average extra EDP saved by the optimal solution is within 5% on selective parallel benchmarks.« less
Methods for operating parallel computing systems employing sequenced communications
Benner, R.E.; Gustafson, J.L.; Montry, G.R.
1999-08-10
A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.
Methods for operating parallel computing systems employing sequenced communications
Benner, Robert E.; Gustafson, John L.; Montry, Gary R.
1999-01-01
A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing.
Li, Hao; Yu, Di; Kumar, Anand; Tu, Yi-Cheng
2014-10-01
Push-based database management system (DBMS) is a new type of data processing software that streams large volume of data to concurrent query operators. The high data rate of such systems requires large computing power provided by the query engine. In our previous work, we built a push-based DBMS named G-SDMS to harness the unrivaled computational capabilities of modern GPUs. A major design goal of G-SDMS is to support concurrent processing of heterogenous query processing operations and enable resource allocation among such operations. Understanding the performance of operations as a result of resource consumption is thus a premise in the design of G-SDMS. With NVIDIA's CUDA framework as the system implementation platform, we present our recent work on performance modeling of CUDA kernels running concurrently under a runtime mechanism named CUDA stream . Specifically, we explore the connection between performance and resource occupancy of compute-bound kernels and develop a model that can predict the performance of such kernels. Furthermore, we provide an in-depth anatomy of the CUDA stream mechanism and summarize the main kernel scheduling disciplines in it. Our models and derived scheduling disciplines are verified by extensive experiments using synthetic and real-world CUDA kernels.
Gómez-Molina, Josué; Ogueta-Alday, Ana; Camara, Jesus; Stickley, Christopher; García-López, Juan
2018-03-01
Concurrent plyometric and running training has the potential to improve running economy (RE) and performance through increasing muscle strength and power, but the possible effect on spatiotemporal parameters of running has not been studied yet. The aim of this study was to compare the effect of 8 weeks of concurrent plyometric and running training on spatiotemporal parameters and physiological variables of novice runners. Twenty-five male participants were randomly assigned into two training groups; running group (RG) (n = 11) and running + plyometric group (RPG) (n = 14). Both groups performed 8 weeks of running training programme, and only the RPG performed a concurrent plyometric training programme (two sessions per week). Anthropometric, physiological (VO 2max , heart rate and RE) and spatiotemporal variables (contact and flight times, step rate and length) were registered before and after the intervention. In comparison to RG, the RPG reduced step rate and increased flight times at the same running speeds (P < .05) while contact times remained constant. Significant increases in pre- and post-training (P < .05) were found in RPG for squat jump and 5 bound test, while RG remained unchanged. Peak speed, ventilatory threshold (VT) speed and respiratory compensation threshold (RCT) speed increased (P < .05) for both groups, although peak speed and VO 2max increased more in the RPG than in the RG. In conclusion, concurrent plyometric and running training entails a reduction in step rate, as well as increases in VT speed, RCT speed, peak speed and VO 2max . Athletes could benefit from plyometric training in order to improve their strength, which would contribute to them attaining higher running speeds.
Parallel processors and nonlinear structural dynamics algorithms and software
NASA Technical Reports Server (NTRS)
Belytschko, Ted
1990-01-01
Techniques are discussed for the implementation and improvement of vectorization and concurrency in nonlinear explicit structural finite element codes. In explicit integration methods, the computation of the element internal force vector consumes the bulk of the computer time. The program can be efficiently vectorized by subdividing the elements into blocks and executing all computations in vector mode. The structuring of elements into blocks also provides a convenient way to implement concurrency by creating tasks which can be assigned to available processors for evaluation. The techniques were implemented in a 3-D nonlinear program with one-point quadrature shell elements. Concurrency and vectorization were first implemented in a single time step version of the program. Techniques were developed to minimize processor idle time and to select the optimal vector length. A comparison of run times between the program executed in scalar, serial mode and the fully vectorized code executed concurrently using eight processors shows speed-ups of over 25. Conjugate gradient methods for solving nonlinear algebraic equations are also readily adapted to a parallel environment. A new technique for improving convergence properties of conjugate gradients in nonlinear problems is developed in conjunction with other techniques such as diagonal scaling. A significant reduction in the number of iterations required for convergence is shown for a statically loaded rigid bar suspended by three equally spaced springs.
Method for simultaneous overlapped communications between neighboring processors in a multiple
Benner, Robert E.; Gustafson, John L.; Montry, Gary R.
1991-01-01
A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
ERIC Educational Resources Information Center
Belke, Terry W.
2010-01-01
Previous research suggested that allocation of responses on concurrent schedules of wheel-running reinforcement was less sensitive to schedule differences than typically observed with more conventional reinforcers. To assess this possibility, 16 female Long Evans rats were exposed to concurrent FR FR schedules of reinforcement and the schedule…
ELAS - SCIENCE & TECHNOLOGY LABORATORY APPLICATIONS SOFTWARE (CONCURRENT VERSION)
NASA Technical Reports Server (NTRS)
Pearson, R. W.
1994-01-01
The Science and Technology Laboratory Applications Software (ELAS) was originally designed to analyze and process digital imagery data, specifically remotely-sensed scanner data. This capability includes the processing of Landsat multispectral data; aircraft-acquired scanner data; digitized topographic data; and numerous other ancillary data, such as soil types and rainfall information, that can be stored in digitized form. ELAS has the subsequent capability to geographically reference this data to dozens of standard, as well as user created projections. As an integrated image processing system, ELAS offers the user of remotely-sensed data a wide range of capabilities in the areas of land cover analysis and general purpose image analysis. ELAS is designed for flexible use and operation and includes its own FORTRAN operating subsystem and an expandable set of FORTRAN application modules. Because all of ELAS resides in one "logical" FORTRAN program, data inputs and outputs, directives, and module switching are convenient for the user. There are over 230 modules presently available to aid the user in performing a wide range of land cover analyses and manipulation. The file management modules enable the user to allocate, define, access, and specify usage for all types of files (ELAS files, subfiles, external files etc.). Various other modules convert specific types of satellite, aircraft, and vector-polygon data into files that can be used by other ELAS modules. The user also has many module options which aid in displaying image data, such as magnification/reduction of the display; true color display; and several memory functions. Additional modules allow for the building and manipulation of polygonal areas of the image data. Finally, there are modules which allow the user to select and classify the image data. An important feature of the ELAS subsystem is that its structure allows new applications modules to be easily integrated in the future. ELAS has as a standard the flexibility to process data elements exceeding 8 bits in length, including floating point (noninteger) elements and 16 or 32 bit integers. Thus it is able to analyze and process "non-standard" nonimage data. The VAX (ERL-10017) and Concurrent (ERL-10013) versions of ELAS 9.0 are written in FORTRAN and ASSEMBLER for DEC VAX series computers running VMS and Concurrent computers running MTM. The Sun (SSC-00019), Masscomp (SSC-00020), and Silicon Graphics (SSC-00021) versions of ELAS 9.0 are written in FORTRAN 77 and C-LANGUAGE for Sun4 series computers running SunOS, Masscomp computers running UNIX, and Silicon Graphics IRIS computers running IRIX. The Concurrent version requires at least 15 bit addressing and a direct memory access channel. The VAX and Concurrent versions of ELAS both require floating-point hardware, at least 1Mb of RAM, and approximately 70Mb of disk space. Both versions also require a COMTAL display device in order to display images. For the Sun, Masscomp, and Silicon Graphics versions of ELAS, the disk storage required is approximately 115Mb, and a minimum of 8Mb of RAM is required for execution. The Sun version of ELAS requires either the X-Window System Version 11 Revision 4 or Sun OpenWindows Version 2. The Masscomp version requires a GA1000 display device and the associated "gp" library. The Silicon Graphics version requires Silicon Graphics' GL library. ELAS display functions will not work with a monochrome monitor. The standard distribution medium for the VAX version (ERL10017) is a set of two 9-track 1600 BPI magnetic tapes in DEC VAX BACKUP format. This version is also available on a TK50 tape cartridge in DEC VAX BACKUP format. The standard distribution medium for the Concurrent version (ERL-10013) is a set of two 9-track 1600 BPI magnetic tapes in Concurrent BACKUP format. The standard distribution medium for the Sun version (SSC-00019) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Masscomp version, (SSC-00020) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Silicon Graphics version (SSC-00021) is a .25 inch streaming magnetic IRIS tape cartridge in UNIX tar format. Version 9.0 was released in 1991. Sun4, SunOS, and Open Windows are trademarks of Sun Microsystems, Inc. MIT X Window System is licensed by Massachusetts Institute of Technology.
Application of the actor model to large scale NDE data analysis
NASA Astrophysics Data System (ADS)
Coughlin, Chris
2018-03-01
The Actor model of concurrent computation discretizes a problem into a series of independent units or actors that interact only through the exchange of messages. Without direct coupling between individual components, an Actor-based system is inherently concurrent and fault-tolerant. These traits lend themselves to so-called "Big Data" applications in which the volume of data to analyze requires a distributed multi-system design. For a practical demonstration of the Actor computational model, a system was developed to assist with the automated analysis of Nondestructive Evaluation (NDE) datasets using the open source Myriad Data Reduction Framework. A machine learning model trained to detect damage in two-dimensional slices of C-Scan data was deployed in a streaming data processing pipeline. To demonstrate the flexibility of the Actor model, the pipeline was deployed on a local system and re-deployed as a distributed system without recompiling, reconfiguring, or restarting the running application.
Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing
Li, Hao; Yu, Di; Kumar, Anand; Tu, Yi-Cheng
2015-01-01
Push-based database management system (DBMS) is a new type of data processing software that streams large volume of data to concurrent query operators. The high data rate of such systems requires large computing power provided by the query engine. In our previous work, we built a push-based DBMS named G-SDMS to harness the unrivaled computational capabilities of modern GPUs. A major design goal of G-SDMS is to support concurrent processing of heterogenous query processing operations and enable resource allocation among such operations. Understanding the performance of operations as a result of resource consumption is thus a premise in the design of G-SDMS. With NVIDIA’s CUDA framework as the system implementation platform, we present our recent work on performance modeling of CUDA kernels running concurrently under a runtime mechanism named CUDA stream. Specifically, we explore the connection between performance and resource occupancy of compute-bound kernels and develop a model that can predict the performance of such kernels. Furthermore, we provide an in-depth anatomy of the CUDA stream mechanism and summarize the main kernel scheduling disciplines in it. Our models and derived scheduling disciplines are verified by extensive experiments using synthetic and real-world CUDA kernels. PMID:26566545
A Framework for Load Balancing of Tensor Contraction Expressions via Dynamic Task Partitioning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lai, Pai-Wei; Stock, Kevin; Rajbhandari, Samyam
In this paper, we introduce the Dynamic Load-balanced Tensor Contractions (DLTC), a domain-specific library for efficient task parallel execution of tensor contraction expressions, a class of computation encountered in quantum chemistry and physics. Our framework decomposes each contraction into smaller unit of tasks, represented by an abstraction referred to as iterators. We exploit an extra level of parallelism by having tasks across independent contractions executed concurrently through a dynamic load balancing run- time. We demonstrate the improved performance, scalability, and flexibility for the computation of tensor contraction expressions on parallel computers using examples from coupled cluster methods.
NASA Astrophysics Data System (ADS)
Balaji, V.; Benson, Rusty; Wyman, Bruce; Held, Isaac
2016-10-01
Climate models represent a large variety of processes on a variety of timescales and space scales, a canonical example of multi-physics multi-scale modeling. Current hardware trends, such as Graphical Processing Units (GPUs) and Many Integrated Core (MIC) chips, are based on, at best, marginal increases in clock speed, coupled with vast increases in concurrency, particularly at the fine grain. Multi-physics codes face particular challenges in achieving fine-grained concurrency, as different physics and dynamics components have different computational profiles, and universal solutions are hard to come by. We propose here one approach for multi-physics codes. These codes are typically structured as components interacting via software frameworks. The component structure of a typical Earth system model consists of a hierarchical and recursive tree of components, each representing a different climate process or dynamical system. This recursive structure generally encompasses a modest level of concurrency at the highest level (e.g., atmosphere and ocean on different processor sets) with serial organization underneath. We propose to extend concurrency much further by running more and more lower- and higher-level components in parallel with each other. Each component can further be parallelized on the fine grain, potentially offering a major increase in the scalability of Earth system models. We present here first results from this approach, called coarse-grained component concurrency, or CCC. Within the Geophysical Fluid Dynamics Laboratory (GFDL) Flexible Modeling System (FMS), the atmospheric radiative transfer component has been configured to run in parallel with a composite component consisting of every other atmospheric component, including the atmospheric dynamics and all other atmospheric physics components. We will explore the algorithmic challenges involved in such an approach, and present results from such simulations. Plans to achieve even greater levels of coarse-grained concurrency by extending this approach within other components, such as the ocean, will be discussed.
Plancton: an opportunistic distributed computing project based on Docker containers
NASA Astrophysics Data System (ADS)
Concas, Matteo; Berzano, Dario; Bagnasco, Stefano; Lusso, Stefano; Masera, Massimo; Puccio, Maximiliano; Vallero, Sara
2017-10-01
The computing power of most modern commodity computers is far from being fully exploited by standard usage patterns. In this work we describe the development and setup of a virtual computing cluster based on Docker containers used as worker nodes. The facility is based on Plancton: a lightweight fire-and-forget background service. Plancton spawns and controls a local pool of Docker containers on a host with free resources, by constantly monitoring its CPU utilisation. It is designed to release the resources allocated opportunistically, whenever another demanding task is run by the host user, according to configurable policies. This is attained by killing a number of running containers. One of the advantages of a thin virtualization layer such as Linux containers is that they can be started almost instantly upon request. We will show how fast the start-up and disposal of containers eventually enables us to implement an opportunistic cluster based on Plancton daemons without a central control node, where the spawned Docker containers behave as job pilots. Finally, we will show how Plancton was configured to run up to 10 000 concurrent opportunistic jobs on the ALICE High-Level Trigger facility, by giving a considerable advantage in terms of management compared to virtual machines.
Belke, Terry W
2006-02-01
How do animals choose between opportunities to run of different durations? Are longer durations preferred over shorter durations because they permit a greater number of revolutions? Are shorter durations preferred because they engender higher rates of running? Will longer durations be chosen because running is less constrained? The present study reports on three experiments that attempted to address these questions. In the first experiment, five male Wistar rats chose between 10-sec and 50-sec opportunities to run on modified concurrent variable-interval (VI) schedules. Across conditions, the durations associated with the alternatives were reversed. Response, time, and reinforcer proportions did not vary from indifference. In a second experiment, eight female Long-Evans rats chose between opportunities to run of equal (30 sec) and unequal durations (10 sec and 50 sec) on concurrent variable-ratio (VR) schedules. As in Experiment 1, between presentations of equal duration conditions, 10-sec and 50-sec durations were reversed. Results showed that response, time, and reinforcer proportions on an alternative did not vary with reinforcer duration. In a third experiment, using concurrent VR schedules, durations were systematically varied to decrease the shorter duration toward 0 sec. As the shorter duration decreased, response, time, and reinforcer proportions shifted toward the longer duration. In summary, differences in durations of opportunities to run did not affect choice behavior in a manner consistent with the assumption that a longer reinforcer is a larger reinforcer.
40 CFR 1506.10 - Timing of agency action.
Code of Federal Regulations, 2010 CFR
2010-07-01
... period prescribed in paragraph (b)(2) of this section may run concurrently. In such cases the... health or safety, may waive the time period in paragraph (b)(2) of this section and publish a decision on...) day period may run concurrently. However, subject to paragraph (d) of this section agencies shall...
40 CFR 1506.10 - Timing of agency action.
Code of Federal Regulations, 2011 CFR
2011-07-01
... period prescribed in paragraph (b)(2) of this section may run concurrently. In such cases the... health or safety, may waive the time period in paragraph (b)(2) of this section and publish a decision on...) day period may run concurrently. However, subject to paragraph (d) of this section agencies shall...
ELAS - SCIENCE & TECHNOLOGY LABORATORY APPLICATIONS SOFTWARE (SILICON GRAPHICS VERSION)
NASA Technical Reports Server (NTRS)
Walters, D.
1994-01-01
The Science and Technology Laboratory Applications Software (ELAS) was originally designed to analyze and process digital imagery data, specifically remotely-sensed scanner data. This capability includes the processing of Landsat multispectral data; aircraft-acquired scanner data; digitized topographic data; and numerous other ancillary data, such as soil types and rainfall information, that can be stored in digitized form. ELAS has the subsequent capability to geographically reference this data to dozens of standard, as well as user created projections. As an integrated image processing system, ELAS offers the user of remotely-sensed data a wide range of capabilities in the areas of land cover analysis and general purpose image analysis. ELAS is designed for flexible use and operation and includes its own FORTRAN operating subsystem and an expandable set of FORTRAN application modules. Because all of ELAS resides in one "logical" FORTRAN program, data inputs and outputs, directives, and module switching are convenient for the user. There are over 230 modules presently available to aid the user in performing a wide range of land cover analyses and manipulation. The file management modules enable the user to allocate, define, access, and specify usage for all types of files (ELAS files, subfiles, external files etc.). Various other modules convert specific types of satellite, aircraft, and vector-polygon data into files that can be used by other ELAS modules. The user also has many module options which aid in displaying image data, such as magnification/reduction of the display; true color display; and several memory functions. Additional modules allow for the building and manipulation of polygonal areas of the image data. Finally, there are modules which allow the user to select and classify the image data. An important feature of the ELAS subsystem is that its structure allows new applications modules to be easily integrated in the future. ELAS has as a standard the flexibility to process data elements exceeding 8 bits in length, including floating point (noninteger) elements and 16 or 32 bit integers. Thus it is able to analyze and process "non-standard" nonimage data. The VAX (ERL-10017) and Concurrent (ERL-10013) versions of ELAS 9.0 are written in FORTRAN and ASSEMBLER for DEC VAX series computers running VMS and Concurrent computers running MTM. The Sun (SSC-00019), Masscomp (SSC-00020), and Silicon Graphics (SSC-00021) versions of ELAS 9.0 are written in FORTRAN 77 and C-LANGUAGE for Sun4 series computers running SunOS, Masscomp computers running UNIX, and Silicon Graphics IRIS computers running IRIX. The Concurrent version requires at least 15 bit addressing and a direct memory access channel. The VAX and Concurrent versions of ELAS both require floating-point hardware, at least 1Mb of RAM, and approximately 70Mb of disk space. Both versions also require a COMTAL display device in order to display images. For the Sun, Masscomp, and Silicon Graphics versions of ELAS, the disk storage required is approximately 115Mb, and a minimum of 8Mb of RAM is required for execution. The Sun version of ELAS requires either the X-Window System Version 11 Revision 4 or Sun OpenWindows Version 2. The Masscomp version requires a GA1000 display device and the associated "gp" library. The Silicon Graphics version requires Silicon Graphics' GL library. ELAS display functions will not work with a monochrome monitor. The standard distribution medium for the VAX version (ERL10017) is a set of two 9-track 1600 BPI magnetic tapes in DEC VAX BACKUP format. This version is also available on a TK50 tape cartridge in DEC VAX BACKUP format. The standard distribution medium for the Concurrent version (ERL-10013) is a set of two 9-track 1600 BPI magnetic tapes in Concurrent BACKUP format. The standard distribution medium for the Sun version (SSC-00019) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Masscomp version, (SSC-00020) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Silicon Graphics version (SSC-00021) is a .25 inch streaming magnetic IRIS tape cartridge in UNIX tar format. Version 9.0 was released in 1991. Sun4, SunOS, and Open Windows are trademarks of Sun Microsystems, Inc. MIT X Window System is licensed by Massachusetts Institute of Technology.
ELAS - SCIENCE & TECHNOLOGY LABORATORY APPLICATIONS SOFTWARE (SUN VERSION)
NASA Technical Reports Server (NTRS)
Walters, D.
1994-01-01
The Science and Technology Laboratory Applications Software (ELAS) was originally designed to analyze and process digital imagery data, specifically remotely-sensed scanner data. This capability includes the processing of Landsat multispectral data; aircraft-acquired scanner data; digitized topographic data; and numerous other ancillary data, such as soil types and rainfall information, that can be stored in digitized form. ELAS has the subsequent capability to geographically reference this data to dozens of standard, as well as user created projections. As an integrated image processing system, ELAS offers the user of remotely-sensed data a wide range of capabilities in the areas of land cover analysis and general purpose image analysis. ELAS is designed for flexible use and operation and includes its own FORTRAN operating subsystem and an expandable set of FORTRAN application modules. Because all of ELAS resides in one "logical" FORTRAN program, data inputs and outputs, directives, and module switching are convenient for the user. There are over 230 modules presently available to aid the user in performing a wide range of land cover analyses and manipulation. The file management modules enable the user to allocate, define, access, and specify usage for all types of files (ELAS files, subfiles, external files etc.). Various other modules convert specific types of satellite, aircraft, and vector-polygon data into files that can be used by other ELAS modules. The user also has many module options which aid in displaying image data, such as magnification/reduction of the display; true color display; and several memory functions. Additional modules allow for the building and manipulation of polygonal areas of the image data. Finally, there are modules which allow the user to select and classify the image data. An important feature of the ELAS subsystem is that its structure allows new applications modules to be easily integrated in the future. ELAS has as a standard the flexibility to process data elements exceeding 8 bits in length, including floating point (noninteger) elements and 16 or 32 bit integers. Thus it is able to analyze and process "non-standard" nonimage data. The VAX (ERL-10017) and Concurrent (ERL-10013) versions of ELAS 9.0 are written in FORTRAN and ASSEMBLER for DEC VAX series computers running VMS and Concurrent computers running MTM. The Sun (SSC-00019), Masscomp (SSC-00020), and Silicon Graphics (SSC-00021) versions of ELAS 9.0 are written in FORTRAN 77 and C-LANGUAGE for Sun4 series computers running SunOS, Masscomp computers running UNIX, and Silicon Graphics IRIS computers running IRIX. The Concurrent version requires at least 15 bit addressing and a direct memory access channel. The VAX and Concurrent versions of ELAS both require floating-point hardware, at least 1Mb of RAM, and approximately 70Mb of disk space. Both versions also require a COMTAL display device in order to display images. For the Sun, Masscomp, and Silicon Graphics versions of ELAS, the disk storage required is approximately 115Mb, and a minimum of 8Mb of RAM is required for execution. The Sun version of ELAS requires either the X-Window System Version 11 Revision 4 or Sun OpenWindows Version 2. The Masscomp version requires a GA1000 display device and the associated "gp" library. The Silicon Graphics version requires Silicon Graphics' GL library. ELAS display functions will not work with a monochrome monitor. The standard distribution medium for the VAX version (ERL10017) is a set of two 9-track 1600 BPI magnetic tapes in DEC VAX BACKUP format. This version is also available on a TK50 tape cartridge in DEC VAX BACKUP format. The standard distribution medium for the Concurrent version (ERL-10013) is a set of two 9-track 1600 BPI magnetic tapes in Concurrent BACKUP format. The standard distribution medium for the Sun version (SSC-00019) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Masscomp version, (SSC-00020) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Silicon Graphics version (SSC-00021) is a .25 inch streaming magnetic IRIS tape cartridge in UNIX tar format. Version 9.0 was released in 1991. Sun4, SunOS, and Open Windows are trademarks of Sun Microsystems, Inc. MIT X Window System is licensed by Massachusetts Institute of Technology.
ELAS - SCIENCE & TECHNOLOGY LABORATORY APPLICATIONS SOFTWARE (MASSCOMP VERSION)
NASA Technical Reports Server (NTRS)
Walters, D.
1994-01-01
The Science and Technology Laboratory Applications Software (ELAS) was originally designed to analyze and process digital imagery data, specifically remotely-sensed scanner data. This capability includes the processing of Landsat multispectral data; aircraft-acquired scanner data; digitized topographic data; and numerous other ancillary data, such as soil types and rainfall information, that can be stored in digitized form. ELAS has the subsequent capability to geographically reference this data to dozens of standard, as well as user created projections. As an integrated image processing system, ELAS offers the user of remotely-sensed data a wide range of capabilities in the areas of land cover analysis and general purpose image analysis. ELAS is designed for flexible use and operation and includes its own FORTRAN operating subsystem and an expandable set of FORTRAN application modules. Because all of ELAS resides in one "logical" FORTRAN program, data inputs and outputs, directives, and module switching are convenient for the user. There are over 230 modules presently available to aid the user in performing a wide range of land cover analyses and manipulation. The file management modules enable the user to allocate, define, access, and specify usage for all types of files (ELAS files, subfiles, external files etc.). Various other modules convert specific types of satellite, aircraft, and vector-polygon data into files that can be used by other ELAS modules. The user also has many module options which aid in displaying image data, such as magnification/reduction of the display; true color display; and several memory functions. Additional modules allow for the building and manipulation of polygonal areas of the image data. Finally, there are modules which allow the user to select and classify the image data. An important feature of the ELAS subsystem is that its structure allows new applications modules to be easily integrated in the future. ELAS has as a standard the flexibility to process data elements exceeding 8 bits in length, including floating point (noninteger) elements and 16 or 32 bit integers. Thus it is able to analyze and process "non-standard" nonimage data. The VAX (ERL-10017) and Concurrent (ERL-10013) versions of ELAS 9.0 are written in FORTRAN and ASSEMBLER for DEC VAX series computers running VMS and Concurrent computers running MTM. The Sun (SSC-00019), Masscomp (SSC-00020), and Silicon Graphics (SSC-00021) versions of ELAS 9.0 are written in FORTRAN 77 and C-LANGUAGE for Sun4 series computers running SunOS, Masscomp computers running UNIX, and Silicon Graphics IRIS computers running IRIX. The Concurrent version requires at least 15 bit addressing and a direct memory access channel. The VAX and Concurrent versions of ELAS both require floating-point hardware, at least 1Mb of RAM, and approximately 70Mb of disk space. Both versions also require a COMTAL display device in order to display images. For the Sun, Masscomp, and Silicon Graphics versions of ELAS, the disk storage required is approximately 115Mb, and a minimum of 8Mb of RAM is required for execution. The Sun version of ELAS requires either the X-Window System Version 11 Revision 4 or Sun OpenWindows Version 2. The Masscomp version requires a GA1000 display device and the associated "gp" library. The Silicon Graphics version requires Silicon Graphics' GL library. ELAS display functions will not work with a monochrome monitor. The standard distribution medium for the VAX version (ERL10017) is a set of two 9-track 1600 BPI magnetic tapes in DEC VAX BACKUP format. This version is also available on a TK50 tape cartridge in DEC VAX BACKUP format. The standard distribution medium for the Concurrent version (ERL-10013) is a set of two 9-track 1600 BPI magnetic tapes in Concurrent BACKUP format. The standard distribution medium for the Sun version (SSC-00019) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Masscomp version, (SSC-00020) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Silicon Graphics version (SSC-00021) is a .25 inch streaming magnetic IRIS tape cartridge in UNIX tar format. Version 9.0 was released in 1991. Sun4, SunOS, and Open Windows are trademarks of Sun Microsystems, Inc. MIT X Window System is licensed by Massachusetts Institute of Technology.
ELAS - SCIENCE & TECHNOLOGY LABORATORY APPLICATIONS SOFTWARE (DEC VAX VERSION)
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1994-01-01
The Science and Technology Laboratory Applications Software (ELAS) was originally designed to analyze and process digital imagery data, specifically remotely-sensed scanner data. This capability includes the processing of Landsat multispectral data; aircraft-acquired scanner data; digitized topographic data; and numerous other ancillary data, such as soil types and rainfall information, that can be stored in digitized form. ELAS has the subsequent capability to geographically reference this data to dozens of standard, as well as user created projections. As an integrated image processing system, ELAS offers the user of remotely-sensed data a wide range of capabilities in the areas of land cover analysis and general purpose image analysis. ELAS is designed for flexible use and operation and includes its own FORTRAN operating subsystem and an expandable set of FORTRAN application modules. Because all of ELAS resides in one "logical" FORTRAN program, data inputs and outputs, directives, and module switching are convenient for the user. There are over 230 modules presently available to aid the user in performing a wide range of land cover analyses and manipulation. The file management modules enable the user to allocate, define, access, and specify usage for all types of files (ELAS files, subfiles, external files etc.). Various other modules convert specific types of satellite, aircraft, and vector-polygon data into files that can be used by other ELAS modules. The user also has many module options which aid in displaying image data, such as magnification/reduction of the display; true color display; and several memory functions. Additional modules allow for the building and manipulation of polygonal areas of the image data. Finally, there are modules which allow the user to select and classify the image data. An important feature of the ELAS subsystem is that its structure allows new applications modules to be easily integrated in the future. ELAS has as a standard the flexibility to process data elements exceeding 8 bits in length, including floating point (noninteger) elements and 16 or 32 bit integers. Thus it is able to analyze and process "non-standard" nonimage data. The VAX (ERL-10017) and Concurrent (ERL-10013) versions of ELAS 9.0 are written in FORTRAN and ASSEMBLER for DEC VAX series computers running VMS and Concurrent computers running MTM. The Sun (SSC-00019), Masscomp (SSC-00020), and Silicon Graphics (SSC-00021) versions of ELAS 9.0 are written in FORTRAN 77 and C-LANGUAGE for Sun4 series computers running SunOS, Masscomp computers running UNIX, and Silicon Graphics IRIS computers running IRIX. The Concurrent version requires at least 15 bit addressing and a direct memory access channel. The VAX and Concurrent versions of ELAS both require floating-point hardware, at least 1Mb of RAM, and approximately 70Mb of disk space. Both versions also require a COMTAL display device in order to display images. For the Sun, Masscomp, and Silicon Graphics versions of ELAS, the disk storage required is approximately 115Mb, and a minimum of 8Mb of RAM is required for execution. The Sun version of ELAS requires either the X-Window System Version 11 Revision 4 or Sun OpenWindows Version 2. The Masscomp version requires a GA1000 display device and the associated "gp" library. The Silicon Graphics version requires Silicon Graphics' GL library. ELAS display functions will not work with a monochrome monitor. The standard distribution medium for the VAX version (ERL10017) is a set of two 9-track 1600 BPI magnetic tapes in DEC VAX BACKUP format. This version is also available on a TK50 tape cartridge in DEC VAX BACKUP format. The standard distribution medium for the Concurrent version (ERL-10013) is a set of two 9-track 1600 BPI magnetic tapes in Concurrent BACKUP format. The standard distribution medium for the Sun version (SSC-00019) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Masscomp version, (SSC-00020) is a .25 inch streaming magnetic tape cartridge in UNIX tar format. The standard distribution medium for the Silicon Graphics version (SSC-00021) is a .25 inch streaming magnetic IRIS tape cartridge in UNIX tar format. Version 9.0 was released in 1991. Sun4, SunOS, and Open Windows are trademarks of Sun Microsystems, Inc. MIT X Window System is licensed by Massachusetts Institute of Technology.
The Acute Effect of Concurrent Training on Running Performance over 6 Days
ERIC Educational Resources Information Center
Doma, Kenji; Deakin, Glen
2015-01-01
Purpose: This study examined the effects of strength training on alternating days and endurance training on consecutive days on running performance for 6 days. Methods: Sixteen male and 8 female moderately trained individuals were evenly assigned into concurrent-training (CCT) and strength-training (ST) groups. The CCT group undertook strength…
34 CFR 674.52 - Cancellation procedures.
Code of Federal Regulations, 2010 CFR
2010-07-01
... borrower's loan deferment under § 674.34(c) to run concurrently with any period for which cancellation... run concurrently with any period for which a cancellation under §§ 674.53, 674.56, 674.57, or 674.58 is granted. (e) National community service. No borrower who has received a benefit under subtitle D...
Exercise to reduce the escalation of cocaine self-administration in adolescent and adult rats.
Zlebnik, Natalie E; Anker, Justin J; Carroll, Marilyn E
2012-12-01
Concurrent access to an exercise wheel decreases cocaine self-administration under short access (5 h/day for 5 days) conditions and suppresses cocaine-primed reinstatement in adult rats. The effect of exercise (wheel running) on the escalation of cocaine intake during long access (LgA, 6 h/day for 26 days) conditions was evaluated. Adolescent and adult female rats acquired wheel running, and behavior was allowed to stabilize for 3 days. They were then implanted with an iv catheter and allowed to self-administer cocaine (0.4 mg/kg, iv) during 6-h daily sessions for 16 days with concurrent access to either an unlocked or a locked running wheel. Subsequently, for ten additional sessions, wheel access conditions during cocaine self-administration sessions were reversed (i.e., locked wheels became unlocked and vice versa). In the adolescents, concurrent access to the unlocked exercise wheel decreased responding for cocaine and attenuated escalation of cocaine intake irrespective of whether the locked or unlocked condition came first. However, cocaine intake increased when the wheel was subsequently locked for the adolescents that had initial access to an unlocked wheel. Concurrent wheel access either before or after the locked wheel access did not reduce cocaine intake in adults. Wheel running reduced cocaine intake during LgA conditions in adolescent but not adult rats, and concurrent access to the running wheel was necessary. These results suggest that exercise prevents cocaine seeking and that this effect is more pronounced in adolescents than adults.
Communications oriented programming of parallel iterative solutions of sparse linear systems
NASA Technical Reports Server (NTRS)
Patrick, M. L.; Pratt, T. W.
1986-01-01
Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.
Sabag, Angelo; Najafi, Abdolrahman; Michael, Scott; Esgin, Tuguy; Halaki, Mark; Hackett, Daniel
2018-04-16
The purpose of this systematic review and meta-analysis is to assess the effect of concurrent high intensity interval training (HIIT) and resistance training (RT) on strength and hypertrophy. Five electronic databases were searched using terms related to HIIT, RT, and concurrent training. Effect size (ES), calculated as standardised differences in the means, were used to examine the effect of concurrent HIIT and RT compared to RT alone on muscle strength and hypertrophy. Sub-analyses were performed to assess region-specific strength and hypertrophy, HIIT modality (cycling versus running), and inter-modal rest responses. Compared to RT alone, concurrent HIIT and RT led to similar changes in muscle hypertrophy and upper body strength. Concurrent HIIT and RT resulted in a lower increase in lower body strength compared to RT alone (ES = -0.248, p = 0.049). Sub analyses showed a trend for lower body strength to be negatively affected by cycling HIIT (ES = -0.377, p = 0.074) and not running (ES = -0.176, p = 0.261). Data suggests concurrent HIIT and RT does not negatively impact hypertrophy or upper body strength, and that any possible negative effect on lower body strength may be ameliorated by incorporating running based HIIT and longer inter-modal rest periods.
Integrated System-Level Optimization for Concurrent Engineering With Parametric Subsystem Modeling
NASA Technical Reports Server (NTRS)
Schuman, Todd; DeWeck, Oliver L.; Sobieski, Jaroslaw
2005-01-01
The introduction of concurrent design practices to the aerospace industry has greatly increased the productivity of engineers and teams during design sessions as demonstrated by JPL's Team X. Simultaneously, advances in computing power have given rise to a host of potent numerical optimization methods capable of solving complex multidisciplinary optimization problems containing hundreds of variables, constraints, and governing equations. Unfortunately, such methods are tedious to set up and require significant amounts of time and processor power to execute, thus making them unsuitable for rapid concurrent engineering use. This paper proposes a framework for Integration of System-Level Optimization with Concurrent Engineering (ISLOCE). It uses parametric neural-network approximations of the subsystem models. These approximations are then linked to a system-level optimizer that is capable of reaching a solution quickly due to the reduced complexity of the approximations. The integration structure is described in detail and applied to the multiobjective design of a simplified Space Shuttle external fuel tank model. Further, a comparison is made between the new framework and traditional concurrent engineering (without system optimization) through an experimental trial with two groups of engineers. Each method is evaluated in terms of optimizer accuracy, time to solution, and ease of use. The results suggest that system-level optimization, running as a background process during integrated concurrent engineering sessions, is potentially advantageous as long as it is judiciously implemented.
Production experience with the ATLAS Event Service
NASA Astrophysics Data System (ADS)
Benjamin, D.; Calafiura, P.; Childers, T.; De, K.; Guan, W.; Maeno, T.; Nilsson, P.; Tsulaia, V.; Van Gemmeren, P.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The ATLAS Event Service (AES) has been designed and implemented for efficient running of ATLAS production workflows on a variety of computing platforms, ranging from conventional Grid sites to opportunistic, often short-lived resources, such as spot market commercial clouds, supercomputers and volunteer computing. The Event Service architecture allows real time delivery of fine grained workloads to running payload applications which process dispatched events or event ranges and immediately stream the outputs to highly scalable Object Stores. Thanks to its agile and flexible architecture the AES is currently being used by grid sites for assigning low priority workloads to otherwise idle computing resources; similarly harvesting HPC resources in an efficient back-fill mode; and massively scaling out to the 50-100k concurrent core level on the Amazon spot market to efficiently utilize those transient resources for peak production needs. Platform ports in development include ATLAS@Home (BOINC) and the Google Compute Engine, and a growing number of HPC platforms. After briefly reviewing the concept and the architecture of the Event Service, we will report the status and experience gained in AES commissioning and production operations on supercomputers, and our plans for extending ES application beyond Geant4 simulation to other workflows, such as reconstruction and data analysis.
Highlights of X-Stack ExM Deliverable Swift/T
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wozniak, Justin M.
Swift/T is a key success from the ExM: System support for extreme-scale, many-task applications1 X-Stack project, which proposed to use concurrent dataflow as an innovative programming model to exploit extreme parallelism in exascale computers. The Swift/T component of the project reimplemented the Swift language from scratch to allow applications that compose scientific modules together to be build and run on available petascale computers (Blue Gene, Cray). Swift/T does this via a new compiler and runtime that generates and executes the application as an MPI program. We assume that mission-critical emerging exascale applications will be composed as scalable applications using existingmore » software components, connected by data dependencies. Developers wrap native code fragments using a higherlevel language, then build composite applications to form a computational experiment. This exemplifies hierarchical concurrency: lower-level messaging libraries are used for fine-grained parallelism; highlevel control is used for inter-task coordination. These patterns are best expressed with dataflow, but static DAGs (i.e., other workflow languages) limit the applications that can be built; they do not provide the expressiveness of Swift, such as conditional execution, iteration, and recursive functions.« less
Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment.
Liu, Qi; Cai, Weidong; Jin, Dandan; Shen, Jian; Fu, Zhangjie; Liu, Xiaodong; Linge, Nigel
2016-08-30
Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve straggler problems. However, there is still no efficient solution for accurate estimation on execution time of run-time tasks, which can affect task allocation and distribution in MapReduce. In this paper, task execution data have been collected and employed for the estimation. A two-phase regression (TPR) method is proposed to predict the finishing time of each task accurately. Detailed data of each task have drawn interests with detailed analysis report being made. According to the results, the prediction accuracy of concurrent tasks' execution time can be improved, in particular for some regular jobs.
Parallel Signal Processing and System Simulation using aCe
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2003-01-01
Recently, networked and cluster computation have become very popular for both signal processing and system simulation. A new language is ideally suited for parallel signal processing applications and system simulation since it allows the programmer to explicitly express the computations that can be performed concurrently. In addition, the new C based parallel language (ace C) for architecture-adaptive programming allows programmers to implement algorithms and system simulation applications on parallel architectures by providing them with the assurance that future parallel architectures will be able to run their applications with a minimum of modification. In this paper, we will focus on some fundamental features of ace C and present a signal processing application (FFT).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Medin, Stanislav A.; Basko, Mikhail M.; Orlov, Yurii N.
2012-07-11
Radiation hydrodynamics 1D simulations were performed with two concurrent codes, DEIRA and RAMPHY. The DEIRA code was used for DT capsule implosion and burn, and the RAMPHY code was used for computation of X-ray and fast ions deposition in the first wall liquid film of the reactor chamber. The simulations were run for 740 MJ direct drive DT capsule and Pb thin liquid wall reactor chamber of 10 m diameter. Temporal profiles for DT capsule leaking power of X-rays, neutrons and fast {sup 4}He ions were obtained and spatial profiles of the liquid film flow parameter were computed and analyzed.
Cooperating reduction machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kluge, W.E.
1983-11-01
This paper presents a concept and a system architecture for the concurrent execution of program expressions of a concrete reduction language based on lamda-expressions. If formulated appropriately, these expressions are well-suited for concurrent execution, following a demand-driven model of computation. In particular, recursive program expressions with nonlinear expansion may, at run time, recursively be partitioned into a hierarchy of independent subexpressions which can be reduced by a corresponding hierarchy of virtual reduction machines. This hierarchy unfolds and collapses dynamically, with virtual machines recursively assuming the role of masters that create and eventually terminate, or synchronize with, slaves. The paper alsomore » proposes a nonhierarchically organized system of reduction machines, each featuring a stack architecture, that effectively supports the allocation of virtual machines to the real machines of the system in compliance with their hierarchical order of creation and termination. 25 references.« less
Belke, Terry W
2010-01-01
Previous research suggested that allocation of responses on concurrent schedules of wheel-running reinforcement was less sensitive to schedule differences than typically observed with more conventional reinforcers. To assess this possibility, 16 female Long Evans rats were exposed to concurrent FR FR schedules of reinforcement and the schedule value on one alternative was systematically increased. In one condition, the reinforcer on both alternatives was .1 ml of 7.5% sucrose solution; in the other, it was a 30-s opportunity to run in a wheel. Results showed that the average ratio at which greater than 90% of responses were allocated to the unchanged alternative was higher with wheel-running reinforcement. As the ratio requirement was initially increased, responding strongly shifted toward the unchanged alternative with sucrose, but not with wheel running. Instead, responding initially increased on both alternatives, then subsequently shifted toward the unchanged alternative. Furthermore, changeover responses as a percentage of total responses decreased with sucrose, but not wheel-running reinforcement. Finally, for some animals, responding on the increasing ratio alternative decreased as the ratio requirement increased, but then stopped and did not decline with further increments. The implications of these results for theories of choice are discussed. PMID:21451744
Belke, Terry W
2010-09-01
Previous research suggested that allocation of responses on concurrent schedules of wheel-running reinforcement was less sensitive to schedule differences than typically observed with more conventional reinforcers. To assess this possibility, 16 female Long Evans rats were exposed to concurrent FR FR schedules of reinforcement and the schedule value on one alternative was systematically increased. In one condition, the reinforcer on both alternatives was .1 ml of 7.5% sucrose solution; in the other, it was a 30-s opportunity to run in a wheel. Results showed that the average ratio at which greater than 90% of responses were allocated to the unchanged alternative was higher with wheel-running reinforcement. As the ratio requirement was initially increased, responding strongly shifted toward the unchanged alternative with sucrose, but not with wheel running. Instead, responding initially increased on both alternatives, then subsequently shifted toward the unchanged alternative. Furthermore, changeover responses as a percentage of total responses decreased with sucrose, but not wheel-running reinforcement. Finally, for some animals, responding on the increasing ratio alternative decreased as the ratio requirement increased, but then stopped and did not decline with further increments. The implications of these results for theories of choice are discussed.
An Authentication Protocol for Future Sensor Networks.
Bilal, Muhammad; Kang, Shin-Gak
2017-04-28
Authentication is one of the essential security services in Wireless Sensor Networks (WSNs) for ensuring secure data sessions. Sensor node authentication ensures the confidentiality and validity of data collected by the sensor node, whereas user authentication guarantees that only legitimate users can access the sensor data. In a mobile WSN, sensor and user nodes move across the network and exchange data with multiple nodes, thus experiencing the authentication process multiple times. The integration of WSNs with Internet of Things (IoT) brings forth a new kind of WSN architecture along with stricter security requirements; for instance, a sensor node or a user node may need to establish multiple concurrent secure data sessions. With concurrent data sessions, the frequency of the re-authentication process increases in proportion to the number of concurrent connections. Moreover, to establish multiple data sessions, it is essential that a protocol participant have the capability of running multiple instances of the protocol run, which makes the security issue even more challenging. The currently available authentication protocols were designed for the autonomous WSN and do not account for the above requirements. Hence, ensuring a lightweight and efficient authentication protocol has become more crucial. In this paper, we present a novel, lightweight and efficient key exchange and authentication protocol suite called the Secure Mobile Sensor Network (SMSN) Authentication Protocol. In the SMSN a mobile node goes through an initial authentication procedure and receives a re-authentication ticket from the base station. Later a mobile node can use this re-authentication ticket when establishing multiple data exchange sessions and/or when moving across the network. This scheme reduces the communication and computational complexity of the authentication process. We proved the strength of our protocol with rigorous security analysis (including formal analysis using the BAN-logic) and simulated the SMSN and previously proposed schemes in an automated protocol verifier tool. Finally, we compared the computational complexity and communication cost against well-known authentication protocols.
An Authentication Protocol for Future Sensor Networks
Bilal, Muhammad; Kang, Shin-Gak
2017-01-01
Authentication is one of the essential security services in Wireless Sensor Networks (WSNs) for ensuring secure data sessions. Sensor node authentication ensures the confidentiality and validity of data collected by the sensor node, whereas user authentication guarantees that only legitimate users can access the sensor data. In a mobile WSN, sensor and user nodes move across the network and exchange data with multiple nodes, thus experiencing the authentication process multiple times. The integration of WSNs with Internet of Things (IoT) brings forth a new kind of WSN architecture along with stricter security requirements; for instance, a sensor node or a user node may need to establish multiple concurrent secure data sessions. With concurrent data sessions, the frequency of the re-authentication process increases in proportion to the number of concurrent connections. Moreover, to establish multiple data sessions, it is essential that a protocol participant have the capability of running multiple instances of the protocol run, which makes the security issue even more challenging. The currently available authentication protocols were designed for the autonomous WSN and do not account for the above requirements. Hence, ensuring a lightweight and efficient authentication protocol has become more crucial. In this paper, we present a novel, lightweight and efficient key exchange and authentication protocol suite called the Secure Mobile Sensor Network (SMSN) Authentication Protocol. In the SMSN a mobile node goes through an initial authentication procedure and receives a re-authentication ticket from the base station. Later a mobile node can use this re-authentication ticket when establishing multiple data exchange sessions and/or when moving across the network. This scheme reduces the communication and computational complexity of the authentication process. We proved the strength of our protocol with rigorous security analysis (including formal analysis using the BAN-logic) and simulated the SMSN and previously proposed schemes in an automated protocol verifier tool. Finally, we compared the computational complexity and communication cost against well-known authentication protocols. PMID:28452937
Belke, Terry W
2012-07-01
Belke (2010) showed that on concurrent ratio schedules, the difference in ratio requirements required to produce near exclusive preference for the lower ratio alternative was substantively greater when the reinforcer was wheel running than when it was sucrose. The current study replicated this finding and showed that this choice behavior can be described by the matching law and the contingency discriminability model. Eight female Long Evans rats were exposed to concurrent VR schedules of wheel-running reinforcement (30s) and the schedule value of the initially preferred alternative was systematically increased. Two rats rapidly developed exclusive preference for the lower ratio alternative, but the majority did not - even when ratios differed by 20:1. Analysis showed that estimates of slopes from the matching law and the proportion of reinforcers misattributed from the contingency discriminability model were related to the ratios at which near exclusive preference developed. The fit of these models would be consistent with misattribution of reinforcers or poor discrimination between alternatives due to the long duration of wheel running. Copyright © 2012 Elsevier B.V. All rights reserved.
Belke, Terry W; Pierce, W David; Duncan, Ian D
2006-09-01
Choice between sucrose and wheel-running reinforcement was assessed in two experiments. In the first experiment, ten male Wistar rats were exposed to concurrent VI 30 s VI 30 s schedules of wheel-running and sucrose reinforcement. Sucrose concentration varied across concentrations of 2.5, 7.5, and 12.5%. As concentration increased, more behavior was allocated to sucrose and more reinforcements were obtained from that alternative. Allocation of behavior to wheel running decreased, but obtained wheel-running reinforcement did not change. Overall, the results suggested that food-deprived rats were sensitive to qualitative changes in food supply (sucrose concentration) while continuing to defend a level of physical activity (wheel running). In the second study, 15 female Long Evans rats were exposed to concurrent variable ratio schedules of sucrose and wheel-running, wheel-running and wheel-running, and sucrose and sucrose reinforcement. For each pair of reinforcers, substitutability was assessed by the effect of income-compensated price changes on consumption of the two reinforcers. Results showed that, as expected, sucrose substituted for sucrose and wheel running substituted for wheel running. Wheel running, however, did not substitute for sucrose; but sucrose partially substituted for wheel running. We address the implications of the interrelationships of sucrose and wheel running for an understanding of activity anorexia.
Belke, Terry W; Duncan, Ian D; David Pierce, W
2006-01-01
Choice between sucrose and wheel-running reinforcement was assessed in two experiments. In the first experiment, ten male Wistar rats were exposed to concurrent VI 30 s VI 30 s schedules of wheel-running and sucrose reinforcement. Sucrose concentration varied across concentrations of 2.5, 7.5, and 12.5%. As concentration increased, more behavior was allocated to sucrose and more reinforcements were obtained from that alternative. Allocation of behavior to wheel running decreased, but obtained wheel-running reinforcement did not change. Overall, the results suggested that food-deprived rats were sensitive to qualitative changes in food supply (sucrose concentration) while continuing to defend a level of physical activity (wheel running). In the second study, 15 female Long Evans rats were exposed to concurrent variable ratio schedules of sucrose and wheel-running, wheel-running and wheel-running, and sucrose and sucrose reinforcement. For each pair of reinforcers, substitutability was assessed by the effect of income-compensated price changes on consumption of the two reinforcers. Results showed that, as expected, sucrose substituted for sucrose and wheel running substituted for wheel running. Wheel running, however, did not substitute for sucrose; but sucrose partially substituted for wheel running. We address the implications of the interrelationships of sucrose and wheel running for an understanding of activity anorexia. PMID:17002224
Parallel processing for scientific computations
NASA Technical Reports Server (NTRS)
Alkhatib, Hasan S.
1991-01-01
The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahn, Surl-Hee; Grate, Jay W.; Darve, Eric F.
Molecular dynamics (MD) simulations are useful in obtaining thermodynamic and kinetic properties of bio-molecules but are limited by the timescale barrier, i.e., we may be unable to efficiently obtain properties because we need to run microseconds or longer simulations using femtoseconds time steps. While there are several existing methods to overcome this timescale barrier and efficiently sample thermodynamic and/or kinetic properties, problems remain in regard to being able to sample un- known systems, deal with high-dimensional space of collective variables, and focus the computational effort on slow timescales. Hence, a new sampling method, called the “Concurrent Adaptive Sampling (CAS) algorithm,”more » has been developed to tackle these three issues and efficiently obtain conformations and pathways. The method is not constrained to use only one or two collective variables, unlike most reaction coordinate-dependent methods. Instead, it can use a large number of collective vari- ables and uses macrostates (a partition of the collective variable space) to enhance the sampling. The exploration is done by running a large number of short simula- tions, and a clustering technique is used to accelerate the sampling. In this paper, we introduce the new methodology and show results from two-dimensional models and bio-molecules, such as penta-alanine and triazine polymer« less
Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
Liu, Qi; Cai, Weidong; Jin, Dandan; Shen, Jian; Fu, Zhangjie; Liu, Xiaodong; Linge, Nigel
2016-01-01
Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve straggler problems. However, there is still no efficient solution for accurate estimation on execution time of run-time tasks, which can affect task allocation and distribution in MapReduce. In this paper, task execution data have been collected and employed for the estimation. A two-phase regression (TPR) method is proposed to predict the finishing time of each task accurately. Detailed data of each task have drawn interests with detailed analysis report being made. According to the results, the prediction accuracy of concurrent tasks’ execution time can be improved, in particular for some regular jobs. PMID:27589753
Bio-Docklets: virtualization containers for single-step execution of NGS pipelines.
Kim, Baekdoo; Ali, Thahmina; Lijeron, Carlos; Afgan, Enis; Krampis, Konstantinos
2017-08-01
Processing of next-generation sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized postanalysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers toward seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. We present an approach for abstracting the complex data operations of multistep, bioinformatics pipelines for NGS data analysis. As examples, we have deployed 2 pipelines for RNA sequencing and chromatin immunoprecipitation sequencing, preconfigured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines as simply as running a single bioinformatics tool. This is achieved using a "meta-script" that automatically starts the Bio-Docklets and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface. The pipeline output is postprocessed by integration with the Visual Omics Explorer framework, providing interactive data visualizations that users can access through a web browser. Our goal is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts on any computing environment, whether a laboratory workstation, university computer cluster, or a cloud service provider. Beyond end users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets. © The Authors 2017. Published by Oxford University Press.
Bio-Docklets: virtualization containers for single-step execution of NGS pipelines
Kim, Baekdoo; Ali, Thahmina; Lijeron, Carlos; Afgan, Enis
2017-01-01
Abstract Processing of next-generation sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized postanalysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers toward seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. We present an approach for abstracting the complex data operations of multistep, bioinformatics pipelines for NGS data analysis. As examples, we have deployed 2 pipelines for RNA sequencing and chromatin immunoprecipitation sequencing, preconfigured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines as simply as running a single bioinformatics tool. This is achieved using a “meta-script” that automatically starts the Bio-Docklets and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface. The pipeline output is postprocessed by integration with the Visual Omics Explorer framework, providing interactive data visualizations that users can access through a web browser. Our goal is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts on any computing environment, whether a laboratory workstation, university computer cluster, or a cloud service provider. Beyond end users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets. PMID:28854616
Heterogeneous concurrent computing with exportable services
NASA Technical Reports Server (NTRS)
Sunderam, Vaidy
1995-01-01
Heterogeneous concurrent computing, based on the traditional process-oriented model, is approaching its functionality and performance limits. An alternative paradigm, based on the concept of services, supporting data driven computation, and built on a lightweight process infrastructure, is proposed to enhance the functional capabilities and the operational efficiency of heterogeneous network-based concurrent computing. TPVM is an experimental prototype system supporting exportable services, thread-based computation, and remote memory operations that is built as an extension of and an enhancement to the PVM concurrent computing system. TPVM offers a significantly different computing paradigm for network-based computing, while maintaining a close resemblance to the conventional PVM model in the interest of compatibility and ease of transition Preliminary experiences have demonstrated that the TPVM framework presents a natural yet powerful concurrent programming interface, while being capable of delivering performance improvements of upto thirty percent.
NASA Technical Reports Server (NTRS)
Schwan, Karsten
1994-01-01
Atmospheric modeling is a grand challenge problem for several reasons, including its inordinate computational requirements and its generation of large amounts of data concurrent with its use of very large data sets derived from measurement instruments like satellites. In addition, atmospheric models are typically run several times, on new data sets or to reprocess existing data sets, to investigate or reinvestigate specific chemical or physical processes occurring in the earth's atmosphere, to understand model fidelity with respect to observational data, or simply to experiment with specific model parameters or components.
Early MIMD experience on the CRAY X-MP
NASA Astrophysics Data System (ADS)
Rhoades, Clifford E.; Stevens, K. G.
1985-07-01
This paper describes some early experience with converting four physics simulation programs to the CRAY X-MP, a current Multiple Instruction, Multiple Data (MIMD) computer consisting of two processors each with an architecture similar to that of the CRAY-1. As a multi-processor, the CRAY X-MP together with the high speed Solid-state Storage Device (SSD) in an ideal machine upon which to study MIMD algorithms for solving the equations of mathematical physics because it is fast enough to run real problems. The computer programs used in this study are all FORTRAN versions of original production codes. They range in sophistication from a one-dimensional numerical simulation of collisionless plasma to a two-dimensional hydrodynamics code with heat flow to a couple of three-dimensional fluid dynamics codes with varying degrees of viscous modeling. Early research with a dual processor configuration has shown speed-ups ranging from 1.55 to 1.98. It has been observed that a few simple extensions to FORTRAN allow a typical programmer to achieve a remarkable level of efficiency. These extensions involve the concept of memory local to a concurrent subprogram and memory common to all concurrent subprograms.
Creating Web-Based Scientific Applications Using Java Servlets
NASA Technical Reports Server (NTRS)
Palmer, Grant; Arnold, James O. (Technical Monitor)
2001-01-01
There are many advantages to developing web-based scientific applications. Any number of people can access the application concurrently. The application can be accessed from a remote location. The application becomes essentially platform-independent because it can be run from any machine that has internet access and can run a web browser. Maintenance and upgrades to the application are simplified since only one copy of the application exists in a centralized location. This paper details the creation of web-based applications using Java servlets. Java is a powerful, versatile programming language that is well suited to developing web-based programs. A Java servlet provides the interface between the central server and the remote client machines. The servlet accepts input data from the client, runs the application on the server, and sends the output back to the client machine. The type of servlet that supports the HTTP protocol will be discussed in depth. Among the topics the paper will discuss are how to write an http servlet, how the servlet can run applications written in Java and other languages, and how to set up a Java web server. The entire process will be demonstrated by building a web-based application to compute stagnation point heat transfer.
Design and Implementation of a Threaded Search Engine for Tour Recommendation Systems
NASA Astrophysics Data System (ADS)
Lee, Junghoon; Park, Gyung-Leen; Ko, Jin-Hee; Shin, In-Hye; Kang, Mikyung
This paper implements a threaded scan engine for the O(n!) search space and measures its performance, aiming at providing a responsive tour recommendation and scheduling service. As a preliminary step of integrating POI ontology, mobile object database, and personalization profile for the development of new vehicular telematics services, this implementation can give a useful guideline to design a challenging and computation-intensive vehicular telematics service. The implemented engine allocates the subtree to the respective threads and makes them run concurrently exploiting the primitives provided by the operating system and the underlying multiprocessor architecture. It also makes it easy to add a variety of constraints, for example, the search tree is pruned if the cost of partial allocation already exceeds the current best. The performance measurement result shows that the service can run even in the low-power telematics device when the number of destinations does not exceed 15, with an appropriate constraint processing.
BESIU Physical Analysis on Hadoop Platform
NASA Astrophysics Data System (ADS)
Huo, Jing; Zang, Dongsong; Lei, Xiaofeng; Li, Qiang; Sun, Gongxing
2014-06-01
In the past 20 years, computing cluster has been widely used for High Energy Physics data processing. The jobs running on the traditional cluster with a Data-to-Computing structure, have to read large volumes of data via the network to the computing nodes for analysis, thereby making the I/O latency become a bottleneck of the whole system. The new distributed computing technology based on the MapReduce programming model has many advantages, such as high concurrency, high scalability and high fault tolerance, and it can benefit us in dealing with Big Data. This paper brings the idea of using MapReduce model to do BESIII physical analysis, and presents a new data analysis system structure based on Hadoop platform, which not only greatly improve the efficiency of data analysis, but also reduces the cost of system building. Moreover, this paper establishes an event pre-selection system based on the event level metadata(TAGs) database to optimize the data analyzing procedure.
NSTX-U Advances in Real-Time C++11 on Linux
NASA Astrophysics Data System (ADS)
Erickson, Keith G.
2015-08-01
Programming languages like C and Ada combined with proprietary embedded operating systems have dominated the real-time application space for decades. The new C++11 standard includes native, language-level support for concurrency, a required feature for any nontrivial event-oriented real-time software. Threads, Locks, and Atomics now exist to provide the necessary tools to build the structures that make up the foundation of a complex real-time system. The National Spherical Torus Experiment Upgrade (NSTX-U) at the Princeton Plasma Physics Laboratory (PPPL) is breaking new ground with the language as applied to the needs of fusion devices. A new Digital Coil Protection System (DCPS) will serve as the main protection mechanism for the magnetic coils, and it is written entirely in C++11 running on Concurrent Computer Corporation's real-time operating system, RedHawk Linux. It runs over 600 algorithms in a 5 kHz control loop that determine whether or not to shut down operations before physical damage occurs. To accomplish this, NSTX-U engineers developed software tools that do not currently exist elsewhere, including real-time atomic synchronization, real-time containers, and a real-time logging framework. Together with a recent (and carefully configured) version of the GCC compiler, these tools enable data acquisition, processing, and output using a conventional operating system to meet a hard real-time deadline (that is, missing one periodic is a failure) of 200 microseconds.
NASA Astrophysics Data System (ADS)
Delipetrev, Blagoj
2016-04-01
Presently, most of the existing software is desktop-based, designed to work on a single computer, which represents a major limitation in many ways, starting from limited computer processing, storage power, accessibility, availability, etc. The only feasible solution lies in the web and cloud. This abstract presents research and development of a cloud computing geospatial application for water resources based on free and open source software and open standards using hybrid deployment model of public - private cloud, running on two separate virtual machines (VMs). The first one (VM1) is running on Amazon web services (AWS) and the second one (VM2) is running on a Xen cloud platform. The presented cloud application is developed using free and open source software, open standards and prototype code. The cloud application presents a framework how to develop specialized cloud geospatial application that needs only a web browser to be used. This cloud application is the ultimate collaboration geospatial platform because multiple users across the globe with internet connection and browser can jointly model geospatial objects, enter attribute data and information, execute algorithms, and visualize results. The presented cloud application is: available all the time, accessible from everywhere, it is scalable, works in a distributed computer environment, it creates a real-time multiuser collaboration platform, the programing languages code and components are interoperable, and it is flexible in including additional components. The cloud geospatial application is implemented as a specialized water resources application with three web services for 1) data infrastructure (DI), 2) support for water resources modelling (WRM), 3) user management. The web services are running on two VMs that are communicating over the internet providing services to users. The application was tested on the Zletovica river basin case study with concurrent multiple users. The application is a state-of-the-art cloud geospatial collaboration platform. The presented solution is a prototype and can be used as a foundation for developing of any specialized cloud geospatial applications. Further research will be focused on distributing the cloud application on additional VMs, testing the scalability and availability of services.
77 FR 59014 - Sunshine Act Meeting; Notice
Federal Register 2010, 2011, 2012, 2013, 2014
2012-09-25
... the meetings of the Institutional Advancement Committee and the Audit Committee, which will run... Eastern Daylight Time. ** The meeting of the Institutional Advancement Committee will run concurrently.... Institutional Advancement Committee** 5. Audit Committee Monday, October 1, 2012 1. Promotion & Provision for...
C formal verification with unix communication and concurrency
NASA Technical Reports Server (NTRS)
Hoover, Doug N.
1990-01-01
The results of a NASA SBIR project are presented in which CSP-Ariel, a verification system for C programs which use Unix system calls for concurrent programming, interprocess communication, and file input and output, was developed. This project builds on ORA's Ariel C verification system by using the system of Hoare's book, Communicating Sequential Processes, to model concurrency and communication. The system runs in ORA's Clio theorem proving environment. The use of CSP to model Unix concurrency and sketch the CSP semantics of a simple concurrent program is outlined. Plans for further development of CSP-Ariel are discussed. This paper is presented in viewgraph form.
Computational simulation of concurrent engineering for aerospace propulsion systems
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Singhal, S. N.
1992-01-01
Results are summarized of an investigation to assess the infrastructure available and the technology readiness in order to develop computational simulation methods/software for concurrent engineering. These results demonstrate that development of computational simulations methods for concurrent engineering is timely. Extensive infrastructure, in terms of multi-discipline simulation, component-specific simulation, system simulators, fabrication process simulation, and simulation of uncertainties - fundamental in developing such methods, is available. An approach is recommended which can be used to develop computational simulation methods for concurrent engineering for propulsion systems and systems in general. Benefits and facets needing early attention in the development are outlined.
Computational simulation for concurrent engineering of aerospace propulsion systems
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Singhal, S. N.
1993-01-01
Results are summarized for an investigation to assess the infrastructure available and the technology readiness in order to develop computational simulation methods/software for concurrent engineering. These results demonstrate that development of computational simulation methods for concurrent engineering is timely. Extensive infrastructure, in terms of multi-discipline simulation, component-specific simulation, system simulators, fabrication process simulation, and simulation of uncertainties--fundamental to develop such methods, is available. An approach is recommended which can be used to develop computational simulation methods for concurrent engineering of propulsion systems and systems in general. Benefits and issues needing early attention in the development are outlined.
Computational simulation for concurrent engineering of aerospace propulsion systems
NASA Astrophysics Data System (ADS)
Chamis, C. C.; Singhal, S. N.
1993-02-01
Results are summarized for an investigation to assess the infrastructure available and the technology readiness in order to develop computational simulation methods/software for concurrent engineering. These results demonstrate that development of computational simulation methods for concurrent engineering is timely. Extensive infrastructure, in terms of multi-discipline simulation, component-specific simulation, system simulators, fabrication process simulation, and simulation of uncertainties--fundamental to develop such methods, is available. An approach is recommended which can be used to develop computational simulation methods for concurrent engineering of propulsion systems and systems in general. Benefits and issues needing early attention in the development are outlined.
ERIC Educational Resources Information Center
Belke, Terry W.; Duncan, Ian D.; Pierce, W. David
2006-01-01
Choice between sucrose and wheel-running reinforcement was assessed in two experiments. In the first experiment, ten male Wistar rats were exposed to concurrent VI 30 s VI 30 s schedules of wheel-running and sucrose reinforcement. Sucrose concentration varied across concentrations of 2.5, 7.5, and 12.5%. As concentration increased, more behavior…
32 CFR 634.43 - Driving records.
Code of Federal Regulations, 2010 CFR
2010-07-01
... the commission of a felony. Fleeing the scene of an accident involving death or personal injury (hit and run). E. Perjury or making a false statement or affidavit under oath to responsible officials... year for intoxicated driving, revocations may run consecutively (total of 24 months) or concurrently...
Belke, T W; Belliveau, J
2001-05-01
Six male Wistar rats were exposed to concurrent variable-interval schedules of wheel-running reinforcement. The reinforcer associated with each alternative was the opportunity to run for 15 s, and the duration of the changeover delay was 1 s. Results suggested that time allocation was more sensitive to relative reinforcement rate than was response allocation. For time allocation, the mean slopes and intercepts were 0.82 and 0.008, respectively. In contrast, for response allocation, mean slopes and intercepts were 0.60 and 0.03, respectively. Correction for low response rates and high rates of changing over, however, increased slopes for response allocation to about equal those for time allocation. The results of the present study suggest that the two-operant form of the matching law can be extended to wheel-running reinforcement. 'I'he effects of a low overall response rate, a short Changeover delay, and long postreinforcement pausing on the assessment of matching in the present study are discussed.
An Adaptive Flow Solver for Air-Borne Vehicles Undergoing Time-Dependent Motions/Deformations
NASA Technical Reports Server (NTRS)
Singh, Jatinder; Taylor, Stephen
1997-01-01
This report describes a concurrent Euler flow solver for flows around complex 3-D bodies. The solver is based on a cell-centered finite volume methodology on 3-D unstructured tetrahedral grids. In this algorithm, spatial discretization for the inviscid convective term is accomplished using an upwind scheme. A localized reconstruction is done for flow variables which is second order accurate. Evolution in time is accomplished using an explicit three-stage Runge-Kutta method which has second order temporal accuracy. This is adapted for concurrent execution using another proven methodology based on concurrent graph abstraction. This solver operates on heterogeneous network architectures. These architectures may include a broad variety of UNIX workstations and PCs running Windows NT, symmetric multiprocessors and distributed-memory multi-computers. The unstructured grid is generated using commercial grid generation tools. The grid is automatically partitioned using a concurrent algorithm based on heat diffusion. This results in memory requirements that are inversely proportional to the number of processors. The solver uses automatic granularity control and resource management techniques both to balance load and communication requirements, and deal with differing memory constraints. These ideas are again based on heat diffusion. Results are subsequently combined for visualization and analysis using commercial CFD tools. Flow simulation results are demonstrated for a constant section wing at subsonic, transonic, and a supersonic case. These results are compared with experimental data and numerical results of other researchers. Performance results are under way for a variety of network topologies.
GaAs Supercomputing: Architecture, Language, And Algorithms For Image Processing
NASA Astrophysics Data System (ADS)
Johl, John T.; Baker, Nick C.
1988-10-01
The application of high-speed GaAs processors in a parallel system matches the demanding computational requirements of image processing. The architecture of the McDonnell Douglas Astronautics Company (MDAC) vector processor is described along with the algorithms and language translator. Most image and signal processing algorithms can utilize parallel processing and show a significant performance improvement over sequential versions. The parallelization performed by this system is within each vector instruction. Since each vector has many elements, each requiring some computation, useful concurrent arithmetic operations can easily be performed. Balancing the memory bandwidth with the computation rate of the processors is an important design consideration for high efficiency and utilization. The architecture features a bus-based execution unit consisting of four to eight 32-bit GaAs RISC microprocessors running at a 200 MHz clock rate for a peak performance of 1.6 BOPS. The execution unit is connected to a vector memory with three buses capable of transferring two input words and one output word every 10 nsec. The address generators inside the vector memory perform different vector addressing modes and feed the data to the execution unit. The functions discussed in this paper include basic MATRIX OPERATIONS, 2-D SPATIAL CONVOLUTION, HISTOGRAM, and FFT. For each of these algorithms, assembly language programs were run on a behavioral model of the system to obtain performance figures.
SiMon: Simulation Monitor for Computational Astrophysics
NASA Astrophysics Data System (ADS)
Xuran Qian, Penny; Cai, Maxwell Xu; Portegies Zwart, Simon; Zhu, Ming
2017-09-01
Scientific discovery via numerical simulations is important in modern astrophysics. This relatively new branch of astrophysics has become possible due to the development of reliable numerical algorithms and the high performance of modern computing technologies. These enable the analysis of large collections of observational data and the acquisition of new data via simulations at unprecedented accuracy and resolution. Ideally, simulations run until they reach some pre-determined termination condition, but often other factors cause extensive numerical approaches to break down at an earlier stage. In those cases, processes tend to be interrupted due to unexpected events in the software or the hardware. In those cases, the scientist handles the interrupt manually, which is time-consuming and prone to errors. We present the Simulation Monitor (SiMon) to automatize the farming of large and extensive simulation processes. Our method is light-weight, it fully automates the entire workflow management, operates concurrently across multiple platforms and can be installed in user space. Inspired by the process of crop farming, we perceive each simulation as a crop in the field and running simulation becomes analogous to growing crops. With the development of SiMon we relax the technical aspects of simulation management. The initial package was developed for extensive parameter searchers in numerical simulations, but it turns out to work equally well for automating the computational processing and reduction of observational data reduction.
Methodologies and systems for heterogeneous concurrent computing
NASA Technical Reports Server (NTRS)
Sunderam, V. S.
1994-01-01
Heterogeneous concurrent computing is gaining increasing acceptance as an alternative or complementary paradigm to multiprocessor-based parallel processing as well as to conventional supercomputing. While algorithmic and programming aspects of heterogeneous concurrent computing are similar to their parallel processing counterparts, system issues, partitioning and scheduling, and performance aspects are significantly different. In this paper, we discuss critical design and implementation issues in heterogeneous concurrent computing, and describe techniques for enhancing its effectiveness. In particular, we highlight the system level infrastructures that are required, aspects of parallel algorithm development that most affect performance, system capabilities and limitations, and tools and methodologies for effective computing in heterogeneous networked environments. We also present recent developments and experiences in the context of the PVM system and comment on ongoing and future work.
Cloudweaver: Adaptive and Data-Driven Workload Manager for Generic Clouds
NASA Astrophysics Data System (ADS)
Li, Rui; Chen, Lei; Li, Wen-Syan
Cloud computing denotes the latest trend in application development for parallel computing on massive data volumes. It relies on clouds of servers to handle tasks that used to be managed by an individual server. With cloud computing, software vendors can provide business intelligence and data analytic services for internet scale data sets. Many open source projects, such as Hadoop, offer various software components that are essential for building a cloud infrastructure. Current Hadoop (and many others) requires users to configure cloud infrastructures via programs and APIs and such configuration is fixed during the runtime. In this chapter, we propose a workload manager (WLM), called CloudWeaver, which provides automated configuration of a cloud infrastructure for runtime execution. The workload management is data-driven and can adapt to dynamic nature of operator throughput during different execution phases. CloudWeaver works for a single job and a workload consisting of multiple jobs running concurrently, which aims at maximum throughput using a minimum set of processors.
Ferrauti, Alexander; Bergermann, Matthias; Fernandez-Fernandez, Jaime
2010-10-01
The purpose of this study was to investigate the effects of a concurrent strength and endurance training program on running performance and running economy of middle-aged runners during their marathon preparation. Twenty-two (8 women and 14 men) recreational runners (mean ± SD: age 40.0 ± 11.7 years; body mass index 22.6 ± 2.1 kg·m⁻²) were separated into 2 groups (n = 11; combined endurance running and strength training program [ES]: 9 men, 2 women and endurance running [E]: 7 men, and 4 women). Both completed an 8-week intervention period that consisted of either endurance training (E: 276 ± 108 minute running per week) or a combined endurance and strength training program (ES: 240 ± 121-minute running plus 2 strength training sessions per week [120 minutes]). Strength training was focused on trunk (strength endurance program) and leg muscles (high-intensity program). Before and after the intervention, subjects completed an incremental treadmill run and maximal isometric strength tests. The initial values for VO2peak (ES: 52.0 ± 6.1 vs. E: 51.1 ± 7.5 ml·kg⁻¹·min⁻¹) and anaerobic threshold (ES: 3.5 ± 0.4 vs. E: 3.4 ± 0.5 m·s⁻¹) were identical in both groups. A significant time × intervention effect was found for maximal isometric force of knee extension (ES: from 4.6 ± 1.4 to 6.2 ± 1.0 N·kg⁻¹, p < 0.01), whereas no changes in body mass occurred. No significant differences between the groups and no significant interaction (time × intervention) were found for VO2 (absolute and relative to VO2peak) at defined marathon running velocities (2.4 and 2.8 m·s⁻¹) and submaximal blood lactate thresholds (2.0, 3.0, and 4.0 mmol·L⁻¹). Stride length and stride frequency also remained unchanged. The results suggest no benefits of an 8-week concurrent strength training for running economy and coordination of recreational marathon runners despite a clear improvement in leg strength, maybe because of an insufficient sample size or a short intervention period.
NASA Astrophysics Data System (ADS)
Anantharaj, V. G.; Venzke, J.; Lingerfelt, E.; Messer, B.
2015-12-01
Climate model simulations are used to understand the evolution and variability of earth's climate. Unfortunately, high-resolution multi-decadal climate simulations can take days to weeks to complete. Typically, the simulation results are not analyzed until the model runs have ended. During the course of the simulation, the output may be processed periodically to ensure that the model is preforming as expected. However, most of the data analytics and visualization are not performed until the simulation is finished. The lengthy time period needed for the completion of the simulation constrains the productivity of climate scientists. Our implementation of near real-time data visualization analytics capabilities allows scientists to monitor the progress of their simulations while the model is running. Our analytics software executes concurrently in a co-scheduling mode, monitoring data production. When new data are generated by the simulation, a co-scheduled data analytics job is submitted to render visualization artifacts of the latest results. These visualization output are automatically transferred to Bellerophon's data server located at ORNL's Compute and Data Environment for Science (CADES) where they are processed and archived into Bellerophon's database. During the course of the experiment, climate scientists can then use Bellerophon's graphical user interface to view animated plots and their associated metadata. The quick turnaround from the start of the simulation until the data are analyzed permits research decisions and projections to be made days or sometimes even weeks sooner than otherwise possible! The supercomputer resources used to run the simulation are unaffected by co-scheduling the data visualization jobs, so the model runs continuously while the data are visualized. Our just-in-time data visualization software looks to increase climate scientists' productivity as climate modeling moves into exascale era of computing.
Performance Studies on Distributed Virtual Screening
Krüger, Jens; de la Garza, Luis; Kohlbacher, Oliver; Nagel, Wolfgang E.
2014-01-01
Virtual high-throughput screening (vHTS) is an invaluable method in modern drug discovery. It permits screening large datasets or databases of chemical structures for those structures binding possibly to a drug target. Virtual screening is typically performed by docking code, which often runs sequentially. Processing of huge vHTS datasets can be parallelized by chunking the data because individual docking runs are independent of each other. The goal of this work is to find an optimal splitting maximizing the speedup while considering overhead and available cores on Distributed Computing Infrastructures (DCIs). We have conducted thorough performance studies accounting not only for the runtime of the docking itself, but also for structure preparation. Performance studies were conducted via the workflow-enabled science gateway MoSGrid (Molecular Simulation Grid). As input we used benchmark datasets for protein kinases. Our performance studies show that docking workflows can be made to scale almost linearly up to 500 concurrent processes distributed even over large DCIs, thus accelerating vHTS campaigns significantly. PMID:25032219
Weather Research and Forecasting Model with Vertical Nesting Capability
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-08-01
The Weather Research and Forecasting (WRF) model with vertical nesting capability is an extension of the WRF model, which is available in the public domain, from www.wrf-model.org. The new code modifies the nesting procedure, which passes lateral boundary conditions between computational domains in the WRF model. Previously, the same vertical grid was required on all domains, while the new code allows different vertical grids to be used on concurrently run domains. This new functionality improves WRF's ability to produce high-resolution simulations of the atmosphere by allowing a wider range of scales to be efficiently resolved and more accurate lateral boundarymore » conditions to be provided through the nesting procedure.« less
Numerical studies of electron dynamics in oblique quasi-perpendicular collisionless shock waves
NASA Technical Reports Server (NTRS)
Liewer, P. C.; Decyk, V. K.; Dawson, J. M.; Lembege, B.
1991-01-01
Linear and nonlinear electron damping of the whistler precursor wave train to low Mach number quasi-perpendicular oblique shocks is studied using a one-dimensional electromagnetic plasma simulation code with particle electrons and ions. In some parameter regimes, electrons are observed to trap along the magnetic field lines in the potential of the whistler precursor wave train. This trapping can lead to significant electron heating in front of the shock for low beta(e). Use of a 64-processor hypercube concurrent computer has enabled long runs using realistic mass ratios in the full particle in-cell code and thus simulate shock parameter regimes and phenomena not previously studied numerically.
NSTX-U Advances in Real-Time C++11 on Linux
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erickson, Keith G.
Programming languages like C and Ada combined with proprietary embedded operating systems have dominated the real-time application space for decades. The new C++11standard includes native, language-level support for concurrency, a required feature for any nontrivial event-oriented real-time software. Threads, Locks, and Atomics now exist to provide the necessary tools to build the structures that make up the foundation of a complex real-time system. The National Spherical Torus Experiment Upgrade (NSTX-U) at the Princeton Plasma Physics Laboratory (PPPL) is breaking new ground with the language as applied to the needs of fusion devices. A new Digital Coil Protection System (DCPS) willmore » serve as the main protection mechanism for the magnetic coils, and it is written entirely in C++11 running on Concurrent Computer Corporation's real-time operating system, RedHawk Linux. It runs over 600 algorithms in a 5 kHz control loop that determine whether or not to shut down operations before physical damage occurs. To accomplish this, NSTX-U engineers developed software tools that do not currently exist elsewhere, including real-time atomic synchronization, real-time containers, and a real-time logging framework. Together with a recent (and carefully configured) version of the GCC compiler, these tools enable data acquisition, processing, and output using a conventional operating system to meet a hard real-time deadline (that is, missing one periodic is a failure) of 200 microseconds.« less
NSTX-U Advances in Real-Time C++11 on Linux
Erickson, Keith G.
2015-08-14
Programming languages like C and Ada combined with proprietary embedded operating systems have dominated the real-time application space for decades. The new C++11standard includes native, language-level support for concurrency, a required feature for any nontrivial event-oriented real-time software. Threads, Locks, and Atomics now exist to provide the necessary tools to build the structures that make up the foundation of a complex real-time system. The National Spherical Torus Experiment Upgrade (NSTX-U) at the Princeton Plasma Physics Laboratory (PPPL) is breaking new ground with the language as applied to the needs of fusion devices. A new Digital Coil Protection System (DCPS) willmore » serve as the main protection mechanism for the magnetic coils, and it is written entirely in C++11 running on Concurrent Computer Corporation's real-time operating system, RedHawk Linux. It runs over 600 algorithms in a 5 kHz control loop that determine whether or not to shut down operations before physical damage occurs. To accomplish this, NSTX-U engineers developed software tools that do not currently exist elsewhere, including real-time atomic synchronization, real-time containers, and a real-time logging framework. Together with a recent (and carefully configured) version of the GCC compiler, these tools enable data acquisition, processing, and output using a conventional operating system to meet a hard real-time deadline (that is, missing one periodic is a failure) of 200 microseconds.« less
Finite Element Analysis in Concurrent Processing: Computational Issues
NASA Technical Reports Server (NTRS)
Sobieszczanski-Sobieski, Jaroslaw; Watson, Brian; Vanderplaats, Garrett
2004-01-01
The purpose of this research is to investigate the potential application of new methods for solving large-scale static structural problems on concurrent computers. It is well known that traditional single-processor computational speed will be limited by inherent physical limits. The only path to achieve higher computational speeds lies through concurrent processing. Traditional factorization solution methods for sparse matrices are ill suited for concurrent processing because the null entries get filled, leading to high communication and memory requirements. The research reported herein investigates alternatives to factorization that promise a greater potential to achieve high concurrent computing efficiency. Two methods, and their variants, based on direct energy minimization are studied: a) minimization of the strain energy using the displacement method formulation; b) constrained minimization of the complementary strain energy using the force method formulation. Initial results indicated that in the context of the direct energy minimization the displacement formulation experienced convergence and accuracy difficulties while the force formulation showed promising potential.
Paradigms and strategies for scientific computing on distributed memory concurrent computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foster, I.T.; Walker, D.W.
1994-06-01
In this work we examine recent advances in parallel languages and abstractions that have the potential for improving the programmability and maintainability of large-scale, parallel, scientific applications running on high performance architectures and networks. This paper focuses on Fortran M, a set of extensions to Fortran 77 that supports the modular design of message-passing programs. We describe the Fortran M implementation of a particle-in-cell (PIC) plasma simulation application, and discuss issues in the optimization of the code. The use of two other methodologies for parallelizing the PIC application are considered. The first is based on the shared object abstraction asmore » embodied in the Orca language. The second approach is the Split-C language. In Fortran M, Orca, and Split-C the ability of the programmer to control the granularity of communication is important is designing an efficient implementation.« less
Method for routing events from key strokes in a multi-processing computer systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rhodes, D.A.; Rustici, E.; Carter, K.H.
1990-01-23
The patent describes a method of routing user input in a computer system which concurrently runs a plurality of processes. It comprises: generating keycodes representative of keys typed by a user; distinguishing generated keycodes by looking up each keycode in a routing table which assigns each possible keycode to an individual assigned process of the plurality of processes, one of which processes being a supervisory process; then, sending each keycode to its assigned process until a keycode assigned to the supervisory process is received; sending keycodes received subsequent to the keycode assigned to the supervisory process to a buffer; next,more » providing additional keycodes to the supervisory process from the buffer until the supervisory process has completed operation; and sending keycodes stored in the buffer to processes assigned therewith after the supervisory process has completedoperation.« less
A C++ Thread Package for Concurrent and Parallel Programming
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jie Chen; William Watson
1999-11-01
Recently thread libraries have become a common entity on various operating systems such as Unix, Windows NT and VxWorks. Those thread libraries offer significant performance enhancement by allowing applications to use multiple threads running either concurrently or in parallel on multiprocessors. However, the incompatibilities between native libraries introduces challenges for those who wish to develop portable applications.
1990-04-23
developed Ada Real - Time Operating System (ARTOS) for bare machine environments(Target), ACW 1.1I0. " ; - -M.UIECTTERMS Ada programming language, Ada...configuration) Operating System: CSC developed Ada Real - Time Operating System (ARTOS) for bare machine environments Memory Size: 4MB 2.2...Test Method Testing of the MC Ado V1.2.beta/ Concurrent Computer Corporation compiler and the CSC developed Ada Real - Time Operating System (ARTOS) for
Generalized concurrence in boson sampling.
Chin, Seungbeom; Huh, Joonsuk
2018-04-17
A fundamental question in linear optical quantum computing is to understand the origin of the quantum supremacy in the physical system. It is found that the multimode linear optical transition amplitudes are calculated through the permanents of transition operator matrices, which is a hard problem for classical simulations (boson sampling problem). We can understand this problem by considering a quantum measure that directly determines the runtime for computing the transition amplitudes. In this paper, we suggest a quantum measure named "Fock state concurrence sum" C S , which is the summation over all the members of "the generalized Fock state concurrence" (a measure analogous to the generalized concurrences of entanglement and coherence). By introducing generalized algorithms for computing the transition amplitudes of the Fock state boson sampling with an arbitrary number of photons per mode, we show that the minimal classical runtime for all the known algorithms directly depends on C S . Therefore, we can state that the Fock state concurrence sum C S behaves as a collective measure that controls the computational complexity of Fock state BS. We expect that our observation on the role of the Fock state concurrence in the generalized algorithm for permanents would provide a unified viewpoint to interpret the quantum computing power of linear optics.
Hypercube matrix computation task
NASA Technical Reports Server (NTRS)
Calalo, R.; Imbriale, W.; Liewer, P.; Lyons, J.; Manshadi, F.; Patterson, J.
1987-01-01
The Hypercube Matrix Computation (Year 1986-1987) task investigated the applicability of a parallel computing architecture to the solution of large scale electromagnetic scattering problems. Two existing electromagnetic scattering codes were selected for conversion to the Mark III Hypercube concurrent computing environment. They were selected so that the underlying numerical algorithms utilized would be different thereby providing a more thorough evaluation of the appropriateness of the parallel environment for these types of problems. The first code was a frequency domain method of moments solution, NEC-2, developed at Lawrence Livermore National Laboratory. The second code was a time domain finite difference solution of Maxwell's equations to solve for the scattered fields. Once the codes were implemented on the hypercube and verified to obtain correct solutions by comparing the results with those from sequential runs, several measures were used to evaluate the performance of the two codes. First, a comparison was provided of the problem size possible on the hypercube with 128 megabytes of memory for a 32-node configuration with that available in a typical sequential user environment of 4 to 8 megabytes. Then, the performance of the codes was anlyzed for the computational speedup attained by the parallel architecture.
Dynamic partial reconfiguration of logic controllers implemented in FPGAs
NASA Astrophysics Data System (ADS)
Bazydło, Grzegorz; Wiśniewski, Remigiusz
2016-09-01
Technological progress in recent years benefits in digital circuits containing millions of logic gates with the capability for reprogramming and reconfiguring. On the one hand it provides the unprecedented computational power, but on the other hand the modelled systems are becoming increasingly complex, hierarchical and concurrent. Therefore, abstract modelling supported by the Computer Aided Design tools becomes a very important task. Even the higher consumption of the basic electronic components seems to be acceptable because chip manufacturing costs tend to fall over the time. The paper presents a modelling approach for logic controllers with the use of Unified Modelling Language (UML). Thanks to the Model Driven Development approach, starting with a UML state machine model, through the construction of an intermediate Hierarchical Concurrent Finite State Machine model, a collection of Verilog files is created. The system description generated in hardware description language can be synthesized and implemented in reconfigurable devices, such as FPGAs. Modular specification of the prototyped controller permits for further dynamic partial reconfiguration of the prototyped system. The idea bases on the exchanging of the functionality of the already implemented controller without stopping of the FPGA device. It means, that a part (for example a single module) of the logic controller is replaced by other version (called context), while the rest of the system is still running. The method is illustrated by a practical example by an exemplary Home Area Network system.
A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ma, Kwan-Liu
Most of today’s visualization libraries and applications are based off of what is known today as the visualization pipeline. In the visualization pipeline model, algorithms are encapsulated as “filtering” components with inputs and outputs. These components can be combined by connecting the outputs of one filter to the inputs of another filter. The visualization pipeline model is popular because it provides a convenient abstraction that allows users to combine algorithms in powerful ways. Unfortunately, the visualization pipeline cannot run effectively on exascale computers. Experts agree that the exascale machine will comprise processors that contain many cores. Furthermore, physical limitations willmore » prevent data movement in and out of the chip (that is, between main memory and the processing cores) from keeping pace with improvements in overall compute performance. To use these processors to their fullest capability, it is essential to carefully consider memory access. This is where the visualization pipeline fails. Each filtering component in the visualization library is expected to take a data set in its entirety, perform some computation across all of the elements, and output the complete results. The process of iterating over all elements must be repeated in each filter, which is one of the worst possible ways to traverse memory when trying to maximize the number of executions per memory access. This project investigates a new type of visualization framework that exhibits a pervasive parallelism necessary to run on exascale machines. Our framework achieves this by defining algorithms in terms of functors, which are localized, stateless operations. Functors can be composited in much the same way as filters in the visualization pipeline. But, functors’ design allows them to be concurrently running on massive amounts of lightweight threads. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for efficient computation on an exascale computer. This project concludes with a functional prototype containing pervasively parallel algorithms that perform demonstratively well on many-core processors. These algorithms are fundamental for performing data analysis and visualization at extreme scale.« less
Software fault-tolerance by design diversity DEDIX: A tool for experiments
NASA Technical Reports Server (NTRS)
Avizienis, A.; Gunningberg, P.; Kelly, J. P. J.; Lyu, R. T.; Strigini, L.; Traverse, P. J.; Tso, K. S.; Voges, U.
1986-01-01
The use of multiple versions of a computer program, independently designed from a common specification, to reduce the effects of an error is discussed. If these versions are designed by independent programming teams, it is expected that a fault in one version will not have the same behavior as any fault in the other versions. Since the errors in the output of the versions are different and uncorrelated, it is possible to run the versions concurrently, cross-check their results at prespecified points, and mask errors. A DEsign DIversity eXperiments (DEDIX) testbed was implemented to study the influence of common mode errors which can result in a failure of the entire system. The layered design of DEDIX and its decision algorithm are described.
Probabilistic simulation of concurrent engineering of propulsion systems
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Singhal, S. N.
1993-01-01
Technology readiness and the available infrastructure is assessed for timely computational simulation of concurrent engineering for propulsion systems. Results for initial coupled multidisciplinary, fabrication-process, and system simulators are presented including uncertainties inherent in various facets of engineering processes. An approach is outlined for computationally formalizing the concurrent engineering process from cradle-to-grave via discipline dedicated workstations linked with a common database.
eXascale PRogramming Environment and System Software (XPRESS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chapman, Barbara; Gabriel, Edgar
Exascale systems, with a thousand times the compute capacity of today’s leading edge petascale computers, are expected to emerge during the next decade. Their software systems will need to facilitate the exploitation of exceptional amounts of concurrency in applications, and ensure that jobs continue to run despite the occurrence of system failures and other kinds of hard and soft errors. Adapting computations at runtime to cope with changes in the execution environment, as well as to improve power and performance characteristics, is likely to become the norm. As a result, considerable innovation is required to develop system support to meetmore » the needs of future computing platforms. The XPRESS project aims to develop and prototype a revolutionary software system for extreme-scale computing for both exascale and strongscaled problems. The XPRESS collaborative research project will advance the state-of-the-art in high performance computing and enable exascale computing for current and future DOE mission-critical applications and supporting systems. The goals of the XPRESS research project are to: A. enable exascale performance capability for DOE applications, both current and future, B. develop and deliver a practical computing system software X-stack, OpenX, for future practical DOE exascale computing systems, and C. provide programming methods and environments for effective means of expressing application and system software for portable exascale system execution.« less
Automated Concurrent Blackboard System Generation in C++
NASA Technical Reports Server (NTRS)
Kaplan, J. A.; McManus, J. W.; Bynum, W. L.
1999-01-01
In his 1992 Ph.D. thesis, "Design and Analysis Techniques for Concurrent Blackboard Systems", John McManus defined several performance metrics for concurrent blackboard systems and developed a suite of tools for creating and analyzing such systems. These tools allow a user to analyze a concurrent blackboard system design and predict the performance of the system before any code is written. The design can be modified until simulated performance is satisfactory. Then, the code generator can be invoked to generate automatically all of the code required for the concurrent blackboard system except for the code implementing the functionality of each knowledge source. We have completed the port of the source code generator and a simulator for a concurrent blackboard system. The source code generator generates the necessary C++ source code to implement the concurrent blackboard system using Parallel Virtual Machine (PVM) running on a heterogeneous network of UNIX(trademark) workstations. The concurrent blackboard simulator uses the blackboard specification file to predict the performance of the concurrent blackboard design. The only part of the source code for the concurrent blackboard system that the user must supply is the code implementing the functionality of the knowledge sources.
AEROELASTIC SIMULATION TOOL FOR INFLATABLE BALLUTE AEROCAPTURE
NASA Technical Reports Server (NTRS)
Liever, P. A.; Sheta, E. F.; Habchi, S. D.
2006-01-01
A multidisciplinary analysis tool is under development for predicting the impact of aeroelastic effects on the functionality of inflatable ballute aeroassist vehicles in both the continuum and rarefied flow regimes. High-fidelity modules for continuum and rarefied aerodynamics, structural dynamics, heat transfer, and computational grid deformation are coupled in an integrated multi-physics, multi-disciplinary computing environment. This flexible and extensible approach allows the integration of state-of-the-art, stand-alone NASA and industry leading continuum and rarefied flow solvers and structural analysis codes into a computing environment in which the modules can run concurrently with synchronized data transfer. Coupled fluid-structure continuum flow demonstrations were conducted on a clamped ballute configuration. The feasibility of implementing a DSMC flow solver in the simulation framework was demonstrated, and loosely coupled rarefied flow aeroelastic demonstrations were performed. A NASA and industry technology survey identified CFD, DSMC and structural analysis codes capable of modeling non-linear shape and material response of thin-film inflated aeroshells. The simulation technology will find direct and immediate applications with NASA and industry in ongoing aerocapture technology development programs.
Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array
NASA Astrophysics Data System (ADS)
Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul
2008-04-01
This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.
NASA Technical Reports Server (NTRS)
Gilbertsen, Noreen D.; Belytschko, Ted
1990-01-01
The implementation of a nonlinear explicit program on a vectorized, concurrent computer with shared memory is described and studied. The conflict between vectorization and concurrency is described and some guidelines are given for optimal block sizes. Several example problems are summarized to illustrate the types of speed-ups which can be achieved by reprogramming as compared to compiler optimization.
NASA Astrophysics Data System (ADS)
Tripathi, Vijay S.; Yeh, G. T.
1993-06-01
Sophisticated and highly computation-intensive models of transport of reactive contaminants in groundwater have been developed in recent years. Application of such models to real-world contaminant transport problems, e.g., simulation of groundwater transport of 10-15 chemically reactive elements (e.g., toxic metals) and relevant complexes and minerals in two and three dimensions over a distance of several hundred meters, requires high-performance computers including supercomputers. Although not widely recognized as such, the computational complexity and demand of these models compare with well-known computation-intensive applications including weather forecasting and quantum chemical calculations. A survey of the performance of a variety of available hardware, as measured by the run times for a reactive transport model HYDROGEOCHEM, showed that while supercomputers provide the fastest execution times for such problems, relatively low-cost reduced instruction set computer (RISC) based scalar computers provide the best performance-to-price ratio. Because supercomputers like the Cray X-MP are inherently multiuser resources, often the RISC computers also provide much better turnaround times. Furthermore, RISC-based workstations provide the best platforms for "visualization" of groundwater flow and contaminant plumes. The most notable result, however, is that current workstations costing less than $10,000 provide performance within a factor of 5 of a Cray X-MP.
Optimizing legacy molecular dynamics software with directive-based offload
NASA Astrophysics Data System (ADS)
Michael Brown, W.; Carrillo, Jan-Michael Y.; Gavhane, Nitin; Thakkar, Foram M.; Plimpton, Steven J.
2015-10-01
Directive-based programming models are one solution for exploiting many-core coprocessors to increase simulation rates in molecular dynamics. They offer the potential to reduce code complexity with offload models that can selectively target computations to run on the CPU, the coprocessor, or both. In this paper, we describe modifications to the LAMMPS molecular dynamics code to enable concurrent calculations on a CPU and coprocessor. We demonstrate that standard molecular dynamics algorithms can run efficiently on both the CPU and an x86-based coprocessor using the same subroutines. As a consequence, we demonstrate that code optimizations for the coprocessor also result in speedups on the CPU; in extreme cases up to 4.7X. We provide results for LAMMPS benchmarks and for production molecular dynamics simulations using the Stampede hybrid supercomputer with both Intel® Xeon Phi™ coprocessors and NVIDIA GPUs. The optimizations presented have increased simulation rates by over 2X for organic molecules and over 7X for liquid crystals on Stampede. The optimizations are available as part of the "Intel package" supplied with LAMMPS.
Block-Parallel Data Analysis with DIY2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morozov, Dmitriy; Peterka, Tom
DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial,more » parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.« less
MaMR: High-performance MapReduce programming model for material cloud applications
NASA Astrophysics Data System (ADS)
Jing, Weipeng; Tong, Danyu; Wang, Yangang; Wang, Jingyuan; Liu, Yaqiu; Zhao, Peng
2017-02-01
With the increasing data size in materials science, existing programming models no longer satisfy the application requirements. MapReduce is a programming model that enables the easy development of scalable parallel applications to process big data on cloud computing systems. However, this model does not directly support the processing of multiple related data, and the processing performance does not reflect the advantages of cloud computing. To enhance the capability of workflow applications in material data processing, we defined a programming model for material cloud applications that supports multiple different Map and Reduce functions running concurrently based on hybrid share-memory BSP called MaMR. An optimized data sharing strategy to supply the shared data to the different Map and Reduce stages was also designed. We added a new merge phase to MapReduce that can efficiently merge data from the map and reduce modules. Experiments showed that the model and framework present effective performance improvements compared to previous work.
Reliability models for dataflow computer systems
NASA Technical Reports Server (NTRS)
Kavi, K. M.; Buckles, B. P.
1985-01-01
The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers.
Parallel Application Performance on Two Generations of Intel Xeon HPC Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Christopher H.; Long, Hai; Sides, Scott
2015-10-15
Two next-generation node configurations hosting the Haswell microarchitecture were tested with a suite of microbenchmarks and application examples, and compared with a current Ivy Bridge production node on NREL" tm s Peregrine high-performance computing cluster. A primary conclusion from this study is that the additional cores are of little value to individual task performance--limitations to application parallelism, or resource contention among concurrently running but independent tasks, limits effective utilization of these added cores. Hyperthreading generally impacts throughput negatively, but can improve performance in the absence of detailed attention to runtime workflow configuration. The observations offer some guidance to procurement ofmore » future HPC systems at NREL. First, raw core count must be balanced with available resources, particularly memory bandwidth. Balance-of-system will determine value more than processor capability alone. Second, hyperthreading continues to be largely irrelevant to the workloads that are commonly seen, and were tested here, at NREL. Finally, perhaps the most impactful enhancement to productivity might occur through enabling multiple concurrent jobs per node. Given the right type and size of workload, more may be achieved by doing many slow things at once, than fast things in order.« less
Accelerating Wright–Fisher Forward Simulations on the Graphics Processing Unit
Lawrie, David S.
2017-01-01
Forward Wright–Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the Central Processor Unit (CPU), thus limiting their usefulness. However, the single-locus Wright–Fisher forward algorithm is exceedingly parallelizable, with many steps that are so-called “embarrassingly parallel,” consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright–Fisher simulation, or “GO Fish” for short, can be used to simulate arbitrary selection and demographic scenarios while running over 250-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data, all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. GO Fish is part of the Parallel PopGen Package available at: http://dl42.github.io/ParallelPopGen/. PMID:28768689
Baker, Nancy A; Cook, James R; Redfern, Mark S
2009-01-01
This paper describes the inter-rater and intra-rater reliability, and the concurrent validity of an observational instrument, the Keyboard Personal Computer Style instrument (K-PeCS), which assesses stereotypical postures and movements associated with computer keyboard use. Three trained raters independently rated the video clips of 45 computer keyboard users to ascertain inter-rater reliability, and then re-rated a sub-sample of 15 video clips to ascertain intra-rater reliability. Concurrent validity was assessed by comparing the ratings obtained using the K-PeCS to scores developed from a 3D motion analysis system. The overall K-PeCS had excellent reliability [inter-rater: intra-class correlation coefficients (ICC)=.90; intra-rater: ICC=.92]. Most individual items on the K-PeCS had from good to excellent reliability, although six items fell below ICC=.75. Those K-PeCS items that were assessed for concurrent validity compared favorably to the motion analysis data for all but two items. These results suggest that most items on the K-PeCS can be used to reliably document computer keyboarding style.
49 CFR 1152.28 - Public use procedures.
Code of Federal Regulations, 2010 CFR
2010-10-01
... for Public Use, and Trail Use § 1152.28 Public use procedures. (a)(1) If the Board finds that the... for public purposes. This period will run concurrently with any other postponements. Jurisdiction to...
49 CFR 1152.28 - Public use procedures.
Code of Federal Regulations, 2011 CFR
2011-10-01
... for Public Use, and Trail Use § 1152.28 Public use procedures. (a)(1) If the Board finds that the... for public purposes. This period will run concurrently with any other postponements. Jurisdiction to...
47 CFR 74.15 - Station license period.
Code of Federal Regulations, 2011 CFR
2011-10-01
... broadcast booster station or a TV broadcast booster station will be issued for a period running concurrently... broadcast station, FM translator or FM broadcast booster, TV translator or TV broadcast booster, or low...
47 CFR 74.15 - Station license period.
Code of Federal Regulations, 2012 CFR
2012-10-01
... broadcast booster station or a TV broadcast booster station will be issued for a period running concurrently... broadcast station, FM translator or FM broadcast booster, TV translator or TV broadcast booster, or low...
The Use of a UNIX-Based Workstation in the Information Systems Laboratory
1989-03-01
system. The conclusions of the research and the resulting recommendations are presented in Chapter III. These recommendations include how to manage...required to run the program on a new system, these should not be significant changes. 2. Processing Environment The UNIX processing environment is...interactive with multi-tasking and multi-user capabilities. Multi-tasking refers to the fact that many programs can be run concurrently. This capability
Ferraro Petrillo, Umberto; Roscigno, Gianluca; Cattaneo, Giuseppe; Giancarlo, Raffaele
2018-06-01
Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in the realm of genome assembly. However, they are so specialized to this domain that they do not extend easily to the computation of informational and linguistic indices, concurrently on sets of genomes. Following the well-established approach in many disciplines, and with a growing success also in bioinformatics, to resort to MapReduce and Hadoop to deal with 'Big Data' problems, we present KCH, the first set of MapReduce algorithms able to perform concurrently informational and linguistic analysis of large collections of genomic sequences on a Hadoop cluster. The benchmarking of KCH that we provide indicates that it is quite effective and versatile. It is also competitive with respect to the parallel and distributed algorithms highly specialized to k-mer statistics collection for genome assembly problems. In conclusion, KCH is a much needed addition to the growing number of algorithms and tools that use MapReduce for bioinformatics core applications. The software, including instructions for running it over Amazon AWS, as well as the datasets are available at http://www.di-srv.unisa.it/KCH. umberto.ferraro@uniroma1.it. Supplementary data are available at Bioinformatics online.
Certification of computational results
NASA Technical Reports Server (NTRS)
Sullivan, Gregory F.; Wilson, Dwight S.; Masson, Gerald M.
1993-01-01
A conceptually novel and powerful technique to achieve fault detection and fault tolerance in hardware and software systems is described. When used for software fault detection, this new technique uses time and software redundancy and can be outlined as follows. In the initial phase, a program is run to solve a problem and store the result. In addition, this program leaves behind a trail of data called a certification trail. In the second phase, another program is run which solves the original problem again. This program, however, has access to the certification trail left by the first program. Because of the availability of the certification trail, the second phase can be performed by a less complex program and can execute more quickly. In the final phase, the two results are compared and if they agree the results are accepted as correct; otherwise an error is indicated. An essential aspect of this approach is that the second program must always generate either an error indication or a correct output even when the certification trail it receives from the first program is incorrect. The certification trail approach to fault tolerance is formalized and realizations of it are illustrated by considering algorithms for the following problems: convex hull, sorting, and shortest path. Cases in which the second phase can be run concurrently with the first and act as a monitor are discussed. The certification trail approach are compared to other approaches to fault tolerance.
PC-CUBE: A Personal Computer Based Hypercube
NASA Technical Reports Server (NTRS)
Ho, Alex; Fox, Geoffrey; Walker, David; Snyder, Scott; Chang, Douglas; Chen, Stanley; Breaden, Matt; Cole, Terry
1988-01-01
PC-CUBE is an ensemble of IBM PCs or close compatibles connected in the hypercube topology with ordinary computer cables. Communication occurs at the rate of 115.2 K-band via the RS-232 serial links. Available for PC-CUBE is the Crystalline Operating System III (CrOS III), Mercury Operating System, CUBIX and PLOTIX which are parallel I/O and graphics libraries. A CrOS performance monitor was developed to facilitate the measurement of communication and computation time of a program and their effects on performance. Also available are CXLISP, a parallel version of the XLISP interpreter; GRAFIX, some graphics routines for the EGA and CGA; and a general execution profiler for determining execution time spent by program subroutines. PC-CUBE provides a programming environment similar to all hypercube systems running CrOS III, Mercury and CUBIX. In addition, every node (personal computer) has its own graphics display monitor and storage devices. These allow data to be displayed or stored at every processor, which has much instructional value and enables easier debugging of applications. Some application programs which are taken from the book Solving Problems on Concurrent Processors (Fox 88) were implemented with graphics enhancement on PC-CUBE. The applications range from solving the Mandelbrot set, Laplace equation, wave equation, long range force interaction, to WaTor, an ecological simulation.
Wheel-running attenuates intravenous cocaine self-administration in rats: sex differences.
Cosgrove, Kelly P; Hunter, Robb G; Carroll, Marilyn E
2002-10-01
This experiment examines the effect of access to a running-wheel on intravenous cocaine self-administration in male and female rats. Rats maintained at 85% of their free-feeding body weight were first exposed to the running-wheel alone during the 6-h sessions until behavior stabilized for 14 days. Intravenous cannulae were then implanted, and the rats were trained to self-administer a low dose of cocaine (0.2 mg/kg) under a fixed-ratio (FR 1) schedule during the 6-h sessions, while the wheel remained inactive and cocaine self-administration stabilized (cocaine-only condition). Next, the wheel access and cocaine self-administration were concurrently available followed by a period of cocaine-only. Behavior was allowed to stabilize for 10 days at each phase. During wheel access, cocaine infusions decreased by 21.9% in males and 70.6% in females compared to the cocaine-only condition; the effect was statistically significant in females. Infusions increased to baseline levels when wheel access was terminated. When cocaine infusions were concurrently available, wheel revolutions were reduced by 63.7% and 61.5% in males and females, respectively, compared to the wheel-only condition. This result did not differ due to sex, but it was statistically significant when data from males and females were combined. These results indicate that wheel-running activity had a greater suppressant effect on cocaine self-administration in females than in males, and in females, wheel-running and cocaine self-administration are substitutable as reinforcers.
Actors: A Model of Concurrent Computation in Distributed Systems.
1985-06-01
Artificial Intelligence Labora- tory of the Massachusetts Institute of Technology. Support for the labora- tory’s aritificial intelligence research is...RD-A157 917 ACTORS: A MODEL OF CONCURRENT COMPUTATION IN 1/3- DISTRIBUTED SYTEMS(U) MASSACHUSETTS INST OF TECH CRMBRIDGE ARTIFICIAL INTELLIGENCE ...Computation In Distributed Systems Gui A. Aghai MIT Artificial Intelligence Laboratory Thsdocument ha. been cipp-oved I= pblicrelease and sale; itsI
HammerCloud: A Stress Testing System for Distributed Analysis
NASA Astrophysics Data System (ADS)
van der Ster, Daniel C.; Elmsheuser, Johannes; Úbeda García, Mario; Paladin, Massimo
2011-12-01
Distributed analysis of LHC data is an I/O-intensive activity which places large demands on the internal network, storage, and local disks at remote computing facilities. Commissioning and maintaining a site to provide an efficient distributed analysis service is therefore a challenge which can be aided by tools to help evaluate a variety of infrastructure designs and configurations. HammerCloud is one such tool; it is a stress testing service which is used by central operations teams, regional coordinators, and local site admins to (a) submit arbitrary number of analysis jobs to a number of sites, (b) maintain at a steady-state a predefined number of jobs running at the sites under test, (c) produce web-based reports summarizing the efficiency and performance of the sites under test, and (d) present a web-interface for historical test results to both evaluate progress and compare sites. HammerCloud was built around the distributed analysis framework Ganga, exploiting its API for grid job management. HammerCloud has been employed by the ATLAS experiment for continuous testing of many sites worldwide, and also during large scale computing challenges such as STEP'09 and UAT'09, where the scale of the tests exceeded 10,000 concurrently running and 1,000,000 total jobs over multi-day periods. In addition, HammerCloud is being adopted by the CMS experiment; the plugin structure of HammerCloud allows the execution of CMS jobs using their official tool (CRAB).
A new parallelization scheme for adaptive mesh refinement
Loffler, Frank; Cao, Zhoujian; Brandt, Steven R.; ...
2016-05-06
Here, we present a new method for parallelization of adaptive mesh refinement called Concurrent Structured Adaptive Mesh Refinement (CSAMR). This new method offers the lower computational cost (i.e. wall time x processor count) of subcycling in time, but with the runtime performance (i.e. smaller wall time) of evolving all levels at once using the time step of the finest level (which does more work than subcycling but has less parallelism). We demonstrate our algorithm's effectiveness using an adaptive mesh refinement code, AMSS-NCKU, and show performance on Blue Waters and other high performance clusters. For the class of problem considered inmore » this paper, our algorithm achieves a speedup of 1.7-1.9 when the processor count for a given AMR run is doubled, consistent with our theoretical predictions.« less
A new parallelization scheme for adaptive mesh refinement
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loffler, Frank; Cao, Zhoujian; Brandt, Steven R.
Here, we present a new method for parallelization of adaptive mesh refinement called Concurrent Structured Adaptive Mesh Refinement (CSAMR). This new method offers the lower computational cost (i.e. wall time x processor count) of subcycling in time, but with the runtime performance (i.e. smaller wall time) of evolving all levels at once using the time step of the finest level (which does more work than subcycling but has less parallelism). We demonstrate our algorithm's effectiveness using an adaptive mesh refinement code, AMSS-NCKU, and show performance on Blue Waters and other high performance clusters. For the class of problem considered inmore » this paper, our algorithm achieves a speedup of 1.7-1.9 when the processor count for a given AMR run is doubled, consistent with our theoretical predictions.« less
Optimizing legacy molecular dynamics software with directive-based offload
Michael Brown, W.; Carrillo, Jan-Michael Y.; Gavhane, Nitin; ...
2015-05-14
The directive-based programming models are one solution for exploiting many-core coprocessors to increase simulation rates in molecular dynamics. They offer the potential to reduce code complexity with offload models that can selectively target computations to run on the CPU, the coprocessor, or both. In our paper, we describe modifications to the LAMMPS molecular dynamics code to enable concurrent calculations on a CPU and coprocessor. We also demonstrate that standard molecular dynamics algorithms can run efficiently on both the CPU and an x86-based coprocessor using the same subroutines. As a consequence, we demonstrate that code optimizations for the coprocessor also resultmore » in speedups on the CPU; in extreme cases up to 4.7X. We provide results for LAMMAS benchmarks and for production molecular dynamics simulations using the Stampede hybrid supercomputer with both Intel (R) Xeon Phi (TM) coprocessors and NVIDIA GPUs: The optimizations presented have increased simulation rates by over 2X for organic molecules and over 7X for liquid crystals on Stampede. The optimizations are available as part of the "Intel package" supplied with LAMMPS. (C) 2015 Elsevier B.V. All rights reserved.« less
Advanced technologies for scalable ATLAS conditions database access on the grid
NASA Astrophysics Data System (ADS)
Basset, R.; Canali, L.; Dimitrov, G.; Girone, M.; Hawkings, R.; Nevski, P.; Valassi, A.; Vaniachine, A.; Viegas, F.; Walker, R.; Wong, A.
2010-04-01
During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysis of server performance under stress tests indicates that Conditions Db data access is limited by the disk I/O throughput. An unacceptable side-effect of the disk I/O saturation is a degradation of the WLCG 3D Services that update Conditions Db data at all ten ATLAS Tier-1 sites using the technology of Oracle Streams. To avoid such bottlenecks we prototyped and tested a novel approach for database peak load avoidance in Grid computing. Our approach is based upon the proven idea of pilot job submission on the Grid: instead of the actual query, an ATLAS utility library sends to the database server a pilot query first.
Karsina, Allen; Thompson, Rachel H; Rodriguez, Nicole M; Vanselow, Nicholas R
2012-01-01
We evaluated the effects of differential reinforcement and accurate verbal rules with feedback on the preference for choice and the verbal reports of 6 adults. Participants earned points on a probabilistic schedule by completing the terminal links of a concurrent-chains arrangement in a computer-based game of chance. In free-choice terminal links, participants selected 3 numbers from an 8-number array; in restricted-choice terminal links participants selected the order of 3 numbers preselected by a computer program. A pop-up box then informed the participants if the numbers they selected or ordered matched or did not match numbers generated by the computer but not displayed; matching in a trial resulted in one point earned. In baseline sessions, schedules of reinforcement were equal across free- and restricted-choice arrangements and a running tally of points earned was shown each trial. The effects of differentially reinforcing restricted-choice selections were evaluated using a reversal design. For 4 participants, the effects of providing a running tally of points won by arrangement and verbal rules regarding the schedule of reinforcement were also evaluated using a nonconcurrent multiple-baseline-across-participants design. Results varied across participants but generally demonstrated that (a) preference for choice corresponded more closely to verbal reports of the odds of winning than to reinforcement schedules, (b) rules and feedback were correlated with more accurate verbal reports, and (c) preference for choice corresponded more highly to the relative number of reinforcements obtained across free- and restricted-choice arrangements in a session than to the obtained probability of reinforcement or to verbal reports of the odds of winning. PMID:22754103
Karsina, Allen; Thompson, Rachel H; Rodriguez, Nicole M; Vanselow, Nicholas R
2012-01-01
We evaluated the effects of differential reinforcement and accurate verbal rules with feedback on the preference for choice and the verbal reports of 6 adults. Participants earned points on a probabilistic schedule by completing the terminal links of a concurrent-chains arrangement in a computer-based game of chance. In free-choice terminal links, participants selected 3 numbers from an 8-number array; in restricted-choice terminal links participants selected the order of 3 numbers preselected by a computer program. A pop-up box then informed the participants if the numbers they selected or ordered matched or did not match numbers generated by the computer but not displayed; matching in a trial resulted in one point earned. In baseline sessions, schedules of reinforcement were equal across free- and restricted-choice arrangements and a running tally of points earned was shown each trial. The effects of differentially reinforcing restricted-choice selections were evaluated using a reversal design. For 4 participants, the effects of providing a running tally of points won by arrangement and verbal rules regarding the schedule of reinforcement were also evaluated using a nonconcurrent multiple-baseline-across-participants design. Results varied across participants but generally demonstrated that (a) preference for choice corresponded more closely to verbal reports of the odds of winning than to reinforcement schedules, (b) rules and feedback were correlated with more accurate verbal reports, and (c) preference for choice corresponded more highly to the relative number of reinforcements obtained across free- and restricted-choice arrangements in a session than to the obtained probability of reinforcement or to verbal reports of the odds of winning.
Fast prediction of RNA-RNA interaction using heuristic algorithm.
Montaseri, Soheila
2015-01-01
Interaction between two RNA molecules plays a crucial role in many medical and biological processes such as gene expression regulation. In this process, an RNA molecule prohibits the translation of another RNA molecule by establishing stable interactions with it. Some algorithms have been formed to predict the structure of the RNA-RNA interaction. High computational time is a common challenge in most of the presented algorithms. In this context, a heuristic method is introduced to accurately predict the interaction between two RNAs based on minimum free energy (MFE). This algorithm uses a few dot matrices for finding the secondary structure of each RNA and binding sites between two RNAs. Furthermore, a parallel version of this method is presented. We describe the algorithm's concurrency and parallelism for a multicore chip. The proposed algorithm has been performed on some datasets including CopA-CopT, R1inv-R2inv, Tar-Tar*, DIS-DIS, and IncRNA54-RepZ in Escherichia coli bacteria. The method has high validity and efficiency, and it is run in low computational time in comparison to other approaches.
NASA Astrophysics Data System (ADS)
Rowe, Robert
2002-05-01
The training of musicians begins by teaching basic musical concepts, a collection of knowledge commonly known as musicianship. Computer programs designed to implement musical skills (e.g., to make sense of what they hear, perform music expressively, or compose convincing pieces) can similarly benefit from access to a fundamental level of musicianship. Recent research in music cognition, artificial intelligence, and music theory has produced a repertoire of techniques that can make the behavior of computer programs more musical. Many of these were presented in a recently published book/CD-ROM entitled Machine Musicianship. For use in interactive music systems, we are interested in those which are fast enough to run in real time and that need only make reference to the material as it appears in sequence. This talk will review several applications that are able to identify the tonal center of musical material during performance. Beyond this specific task, the design of real-time algorithmic listening through the concurrent operation of several connected analyzers is examined. The presentation includes discussion of a library of C++ objects that can be combined to perform interactive listening and a demonstration of their capability.
Study of Solid State Drives performance in PROOF distributed analysis system
NASA Astrophysics Data System (ADS)
Panitkin, S. Y.; Ernst, M.; Petkus, R.; Rind, O.; Wenaus, T.
2010-04-01
Solid State Drives (SSD) is a promising storage technology for High Energy Physics parallel analysis farms. Its combination of low random access time and relatively high read speed is very well suited for situations where multiple jobs concurrently access data located on the same drive. It also has lower energy consumption and higher vibration tolerance than Hard Disk Drive (HDD) which makes it an attractive choice in many applications raging from personal laptops to large analysis farms. The Parallel ROOT Facility - PROOF is a distributed analysis system which allows to exploit inherent event level parallelism of high energy physics data. PROOF is especially efficient together with distributed local storage systems like Xrootd, when data are distributed over computing nodes. In such an architecture the local disk subsystem I/O performance becomes a critical factor, especially when computing nodes use multi-core CPUs. We will discuss our experience with SSDs in PROOF environment. We will compare performance of HDD with SSD in I/O intensive analysis scenarios. In particular we will discuss PROOF system performance scaling with a number of simultaneously running analysis jobs.
Multi-jagged: A scalable parallel spatial partitioning algorithm
Deveci, Mehmet; Rajamanickam, Sivasankaran; Devine, Karen D.; ...
2015-03-18
Geometric partitioning is fast and effective for load-balancing dynamic applications, particularly those requiring geometric locality of data (particle methods, crash simulations). We present, to our knowledge, the first parallel implementation of a multidimensional-jagged geometric partitioner. In contrast to the traditional recursive coordinate bisection algorithm (RCB), which recursively bisects subdomains perpendicular to their longest dimension until the desired number of parts is obtained, our algorithm does recursive multi-section with a given number of parts in each dimension. By computing multiple cut lines concurrently and intelligently deciding when to migrate data while computing the partition, we minimize data movement compared to efficientmore » implementations of recursive bisection. We demonstrate the algorithm's scalability and quality relative to the RCB implementation in Zoltan on both real and synthetic datasets. Our experiments show that the proposed algorithm performs and scales better than RCB in terms of run-time without degrading the load balance. Lastly, our implementation partitions 24 billion points into 65,536 parts within a few seconds and exhibits near perfect weak scaling up to 6K cores.« less
Marshall, Harry H; Griffiths, David J; Mwanguhya, Francis; Businge, Robert; Griffiths, Amber G F; Kyabulima, Solomon; Mwesige, Kenneth; Sanderson, Jennifer L; Thompson, Faye J; Vitikainen, Emma I K; Cant, Michael A
2018-01-01
Studying ecological and evolutionary processes in the natural world often requires research projects to follow multiple individuals in the wild over many years. These projects have provided significant advances but may also be hampered by needing to accurately and efficiently collect and store multiple streams of the data from multiple individuals concurrently. The increase in the availability and sophistication of portable computers (smartphones and tablets) and the applications that run on them has the potential to address many of these data collection and storage issues. In this paper we describe the challenges faced by one such long-term, individual-based research project: the Banded Mongoose Research Project in Uganda. We describe a system we have developed called Mongoose 2000 that utilises the potential of apps and portable computers to meet these challenges. We discuss the benefits and limitations of employing such a system in a long-term research project. The app and source code for the Mongoose 2000 system are freely available and we detail how it might be used to aid data collection and storage in other long-term individual-based projects.
Artificial Intelligence (AI) Based Tactical Guidance for Fighter Aircraft
NASA Technical Reports Server (NTRS)
McManus, John W.; Goodrich, Kenneth H.
1990-01-01
A research program investigating the use of Artificial Intelligence (AI) techniques to aid in the development of a Tactical Decision Generator (TDG) for Within Visual Range (WVR) air combat engagements is discussed. The application of AI programming and problem solving methods in the development and implementation of the Computerized Logic For Air-to-Air Warfare Simulations (CLAWS), a second generation TDG, is presented. The Knowledge-Based Systems used by CLAWS to aid in the tactical decision-making process are outlined in detail, and the results of tests to evaluate the performance of CLAWS versus a baseline TDG developed in FORTRAN to run in real-time in the Langley Differential Maneuvering Simulator (DMS), are presented. To date, these test results have shown significant performance gains with respect to the TDG baseline in one-versus-one air combat engagements, and the AI-based TDG software has proven to be much easier to modify and maintain than the baseline FORTRAN TDG programs. Alternate computing environments and programming approaches, including the use of parallel algorithms and heterogeneous computer networks are discussed, and the design and performance of a prototype concurrent TDG system are presented.
Another Program For Generating Interactive Graphics
NASA Technical Reports Server (NTRS)
Costenbader, Jay; Moleski, Walt; Szczur, Martha; Howell, David; Engelberg, Norm; Li, Tin P.; Misra, Dharitri; Miller, Philip; Neve, Leif; Wolf, Karl;
1991-01-01
VAX/Ultrix version of Transportable Applications Environment Plus (TAE+) computer program provides integrated, portable software environment for developing and running interactive window, text, and graphical-object-based application software systems. Enables programmer or nonprogrammer to construct easily custom software interface between user and application program and to move resulting interface program and its application program to different computers. When used throughout company for wide range of applications, makes both application program and computer seem transparent, with noticeable improvements in learning curve. Available in form suitable for following six different groups of computers: DEC VAX station and other VMS VAX computers, Macintosh II computers running AUX, Apollo Domain Series 3000, DEC VAX and reduced-instruction-set-computer workstations running Ultrix, Sun 3- and 4-series workstations running Sun OS and IBM RT/PC's and PS/2 computers running AIX, and HP 9000 S
Size and emotion averaging: costs of dividing attention after all.
Brand, John; Oriet, Chris; Tottenham, Laurie Sykes
2012-03-01
Perceptual averaging is a process by which sets of similar items are represented by summary statistics such as their average size, luminance, or orientation. Researchers have argued that this process is automatic, able to be carried out without interference from concurrent processing. Here, we challenge this conclusion and demonstrate a reliable cost of computing the mean size of circles distinguished by colour (Experiments 1 and 2) and the mean emotionality of faces distinguished by sex (Experiment 3). We also test the viability of two strategies that could have allowed observers to guess the correct response without computing the average size or emotionality of both sets concurrently. We conclude that although two means can be computed concurrently, doing so incurs a cost of dividing attention.
The VLBA correlator: Real-time in the distributed era
NASA Technical Reports Server (NTRS)
Wells, D. C.
1992-01-01
The correlator is the signal processing engine of the Very Long Baseline Array (VLBA). Radio signals are recorded on special wideband (128 Mb/s) digital recorders at the 10 telescopes, with sampling times controlled by hydrogen maser clocks. The magnetic tapes are shipped to the Array Operations Center in Socorro, New Mexico, where they are played back simultaneously into the correlator. Real-time software and firmware controls the playback drives to achieve synchronization, compute models of the wavefront delay, control the numerous modules of the correlator, and record FITS files of the fringe visibilities at the back-end of the correlator. In addition to the more than 3000 custom VLSI chips which handle the massive data flow of the signal processing, the correlator contains a total of more than 100 programmable computers, 8-, 16- and 32-bit CPUs. Code is downloaded into front-end CPU's dependent on operating mode. Low-level code is assembly language, high-level code is C running under a RT OS. We use VxWorks on Motorola MVME147 CPU's. Code development is on a complex of SPARC workstations connected to the RT CPU's by Ethernet. The overall management of the correlation process is dependent on a database management system. We use Ingres running on a Sparcstation-2. We transfer logging information from the database of the VLBA Monitor and Control System to our database using Ingres/NET. Job scripts are computed and are transferred to the real-time computers using NFS, and correlation job execution logs and status flow back by the route. Operator status and control displays use windows on workstations, interfaced to the real-time processes by network protocols. The extensive network protocol support provided by VxWorks is invaluable. The VLBA Correlator's dependence on network protocols is an example of the radical transformation of the real-time world over the past five years. Real-time is becoming more like conventional computing. Paradoxically, 'conventional' computing is also adopting practices from the real-time world: semaphores, shared memory, light-weight threads, and concurrency. This appears to be a convergence of thinking.
PanDA for ATLAS distributed computing in the next decade
NASA Astrophysics Data System (ADS)
Barreiro Megino, F. H.; De, K.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Padolski, S.; Panitkin, S.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The Production and Distributed Analysis (PanDA) system has been developed to meet ATLAS production and analysis requirements for a data-driven workload management system capable of operating at the Large Hadron Collider (LHC) data processing scale. Heterogeneous resources used by the ATLAS experiment are distributed worldwide at hundreds of sites, thousands of physicists analyse the data remotely, the volume of processed data is beyond the exabyte scale, dozens of scientific applications are supported, while data processing requires more than a few billion hours of computing usage per year. PanDA performed very well over the last decade including the LHC Run 1 data taking period. However, it was decided to upgrade the whole system concurrently with the LHC’s first long shutdown in order to cope with rapidly changing computing infrastructure. After two years of reengineering efforts, PanDA has embedded capabilities for fully dynamic and flexible workload management. The static batch job paradigm was discarded in favor of a more automated and scalable model. Workloads are dynamically tailored for optimal usage of resources, with the brokerage taking network traffic and forecasts into account. Computing resources are partitioned based on dynamic knowledge of their status and characteristics. The pilot has been re-factored around a plugin structure for easier development and deployment. Bookkeeping is handled with both coarse and fine granularities for efficient utilization of pledged or opportunistic resources. An in-house security mechanism authenticates the pilot and data management services in off-grid environments such as volunteer computing and private local clusters. The PanDA monitor has been extensively optimized for performance and extended with analytics to provide aggregated summaries of the system as well as drill-down to operational details. There are as well many other challenges planned or recently implemented, and adoption by non-LHC experiments such as bioinformatics groups successfully running Paleomix (microbial genome and metagenomes) payload on supercomputers. In this paper we will focus on the new and planned features that are most important to the next decade of distributed computing workload management.
AthenaMT: upgrading the ATLAS software framework for the many-core world with multi-threading
NASA Astrophysics Data System (ADS)
Leggett, Charles; Baines, John; Bold, Tomasz; Calafiura, Paolo; Farrell, Steven; van Gemmeren, Peter; Malon, David; Ritsch, Elmar; Stewart, Graeme; Snyder, Scott; Tsulaia, Vakhtang; Wynne, Benjamin; ATLAS Collaboration
2017-10-01
ATLAS’s current software framework, Gaudi/Athena, has been very successful for the experiment in LHC Runs 1 and 2. However, its single threaded design has been recognized for some time to be increasingly problematic as CPUs have increased core counts and decreased available memory per core. Even the multi-process version of Athena, AthenaMP, will not scale to the range of architectures we expect to use beyond Run2. After concluding a rigorous requirements phase, where many design components were examined in detail, ATLAS has begun the migration to a new data-flow driven, multi-threaded framework, which enables the simultaneous processing of singleton, thread unsafe legacy Algorithms, cloned Algorithms that execute concurrently in their own threads with different Event contexts, and fully re-entrant, thread safe Algorithms. In this paper we report on the process of modifying the framework to safely process multiple concurrent events in different threads, which entails significant changes in the underlying handling of features such as event and time dependent data, asynchronous callbacks, metadata, integration with the online High Level Trigger for partial processing in certain regions of interest, concurrent I/O, as well as ensuring thread safety of core services. We also report on upgrading the framework to handle Algorithms that are fully re-entrant.
Pressure Ratio to Thermal Environments
NASA Technical Reports Server (NTRS)
Lopez, Pedro; Wang, Winston
2012-01-01
A pressure ratio to thermal environments (PRatTlE.pl) program is a Perl language code that estimates heating at requested body point locations by scaling the heating at a reference location times a pressure ratio factor. The pressure ratio factor is the ratio of the local pressure at the reference point and the requested point from CFD (computational fluid dynamics) solutions. This innovation provides pressure ratio-based thermal environments in an automated and traceable method. Previously, the pressure ratio methodology was implemented via a Microsoft Excel spreadsheet and macro scripts. PRatTlE is able to calculate heating environments for 150 body points in less than two minutes. PRatTlE is coded in Perl programming language, is command-line-driven, and has been successfully executed on both the HP and Linux platforms. It supports multiple concurrent runs. PRatTlE contains error trapping and input file format verification, which allows clear visibility into the input data structure and intermediate calculations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
The Fermilab Grid and Cloud Computing Department and the KISTI Global Science experimental Data hub Center are working on a multi-year Collaborative Research and Development Agreement.With the knowledge developed in the first year on how to provision and manage a federation of virtual machines through Cloud management systems. In this second year, we expanded the work on provisioning and federation, increasing both scale and diversity of solutions, and we started to build on-demand services on the established fabric, introducing the paradigm of Platform as a Service to assist with the execution of scientific workflows. We have enabled scientific workflows ofmore » stakeholders to run on multiple cloud resources at the scale of 1,000 concurrent machines. The demonstrations have been in the areas of (a) Virtual Infrastructure Automation and Provisioning, (b) Interoperability and Federation of Cloud Resources, and (c) On-demand Services for ScientificWorkflows.« less
Challenges and Opportunities in Interdisciplinary Materials Research Experiences for Undergraduates
NASA Astrophysics Data System (ADS)
Vohra, Yogesh; Nordlund, Thomas
2009-03-01
The University of Alabama at Birmingham (UAB) offer a broad range of interdisciplinary materials research experiences to undergraduate students with diverse backgrounds in physics, chemistry, applied mathematics, and engineering. The research projects offered cover a broad range of topics including high pressure physics, microelectronic materials, nano-materials, laser materials, bioceramics and biopolymers, cell-biomaterials interactions, planetary materials, and computer simulation of materials. The students welcome the opportunity to work with an interdisciplinary team of basic science, engineering, and biomedical faculty but the challenge is in learning the key vocabulary for interdisciplinary collaborations, experimental tools, and working in an independent capacity. The career development workshops dealing with the graduate school application process and the entrepreneurial business activities were found to be most effective. The interdisciplinary university wide poster session helped student broaden their horizons in research careers. The synergy of the REU program with other concurrently running high school summer programs on UAB campus will also be discussed.
An efficient parallel termination detection algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, A. H.; Crivelli, S.; Jessup, E. R.
2004-05-27
Information local to any one processor is insufficient to monitor the overall progress of most distributed computations. Typically, a second distributed computation for detecting termination of the main computation is necessary. In order to be a useful computational tool, the termination detection routine must operate concurrently with the main computation, adding minimal overhead, and it must promptly and correctly detect termination when it occurs. In this paper, we present a new algorithm for detecting the termination of a parallel computation on distributed-memory MIMD computers that satisfies all of those criteria. A variety of termination detection algorithms have been devised. Ofmore » these, the algorithm presented by Sinha, Kale, and Ramkumar (henceforth, the SKR algorithm) is unique in its ability to adapt to the load conditions of the system on which it runs, thereby minimizing the impact of termination detection on performance. Because their algorithm also detects termination quickly, we consider it to be the most efficient practical algorithm presently available. The termination detection algorithm presented here was developed for use in the PMESC programming library for distributed-memory MIMD computers. Like the SKR algorithm, our algorithm adapts to system loads and imposes little overhead. Also like the SKR algorithm, ours is tree-based, and it does not depend on any assumptions about the physical interconnection topology of the processors or the specifics of the distributed computation. In addition, our algorithm is easier to implement and requires only half as many tree traverses as does the SKR algorithm. This paper is organized as follows. In section 2, we define our computational model. In section 3, we review the SKR algorithm. We introduce our new algorithm in section 4, and prove its correctness in section 5. We discuss its efficiency and present experimental results in section 6.« less
ExM:System Support for Extreme-Scale, Many-Task Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Katz, Daniel S
The ever-increasing power of supercomputer systems is both driving and enabling the emergence of new problem-solving methods that require the effi cient execution of many concurrent and interacting tasks. Methodologies such as rational design (e.g., in materials science), uncertainty quanti fication (e.g., in engineering), parameter estimation (e.g., for chemical and nuclear potential functions, and in economic energy systems modeling), massive dynamic graph pruning (e.g., in phylogenetic searches), Monte-Carlo- based iterative fi xing (e.g., in protein structure prediction), and inverse modeling (e.g., in reservoir simulation) all have these requirements. These many-task applications frequently have aggregate computing needs that demand the fastestmore » computers. For example, proposed next-generation climate model ensemble studies will involve 1,000 or more runs, each requiring 10,000 cores for a week, to characterize model sensitivity to initial condition and parameter uncertainty. The goal of the ExM project is to achieve the technical advances required to execute such many-task applications efficiently, reliably, and easily on petascale and exascale computers. In this way, we will open up extreme-scale computing to new problem solving methods and application classes. In this document, we report on combined technical progress of the collaborative ExM project, and the institutional financial status of the portion of the project at University of Chicago, over the rst 8 months (through April 30, 2011)« less
Efficiency of parallel direct optimization
NASA Technical Reports Server (NTRS)
Janies, D. A.; Wheeler, W. C.
2001-01-01
Tremendous progress has been made at the level of sequential computation in phylogenetics. However, little attention has been paid to parallel computation. Parallel computing is particularly suited to phylogenetics because of the many ways large computational problems can be broken into parts that can be analyzed concurrently. In this paper, we investigate the scaling factors and efficiency of random addition and tree refinement strategies using the direct optimization software, POY, on a small (10 slave processors) and a large (256 slave processors) cluster of networked PCs running LINUX. These algorithms were tested on several data sets composed of DNA and morphology ranging from 40 to 500 taxa. Various algorithms in POY show fundamentally different properties within and between clusters. All algorithms are efficient on the small cluster for the 40-taxon data set. On the large cluster, multibuilding exhibits excellent parallel efficiency, whereas parallel building is inefficient. These results are independent of data set size. Branch swapping in parallel shows excellent speed-up for 16 slave processors on the large cluster. However, there is no appreciable speed-up for branch swapping with the further addition of slave processors (>16). This result is independent of data set size. Ratcheting in parallel is efficient with the addition of up to 32 processors in the large cluster. This result is independent of data set size. c2001 The Willi Hennig Society.
Methods for design and evaluation of integrated hardware-software systems for concurrent computation
NASA Technical Reports Server (NTRS)
Pratt, T. W.
1985-01-01
Research activities and publications are briefly summarized. The major tasks reviewed are: (1) VAX implementation of the PISCES parallel programming environment; (2) Apollo workstation network implementation of the PISCES environment; (3) FLEX implementation of the PISCES environment; (4) sparse matrix iterative solver in PSICES Fortran; (5) image processing application of PISCES; and (6) a formal model of concurrent computation being developed.
ERIC Educational Resources Information Center
Paney, Andrew S.; Kay, Ann C.
2015-01-01
The purpose of this study was to measure the effect of concurrent visual feedback on pitch-matching skill development in third-grade students. Participants played a computer game, "SingingCoach," which scored the accuracy of their singing of the song "America." They followed the contour of the melody on the screen as the…
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Crowley, Kay
1991-01-01
Run-time methods are studied to automatically parallelize and schedule iterations of a do loop in certain cases where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run-time, wavefronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. Symbolic transformation rules are used to produce: inspector procedures that perform execution time preprocessing, and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. Performance results are presented from experiments conducted on the Encore Multimax. These results illustrate that run-time reordering of loop indexes can have a significant impact on performance.
Accelerating large-scale protein structure alignments with graphics processing units
2012-01-01
Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132
The Weekly Fab Five: Things You Should Do Every Week To Keep Your Computer Running in Tip-Top Shape.
ERIC Educational Resources Information Center
Crispen, Patrick
2001-01-01
Describes five steps that school librarians should follow every week to keep their computers running at top efficiency. Explains how to update virus definitions; run Windows update; run ScanDisk to repair errors on the hard drive; run a disk defragmenter; and backup all data. (LRW)
NASA Astrophysics Data System (ADS)
Boardsen, S. A.; Adrian, M. L.; Pfaff, R.; Menietti, J. D.
2014-10-01
Direct measurement of low < 1 eV electron temperature is difficult to make in the Earth's inner magnetosphere for electron densities (Ne) < 3 × 102 cm-3. We compute these quantities by solving current balance equations in low-density regions. Concurrent measurements from the Polar spacecraft of the relative potential (VS - VP), between the spacecraft body and the electric field probe, and the electron density (Ne), derived from upper hybrid frequency (fUHR), were used in the current balance equations to solve for the electron temperature (Te), Vs, and Vp. Where VP is the probe potential and VS is the spacecraft potential relative to the nearby plasma. The assumption that the bulk plasma electrons are Maxwellian is used in the computations. Our data set covered 1.5 years of measurements when fUHR was detectable (L < 10). The following "averaged" Te versus L relation for 3 < L < 5 was obtained: Te = 0.58 + 0.49 (L - 3) eV. This expression is in reasonable agreement with extrapolations of ionospheric Te measurements by Akebono at lower altitudes. However, the solution is sensitive to the photoemission coefficients, substituting those of Scudder et al. (2000) with those of Escoubet et al. (1997), the Te curve shifted upward by ~1 eV. Also, the solution is sensitive to measurement error of VS - VP, applying a voltage shift of ±0.1 and ±0.2 V to VS - VP, the relative median error for our data set was computed to be 0.27 and 1.04, respectively. We believe that our Te values computed outside the plasmasphere are unrealistically low. We conclude that this method shows promise inside the plasmasphere but should be used with caution. We also quantified the Ne versus VS - VP relationship. The running median Ne versus VS - VP curve shows no significant variation over the 1.5 year period of the data set, suggesting that the photoemission coefficients did not change significantly over this time span. The Scudder et al. (2000) Ne model, based on only one Polar orbit, is in reasonable agreement (within a factor of 2) with our results.
1989-07-11
applicable because this implementation does not support temporary files with names. ag . EE2401D is inapplicable because this implementation does not...buffer. No spanned records with ASCII.NUL are output. A line terminator followed by a page terminator may be represented as: ASC::. CR ASCU :.FF ASCII.CR if
Use of statecharts in the modelling of dynamic behaviour in the ATLAS DAQ prototype-1
NASA Astrophysics Data System (ADS)
Croll, P.; Duval, P.-Y.; Jones, R.; Kolos, S.; Sari, R. F.; Wheeler, S.
1998-08-01
Many applications within the ATLAS DAQ prototype-1 system have complicated dynamic behaviour which can be successfully modelled in terms of states and transitions between states. Previously, state diagrams implemented as finite-state machines have been used. Although effective, they become ungainly as system size increases. Harel statecharts address this problem by implementing additional features such as hierarchy and concurrency. The CHSM object-oriented language system is freeware which implements Harel statecharts as concurrent, hierarchical, finite-state machines (CHSMs). An evaluation of this language system by the ATLAS DAQ group has shown it to be suitable for describing the dynamic behaviour of typical DAQ applications. The language is currently being used to model the dynamic behaviour of the prototype-1 run-control system. The design is specified by means of a CHSM description file, and C++ code is obtained by running the CHSM compiler on the file. In parallel with the modelling work, a code generator has been developed which translates statecharts, drawn using the StP CASE tool, into the CHSM language. C++ code, describing the dynamic behaviour of the run-control system, has been successfully generated directly from StP statecharts using the CHSM generator and compiler. The validity of the design was tested using the simulation features of the Statemate CASE tool.
NASA Technical Reports Server (NTRS)
Mielke, Roland V. (Inventor); Stoughton, John W. (Inventor)
1990-01-01
Computationally complex primitive operations of an algorithm are executed concurrently in a plurality of functional units under the control of an assignment manager. The algorithm is preferably defined as a computationally marked graph contianing data status edges (paths) corresponding to each of the data flow edges. The assignment manager assigns primitive operations to the functional units and monitors completion of the primitive operations to determine data availability using the computational marked graph of the algorithm. All data accessing of the primitive operations is performed by the functional units independently of the assignment manager.
Program For Generating Interactive Displays
NASA Technical Reports Server (NTRS)
Costenbader, Jay; Moleski, Walt; Szczur, Martha; Howell, David; Engelberg, Norm; Li, Tin P.; Misra, Dharitri; Miller, Philip; Neve, Leif; Wolf, Karl;
1991-01-01
Sun/Unix version of Transportable Applications Environment Plus (TAE+) computer program provides integrated, portable software environment for developing and running interactive window, text, and graphical-object-based application software systems. Enables programmer or nonprogrammer to construct easily custom software interface between user and application program and to move resulting interface program and its application program to different computers. Plus viewed as productivity tool for application developers and application end users, who benefit from resultant consistent and well-designed user interface sheltering them from intricacies of computer. Available in form suitable for following six different groups of computers: DEC VAX station and other VMS VAX computers, Macintosh II computers running AUX, Apollo Domain Series 3000, DEC VAX and reduced-instruction-set-computer workstations running Ultrix, Sun 3- and 4-series workstations running Sun OS and IBM RT/PC and PS/2 compute
A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations
NASA Astrophysics Data System (ADS)
Demir, I.; Agliamzanov, R.
2014-12-01
Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.
Using ADA Tasks to Simulate Operating Equipment
NASA Technical Reports Server (NTRS)
DeAcetis, Louis A.; Schmidt, Oron; Krishen, Kumar
1990-01-01
A method of simulating equipment using ADA tasks is discussed. Individual units of equipment are coded as concurrently running tasks that monitor and respond to input signals. This technique has been used in a simulation of the space-to-ground Communications and Tracking subsystem of Space Station Freedom.
Using Ada tasks to simulate operating equipment
NASA Technical Reports Server (NTRS)
Deacetis, Louis A.; Schmidt, Oron; Krishen, Kumar
1990-01-01
A method of simulating equipment using Ada tasks is discussed. Individual units of equipment are coded as concurrently running tasks that monitor and respond to input signals. This technique has been used in a simulation of the space-to-ground Communications and Tracking subsystem of Space Station Freedom.
GPU-Accelerated Voxelwise Hepatic Perfusion Quantification
Wang, H; Cao, Y
2012-01-01
Voxelwise quantification of hepatic perfusion parameters from dynamic contrast enhanced (DCE) imaging greatly contributes to assessment of liver function in response to radiation therapy. However, the efficiency of the estimation of hepatic perfusion parameters voxel-by-voxel in the whole liver using a dual-input single-compartment model requires substantial improvement for routine clinical applications. In this paper, we utilize the parallel computation power of a graphics processing unit (GPU) to accelerate the computation, while maintaining the same accuracy as the conventional method. Using CUDA-GPU, the hepatic perfusion computations over multiple voxels are run across the GPU blocks concurrently but independently. At each voxel, non-linear least squares fitting the time series of the liver DCE data to the compartmental model is distributed to multiple threads in a block, and the computations of different time points are performed simultaneously and synchronically. An efficient fast Fourier transform in a block is also developed for the convolution computation in the model. The GPU computations of the voxel-by-voxel hepatic perfusion images are compared with ones by the CPU using the simulated DCE data and the experimental DCE MR images from patients. The computation speed is improved by 30 times using a NVIDIA Tesla C2050 GPU compared to a 2.67 GHz Intel Xeon CPU processor. To obtain liver perfusion maps with 626400 voxels in a patient’s liver, it takes 0.9 min with the GPU-accelerated voxelwise computation, compared to 110 min with the CPU, while both methods result in perfusion parameters differences less than 10−6. The method will be useful for generating liver perfusion images in clinical settings. PMID:22892645
Time takes space: selective effects of multitasking on concurrent spatial processing.
Mäntylä, Timo; Coni, Valentina; Kubik, Veit; Todorov, Ivo; Del Missier, Fabio
2017-08-01
Many everyday activities require coordination and monitoring of complex relations of future goals and deadlines. Cognitive offloading may provide an efficient strategy for reducing control demands by representing future goals and deadlines as a pattern of spatial relations. We tested the hypothesis that multiple-task monitoring involves time-to-space transformational processes, and that these spatial effects are selective with greater demands on coordinate (metric) than categorical (nonmetric) spatial relation processing. Participants completed a multitasking session in which they monitored four series of deadlines, running on different time scales, while making concurrent coordinate or categorical spatial judgments. We expected and found that multitasking taxes concurrent coordinate, but not categorical, spatial processing. Furthermore, males showed a better multitasking performance than females. These findings provide novel experimental evidence for the hypothesis that efficient multitasking involves metric relational processing.
Loureiro, Luiz de França Bahia; de Freitas, Paulo Barbosa
2016-04-01
Badminton requires open and fast actions toward the shuttlecock, but there is no specific agility test for badminton players with specific movements. To develop an agility test that simultaneously assesses perception and motor capacity and examine the test's concurrent and construct validity and its test-retest reliability. The Badcamp agility test consists of running as fast as possible to 6 targets placed on the corners and middle points of a rectangular area (5.6 × 4.2 m) from the start position located in the center of it, following visual stimuli presented in a luminous panel. The authors recruited 43 badminton players (17-32 y old) to evaluate concurrent (with shuttle-run agility test--SRAT) and construct validity and test-retest reliability. Results revealed that Badcamp presents concurrent and construct validity, as its performance is strongly related to SRAT (ρ = 0.83, P < .001), with performance of experts being better than nonexpert players (P < .01). In addition, Badcamp is reliable, as no difference (P = .07) and a high intraclass correlation (ICC = .93) were found in the performance of the players on 2 different occasions. The findings indicate that Badcamp is an effective, valid, and reliable tool to measure agility, allowing coaches and athletic trainers to evaluate players' athletic condition and training effectiveness and possibly detect talented individuals in this sport.
76 FR 63660 - Sunshine Act Meetings; Notice
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-13
... Daylight Time. ** The meeting of the Institutional Advancement Committee will run concurrently with the...** 2:15 p.m. 3. Institutional Advancement Committee** 3:00 p.m. 4. Audit Committee** 3:30 p.m. Tuesday... benefits plan document, to consider and act on the report of the Institutional Advancement Committee...
Formal Specification and Verification of Concurrent Programs
1993-02-01
of examples from the emerging theory of This book describes operating systems in general programming languages. via the construction of MINIX , a UNIX...look-alike that runs on IBM-PC compatibles. The book con- Wegner72 tains a complete MINIX manual and a complete Wegnerflisting of its C codie. egner
ERIC Educational Resources Information Center
Harvey, Carl A., II; Kaplan, Allison G.
2007-01-01
This article highlights some of the exciting learning opportunities at the conference in Reno, Nevada. From an author strand running throughout the program to sessions on technology, collaboration, and general best practice, the concurrent programs promise something for everyone. There are exhibits that will showcase the latest in furniture,…
Code of Federal Regulations, 2010 CFR
2010-07-01
..., regulations, and public laws of the United States in accordance with the policies set forth in the Act and in... procedures required by law or by agency practice so that all such procedures run concurrently rather than... human environment. (e) Use the NEPA process to identify and assess the reasonable alternatives to...
Griffiths, David J.; Mwanguhya, Francis; Businge, Robert; Griffiths, Amber G. F.; Kyabulima, Solomon; Mwesige, Kenneth; Sanderson, Jennifer L.; Thompson, Faye J.; Vitikainen, Emma I. K.; Cant, Michael A.
2018-01-01
Studying ecological and evolutionary processes in the natural world often requires research projects to follow multiple individuals in the wild over many years. These projects have provided significant advances but may also be hampered by needing to accurately and efficiently collect and store multiple streams of the data from multiple individuals concurrently. The increase in the availability and sophistication of portable computers (smartphones and tablets) and the applications that run on them has the potential to address many of these data collection and storage issues. In this paper we describe the challenges faced by one such long-term, individual-based research project: the Banded Mongoose Research Project in Uganda. We describe a system we have developed called Mongoose 2000 that utilises the potential of apps and portable computers to meet these challenges. We discuss the benefits and limitations of employing such a system in a long-term research project. The app and source code for the Mongoose 2000 system are freely available and we detail how it might be used to aid data collection and storage in other long-term individual-based projects. PMID:29315317
Application Reuse Library for Software, Requirements, and Guidelines
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Thronesbery, Carroll
1994-01-01
Better designs are needed for expert systems and other operations automation software, for more reliable, usable and effective human support. A prototype computer-aided Application Reuse Library shows feasibility of supporting concurrent development and improvement of advanced software by users, analysts, software developers, and human-computer interaction experts. Such a library expedites development of quality software, by providing working, documented examples, which support understanding, modification and reuse of requirements as well as code. It explicitly documents and implicitly embodies design guidelines, standards and conventions. The Application Reuse Library provides application modules with Demo-and-Tester elements. Developers and users can evaluate applicability of a library module and test modifications, by running it interactively. Sub-modules provide application code and displays and controls. The library supports software modification and reuse, by providing alternative versions of application and display functionality. Information about human support and display requirements is provided, so that modifications will conform to guidelines. The library supports entry of new application modules from developers throughout an organization. Example library modules include a timer, some buttons and special fonts, and a real-time data interface program. The library prototype is implemented in the object-oriented G2 environment for developing real-time expert systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bootsma, G. J., E-mail: Gregory.Bootsma@rmp.uhn.on.ca; Verhaegen, F.; Medical Physics Unit, Department of Oncology, McGill University, Montreal, Quebec H3G 1A4
2015-01-15
Purpose: X-ray scatter is a significant impediment to image quality improvements in cone-beam CT (CBCT). The authors present and demonstrate a novel scatter correction algorithm using a scatter estimation method that simultaneously combines multiple Monte Carlo (MC) CBCT simulations through the use of a concurrently evaluated fitting function, referred to as concurrent MC fitting (CMCF). Methods: The CMCF method uses concurrently run MC CBCT scatter projection simulations that are a subset of the projection angles used in the projection set, P, to be corrected. The scattered photons reaching the detector in each MC simulation are simultaneously aggregated by an algorithmmore » which computes the scatter detector response, S{sub MC}. S{sub MC} is fit to a function, S{sub F}, and if the fit of S{sub F} is within a specified goodness of fit (GOF), the simulations are terminated. The fit, S{sub F}, is then used to interpolate the scatter distribution over all pixel locations for every projection angle in the set P. The CMCF algorithm was tested using a frequency limited sum of sines and cosines as the fitting function on both simulated and measured data. The simulated data consisted of an anthropomorphic head and a pelvis phantom created from CT data, simulated with and without the use of a compensator. The measured data were a pelvis scan of a phantom and patient taken on an Elekta Synergy platform. The simulated data were used to evaluate various GOF metrics as well as determine a suitable fitness value. The simulated data were also used to quantitatively evaluate the image quality improvements provided by the CMCF method. A qualitative analysis was performed on the measured data by comparing the CMCF scatter corrected reconstruction to the original uncorrected and corrected by a constant scatter correction reconstruction, as well as a reconstruction created using a set of projections taken with a small cone angle. Results: Pearson’s correlation, r, proved to be a suitable GOF metric with strong correlation with the actual error of the scatter fit, S{sub F}. Fitting the scatter distribution to a limited sum of sine and cosine functions using a low-pass filtered fast Fourier transform provided a computationally efficient and accurate fit. The CMCF algorithm reduces the number of photon histories required by over four orders of magnitude. The simulated experiments showed that using a compensator reduced the computational time by a factor between 1.5 and 1.75. The scatter estimates for the simulated and measured data were computed between 35–93 s and 114–122 s, respectively, using 16 Intel Xeon cores (3.0 GHz). The CMCF scatter correction improved the contrast-to-noise ratio by 10%–50% and reduced the reconstruction error to under 3% for the simulated phantoms. Conclusions: The novel CMCF algorithm significantly reduces the computation time required to estimate the scatter distribution by reducing the statistical noise in the MC scatter estimate and limiting the number of projection angles that must be simulated. Using the scatter estimate provided by the CMCF algorithm to correct both simulated and real projection data showed improved reconstruction image quality.« less
Belke, Terry W; Pierce, W David
2009-02-01
Twelve female Long-Evans rats were exposed to concurrent variable (VR) ratio schedules of sucrose and wheel-running reinforcement (Sucrose VR 10 Wheel VR 10; Sucrose VR 5 Wheel VR 20; Sucrose VR 20 Wheel VR 5) with predetermined budgets (number of responses). The allocation of lever pressing to the sucrose and wheel-running alternatives was assessed at high and low body weights. Results showed that wheel-running rate and lever-pressing rates for sucrose and wheel running increased, but the choice of wheel running decreased at the low body weight. A regression analysis of relative consumption as a function of relative price showed that consumption shifted toward sucrose and interacted with price differences in a manner consistent with increased substitutability. Demand curves showed that demand for sucrose became less elastic while demand for wheel running became more elastic at the low body weight. These findings reflect an increase in the difference in relative value of sucrose and wheel running as body weight decreased. Discussion focuses on the limitations of response rates as measures of reinforcement value. In addition, we address the commonalities between matching and demand curve equations for the analysis of changes in relative reinforcement value.
Finite elements and the method of conjugate gradients on a concurrent processor
NASA Technical Reports Server (NTRS)
Lyzenga, G. A.; Raefsky, A.; Hager, G. H.
1985-01-01
An algorithm for the iterative solution of finite element problems on a concurrent processor is presented. The method of conjugate gradients is used to solve the system of matrix equations, which is distributed among the processors of a MIMD computer according to an element-based spatial decomposition. This algorithm is implemented in a two-dimensional elastostatics program on the Caltech Hypercube concurrent processor. The results of tests on up to 32 processors show nearly linear concurrent speedup, with efficiencies over 90 percent for sufficiently large problems.
A Concurrent Distributed System for Aircraft Tactical Decision Generation
NASA Technical Reports Server (NTRS)
McManus, John W.
1990-01-01
A research program investigating the use of artificial intelligence (AI) techniques to aid in the development of a Tactical Decision Generator (TDG) for Within Visual Range (WVR) air combat engagements is discussed. The application of AI programming and problem solving methods in the development and implementation of a concurrent version of the Computerized Logic For Air-to-Air Warfare Simulations (CLAWS) program, a second generation TDG, is presented. Concurrent computing environments and programming approaches are discussed and the design and performance of a prototype concurrent TDG system are presented.
Finite elements and the method of conjugate gradients on a concurrent processor
NASA Technical Reports Server (NTRS)
Lyzenga, G. A.; Raefsky, A.; Hager, B. H.
1984-01-01
An algorithm for the iterative solution of finite element problems on a concurrent processor is presented. The method of conjugate gradients is used to solve the system of matrix equations, which is distributed among the processors of a MIMD computer according to an element-based spatial decomposition. This algorithm is implemented in a two-dimensional elastostatics program on the Caltech Hypercube concurrent processor. The results of tests on up to 32 processors show nearly linear concurrent speedup, with efficiencies over 90% for sufficiently large problems.
The Enhancement of Concurrent Processing through Functional Programming Languages.
1984-06-01
ta * functional programming languages allow us to harness the pro- cessing power of computers with hundreds or even thousands of DD I 1473 EDITION OF...that it might be the best way to make imperative library", programs into functional ones which are well suited to concurrent processing. Accession For...statements in their code. We assert that functional programming languajes allok us to harness the processing power of computers with hundre4s or even
Numericware i: Identical by State Matrix Calculator
Kim, Bongsong; Beavis, William D
2017-01-01
We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreading and forward chopping. The multithreading allows computational routines to concurrently run on multiple central processing unit (CPU) processors. The forward chopping addresses memory limitation by dividing a dataset into appropriately sized subsets. Numericware i allows calculation of the IBS matrix for a large genotypic dataset using a laptop or a desktop computer. For comparison with different software, we calculated genetic relationship matrices using Numericware i, SPAGeDi, and TASSEL with the same genotypic dataset. Numericware i calculates IBS coefficients between 0 and 2, whereas SPAGeDi and TASSEL produce different ranges of values including negative values. The Pearson correlation coefficient between the matrices from Numericware i and TASSEL was high at .9972, whereas SPAGeDi showed low correlation with Numericware i (.0505) and TASSEL (.0587). With a high-dimensional dataset of 500 entities by 10 000 000 SNPs, Numericware i spent 382 minutes using 19 CPU threads and 64 GB memory by dividing the dataset into 3 pieces, whereas SPAGeDi and TASSEL failed with the same dataset. Numericware i is freely available for Windows and Linux under CC-BY 4.0 license at https://figshare.com/s/f100f33a8857131eb2db. PMID:28469375
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Crowley, Kay
1990-01-01
Run time methods are studied to automatically parallelize and schedule iterations of a do loop in certain cases, where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run time, wave fronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. Symbolic transformation rules are used to produce: inspector procedures that perform execution time preprocessing and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. Performance results are presented from experiments conducted on the Encore Multimax. These results illustrate that run time reordering of loop indices can have a significant impact on performance. Furthermore, the overheads associated with this type of reordering are amortized when the loop is executed several times with the same dependency structure.
Single-task fMRI overlap predicts concurrent multitasking interference.
Nijboer, Menno; Borst, Jelmer; van Rijn, Hedderik; Taatgen, Niels
2014-10-15
There is no consensus regarding the origin of behavioral interference that occurs during concurrent multitasking. Some evidence points toward a multitasking locus in the brain, while other results imply that interference is the consequence of task interactions in several brain regions. To investigate this issue, we conducted a functional MRI (fMRI) study consisting of three component tasks, which were performed both separately and in combination. The results indicated that no specific multitasking area exists. Instead, different patterns of activation across conditions could be explained by assuming that the interference is a result of task interactions. Additionally, similarity in single-task activation patterns correlated with a decrease in accuracy during dual-task conditions. Taken together, these results support the view that multitasking interference is not due to a bottleneck in a single "multitasking" brain region, but is a result of interactions between concurrently running processes. Copyright © 2014 Elsevier Inc. All rights reserved.
TWOS - TIME WARP OPERATING SYSTEM, VERSION 2.5.1
NASA Technical Reports Server (NTRS)
Bellenot, S. F.
1994-01-01
The Time Warp Operating System (TWOS) is a special-purpose operating system designed to support parallel discrete-event simulation. TWOS is a complete implementation of the Time Warp mechanism, a distributed protocol for virtual time synchronization based on process rollback and message annihilation. Version 2.5.1 supports simulations and other computations using both virtual time and dynamic load balancing; it does not support general time-sharing or multi-process jobs using conventional message synchronization and communication. The program utilizes the underlying operating system's resources. TWOS runs a single simulation at a time, executing it concurrently on as many processors of a distributed system as are allocated. The simulation needs only to be decomposed into objects (logical processes) that interact through time-stamped messages. TWOS provides transparent synchronization. The user does not have to add any more special logic to aid in synchronization, nor give any synchronization advice, nor even understand much about how the Time Warp mechanism works. The Time Warp Simulator (TWSIM) subdirectory contains a sequential simulation engine that is interface compatible with TWOS. This means that an application designer and programmer who wish to use TWOS can prototype code on TWSIM on a single processor and/or workstation before having to deal with the complexity of working on a distributed system. TWSIM also provides statistics about the application which may be helpful for determining the correctness of an application and for achieving good performance on TWOS. Version 2.5.1 has an updated interface that is not compatible with 2.0. The program's user manual assists the simulation programmer in the design, coding, and implementation of discrete-event simulations running on TWOS. The manual also includes a practical user's guide to the TWOS application benchmark, Colliding Pucks. TWOS supports simulations written in the C programming language. It is designed to run on the Sun3/Sun4 series computers and the BBN "Butterfly" GP-1000 computer. The standard distribution medium for this package is a .25 inch tape cartridge in TAR format. TWOS was developed in 1989 and updated in 1991. This program is a copyrighted work with all copyright vested in NASA. Sun3 and Sun4 are trademarks of Sun Microsystems, Inc.
NASA Astrophysics Data System (ADS)
Kazarov, A.; Lehmann Miotto, G.; Magnoni, L.
2012-06-01
The Trigger and Data Acquisition (TDAQ) system of the ATLAS experiment at CERN is the infrastructure responsible for collecting and transferring ATLAS experimental data from detectors to the mass storage system. It relies on a large, distributed computing environment, including thousands of computing nodes with thousands of application running concurrently. In such a complex environment, information analysis is fundamental for controlling applications behavior, error reporting and operational monitoring. During data taking runs, streams of messages sent by applications via the message reporting system together with data published from applications via information services are the main sources of knowledge about correctness of running operations. The flow of data produced (with an average rate of O(1-10KHz)) is constantly monitored by experts to detect problem or misbehavior. This requires strong competence and experience in understanding and discovering problems and root causes, and often the meaningful information is not in the single message or update, but in the aggregated behavior in a certain time-line. The AAL project is meant at reducing the man power needs and at assuring a constant high quality of problem detection by automating most of the monitoring tasks and providing real-time correlation of data-taking and system metrics. This project combines technologies coming from different disciplines, in particular it leverages on an Event Driven Architecture to unify the flow of data from the ATLAS infrastructure, on a Complex Event Processing (CEP) engine for correlation of events and on a message oriented architecture for components integration. The project is composed of 2 main components: a core processing engine, responsible for correlation of events through expert-defined queries and a web based front-end to present real-time information and interact with the system. All components works in a loose-coupled event based architecture, with a message broker to centralize all communication between modules. The result is an intelligent system able to extract and compute relevant information from the flow of operational data to provide real-time feedback to human experts who can promptly react when needed. The paper presents the design and implementation of the AAL project, together with the results of its usage as automated monitoring assistant for the ATLAS data taking infrastructure.
Extensive multi-generational microplate culturing (copepod hatching stage through two broods) experiments were completed with the POPs lindane, DDD and fipronil sulfide. Identical tandem microplate experiments were run concurrently to yield sufficient copepod biomass for li...
29 CFR 825.207 - Substitution of paid leave.
Code of Federal Regulations, 2010 CFR
2010-07-01
... employer, will run concurrently with the unpaid FMLA leave. Accordingly, the employee receives pay pursuant... which is not a serious health condition or serious injury or illness does not count against the employee... condition may result from injury to the employee “on or off” the job. If the employer designates the leave...
DACS II - A distributed thermal/mechanical loads data acquisition and control system
NASA Technical Reports Server (NTRS)
Zamanzadeh, Behzad; Trover, William F.; Anderson, Karl F.
1987-01-01
A distributed data acquisition and control system has been developed for the NASA Flight Loads Research Facility. The DACS II system is composed of seven computer systems and four array processors configured as a main computer system, three satellite computer systems, and 13 analog input/output systems interconnected through three independent data networks. Up to three independent heating and loading tests can be run concurrently on different test articles or the entire system can be used on a single large test such as a full scale hypersonic aircraft. Thermal tests can include up to 512 independent adaptive closed loop control channels. The control system can apply up to 20 MW of heating to a test specimen while simultaneously applying independent mechanical loads. Each thermal control loop is capable of heating a structure at rates of up to 150 F per second over a temperature range of -300 to +2500 F. Up to 64 independent mechanical load profiles can be commanded along with thermal control. Up to 1280 analog inputs monitor temperature, load, displacement and strain on the test specimens with real time data displayed on up to 15 terminals as color plots and tabular data displays. System setup and operation is accomplished with interactive menu-driver displays with extensive facilities to assist the users in all phases of system operation.
System and method for controlling power consumption in a computer system based on user satisfaction
Yang, Lei; Dick, Robert P; Chen, Xi; Memik, Gokhan; Dinda, Peter A; Shy, Alex; Ozisikyilmaz, Berkin; Mallik, Arindam; Choudhary, Alok
2014-04-22
Systems and methods for controlling power consumption in a computer system. For each of a plurality of interactive applications, the method changes a frequency at which a processor of the computer system runs, receives an indication of user satisfaction, determines a relationship between the changed frequency and the user satisfaction of the interactive application, and stores the determined relationship information. The determined relationship can distinguish between different users and different interactive applications. A frequency may be selected from the discrete frequencies at which the processor of the computer system runs based on the determined relationship information for a particular user and a particular interactive application running on the processor of the computer system. The processor may be adapted to run at the selected frequency.
Design of testbed and emulation tools
NASA Technical Reports Server (NTRS)
Lundstrom, S. F.; Flynn, M. J.
1986-01-01
The research summarized was concerned with the design of testbed and emulation tools suitable to assist in projecting, with reasonable accuracy, the expected performance of highly concurrent computing systems on large, complete applications. Such testbed and emulation tools are intended for the eventual use of those exploring new concurrent system architectures and organizations, either as users or as designers of such systems. While a range of alternatives was considered, a software based set of hierarchical tools was chosen to provide maximum flexibility, to ease in moving to new computers as technology improves and to take advantage of the inherent reliability and availability of commercially available computing systems.
NASA Astrophysics Data System (ADS)
Qi, Xianfei; Gao, Ting; Yan, Fengli
2017-01-01
Concurrence, as one of the entanglement measures, is a useful tool to characterize quantum entanglement in various quantum systems. However, the computation of the concurrence involves difficult optimizations and only for the case of two qubits, an exact formula was found. We investigate the concurrence of four-qubit quantum states and derive analytical lower bound of concurrence using the multiqubit monogamy inequality. It is shown that this lower bound is able to improve the existing bounds. This approach can be generalized to arbitrary qubit systems. We present an exact formula of concurrence for some mixed quantum states. For even-qubit states, we derive an improved lower bound of concurrence using a monogamy equality for qubit systems. At the same time, we show that a multipartite state is k-nonseparable if the multipartite concurrence is larger than a constant related to the value of k, the qudit number and the dimension of the subsystems. Our results can be applied to detect the multipartite k-nonseparable states.
Concurrent Training for Sports Performance: The Two Sides of the Medal.
Berryman, Nicolas; Mujika, Inigo; Bosquet, Laurent
2018-05-29
The classical work by Robert C. Hickson showed in 1980 that the addition of a resistance training protocol to a predominantly aerobic program could lead to impaired leg strength adaptations in comparison to a resistance-only training regimen. This interference phenomenon was later highlighted in many reports, including a meta-analysis. However, it seems that the interference effect has not been consistently reported, probably because of the complex interactions between training variables and methodological issues. On the other side of the medal, Dr Hickson and colleagues subsequently (1986) reported that a strength training mesocycle could be beneficial for endurance performance in running and cycling. In recent meta-analyses and review articles, it was demonstrated that such a training strategy could improve middle- and long-distance performance in many disciplines (running, cycling, cross-country skiing and swimming). Interestingly, it appears that improvements in the energy cost of locomotion could be associated with these performance enhancements. Despite these benefits, it was also reported that strength training could represent a detrimental stimulus for endurance performance if an inappropriate training plan has been prepared. Taken together, these observations suggest that coaches and athletes should be careful when concurrent training seems imperative in order to meet the complex physiological requirements of their sport. Therefore, this brief review will present a practical appraisal of concurrent training for sports performance. In addition, recommendations will be provided so that practitioners could adapt their interventions based on the training objectives.
Dobbin, Nick; Highton, Jamie; Moss, Samantha L; Hunwicks, Richard; Twist, Craig
2018-06-01
Dobbin, N, Highton, J, Moss, SL, Hunwicks, R, and Twist, C. Concurrent validity of a rugby-specific Yo-Yo intermittent recovery test (level 1) for assessing match-related running performance. J Strength Cond Res XX(X): 000-000, 2018-This study investigated the concurrent validity of a rugby-specific high-intensity intermittent running test against the internal, external, and perceptual responses to simulated match play. Thirty-six rugby league players (age 18.5 ± 1.8 years; stature 181.4 ± 7.6 cm; body mass 83.5 ± 9.8 kg) completed the prone Yo-Yo Intermittent Recovery Test (Yo-Yo IR1), of which 16 also completed the Yo-Yo IR1, and 2 × ∼20 minute bouts of a simulated match play (rugby league match simulation protocol for interchange players [RLMSP-i]). Most likely reductions in relative total, low-speed and high-speed distance, mean speed, and time above 20 W·kg (high metabolic power [HMP]) were observed between bouts of the RLMSP-i. Likewise, rating of perceived exertion (RPE) and percentage of peak heart rate (%HRpeak) were very likely and likely higher during the second bout. Pearson's correlations revealed a large relationship for the change in relative distance (r = 0.57-0.61) between bouts with both Yo-Yo IR1 tests. The prone Yo-Yo IR1 was more strongly related to the RLMSP-i for change in repeated sprint speed (r = 0.78 cf. 0.56), mean speed (r = 0.64 cf. 0.36), HMP (r = 0.48 cf. 0.25), fatigue index (r = 0.71 cf. 0.63), %HRpeak (r = -0.56 cf. -0.35), RPEbout1 (r = -0.44 cf. -0.14), and RPEbout2 (r = -0.68 cf. -0.41) than the Yo-Yo IR1, but not for blood lactate concentration (r = -0.20 to -0.28 cf. -0.35 to -0.49). The relationships between prone Yo-Yo IR1 distance and measures of load during the RLMSP-i suggest that it possesses concurrent validity and is more strongly associated with measures of training or match load than the Yo-Yo IR1 using rugby league players.
A wirelessly programmable actuation and sensing system for structural health monitoring
NASA Astrophysics Data System (ADS)
Long, James; Büyüköztürk, Oral
2016-04-01
Wireless sensor networks promise to deliver low cost, low power and massively distributed systems for structural health monitoring. A key component of these systems, particularly when sampling rates are high, is the capability to process data within the network. Although progress has been made towards this vision, it remains a difficult task to develop and program 'smart' wireless sensing applications. In this paper we present a system which allows data acquisition and computational tasks to be specified in Python, a high level programming language, and executed within the sensor network. Key features of this system include the ability to execute custom application code without firmware updates, to run multiple users' requests concurrently and to conserve power through adjustable sleep settings. Specific examples of sensor node tasks are given to demonstrate the features of this system in the context of structural health monitoring. The system comprises of individual firmware for nodes in the wireless sensor network, and a gateway server and web application through which users can remotely submit their requests.
MacDonall, James S
2017-09-01
Some have reported changing the schedule at one alternative of a concurrent schedule changed responding at the other alternative (Catania, 1969), which seems odd because no contingencies were changed there. When concurrent schedules are programmed using two schedules, one associated with each alternative that operate continuously, changing the schedule at one alternative also changes the switch schedule at the other alternative. Thus, changes in responding at the constant alternative could be due to the change in the switch schedule. To assess this possibility, six rats were exposed to a series of conditions that alternated between pairs of interval schedules at both alternatives and a pair of interval schedules at one, constant, alternative and a pair of extinction schedules at the other alternative. Comparing run lengths, visit durations and response rates at the constant alternative in the alternating conditions did not show consistent increases and decreases when a strict criterion for changes was used. Using a less stringent definition (any change in mean values) showed changes. The stay/switch analysis suggests it may be inaccurate to apply behavioral contrast to procedures that change from concurrent variable-interval variable-interval schedules to concurrent variable-interval extinction schedules because the contingencies in neither alternative are constant. © 2017 Society for the Experimental Analysis of Behavior.
Ebacher, G; Besner, M C; Clément, B; Prévost, M
2012-09-01
Intrusion events caused by transient low pressures may result in the contamination of a water distribution system (DS). This work aims at estimating the range of potential intrusion volumes that could result from a real downsurge event caused by a momentary pump shutdown. A model calibrated with transient low pressure recordings was used to simulate total intrusion volumes through leakage orifices and submerged air vacuum valves (AVVs). Four critical factors influencing intrusion volumes were varied: the external head of (untreated) water on leakage orifices, the external head of (untreated) water on submerged air vacuum valves, the leakage rate, and the diameter of AVVs' outlet orifice (represented by a multiplicative factor). Leakage orifices' head and AVVs' orifice head levels were assessed through fieldwork. Two sets of runs were generated as part of two statistically designed experiments. A first set of 81 runs was based on a complete factorial design in which each factor was varied over 3 levels. A second set of 40 runs was based on a latin hypercube design, better suited for experimental runs on a computer model. The simulations were conducted using commercially available transient analysis software. Responses, measured by total intrusion volumes, ranged from 10 to 366 L. A second degree polynomial was used to analyze the total intrusion volumes. Sensitivity analyses of both designs revealed that the relationship between the total intrusion volume and the four contributing factors is not monotonic, with the AVVs' orifice head being the most influential factor. When intrusion through both pathways occurs concurrently, interactions between the intrusion flows through leakage orifices and submerged AVVs influence intrusion volumes. When only intrusion through leakage orifices is considered, the total intrusion volume is more largely influenced by the leakage rate than by the leakage orifices' head. The latter mainly impacts the extent of the area affected by intrusion. Copyright © 2012 Elsevier Ltd. All rights reserved.
Bruening, Dustin A; Pohl, Michael B; Takahashi, Kota Z; Barrios, Joaquin A
2018-05-17
Changes in running strike pattern affect ankle and knee mechanics, but little is known about the influence of strike pattern on the joints distal to the ankle. The purpose of this study was to explore the effects of forefoot strike (FFS) and rearfoot strike (RFS) running patterns on foot kinematics and kinetics, from the perspectives of the midtarsal locking theory and the windlass mechanism. Per the midtarsal locking theory, we hypothesized that the ankle would be more inverted in early stance when using a FFS, resulting in decreased midtarsal joint excursions and increased dynamic stiffness. Associated with a more engaged windlass mechanism, we hypothesized that a FFS would elicit increased metatarsophalangeal joint excursions and negative work in late stance. Eighteen healthy female runners ran overground with both FFS and RFS patterns. Instrumented motion capture and a validated multi-segment foot model were used to analyze midtarsal and metatarsophalangeal joint kinematics and kinetics. During early stance in FFS the ankle was more inverted, with concurrently decreased midtarsal eversion (p < 0.001) and abduction excursions (p = 0.003) but increased dorsiflexion excursion (p = 0.005). Dynamic midtarsal stiffness did not differ (p = 0.761). During late stance in FFS, metatarsophalangeal extension was increased (p = 0.009), with concurrently increased negative work (p < 0.001). In addition, there was simultaneously increased midtarsal positive work (p < 0.001), suggesting enhanced power transfer in FFS. Clear evidence for the presence of midtarsal locking was not observed in either strike pattern during running. However, the windlass mechanism appeared to be engaged to a greater extent during FFS. Copyright © 2018 Elsevier Ltd. All rights reserved.
Scalable Performance Measurement and Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gamblin, Todd
2009-01-01
Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Modern machines may contain 100,000 or more microprocessor cores, and the largest of these, IBM's Blue Gene/L, contains over 200,000 cores. Future systems are expected to support millions of concurrent tasks. In this dissertation, we focus on efficient techniques for measuring and analyzing the performance of applications running on very large parallel machines. Tuning the performance of large-scale applications can be a subtle and time-consuming task because application developers must measure and interpret data from many independent processes. While the volume of the raw data scales linearly with the number ofmore » tasks in the running system, the number of tasks is growing exponentially, and data for even small systems quickly becomes unmanageable. Transporting performance data from so many processes over a network can perturb application performance and make measurements inaccurate, and storing such data would require a prohibitive amount of space. Moreover, even if it were stored, analyzing the data would be extremely time-consuming. In this dissertation, we present novel methods for reducing performance data volume. The first draws on multi-scale wavelet techniques from signal processing to compress systemwide, time-varying load-balance data. The second uses statistical sampling to select a small subset of running processes to generate low-volume traces. A third approach combines sampling and wavelet compression to stratify performance data adaptively at run-time and to reduce further the cost of sampled tracing. We have integrated these approaches into Libra, a toolset for scalable load-balance analysis. We present Libra and show how it can be used to analyze data from large scientific applications scalably.« less
Simulating three dimensional wave run-up over breakwaters covered by antifer units
NASA Astrophysics Data System (ADS)
Najafi-Jilani, A.; Niri, M. Zakiri; Naderi, Nader
2014-06-01
The paper presents the numerical analysis of wave run-up over rubble-mound breakwaters covered by antifer units using a technique integrating Computer-Aided Design (CAD) and Computational Fluid Dynamics (CFD) software. Direct application of Navier-Stokes equations within armour blocks, is used to provide a more reliable approach to simulate wave run-up over breakwaters. A well-tested Reynolds-averaged Navier-Stokes (RANS) Volume of Fluid (VOF) code (Flow-3D) was adopted for CFD computations. The computed results were compared with experimental data to check the validity of the model. Numerical results showed that the direct three dimensional (3D) simulation method can deliver accurate results for wave run-up over rubble mound breakwaters. The results showed that the placement pattern of antifer units had a great impact on values of wave run-up so that by changing the placement pattern from regular to double pyramid can reduce the wave run-up by approximately 30%. Analysis was done to investigate the influences of surface roughness, energy dissipation in the pores of the armour layer and reduced wave run-up due to inflow into the armour and stone layer.
Translations on Narcotics and Dangerous Drug No. 273
1976-11-26
at Fair Park here at 5 a.m. on March 27, last year. The court’s president, Encik Annuar Bin Dato Zainal Abidin, sentenced him to five years’ jail...and six strokes of the rotan on each charge. The sentences are however to run concurrently. In passing sentence, Encik Annuar said that Yong had
2012-09-30
results of running it on the GLINT10 AUV Unicorn dataset is given in Fig. 3. This figure is a histogram which shows the number of messages from the...as well as Non- Government Institutions such as Battelle Memorial Institute. PUBLICATIONS 1. T. Schneider and H. Schmidt, "Goby-Acomms
WinSCP for Windows File Transfers | High-Performance Computing | NREL
WinSCP for Windows File Transfers WinSCP for Windows File Transfers WinSCP for can used to securely transfer files between your local computer running Microsoft Windows and a remote computer running Linux
NASA Astrophysics Data System (ADS)
Evans, J. D.; Hao, W.; Chettri, S.
2013-12-01
The cloud is proving to be a uniquely promising platform for scientific computing. Our experience with processing satellite data using Amazon Web Services highlights several opportunities for enhanced performance, flexibility, and cost effectiveness in the cloud relative to traditional computing -- for example: - Direct readout from a polar-orbiting satellite such as the Suomi National Polar-Orbiting Partnership (S-NPP) requires bursts of processing a few times a day, separated by quiet periods when the satellite is out of receiving range. In the cloud, by starting and stopping virtual machines in minutes, we can marshal significant computing resources quickly when needed, but not pay for them when not needed. To take advantage of this capability, we are automating a data-driven approach to the management of cloud computing resources, in which new data availability triggers the creation of new virtual machines (of variable size and processing power) which last only until the processing workflow is complete. - 'Spot instances' are virtual machines that run as long as one's asking price is higher than the provider's variable spot price. Spot instances can greatly reduce the cost of computing -- for software systems that are engineered to withstand unpredictable interruptions in service (as occurs when a spot price exceeds the asking price). We are implementing an approach to workflow management that allows data processing workflows to resume with minimal delays after temporary spot price spikes. This will allow systems to take full advantage of variably-priced 'utility computing.' - Thanks to virtual machine images, we can easily launch multiple, identical machines differentiated only by 'user data' containing individualized instructions (e.g., to fetch particular datasets or to perform certain workflows or algorithms) This is particularly useful when (as is the case with S-NPP data) we need to launch many very similar machines to process an unpredictable number of data files concurrently. Our experience shows the viability and flexibility of this approach to workflow management for scientific data processing. - Finally, cloud computing is a promising platform for distributed volunteer ('interstitial') computing, via mechanisms such as the Berkeley Open Infrastructure for Network Computing (BOINC) popularized with the SETI@Home project and others such as ClimatePrediction.net and NASA's Climate@Home. Interstitial computing faces significant challenges as commodity computing shifts from (always on) desktop computers towards smartphones and tablets (untethered and running on scarce battery power); but cloud computing offers significant slack capacity. This capacity includes virtual machines with unused RAM or underused CPUs; virtual storage volumes allocated (& paid for) but not full; and virtual machines that are paid up for the current hour but whose work is complete. We are devising ways to facilitate the reuse of these resources (i.e., cloud-based interstitial computing) for satellite data processing and related analyses. We will present our findings and research directions on these and related topics.
Schroeder, Mark D.; Greer, Christina; Gaul, Ulrike
2011-01-01
The generation of metameric body plans is a key process in development. In Drosophila segmentation, periodicity is established rapidly through the complex transcriptional regulation of the pair-rule genes. The ‘primary’ pair-rule genes generate their 7-stripe expression through stripe-specific cis-regulatory elements controlled by the preceding non-periodic maternal and gap gene patterns, whereas ‘secondary’ pair-rule genes are thought to rely on 7-stripe elements that read off the already periodic primary pair-rule patterns. Using a combination of computational and experimental approaches, we have conducted a comprehensive systems-level examination of the regulatory architecture underlying pair-rule stripe formation. We find that runt (run), fushi tarazu (ftz) and odd skipped (odd) establish most of their pattern through stripe-specific elements, arguing for a reclassification of ftz and odd as primary pair-rule genes. In the case of run, we observe long-range cis-regulation across multiple intervening genes. The 7-stripe elements of run, ftz and odd are active concurrently with the stripe-specific elements, indicating that maternal/gap-mediated control and pair-rule gene cross-regulation are closely integrated. Stripe-specific elements fall into three distinct classes based on their principal repressive gap factor input; stripe positions along the gap gradients correlate with the strength of predicted input. The prevalence of cis-elements that generate two stripes and their genomic organization suggest that single-stripe elements arose by splitting and subfunctionalization of ancestral dual-stripe elements. Overall, our study provides a greatly improved understanding of how periodic patterns are established in the Drosophila embryo. PMID:21693522
Modeling and optimum time performance for concurrent processing
NASA Technical Reports Server (NTRS)
Mielke, Roland R.; Stoughton, John W.; Som, Sukhamoy
1988-01-01
The development of a new graph theoretic model for describing the relation between a decomposed algorithm and its execution in a data flow environment is presented. Called ATAMM, the model consists of a set of Petri net marked graphs useful for representing decision-free algorithms having large-grained, computationally complex primitive operations. Performance time measures which determine computing speed and throughput capacity are defined, and the ATAMM model is used to develop lower bounds for these times. A concurrent processing operating strategy for achieving optimum time performance is presented and illustrated by example.
Parallel Algorithms and Patterns
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robey, Robert W.
2016-06-16
This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Parallel scheduling of recursively defined arrays
NASA Technical Reports Server (NTRS)
Myers, T. J.; Gokhale, M. B.
1986-01-01
A new method of automatic generation of concurrent programs which constructs arrays defined by sets of recursive equations is described. It is assumed that the time of computation of an array element is a linear combination of its indices, and integer programming is used to seek a succession of hyperplanes along which array elements can be computed concurrently. The method can be used to schedule equations involving variable length dependency vectors and mutually recursive arrays. Portions of the work reported here have been implemented in the PS automatic program generation system.
1986-06-11
been specified, then the amount specified is returned. Otherwise the current amount allocated is returned. T’STORAGESIZE for task types or objects is...hrs DURATION’LAST 131071.99993896484375 36 hrs F.A Address Clauses Address clauses are implemented for objects. No storage is allocated for objects...it is ignored. at Allocation . An integer in the range 1..2,147,483,647. For CONTIGUOUS files, it specifies the number of 256 byte sectors. For ITAM
RAPPORT: running scientific high-performance computing applications on the cloud.
Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt
2013-01-28
Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
GEANT4 distributed computing for compact clusters
NASA Astrophysics Data System (ADS)
Harrawood, Brian P.; Agasthya, Greeshma A.; Lakshmanan, Manu N.; Raterman, Gretchen; Kapadia, Anuj J.
2014-11-01
A new technique for distribution of GEANT4 processes is introduced to simplify running a simulation in a parallel environment such as a tightly coupled computer cluster. Using a new C++ class derived from the GEANT4 toolkit, multiple runs forming a single simulation are managed across a local network of computers with a simple inter-node communication protocol. The class is integrated with the GEANT4 toolkit and is designed to scale from a single symmetric multiprocessing (SMP) machine to compact clusters ranging in size from tens to thousands of nodes. User designed 'work tickets' are distributed to clients using a client-server work flow model to specify the parameters for each individual run of the simulation. The new g4DistributedRunManager class was developed and well tested in the course of our Neutron Stimulated Emission Computed Tomography (NSECT) experiments. It will be useful for anyone running GEANT4 for large discrete data sets such as covering a range of angles in computed tomography, calculating dose delivery with multiple fractions or simply speeding the through-put of a single model.
NASA Technical Reports Server (NTRS)
Schumacher, W.; Geiser, G.
1978-01-01
The basic concepts of Petri nets are reviewed as well as their application as the fundamental model of technical systems with concurrent discrete events such as hardware systems and software models of computers. The use of Petri nets is proposed for modeling the human operator dealing with concurrent discrete tasks. Their properties useful in modeling the human operator are discussed and practical examples are given. By means of and experimental investigation of binary concurrent tasks which are presented in a serial manner, the representation of human behavior by Petri nets is demonstrated.
25 CFR 542.10 - What are the minimum internal control standards for keno?
Code of Federal Regulations, 2012 CFR
2012-04-01
... keno? (a) Computer applications. For any computer applications utilized, alternate documentation and/or... restricted transaction log or computer storage media concurrently with the generation of the ticket. (3) Keno personnel shall be precluded from having access to the restricted transaction log or computer storage media...
25 CFR 542.10 - What are the minimum internal control standards for keno?
Code of Federal Regulations, 2013 CFR
2013-04-01
... keno? (a) Computer applications. For any computer applications utilized, alternate documentation and/or... restricted transaction log or computer storage media concurrently with the generation of the ticket. (3) Keno personnel shall be precluded from having access to the restricted transaction log or computer storage media...
Computational Methods for Feedback Controllers for Aerodynamics Flow Applications
2007-08-15
Iteration #, and y-translation by: »> Fy=[unf(:,8);runA(:,8);runB(:,8);runC(:,8);runD(:,S); runE (:,8)]; >> Oy-[unf(:,23) ;runA(:,23) ;runB(:,23) ;runC(:,23...runD(:,23) ; runE (:,23)]; >> Iter-[unf(:,1);runA(U ,l);runB(:,l);runC(:,l) ;runD(:,l); runE (:,l)]; >> plot(Fy) Cobalt version 4.0 €blso!,,tic,,. ř-21
Proposal for grid computing for nuclear applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Idris, Faridah Mohamad; Ismail, Saaidi; Haris, Mohd Fauzi B.
2014-02-12
The use of computer clusters for computational sciences including computational physics is vital as it provides computing power to crunch big numbers at a faster rate. In compute intensive applications that requires high resolution such as Monte Carlo simulation, the use of computer clusters in a grid form that supplies computational power to any nodes within the grid that needs computing power, has now become a necessity. In this paper, we described how the clusters running on a specific application could use resources within the grid, to run the applications to speed up the computing process.
El-Zawawy, Mohamed A.
2014-01-01
This paper introduces new approaches for the analysis of frequent statement and dereference elimination for imperative and object-oriented distributed programs running on parallel machines equipped with hierarchical memories. The paper uses languages whose address spaces are globally partitioned. Distributed programs allow defining data layout and threads writing to and reading from other thread memories. Three type systems (for imperative distributed programs) are the tools of the proposed techniques. The first type system defines for every program point a set of calculated (ready) statements and memory accesses. The second type system uses an enriched version of types of the first type system and determines which of the ready statements and memory accesses are used later in the program. The third type system uses the information gather so far to eliminate unnecessary statement computations and memory accesses (the analysis of frequent statement and dereference elimination). Extensions to these type systems are also presented to cover object-oriented distributed programs. Two advantages of our work over related work are the following. The hierarchical style of concurrent parallel computers is similar to the memory model used in this paper. In our approach, each analysis result is assigned a type derivation (serves as a correctness proof). PMID:24892098
ERIC Educational Resources Information Center
Mayer, Richard E.; Sims, Valerie K.
1994-01-01
In 2 experiments, 162 high- and low-spatial ability students viewed a computer-generated animation and heard a concurrent or successive explanation. The concurrent group generated more creative solutions to transfer problems and demonstrated a contiguity effect consistent with dual-coding theory. (SLD)
Wireless Computing Architecture III
2013-09-01
MIMO Multiple-Input and Multiple-Output MIMO /CON MIMO with concurrent hannel access and estimation MU- MIMO Multiuser MIMO OFDM Orthogonal...compressive sensing \\; a design for concurrent channel estimation in scalable multiuser MIMO networking; and novel networking protocols based on machine...Network, Antenna Arrays, UAV networking, Angle of Arrival, Localization MIMO , Access Point, Channel State Information, Compressive Sensing 16
NASA Technical Reports Server (NTRS)
Kemeny, Sabrina E.
1994-01-01
Electronic and optoelectronic hardware implementations of highly parallel computing architectures address several ill-defined and/or computation-intensive problems not easily solved by conventional computing techniques. The concurrent processing architectures developed are derived from a variety of advanced computing paradigms including neural network models, fuzzy logic, and cellular automata. Hardware implementation technologies range from state-of-the-art digital/analog custom-VLSI to advanced optoelectronic devices such as computer-generated holograms and e-beam fabricated Dammann gratings. JPL's concurrent processing devices group has developed a broad technology base in hardware implementable parallel algorithms, low-power and high-speed VLSI designs and building block VLSI chips, leading to application-specific high-performance embeddable processors. Application areas include high throughput map-data classification using feedforward neural networks, terrain based tactical movement planner using cellular automata, resource optimization (weapon-target assignment) using a multidimensional feedback network with lateral inhibition, and classification of rocks using an inner-product scheme on thematic mapper data. In addition to addressing specific functional needs of DOD and NASA, the JPL-developed concurrent processing device technology is also being customized for a variety of commercial applications (in collaboration with industrial partners), and is being transferred to U.S. industries. This viewgraph p resentation focuses on two application-specific processors which solve the computation intensive tasks of resource allocation (weapon-target assignment) and terrain based tactical movement planning using two extremely different topologies. Resource allocation is implemented as an asynchronous analog competitive assignment architecture inspired by the Hopfield network. Hardware realization leads to a two to four order of magnitude speed-up over conventional techniques and enables multiple assignments, (many to many), not achievable with standard statistical approaches. Tactical movement planning (finding the best path from A to B) is accomplished with a digital two-dimensional concurrent processor array. By exploiting the natural parallel decomposition of the problem in silicon, a four order of magnitude speed-up over optimized software approaches has been demonstrated.
The internal gravity wave spectrum in two high-resolution global ocean models
NASA Astrophysics Data System (ADS)
Arbic, B. K.; Ansong, J. K.; Buijsman, M. C.; Kunze, E. L.; Menemenlis, D.; Müller, M.; Richman, J. G.; Savage, A.; Shriver, J. F.; Wallcraft, A. J.; Zamudio, L.
2016-02-01
We examine the internal gravity wave (IGW) spectrum in two sets of high-resolution global ocean simulations that are forced concurrently by atmospheric fields and the astronomical tidal potential. We analyze global 1/12th and 1/25th degree HYCOM simulations, and global 1/12th, 1/24th, and 1/48th degree simulations of the MITgcm. We are motivated by the central role that IGWs play in ocean mixing, by operational considerations of the US Navy, which runs HYCOM as an ocean forecast model, and by the impact of the IGW continuum on the sea surface height (SSH) measurements that will be taken by the planned NASA/CNES SWOT wide-swath altimeter mission. We (1) compute the IGW horizontal wavenumber-frequency spectrum of kinetic energy, and interpret the results with linear dispersion relations computed from the IGW Sturm-Liouville problem, (2) compute and similarly interpret nonlinear spectral kinetic energy transfers in the IGW band, (3) compute and similarly interpret IGW contributions to SSH variance, (4) perform comparisons of modeled IGW kinetic energy frequency spectra with moored current meter observations, and (5) perform comparisons of modeled IGW kinetic energy vertical wavenumber-frequency spectra with moored observations. This presentation builds upon our work in Muller et al. (2015, GRL), who performed tasks (1), (2), and (4) in 1/12th and 1/25th degree HYCOM simulations, for one region of the North Pacific. New for this presentation are tasks (3) and (5), the inclusion of MITgcm solutions, and the analysis of additional ocean regions.
SSL - THE SIMPLE SOCKETS LIBRARY
NASA Technical Reports Server (NTRS)
Campbell, C. E.
1994-01-01
The Simple Sockets Library (SSL) allows C programmers to develop systems of cooperating programs using Berkeley streaming Sockets running under the TCP/IP protocol over Ethernet. The SSL provides a simple way to move information between programs running on the same or different machines and does so with little overhead. The SSL can create three types of Sockets: namely a server, a client, and an accept Socket. The SSL's Sockets are designed to be used in a fashion reminiscent of the use of FILE pointers so that a C programmer who is familiar with reading and writing files will immediately feel comfortable with reading and writing with Sockets. The SSL consists of three parts: the library, PortMaster, and utilities. The user of the SSL accesses it by linking programs to the SSL library. The PortMaster initializes connections between clients and servers. The PortMaster also supports a "firewall" facility to keep out socket requests from unapproved machines. The "firewall" is a file which contains Internet addresses for all approved machines. There are three utilities provided with the SSL. SKTDBG can be used to debug programs that make use of the SSL. SPMTABLE lists the servers and port numbers on requested machine(s). SRMSRVR tells the PortMaster to forcibly remove a server name from its list. The package also includes two example programs: multiskt.c, which makes multiple accepts on one server, and sktpoll.c, which repeatedly attempts to connect a client to some server at one second intervals. SSL is a machine independent library written in the C-language for computers connected via Ethernet using the TCP/IP protocol. It has been successfully compiled and implemented on a variety of platforms, including Sun series computers running SunOS, DEC VAX series computers running VMS, SGI computers running IRIX, DECstations running ULTRIX, DEC alpha AXPs running OSF/1, IBM RS/6000 computers running AIX, IBM PC and compatibles running BSD/386 UNIX and HP Apollo 3000/4000/9000/400T computers running HP-UX. SSL requires 45K of RAM to run under SunOS and 80K of RAM to run under VMS. For use on IBM PC series computers and compatibles running DOS, SSL requires Microsoft C 6.0 and the Wollongong TCP/IP package. Source code for sample programs and debugging tools are provided. The documentation is available on the distribution medium in TeX and PostScript formats. The standard distribution medium for SSL is a .25 inch streaming magnetic tape cartridge (QIC-24) in UNIX tar format. It is also available on a 3.5 inch diskette in UNIX tar format and a 5.25 inch 360K MS-DOS format diskette. The SSL was developed in 1992 and was updated in 1993.
Design of object-oriented distributed simulation classes
NASA Technical Reports Server (NTRS)
Schoeffler, James D. (Principal Investigator)
1995-01-01
Distributed simulation of aircraft engines as part of a computer aided design package is being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for 'Numerical Propulsion Simulation System'. NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT 'Actor' model of a concurrent object and uses 'connectors' to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.
Design of Object-Oriented Distributed Simulation Classes
NASA Technical Reports Server (NTRS)
Schoeffler, James D.
1995-01-01
Distributed simulation of aircraft engines as part of a computer aided design package being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for "Numerical Propulsion Simulation System". NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT "Actor" model of a concurrent object and uses "connectors" to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.
Evolution of the ATLAS distributed computing system during the LHC long shutdown
NASA Astrophysics Data System (ADS)
Campana, S.; Atlas Collaboration
2014-06-01
The ATLAS Distributed Computing project (ADC) was established in 2007 to develop and operate a framework, following the ATLAS computing model, to enable data storage, processing and bookkeeping on top of the Worldwide LHC Computing Grid (WLCG) distributed infrastructure. ADC development has always been driven by operations and this contributed to its success. The system has fulfilled the demanding requirements of ATLAS, daily consolidating worldwide up to 1 PB of data and running more than 1.5 million payloads distributed globally, supporting almost one thousand concurrent distributed analysis users. Comprehensive automation and monitoring minimized the operational manpower required. The flexibility of the system to adjust to operational needs has been important to the success of the ATLAS physics program. The LHC shutdown in 2013-2015 affords an opportunity to improve the system in light of operational experience and scale it to cope with the demanding requirements of 2015 and beyond, most notably a much higher trigger rate and event pileup. We will describe the evolution of the ADC software foreseen during this period. This includes consolidating the existing Production and Distributed Analysis framework (PanDA) and ATLAS Grid Information System (AGIS), together with the development and commissioning of next generation systems for distributed data management (DDM/Rucio) and production (Prodsys-2). We will explain how new technologies such as Cloud Computing and NoSQL databases, which ATLAS investigated as R&D projects in past years, will be integrated in production. Finally, we will describe more fundamental developments such as breaking job-to-data locality by exploiting storage federations and caches, and event level (rather than file or dataset level) workload engines.
Taylor, Stuart A; Charman, Susan C; Lefere, Philippe; McFarland, Elizabeth G; Paulson, Erik K; Yee, Judy; Aslam, Rizwan; Barlow, John M; Gupta, Arun; Kim, David H; Miller, Chad M; Halligan, Steve
2008-02-01
To prospectively compare the diagnostic performance and time efficiency of both second and concurrent computer-aided detection (CAD) reading paradigms for retrospectively obtained computed tomographic (CT) colonography data sets by using consensus reading (three radiologists) of colonoscopic findings as a reference standard. Ethical permission, HIPAA compliance (for U.S. institutions), and patient consent were obtained from all institutions for use of CT colonography data sets in this study. Ten radiologists each read 25 CT colonography data sets (12 men, 13 women; mean age, 61 years) containing 69 polyps (28 were 1-5 mm, 41 were >or=6 mm) by using workstations integrated with CAD software. Reading was randomized to either "second read" CAD (applied only after initial unassisted assessment) or "concurrent read" CAD (applied at the start of assessment). Data sets were reread 6 weeks later by using the opposing paradigm. Polyp sensitivity and reading times were compared by using multilevel logistic and linear regression, respectively. Receiver operating characteristic (ROC) curves were generated. Compared with the unassisted read, odds of improved polyp (>or=6 mm) detection were 1.5 (95% confidence interval [CI]: 1.0, 2.2) and 1.3 (95% CI: 0.9, 1.9) by using CAD as second and concurrent reader, respectively. Detection odds by using CAD concurrently were 0.87 (95% CI: 0.59, 1.3) and 0.76 (95% CI: 0.57, 1.01) those of second read CAD, excluding and including polyps 1-5 mm, respectively. The concurrent read took 2.9 minutes (95% CI: -3.8, -1.9) less than did second read. The mean areas under the ROC curve (95% CI) for the unassisted read, second read CAD, and concurrent read CAD were 0.83 (95% CI: 0.78, 0.87), 0.86 (95% CI: 0.82, 0.90), and 0.88 (95% CI: 0.83, 0.92), respectively. CAD is more time efficient when used concurrently than when used as a second reader, with similar sensitivity for polyps 6 mm or larger. However, use of second read CAD maximizes sensitivity, particularly for smaller lesions. (c) RSNA, 2007.
Where to cut, where to run : prospects for U.S. South softwood timber supplies and prices
Henry Spelter
1999-01-01
A review of market history shows that southern pine sawtimber stumpage prices have increased by over 150 percent in this decade (Timber Mart South). Concurrently, some (i.e. Cubbage and Abt (1996) Nilsson et al (1999)) have questioned the adequacy of southern timber supplies to meet projected demands, which are projected to increase by...
Weidemann, Gabrielle; Tangen, Jason M; Lovibond, Peter F; Mitchell, Christopher J
2009-04-01
P. Perruchet (1985b) showed a double dissociation of conditioned responses (CRs) and expectancy for an airpuff unconditioned stimulus (US) in a 50% partial reinforcement schedule in human eyeblink conditioning. In the Perruchet effect, participants show an increase in CRs and a concurrent decrease in expectancy for the airpuff across runs of reinforced trials; conversely, participants show a decrease in CRs and a concurrent increase in expectancy for the airpuff across runs of nonreinforced trials. Three eyeblink conditioning experiments investigated whether the linear trend in eyeblink CRs in the Perruchet effect is a result of changes in associative strength of the conditioned stimulus (CS), US sensitization, or learning the precise timing of the US. Experiments 1 and 2 demonstrated that the linear trend in eyeblink CRs is not the result of US sensitization. Experiment 3 showed that the linear trend in eyeblink CRs is present with both a fixed and a variable CS-US interval and so is not the result of learning the precise timing of the US. The results are difficult to reconcile with a single learning process model of associative learning in which expectancy mediates CRs. Copyright (c) 2009 APA, all rights reserved.
Using Histories to Implement Atomic Objects
NASA Technical Reports Server (NTRS)
Ng, Pui
1987-01-01
In this paper we describe an approach of implementing atomicity. Atomicity requires that computations appear to be all-or-nothing and executed in a serialization order. The approach we describe has three characteristics. First, it utilizes the semantics of an application to improve concurrency. Second, it reduces the complexity of application-dependent synchronization code by analyzing the process of writing it. In fact, the process can be automated with logic programming. Third, our approach hides the protocol used to arrive at a serialization order from the applications. As a result, different protocols can be used without affecting the applications. Our approach uses a history tree abstraction. The history tree captures the ordering relationship among concurrent computations. By determining what types of computations exist in the history tree and their parameters, a computation can determine whether it can proceed.
ERIC Educational Resources Information Center
Fahy, Patrick J.
Computer-assisted learning (CAL) can be used for adults functioning at any academic or grade level. In adult basic education (ABE), CAL can promote greater learning effectiveness and faster progress, concurrent learning and experience with computer literacy skills, privacy, and motivation. Adults who face barriers (financial, geographic, personal,…
NASA Technical Reports Server (NTRS)
Hale, Mark A.; Craig, James I.; Mistree, Farrokh; Schrage, Daniel P.
1995-01-01
Computing architectures are being assembled that extend concurrent engineering practices by providing more efficient execution and collaboration on distributed, heterogeneous computing networks. Built on the successes of initial architectures, requirements for a next-generation design computing infrastructure can be developed. These requirements concentrate on those needed by a designer in decision-making processes from product conception to recycling and can be categorized in two areas: design process and design information management. A designer both designs and executes design processes throughout design time to achieve better product and process capabilities while expanding fewer resources. In order to accomplish this, information, or more appropriately design knowledge, needs to be adequately managed during product and process decomposition as well as recomposition. A foundation has been laid that captures these requirements in a design architecture called DREAMS (Developing Robust Engineering Analysis Models and Specifications). In addition, a computing infrastructure, called IMAGE (Intelligent Multidisciplinary Aircraft Generation Environment), is being developed that satisfies design requirements defined in DREAMS and incorporates enabling computational technologies.
Moving Object Detection in Heterogeneous Conditions in Embedded Systems.
Garbo, Alessandro; Quer, Stefano
2017-07-01
This paper presents a system for moving object exposure, focusing on pedestrian detection, in external, unfriendly, and heterogeneous environments. The system manipulates and accurately merges information coming from subsequent video frames, making small computational efforts in each single frame. Its main characterizing feature is to combine several well-known movement detection and tracking techniques, and to orchestrate them in a smart way to obtain good results in diversified scenarios. It uses dynamically adjusted thresholds to characterize different regions of interest, and it also adopts techniques to efficiently track movements, and detect and correct false positives. Accuracy and reliability mainly depend on the overall receipt, i.e., on how the software system is designed and implemented, on how the different algorithmic phases communicate information and collaborate with each other, and on how concurrency is organized. The application is specifically designed to work with inexpensive hardware devices, such as off-the-shelf video cameras and small embedded computational units, eventually forming an intelligent urban grid. As a matter of fact, the major contribution of the paper is the presentation of a tool for real-time applications in embedded devices with finite computational (time and memory) resources. We run experimental results on several video sequences (both home-made and publicly available), showing the robustness and accuracy of the overall detection strategy. Comparisons with state-of-the-art strategies show that our application has similar tracking accuracy but much higher frame-per-second rates.
Computing Maximum Cardinality Matchings in Parallel on Bipartite Graphs via Tree-Grafting
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azad, Ariful; Buluc, Aydn; Pothen, Alex
It is difficult to obtain high performance when computing matchings on parallel processors because matching algorithms explicitly or implicitly search for paths in the graph, and when these paths become long, there is little concurrency. In spite of this limitation, we present a new algorithm and its shared-memory parallelization that achieves good performance and scalability in computing maximum cardinality matchings in bipartite graphs. This algorithm searches for augmenting paths via specialized breadth-first searches (BFS) from multiple source vertices, hence creating more parallelism than single source algorithms. Algorithms that employ multiple-source searches cannot discard a search tree once no augmenting pathmore » is discovered from the tree, unlike algorithms that rely on single-source searches. We describe a novel tree-grafting method that eliminates most of the redundant edge traversals resulting from this property of multiple-source searches. We also employ the recent direction-optimizing BFS algorithm as a subroutine to discover augmenting paths faster. Our algorithm compares favorably with the current best algorithms in terms of the number of edges traversed, the average augmenting path length, and the number of iterations. Here, we provide a proof of correctness for our algorithm. Our NUMA-aware implementation is scalable to 80 threads of an Intel multiprocessor and to 240 threads on an Intel Knights Corner coprocessor. On average, our parallel algorithm runs an order of magnitude faster than the fastest algorithms available. The performance improvement is more significant on graphs with small matching number.« less
Computing Maximum Cardinality Matchings in Parallel on Bipartite Graphs via Tree-Grafting
Azad, Ariful; Buluc, Aydn; Pothen, Alex
2016-03-24
It is difficult to obtain high performance when computing matchings on parallel processors because matching algorithms explicitly or implicitly search for paths in the graph, and when these paths become long, there is little concurrency. In spite of this limitation, we present a new algorithm and its shared-memory parallelization that achieves good performance and scalability in computing maximum cardinality matchings in bipartite graphs. This algorithm searches for augmenting paths via specialized breadth-first searches (BFS) from multiple source vertices, hence creating more parallelism than single source algorithms. Algorithms that employ multiple-source searches cannot discard a search tree once no augmenting pathmore » is discovered from the tree, unlike algorithms that rely on single-source searches. We describe a novel tree-grafting method that eliminates most of the redundant edge traversals resulting from this property of multiple-source searches. We also employ the recent direction-optimizing BFS algorithm as a subroutine to discover augmenting paths faster. Our algorithm compares favorably with the current best algorithms in terms of the number of edges traversed, the average augmenting path length, and the number of iterations. Here, we provide a proof of correctness for our algorithm. Our NUMA-aware implementation is scalable to 80 threads of an Intel multiprocessor and to 240 threads on an Intel Knights Corner coprocessor. On average, our parallel algorithm runs an order of magnitude faster than the fastest algorithms available. The performance improvement is more significant on graphs with small matching number.« less
Moving Object Detection in Heterogeneous Conditions in Embedded Systems
Garbo, Alessandro
2017-01-01
This paper presents a system for moving object exposure, focusing on pedestrian detection, in external, unfriendly, and heterogeneous environments. The system manipulates and accurately merges information coming from subsequent video frames, making small computational efforts in each single frame. Its main characterizing feature is to combine several well-known movement detection and tracking techniques, and to orchestrate them in a smart way to obtain good results in diversified scenarios. It uses dynamically adjusted thresholds to characterize different regions of interest, and it also adopts techniques to efficiently track movements, and detect and correct false positives. Accuracy and reliability mainly depend on the overall receipt, i.e., on how the software system is designed and implemented, on how the different algorithmic phases communicate information and collaborate with each other, and on how concurrency is organized. The application is specifically designed to work with inexpensive hardware devices, such as off-the-shelf video cameras and small embedded computational units, eventually forming an intelligent urban grid. As a matter of fact, the major contribution of the paper is the presentation of a tool for real-time applications in embedded devices with finite computational (time and memory) resources. We run experimental results on several video sequences (both home-made and publicly available), showing the robustness and accuracy of the overall detection strategy. Comparisons with state-of-the-art strategies show that our application has similar tracking accuracy but much higher frame-per-second rates. PMID:28671582
Concurrent negotiation and coordination for grid resource coallocation.
Sim, Kwang Mong; Shi, Benyun
2010-06-01
Bolstering resource coallocation is essential for realizing the Grid vision, because computationally intensive applications often require multiple computing resources from different administrative domains. Given that resource providers and consumers may have different requirements, successfully obtaining commitments through concurrent negotiations with multiple resource providers to simultaneously access several resources is a very challenging task for consumers. The impetus of this paper is that it is one of the earliest works that consider a concurrent negotiation mechanism for Grid resource coallocation. The concurrent negotiation mechanism is designed for 1) managing (de)commitment of contracts through one-to-many negotiations and 2) coordination of multiple concurrent one-to-many negotiations between a consumer and multiple resource providers. The novel contributions of this paper are devising 1) a utility-oriented coordination (UOC) strategy, 2) three classes of commitment management strategies (CMSs) for concurrent negotiation, and 3) the negotiation protocols of consumers and providers. Implementing these ideas in a testbed, three series of experiments were carried out in a variety of settings to compare the following: 1) the CMSs in this paper with the work of others in a single one-to-many negotiation environment for one resource where decommitment is allowed for both provider and consumer agents; 2) the performance of the three classes of CMSs in different resource market types; and 3) the UOC strategy with the work of others [e.g., the patient coordination strategy (PCS )] for coordinating multiple concurrent negotiations. Empirical results show the following: 1) the UOC strategy achieved higher utility, faster negotiation speed, and higher success rates than PCS for different resource market types; and 2) the CMS in this paper achieved higher final utility than the CMS in other works. Additionally, the properties of the three classes of CMSs in different kinds of resource markets are also verified.
Simulation of LHC events on a millions threads
NASA Astrophysics Data System (ADS)
Childers, J. T.; Uram, T. D.; LeCompte, T. J.; Papka, M. E.; Benjamin, D. P.
2015-12-01
Demand for Grid resources is expected to double during LHC Run II as compared to Run I; the capacity of the Grid, however, will not double. The HEP community must consider how to bridge this computing gap by targeting larger compute resources and using the available compute resources as efficiently as possible. Argonne's Mira, the fifth fastest supercomputer in the world, can run roughly five times the number of parallel processes that the ATLAS experiment typically uses on the Grid. We ported Alpgen, a serial x86 code, to run as a parallel application under MPI on the Blue Gene/Q architecture. By analysis of the Alpgen code, we reduced the memory footprint to allow running 64 threads per node, utilizing the four hardware threads available per core on the PowerPC A2 processor. Event generation and unweighting, typically run as independent serial phases, are coupled together in a single job in this scenario, reducing intermediate writes to the filesystem. By these optimizations, we have successfully run LHC proton-proton physics event generation at the scale of a million threads, filling two-thirds of Mira.
Concurrent approach for evolving compact decision rule sets
NASA Astrophysics Data System (ADS)
Marmelstein, Robert E.; Hammack, Lonnie P.; Lamont, Gary B.
1999-02-01
The induction of decision rules from data is important to many disciplines, including artificial intelligence and pattern recognition. To improve the state of the art in this area, we introduced the genetic rule and classifier construction environment (GRaCCE). It was previously shown that GRaCCE consistently evolved decision rule sets from data, which were significantly more compact than those produced by other methods (such as decision tree algorithms). The primary disadvantage of GRaCCe, however, is its relatively poor run-time execution performance. In this paper, a concurrent version of the GRaCCE architecture is introduced, which improves the efficiency of the original algorithm. A prototype of the algorithm is tested on an in- house parallel processor configuration and the results are discussed.
Running Jobs on the Peregrine System | High-Performance Computing | NREL
on the Peregrine high-performance computing (HPC) system. Running Different Types of Jobs Batch jobs scheduling policies - queue names, limits, etc. Requesting different node types Sample batch scripts
The NASA computer science research program plan
NASA Technical Reports Server (NTRS)
1983-01-01
A taxonomy of computer science is included, one state of the art of each of the major computer science categories is summarized. A functional breakdown of NASA programs under Aeronautics R and D, space R and T, and institutional support is also included. These areas were assessed against the computer science categories. Concurrent processing, highly reliable computing, and information management are identified.
Comparing Computer-Adaptive and Curriculum-Based Measurement Methods of Assessment
ERIC Educational Resources Information Center
Shapiro, Edward S.; Gebhardt, Sarah N.
2012-01-01
This article reported the concurrent, predictive, and diagnostic accuracy of a computer-adaptive test (CAT) and curriculum-based measurements (CBM; both computation and concepts/application measures) for universal screening in mathematics among students in first through fourth grade. Correlational analyses indicated moderate to strong…
NASA Workshop on Computational Structural Mechanics 1987, part 3
NASA Technical Reports Server (NTRS)
Sykes, Nancy P. (Editor)
1989-01-01
Computational Structural Mechanics (CSM) topics are explored. Algorithms and software for nonlinear structural dynamics, concurrent algorithms for transient finite element analysis, computational methods and software systems for dynamics and control of large space structures, and the use of multi-grid for structural analysis are discussed.
ERIC Educational Resources Information Center
Bertozzi, N.; Hebert, C.; Rought, J.; Staniunas, C.
2007-01-01
Over the past decade the software products available for solid modeling, dynamic, stress, thermal, and flow analysis, and computer-aiding manufacturing (CAM) have become more powerful, affordable, and easier to use. At the same time it has become increasingly important for students to gain concurrent engineering design and systems integration…
Functional language and data flow architectures
NASA Technical Reports Server (NTRS)
Ercegovac, M. D.; Patel, D. R.; Lang, T.
1983-01-01
This is a tutorial article about language and architecture approaches for highly concurrent computer systems based on the functional style of programming. The discussion concentrates on the basic aspects of functional languages, and sequencing models such as data-flow, demand-driven and reduction which are essential at the machine organization level. Several examples of highly concurrent machines are described.
How to Quickly Import CAD Geometry into Thermal Desktop
NASA Technical Reports Server (NTRS)
Wright, Shonte; Beltran, Emilio
2002-01-01
There are several groups at JPL (Jet Propulsion Laboratory) that are committed to concurrent design efforts, two are featured here. Center for Space Mission Architecture and Design (CSMAD) enables the practical application of advanced process technologies in JPL's mission architecture process. Team I functions as an incubator for projects that are in the Discovery, and even pre-Discovery proposal stages. JPL's concurrent design environment is to a large extent centered on the CAD (Computer Aided Design) file. During concurrent design sessions CAD geometry is ported to other more specialized engineering design packages.
Computational Approaches to Simulation and Optimization of Global Aircraft Trajectories
NASA Technical Reports Server (NTRS)
Ng, Hok Kwan; Sridhar, Banavar
2016-01-01
This study examines three possible approaches to improving the speed in generating wind-optimal routes for air traffic at the national or global level. They are: (a) using the resources of a supercomputer, (b) running the computations on multiple commercially available computers and (c) implementing those same algorithms into NASAs Future ATM Concepts Evaluation Tool (FACET) and compares those to a standard implementation run on a single CPU. Wind-optimal aircraft trajectories are computed using global air traffic schedules. The run time and wait time on the supercomputer for trajectory optimization using various numbers of CPUs ranging from 80 to 10,240 units are compared with the total computational time for running the same computation on a single desktop computer and on multiple commercially available computers for potential computational enhancement through parallel processing on the computer clusters. This study also re-implements the trajectory optimization algorithm for further reduction of computational time through algorithm modifications and integrates that with FACET to facilitate the use of the new features which calculate time-optimal routes between worldwide airport pairs in a wind field for use with existing FACET applications. The implementations of trajectory optimization algorithms use MATLAB, Python, and Java programming languages. The performance evaluations are done by comparing their computational efficiencies and based on the potential application of optimized trajectories. The paper shows that in the absence of special privileges on a supercomputer, a cluster of commercially available computers provides a feasible approach for national and global air traffic system studies.
WinHPC System | High-Performance Computing | NREL
System WinHPC System NREL's WinHPC system is a computing cluster running the Microsoft Windows operating system. It allows users to run jobs requiring a Windows environment such as ANSYS and MATLAB
Overview of Computer Simulation Modeling Approaches and Methods
Robert E. Manning; Robert M. Itami; David N. Cole; Randy Gimblett
2005-01-01
The field of simulation modeling has grown greatly with recent advances in computer hardware and software. Much of this work has involved large scientific and industrial applications for which substantial financial resources are available. However, advances in object-oriented programming and simulation methodology, concurrent with dramatic increases in computer...
Analyzing Spacecraft Telecommunication Systems
NASA Technical Reports Server (NTRS)
Kordon, Mark; Hanks, David; Gladden, Roy; Wood, Eric
2004-01-01
Multi-Mission Telecom Analysis Tool (MMTAT) is a C-language computer program for analyzing proposed spacecraft telecommunication systems. MMTAT utilizes parameterized input and computational models that can be run on standard desktop computers to perform fast and accurate analyses of telecommunication links. MMTAT is easy to use and can easily be integrated with other software applications and run as part of almost any computational simulation. It is distributed as either a stand-alone application program with a graphical user interface or a linkable library with a well-defined set of application programming interface (API) calls. As a stand-alone program, MMTAT provides both textual and graphical output. The graphs make it possible to understand, quickly and easily, how telecommunication performance varies with variations in input parameters. A delimited text file that can be read by any spreadsheet program is generated at the end of each run. The API in the linkable-library form of MMTAT enables the user to control simulation software and to change parameters during a simulation run. Results can be retrieved either at the end of a run or by use of a function call at any time step.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Janjusic, Tommy; Kartsaklis, Christos
Application analysis is facilitated through a number of program profiling tools. The tools vary in their complexity, ease of deployment, design, and profiling detail. Specifically, understand- ing, analyzing, and optimizing is of particular importance for scientific applications where minor changes in code paths and data-structure layout can have profound effects. Understanding how intricate data-structures are accessed and how a given memory system responds is a complex task. In this paper we describe a trace profiling tool, Glprof, specifically aimed to lessen the burden of the programmer to pin-point heavily involved data-structures during an application's run-time, and understand data-structure run-time usage.more » Moreover, we showcase the tool's modularity using additional cache simulation components. We elaborate on the tool's design, and features. Finally we demonstrate the application of our tool in the context of Spec bench- marks using the Glprof profiler and two concurrently running cache simulators, PPC440 and AMD Interlagos.« less
Belke, Terry W
2007-05-01
Rats were exposed to a fixed interval 30 s schedule that produced opportunities to run of equal or unequal durations to assess the effect of differences in duration on responding. Each duration was signaled by a different stimulus. Wheel-running reinforcer duration pairs were 30 s 30 s, 50 s 10 s, and 55 s 5 s. An analysis of median postreinforcement pause duration and mean local lever-pressing rates broken down by previous reinforcer duration and duration of signaled upcoming reinforcer showed that postreinforcement pause duration was affected by the duration of the previous reinforcer but not by the stimulus signaling the duration of the upcoming reinforcer. Local lever-pressing rates were not affected by either previous or upcoming reinforcer duration. In general, the results are consistent with indifference between these durations obtained using a concurrent choice procedure.
Multi-GPU Accelerated Admittance Method for High-Resolution Human Exposure Evaluation.
Xiong, Zubiao; Feng, Shi; Kautz, Richard; Chandra, Sandeep; Altunyurt, Nevin; Chen, Ji
2015-12-01
A multi-graphics processing unit (GPU) accelerated admittance method solver is presented for solving the induced electric field in high-resolution anatomical models of human body when exposed to external low-frequency magnetic fields. In the solver, the anatomical model is discretized as a three-dimensional network of admittances. The conjugate orthogonal conjugate gradient (COCG) iterative algorithm is employed to take advantage of the symmetric property of the complex-valued linear system of equations. Compared against the widely used biconjugate gradient stabilized method, the COCG algorithm can reduce the solving time by 3.5 times and reduce the storage requirement by about 40%. The iterative algorithm is then accelerated further by using multiple NVIDIA GPUs. The computations and data transfers between GPUs are overlapped in time by using asynchronous concurrent execution design. The communication overhead is well hidden so that the acceleration is nearly linear with the number of GPU cards. Numerical examples show that our GPU implementation running on four NVIDIA Tesla K20c cards can reach 90 times faster than the CPU implementation running on eight CPU cores (two Intel Xeon E5-2603 processors). The implemented solver is able to solve large dimensional problems efficiently. A whole adult body discretized in 1-mm resolution can be solved in just several minutes. The high efficiency achieved makes it practical to investigate human exposure involving a large number of cases with a high resolution that meets the requirements of international dosimetry guidelines.
2011-08-01
5 Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis...classification of streaming data. Example input images (top left). All digit prototypes (cluster centers) found, with size proportional to frequency (top...Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis 1 http
Characterizing parallel file-access patterns on a large-scale multiprocessor
NASA Technical Reports Server (NTRS)
Purakayastha, Apratim; Ellis, Carla Schlatter; Kotz, David; Nieuwejaar, Nils; Best, Michael
1994-01-01
Rapid increases in the computational speeds of multiprocessors have not been matched by corresponding performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-volume data transfer between the I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected workload. So far there have been no comprehensive usage studies of multiprocessor file systems. Our CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First the file system should support efficient concurrent access to many files, and I/O requests from many jobs under varying load conditions. Second, it must efficiently manage large files kept open for long periods. Third, it should expect to see small requests predominantly sequential access patterns, application-wide synchronous access, no concurrent file-sharing between jobs appreciable byte and block sharing between processes within jobs, and strong interprocess locality. Finally, the trace data suggest that node-level write caches and collective I/O request interfaces may be useful in certain environments.
Design for Run-Time Monitor on Cloud Computing
NASA Astrophysics Data System (ADS)
Kang, Mikyung; Kang, Dong-In; Yun, Mira; Park, Gyung-Leen; Lee, Junghoon
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is the type of a parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring the system status change, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize resources on cloud computing. RTM monitors application software through library instrumentation as well as underlying hardware through performance counter optimizing its computing configuration based on the analyzed data.
Cloud Computing for Complex Performance Codes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin
This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
Advanced Collaborative Environments Supporting Systems Integration and Design
2003-03-01
concurrently view a virtual system or product model while maintaining natural, human communication . These virtual systems operate within a computer-generated...These environments allow multiple individuals to concurrently view a virtual system or product model while simultaneously maintaining natural, human ... communication . As a result, TARDEC researchers and system developers are using this advanced high-end visualization technology to develop future
Nonrecursive formulations of multibody dynamics and concurrent multiprocessing
NASA Technical Reports Server (NTRS)
Kurdila, Andrew J.; Menon, Ramesh
1993-01-01
Since the late 1980's, research in recursive formulations of multibody dynamics has flourished. Historically, much of this research can be traced to applications of low dimensionality in mechanism and vehicle dynamics. Indeed, there is little doubt that recursive order N methods are the method of choice for this class of systems. This approach has the advantage that a minimal number of coordinates are utilized, parallelism can be induced for certain system topologies, and the method is of order N computational cost for systems of N rigid bodies. Despite the fact that many authors have dismissed redundant coordinate formulations as being of order N(exp 3), and hence less attractive than recursive formulations, we present recent research that demonstrates that at least three distinct classes of redundant, nonrecursive multibody formulations consistently achieve order N computational cost for systems of rigid and/or flexible bodies. These formulations are as follows: (1) the preconditioned range space formulation; (2) penalty methods; and (3) augmented Lagrangian methods for nonlinear multibody dynamics. The first method can be traced to its foundation in equality constrained quadratic optimization, while the last two methods have been studied extensively in the context of coercive variational boundary value problems in computational mechanics. Until recently, however, they have not been investigated in the context of multibody simulation, and present theoretical questions unique to nonlinear dynamics. All of these nonrecursive methods have additional advantages with respect to recursive order N methods: (1) the formalisms retain the highly desirable order N computational cost; (2) the techniques are amenable to concurrent simulation strategies; (3) the approaches do not depend upon system topology to induce concurrency; and (4) the methods can be derived to balance the computational load automatically on concurrent multiprocessors. In addition to the presentation of the fundamental formulations, this paper presents new theoretical results regarding the rate of convergence of order N constraint stabilization schemes associated with the newly introduced class of methods.
NASA Astrophysics Data System (ADS)
Myre, Joseph M.
Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
Linking consistency with object/thread semantics - An approach to robust computation
NASA Technical Reports Server (NTRS)
Chen, Raymond C.; Dasgupta, Partha
1989-01-01
This paper presents an object/thread based paradigm that links data consistency with object/thread semantics. The paradigm can be used to achieve a wide range of consistency semantics from strict atomic transactions to standard process semantics. The paradigm supports three types of data consistency. Object programmers indicate the type of consistency desired on a per-operation basis and the system performs automatic concurrency control and recovery management to ensure that those consistency requirements are met. This allows programmers to customize consistency and recovery on a per-application basis without having to supply complicated, custom recovery management schemes. The paradigm allows robust and nonrobust computation to operate concurrently on the same data in a well defined manner. The operating system needs to support only one vehicle of computation - the thread.
Fox, Annie Beth; Rosen, Jonathan; Crawford, Mary
2009-02-01
Instant messaging (IM) has become one of the most popular forms of computer-mediated communication (CMC) and is especially prevalent on college campuses. Previous research suggests that IM users often multitask while conversing online. To date, no one has yet examined the cognitive effect of concurrent IM use. Participants in the present study (N = 69) completed a reading comprehension task uninterrupted or while concurrently holding an IM conversation. Participants who IMed while performing the reading task took significantly longer to complete the task, indicating that concurrent IM use negatively affects efficiency. Concurrent IM use did not affect reading comprehension scores. Additional analyses revealed that the more time participants reported spending on IM, the lower their reading comprehension scores. Finally, we found that the more time participants reported spending on IM, the lower their self-reported GPA. Implications and future directions are discussed.
Nonlinear Analysis of a Bolted Marine Riser Connector Using NASTRAN Substructuring
NASA Technical Reports Server (NTRS)
Fox, G. L.
1984-01-01
Results of an investigation of the behavior of a bolted, flange type marine riser connector is reported. The method used to account for the nonlinear effect of connector separation due to bolt preload and axial tension load is described. The automated multilevel substructing capability of COSMIC/NASTRAN was employed at considerable savings in computer run time. Simplified formulas for computer resources, i.e., computer run times for modules SDCOMP, FBS, and MPYAD, as well as disk storage space, are presented. Actual run time data on a VAX-11/780 is compared with the formulas presented.
Scalable computing for evolutionary genomics.
Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert
2012-01-01
Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project, BioNode encourages creating free and open source VM images, for multiple targets, through one central project. BioNode can be deployed on Windows, OSX, Linux, and in the Cloud. Next to the downloadable BioNode images, we provide tutorials online, which empower bioinformaticians to install and run BioNode in different environments, as well as information for future initiatives, on creating and building such images.
NASA Technical Reports Server (NTRS)
Jiang, Ching-Biau; T'ien, James S.
1994-01-01
Excerpts from a paper describing the numerical examination of concurrent-flow flame spread over a thin solid in purely forced flow with gas-phase radiation are presented. The computational model solves the two-dimensional, elliptic, steady, and laminar conservation equations for mass, momentum, energy, and chemical species. Gas-phase combustion is modeled via a one-step, second order finite rate Arrhenius reaction. Gas-phase radiation considering gray non-scattering medium is solved by a S-N discrete ordinates method. A simplified solid phase treatment assumes a zeroth order pyrolysis relation and includes radiative interaction between the surface and the gas phase.
Modeling of dialogue regimes of distance robot control
NASA Astrophysics Data System (ADS)
Larkin, E. V.; Privalov, A. N.
2017-02-01
Process of distance control of mobile robots is investigated. Petri-Markov net for modeling of dialogue regime is worked out. It is shown, that sequence of operations of next subjects: a human operator, a dialogue computer and an onboard computer may be simulated with use the theory of semi-Markov processes. From the semi-Markov process of the general form Markov process was obtained, which includes only states of transaction generation. It is shown, that a real transaction flow is the result of «concurrency» in states of Markov process. Iteration procedure for evaluation of transaction flow parameters, which takes into account effect of «concurrency», is proposed.
Fingerprinting Communication and Computation on HPC Machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peisert, Sean
2010-06-02
How do we identify what is actually running on high-performance computing systems? Names of binaries, dynamic libraries loaded, or other elements in a submission to a batch queue can give clues, but binary names can be changed, and libraries provide limited insight and resolution on the code being run. In this paper, we present a method for"fingerprinting" code running on HPC machines using elements of communication and computation. We then discuss how that fingerprint can be used to determine if the code is consistent with certain other types of codes, what a user usually runs, or what the user requestedmore » an allocation to do. In some cases, our techniques enable us to fingerprint HPC codes using runtime MPI data with a high degree of accuracy.« less
Concurrent processing simulation of the space station
NASA Technical Reports Server (NTRS)
Gluck, R.; Hale, A. L.; Sunkel, John W.
1989-01-01
The development of a new capability for the time-domain simulation of multibody dynamic systems and its application to the study of a large angle rotational maneuvers of the Space Station is described. The effort was divided into three sequential tasks, which required significant advancements of the state-of-the art to accomplish. These were: (1) the development of an explicit mathematical model via symbol manipulation of a flexible, multibody dynamic system; (2) the development of a methodology for balancing the computational load of an explicit mathematical model for concurrent processing; and (3) the implementation and successful simulation of the above on a prototype Custom Architectured Parallel Processing System (CAPPS) containing eight processors. The throughput rate achieved by the CAPPS operating at only 70 percent efficiency, was 3.9 times greater than that obtained sequentially by the IBM 3090 supercomputer simulating the same problem. More significantly, analysis of the results leads to the conclusion that the relative cost effectiveness of concurrent vs. sequential digital computation will grow substantially as the computational load is increased. This is a welcomed development in an era when very complex and cumbersome mathematical models of large space vehicles must be used as substitutes for full scale testing which has become impractical.
Experimental evaluation of the impact of packet capturing tools for web services.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choe, Yung Ryn; Mohapatra, Prasant; Chuah, Chen-Nee
Network measurement is a discipline that provides the techniques to collect data that are fundamental to many branches of computer science. While many capturing tools and comparisons have made available in the literature and elsewhere, the impact of these packet capturing tools on existing processes have not been thoroughly studied. While not a concern for collection methods in which dedicated servers are used, many usage scenarios of packet capturing now requires the packet capturing tool to run concurrently with operational processes. In this work we perform experimental evaluations of the performance impact that packet capturing process have on web-based services;more » in particular, we observe the impact on web servers. We find that packet capturing processes indeed impact the performance of web servers, but on a multi-core system the impact varies depending on whether the packet capturing and web hosting processes are co-located or not. In addition, the architecture and behavior of the web server and process scheduling is coupled with the behavior of the packet capturing process, which in turn also affect the web server's performance.« less
Low-cost USB interface for operant research using Arduino and Visual Basic.
Escobar, Rogelio; Pérez-Herrera, Carlos A
2015-03-01
This note describes the design of a low-cost interface using Arduino microcontroller boards and Visual Basic programming for operant conditioning research. The board executes one program in Arduino programming language that polls the state of the inputs and generates outputs in an operant chamber. This program communicates through a USB port with another program written in Visual Basic 2010 Express Edition running on a laptop, desktop, netbook computer, or even a tablet equipped with Windows operating system. The Visual Basic program controls schedules of reinforcement and records real-time data. A single Arduino board can be used to control a total of 52 inputs/output lines, and multiple Arduino boards can be used to control multiple operant chambers. An external power supply and a series of micro relays are required to control 28-V DC devices commonly used in operant chambers. Instructions for downloading and using the programs to generate simple and concurrent schedules of reinforcement are provided. Testing suggests that the interface is reliable, accurate, and could serve as an inexpensive alternative to commercial equipment. © Society for the Experimental Analysis of Behavior.
The Educational Value of Microcomputers: Perceptions among Parents of Young Gifted Children.
ERIC Educational Resources Information Center
Johnson, Lawrence J.; Lewman, Beverly S.
1986-01-01
Parents of 62 children enrolled in a private school for young gifted students completed a questionnaire designed to assess home use of computers, as well as parental concerns and expectations for appropriate concurrent and future computer use in educational settings. Familiarity with computers increased perceptions of their beneficial educational…
Robot computer problem solving system
NASA Technical Reports Server (NTRS)
Becker, J. D.; Merriam, E. W.
1974-01-01
The conceptual, experimental, and practical phases of developing a robot computer problem solving system are outlined. Robot intelligence, conversion of the programming language SAIL to run under the THNEX monitor, and the use of the network to run several cooperating jobs at different sites are discussed.
Using Block-local Atomicity to Detect Stale-value Concurrency Errors
NASA Technical Reports Server (NTRS)
Artho, Cyrille; Havelund, Klaus; Biere, Armin
2004-01-01
Data races do not cover all kinds of concurrency errors. This paper presents a data-flow-based technique to find stale-value errors, which are not found by low-level and high-level data race algorithms. Stale values denote copies of shared data where the copy is no longer synchronized. The algorithm to detect such values works as a consistency check that does not require any assumptions or annotations of the program. It has been implemented as a static analysis in JNuke. The analysis is sound and requires only a single execution trace if implemented as a run-time checking algorithm. Being based on an analysis of Java bytecode, it encompasses the full program semantics, including arbitrarily complex expressions. Related techniques are more complex and more prone to over-reporting.
Active Nodal Task Seeking for High-Performance, Ultra-Dependable Computing
1994-07-01
implementation. Figure 1 shows a hardware organization of ANTS: stand-alone computing nodes inter - connected by buses. 2.1 Run Time Partitioning The...nodes in 14 respond to changing loads [27] or system reconfiguration [26]. Existing techniques are all source-initiated or server-initiated [27]. 5.1...short-running task segments. The task segments must be short-running in order that processors will become avalable often enough to satisfy changing
NASA Technical Reports Server (NTRS)
Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.
1995-01-01
A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.
Parallel computing for automated model calibration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burke, John S.; Danielson, Gary R.; Schulz, Douglas A.
2002-07-29
Natural resources model calibration is a significant burden on computing and staff resources in modeling efforts. Most assessments must consider multiple calibration objectives (for example magnitude and timing of stream flow peak). An automated calibration process that allows real time updating of data/models, allowing scientists to focus effort on improving models is needed. We are in the process of building a fully featured multi objective calibration tool capable of processing multiple models cheaply and efficiently using null cycle computing. Our parallel processing and calibration software routines have been generically, but our focus has been on natural resources model calibration. Somore » far, the natural resources models have been friendly to parallel calibration efforts in that they require no inter-process communication, only need a small amount of input data and only output a small amount of statistical information for each calibration run. A typical auto calibration run might involve running a model 10,000 times with a variety of input parameters and summary statistical output. In the past model calibration has been done against individual models for each data set. The individual model runs are relatively fast, ranging from seconds to minutes. The process was run on a single computer using a simple iterative process. We have completed two Auto Calibration prototypes and are currently designing a more feature rich tool. Our prototypes have focused on running the calibration in a distributed computing cross platform environment. They allow incorporation of?smart? calibration parameter generation (using artificial intelligence processing techniques). Null cycle computing similar to SETI@Home has also been a focus of our efforts. This paper details the design of the latest prototype and discusses our plans for the next revision of the software.« less
The Impact and Promise of Open-Source Computational Material for Physics Teaching
NASA Astrophysics Data System (ADS)
Christian, Wolfgang
2017-01-01
A computer-based modeling approach to teaching must be flexible because students and teachers have different skills and varying levels of preparation. Learning how to run the ``software du jour'' is not the objective for integrating computational physics material into the curriculum. Learning computational thinking, how to use computation and computer-based visualization to communicate ideas, how to design and build models, and how to use ready-to-run models to foster critical thinking is the objective. Our computational modeling approach to teaching is a research-proven pedagogy that predates computers. It attempts to enhance student achievement through the Modeling Cycle. This approach was pioneered by Robert Karplus and the SCIS Project in the 1960s and 70s and later extended by the Modeling Instruction Program led by Jane Jackson and David Hestenes at Arizona State University. This talk describes a no-cost open-source computational approach aligned with a Modeling Cycle pedagogy. Our tools, curricular material, and ready-to-run examples are freely available from the Open Source Physics Collection hosted on the AAPT-ComPADRE digital library. Examples will be presented.
Colt: an experiment in wormhole run-time reconfiguration
NASA Astrophysics Data System (ADS)
Bittner, Ray; Athanas, Peter M.; Musgrove, Mark
1996-10-01
Wormhole run-time reconfiguration (RTR) is an attempt to create a refined computing paradigm for high performance computational tasks. By combining concepts from field programmable gate array (FPGA) technologies with data flow computing, the Colt/Stallion architecture achieves high utilization of hardware resources, and facilitates rapid run-time reconfiguration. Targeted mainly at DSP-type operations, the Colt integrated circuit -- a prototype wormhole RTR device -- compares favorably to contemporary DSP alternatives in terms of silicon area consumed per unit computation and in computing performance. Although emphasis has been placed on signal processing applications, general purpose computation has not been overlooked. Colt is a prototype that defines an architecture not only at the chip level but also in terms of an overall system design. As this system is realized, the concept of wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
On Why It Is Impossible to Prove that the BDX90 Dispatcher Implements a Time-sharing System
NASA Technical Reports Server (NTRS)
Boyer, R. S.; Moore, J. S.
1983-01-01
The Software Implemented Fault Tolerance SIFT system, is written in PASCAL except for about a page of machine code. The SIFT system implements a small time sharing system in which PASCAL programs for separate application tasks are executed according to a schedule with real time constraints. The PASCAL language has no provision for handling the notion of an interrupt such as the B930 clock interrupt. The PASCAL language also lacks the notion of running a PASCAL subroutine for a given amount of time, suspending it, saving away the suspension, and later activating the suspension. Machine code was used to overcome these inadequacies of PASCAL. Code which handles clock interrupts and suspends processes is called a dispatcher. The time sharing/virtual machine idea is completely destroyed by the reconfiguration task. After termination of the reconfiguration task, the tasks run by the dispatcher have no relation to those run before reconfiguration. It is impossible to view the dispatcher as a time-sharing system implementing virtual BDX930s running concurrently when one process can wipe out the others.
Virtualization and cloud computing in dentistry.
Chow, Frank; Muftu, Ali; Shorter, Richard
2014-01-01
The use of virtualization and cloud computing has changed the way we use computers. Virtualization is a method of placing software called a hypervisor on the hardware of a computer or a host operating system. It allows a guest operating system to run on top of the physical computer with a virtual machine (i.e., virtual computer). Virtualization allows multiple virtual computers to run on top of one physical computer and to share its hardware resources, such as printers, scanners, and modems. This increases the efficient use of the computer by decreasing costs (e.g., hardware, electricity administration, and management) since only one physical computer is needed and running. This virtualization platform is the basis for cloud computing. It has expanded into areas of server and storage virtualization. One of the commonly used dental storage systems is cloud storage. Patient information is encrypted as required by the Health Insurance Portability and Accountability Act (HIPAA) and stored on off-site private cloud services for a monthly service fee. As computer costs continue to increase, so too will the need for more storage and processing power. Virtual and cloud computing will be a method for dentists to minimize costs and maximize computer efficiency in the near future. This article will provide some useful information on current uses of cloud computing.
Formal Semanol Specification of Ada.
1980-09-01
concurrent task modeling involved very little change to the SEMANOL metalanguage. A primitive capable of initiating concurrent SEMANOL task processors...i.e., #CO-COMPUTE) and two primitivc-; corresponding to integer semaphores (i.c., #P and #V) were all that were required. In addition, these changes... synchronization techniques and choice of correct unblocking alternatives. We should note that it had been our original intention to use the Ada Translator program
Framework for architecture-independent run-time reconfigurable applications
NASA Astrophysics Data System (ADS)
Lehn, David I.; Hudson, Rhett D.; Athanas, Peter M.
2000-10-01
Configurable Computing Machines (CCMs) have emerged as a technology with the computational benefits of custom ASICs as well as the flexibility and reconfigurability of general-purpose microprocessors. Significant effort from the research community has focused on techniques to move this reconfigurability from a rapid application development tool to a run-time tool. This requires the ability to change the hardware design while the application is executing and is known as Run-Time Reconfiguration (RTR). Widespread acceptance of run-time reconfigurable custom computing depends upon the existence of high-level automated design tools. Such tools must reduce the designers effort to port applications between different platforms as the architecture, hardware, and software evolves. A Java implementation of a high-level application framework, called Janus, is presented here. In this environment, developers create Java classes that describe the structural behavior of an application. The framework allows hardware and software modules to be freely mixed and interchanged. A compilation phase of the development process analyzes the structure of the application and adapts it to the target platform. Janus is capable of structuring the run-time behavior of an application to take advantage of the memory and computational resources available.
Karpievitch, Yuliya V; Almeida, Jonas S
2006-01-01
Background Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke. Results mGrid augments the usual process distribution seen in other similar distributed systems by adding facilities for user code distribution. mGrid's client-side interface is an easy-to-use native Matlab toolbox that transparently executes user-defined code on remote machines (i.e. the user is unaware that the code is executing somewhere else). Run-time variables are automatically packed and distributed with the user-defined code and automated load-balancing of remote resources enables smooth concurrent execution. mGrid is an open source environment. Apart from the programming language itself, all other components are also open source, freely available tools: light-weight PHP scripts and the Apache web server. Conclusion Transparent, load-balanced distribution of user-defined Matlab toolboxes and rapid prototyping of many simple parallel applications can now be done with a single easy-to-use Matlab command. Because mGrid utilizes only Matlab, light-weight PHP scripts and the Apache web server, installation and configuration are very simple. Moreover, the web-based infrastructure of mGrid allows for it to be easily extensible over the Internet. PMID:16539707
Karpievitch, Yuliya V; Almeida, Jonas S
2006-03-15
Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke. mGrid augments the usual process distribution seen in other similar distributed systems by adding facilities for user code distribution. mGrid's client-side interface is an easy-to-use native Matlab toolbox that transparently executes user-defined code on remote machines (i.e. the user is unaware that the code is executing somewhere else). Run-time variables are automatically packed and distributed with the user-defined code and automated load-balancing of remote resources enables smooth concurrent execution. mGrid is an open source environment. Apart from the programming language itself, all other components are also open source, freely available tools: light-weight PHP scripts and the Apache web server. Transparent, load-balanced distribution of user-defined Matlab toolboxes and rapid prototyping of many simple parallel applications can now be done with a single easy-to-use Matlab command. Because mGrid utilizes only Matlab, light-weight PHP scripts and the Apache web server, installation and configuration are very simple. Moreover, the web-based infrastructure of mGrid allows for it to be easily extensible over the Internet.
NASA Astrophysics Data System (ADS)
Brzuszek, Marcin; Daniluk, Andrzej
2006-11-01
Writing a concurrent program can be more difficult than writing a sequential program. Programmer needs to think about synchronisation, race conditions and shared variables. Transactions help reduce the inconvenience of using threads. A transaction is an abstraction, which allows programmers to group a sequence of actions on the program into a logical, higher-level computation unit. This paper presents multithreaded versions of the GROWTH program, which allow to calculate the layer coverages during the growth of thin epitaxial films and the corresponding RHEED intensities according to the kinematical approximation. The presented programs also contain graphical user interfaces, which enable displaying program data at run-time. New version program summaryTitles of programs:GROWTHGr, GROWTH06 Catalogue identifier:ADVL_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADVL_v2_0 Program obtainable from:CPC Program Library, Queen's University of Belfast, N. Ireland Catalogue identifier of previous version:ADVL Does the new version supersede the original program:No Computer for which the new version is designed and others on which it has been tested: Pentium-based PC Operating systems or monitors under which the new version has been tested: Windows 9x, XP, NT Programming language used:Object Pascal Memory required to execute with typical data:More than 1 MB Number of bits in a word:64 bits Number of processors used:1 No. of lines in distributed program, including test data, etc.:20 931 Number of bytes in distributed program, including test data, etc.: 1 311 268 Distribution format:tar.gz Nature of physical problem: The programs compute the RHEED intensities during the growth of thin epitaxial structures prepared using the molecular beam epitaxy (MBE). The computations are based on the use of kinematical diffraction theory [P.I. Cohen, G.S. Petrich, P.R. Pukite, G.J. Whaley, A.S. Arrott, Surf. Sci. 216 (1989) 222. [1
NASA Astrophysics Data System (ADS)
Bootsma, Gregory J.
X-ray scatter in cone-beam computed tomography (CBCT) is known to reduce image quality by introducing image artifacts, reducing contrast, and limiting computed tomography (CT) number accuracy. The extent of the effect of x-ray scatter on CBCT image quality is determined by the shape and magnitude of the scatter distribution in the projections. A method to allay the effects of scatter is imperative to enable application of CBCT to solve a wider domain of clinical problems. The work contained herein proposes such a method. A characterization of the scatter distribution through the use of a validated Monte Carlo (MC) model is carried out. The effects of imaging parameters and compensators on the scatter distribution are investigated. The spectral frequency components of the scatter distribution in CBCT projection sets are analyzed using Fourier analysis and found to reside predominately in the low frequency domain. The exact frequency extents of the scatter distribution are explored for different imaging configurations and patient geometries. Based on the Fourier analysis it is hypothesized the scatter distribution can be represented by a finite sum of sine and cosine functions. The fitting of MC scatter distribution estimates enables the reduction of the MC computation time by diminishing the number of photon tracks required by over three orders of magnitude. The fitting method is incorporated into a novel scatter correction method using an algorithm that simultaneously combines multiple MC scatter simulations. Running concurrent MC simulations while simultaneously fitting the results allows for the physical accuracy and flexibility of MC methods to be maintained while enhancing the overall efficiency. CBCT projection set scatter estimates, using the algorithm, are computed on the order of 1--2 minutes instead of hours or days. Resulting scatter corrected reconstructions show a reduction in artifacts and improvement in tissue contrast and voxel value accuracy.
A new approach for data acquisition at the JPL space simulators
NASA Technical Reports Server (NTRS)
Fisher, Terry C.
1992-01-01
In 1990, a personal computer based data acquisition system was put into service for the Space Simulators and Environmental Test Laboratory at the Jet Propulsion Laboratory (JPL) in Pasadena, California. The new system replaced an outdated minicomputer system which had been in use since 1980. This new data acquisition system was designed and built by JPL for the specific task of acquiring thermal test data in support of space simulation and thermal vacuum testing at JPL. The data acquisition system was designed using powerful personal computers and local-area-network (LAN) technology. Reliability, expandability, and maintainability were some of the most important criteria in the design of the data system and in the selection of hardware and software components. The data acquisition system is used to record both test chamber operational data and thermal data from the unit under test. Tests are conducted in numerous small thermal vacuum chambers and in the large solar simulator and range in size from individual components using only 2 or 3 thermocouples to entire planetary spacecraft requiring in excess of 1200 channels of test data. The system supports several of these tests running concurrently. The previous data system is described along with reasons for its replacement, the types of data acquired, the new data system, and the benefits obtained from the new system including information on tests performed to date.
Load Balancing Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pearce, Olga Tkachyshyn
2014-12-01
The largest supercomputers have millions of independent processors, and concurrency levels are rapidly increasing. For ideal efficiency, developers of the simulations that run on these machines must ensure that computational work is evenly balanced among processors. Assigning work evenly is challenging because many large modern parallel codes simulate behavior of physical systems that evolve over time, and their workloads change over time. Furthermore, the cost of imbalanced load increases with scale because most large-scale scientific simulations today use a Single Program Multiple Data (SPMD) parallel programming model, and an increasing number of processors will wait for the slowest one atmore » the synchronization points. To address load imbalance, many large-scale parallel applications use dynamic load balance algorithms to redistribute work evenly. The research objective of this dissertation is to develop methods to decide when and how to load balance the application, and to balance it effectively and affordably. We measure and evaluate the computational load of the application, and develop strategies to decide when and how to correct the imbalance. Depending on the simulation, a fast, local load balance algorithm may be suitable, or a more sophisticated and expensive algorithm may be required. We developed a model for comparison of load balance algorithms for a specific state of the simulation that enables the selection of a balancing algorithm that will minimize overall runtime.« less
Progress in Machine Learning Studies for the CMS Computing Infrastructure
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bonacorsi, Daniele; Kuznetsov, Valentin; Magini, Nicolo
Here, computing systems for LHC experiments developed together with Grids worldwide. While a complete description of the original Grid-based infrastructure and services for LHC experiments and its recent evolutions can be found elsewhere, it is worth to mention here the scale of the computing resources needed to fulfill the needs of LHC experiments in Run-1 and Run-2 so far.
Progress in Machine Learning Studies for the CMS Computing Infrastructure
Bonacorsi, Daniele; Kuznetsov, Valentin; Magini, Nicolo; ...
2017-12-06
Here, computing systems for LHC experiments developed together with Grids worldwide. While a complete description of the original Grid-based infrastructure and services for LHC experiments and its recent evolutions can be found elsewhere, it is worth to mention here the scale of the computing resources needed to fulfill the needs of LHC experiments in Run-1 and Run-2 so far.
Kimhy, David; Delespaul, Philippe; Ahn, Hongshik; Cai, Shengnan; Shikhman, Marina; Lieberman, Jeffrey A; Malaspina, Dolores; Sloan, Richard P
2010-11-01
Psychosis has been repeatedly suggested to be affected by increases in stress and arousal. However, there is a dearth of evidence supporting the temporal link between stress, arousal, and psychosis during "real-world" functioning. This paucity of evidence may stem from limitations of current research methodologies. Our aim is to the test the feasibility and validity of a novel methodology designed to measure concurrent stress and arousal in individuals with psychosis during "real-world" daily functioning. Twenty patients with psychosis completed a 36-hour ambulatory assessment of stress and arousal. We used experience sampling method with palm computers to assess stress (10 times per day, 10 AM → 10 PM) along with concurrent ambulatory measurement of cardiac autonomic regulation using a Holter monitor. The clocks of the palm computer and Holter monitor were synchronized, allowing the temporal linking of the stress and arousal data. We used power spectral analysis to determine the parasympathetic contributions to autonomic regulation and sympathovagal balance during 5 minutes before and after each experience sample. Patients completed 79% of the experience samples (75% with a valid concurrent arousal data). Momentary increases in stress had inverse correlation with concurrent parasympathetic activity (ρ = -.27, P < .0001) and positive correlation with sympathovagal balance (ρ = .19, P = .0008). Stress and heart rate were not significantly related (ρ = -.05, P = .3875). The findings support the feasibility and validity of our methodology in individuals with psychosis. The methodology offers a novel way to study in high time resolution the concurrent, "real-world" interactions between stress, arousal, and psychosis. The authors discuss the methodology's potential applications and future research directions.
Expected Reachability-Time Games
NASA Astrophysics Data System (ADS)
Forejt, Vojtěch; Kwiatkowska, Marta; Norman, Gethin; Trivedi, Ashutosh
In an expected reachability-time game (ERTG) two players, Min and Max, move a token along the transitions of a probabilistic timed automaton, so as to minimise and maximise, respectively, the expected time to reach a target. These games are concurrent since at each step of the game both players choose a timed move (a time delay and action under their control), and the transition of the game is determined by the timed move of the player who proposes the shorter delay. A game is turn-based if at any step of the game, all available actions are under the control of precisely one player. We show that while concurrent ERTGs are not always determined, turn-based ERTGs are positionally determined. Using the boundary region graph abstraction, and a generalisation of Asarin and Maler's simple function, we show that the decision problems related to computing the upper/lower values of concurrent ERTGs, and computing the value of turn-based ERTGs are decidable and their complexity is in NEXPTIME ∩ co-NEXPTIME.
Multitasking the code ARC3D. [for computational fluid dynamics
NASA Technical Reports Server (NTRS)
Barton, John T.; Hsiung, Christopher C.
1986-01-01
The CRAY multitasking system was developed in order to utilize all four processors and sharply reduce the wall clock run time. This paper describes the techniques used to modify the computational fluid dynamics code ARC3D for this run and analyzes the achieved speedup. The ARC3D code solves either the Euler or thin-layer N-S equations using an implicit approximate factorization scheme. Results indicate that multitask processing can be used to achieve wall clock speedup factors of over three times, depending on the nature of the program code being used. Multitasking appears to be particularly advantageous for large-memory problems running on multiple CPU computers.
Design and Development of a Run-Time Monitor for Multi-Core Architectures in Cloud Computing
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P.; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data. PMID:22163811
Design and development of a run-time monitor for multi-core architectures in cloud computing.
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, Tianzhen; Buhl, Fred; Haves, Philip
2008-09-20
EnergyPlus is a new generation building performance simulation program offering many new modeling capabilities and more accurate performance calculations integrating building components in sub-hourly time steps. However, EnergyPlus runs much slower than the current generation simulation programs. This has become a major barrier to its widespread adoption by the industry. This paper analyzed EnergyPlus run time from comprehensive perspectives to identify key issues and challenges of speeding up EnergyPlus: studying the historical trends of EnergyPlus run time based on the advancement of computers and code improvements to EnergyPlus, comparing EnergyPlus with DOE-2 to understand and quantify the run time differences,more » identifying key simulation settings and model features that have significant impacts on run time, and performing code profiling to identify which EnergyPlus subroutines consume the most amount of run time. This paper provides recommendations to improve EnergyPlus run time from the modeler?s perspective and adequate computing platforms. Suggestions of software code and architecture changes to improve EnergyPlus run time based on the code profiling results are also discussed.« less
Self-efficacy pathways to childhood depression.
Bandura, A; Pastorelli, C; Barbaranelli, C; Caprara, G V
1999-02-01
This prospective research analyzed how different facets of perceived self-efficacy operate in concert within a network of sociocognitive influences in childhood depression. Perceived social and academic inefficacy contributed to concurrent and subsequent depression both directly and through their impact on academic achievement, prosocialness, and problem behaviors. In the shorter run, children were depressed over beliefs in their academic inefficacy rather than over their actual academic performances. In the longer run, the impact of a low sense of academic efficacy on depression was mediated through academic achievement, problem behavior, and prior depression. Perceived social inefficacy had a heavier impact on depression in girls than in boys in the longer term. Depression was also more strongly linked over time for girls than for boys.
Compressed quantum computation using a remote five-qubit quantum computer
NASA Astrophysics Data System (ADS)
Hebenstreit, M.; Alsina, D.; Latorre, J. I.; Kraus, B.
2017-05-01
The notion of compressed quantum computation is employed to simulate the Ising interaction of a one-dimensional chain consisting of n qubits using the universal IBM cloud quantum computer running on log2(n ) qubits. The external field parameter that controls the quantum phase transition of this model translates into particular settings of the quantum gates that generate the circuit. We measure the magnetization, which displays the quantum phase transition, on a two-qubit system, which simulates a four-qubit Ising chain, and show its agreement with the theoretical prediction within a certain error. We also discuss the relevant point of how to assess errors when using a cloud quantum computer with a limited amount of runs. As a solution, we propose to use validating circuits, that is, to run independent controlled quantum circuits of similar complexity to the circuit of interest.
Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP
NASA Technical Reports Server (NTRS)
Long, Lyle N.; Brentner, Kenneth S.
2000-01-01
This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.
Petri net model for analysis of concurrently processed complex algorithms
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.
1986-01-01
This paper presents a Petri-net model suitable for analyzing the concurrent processing of computationally complex algorithms. The decomposed operations are to be processed in a multiple processor, data driven architecture. Of particular interest is the application of the model to both the description of the data/control flow of a particular algorithm, and to the general specification of the data driven architecture. A candidate architecture is also presented.
van Gog, Tamara; Paas, Fred; van Merriënboer, Jeroen J G; Witte, Puk
2005-12-01
This study investigated the amounts of problem-solving process information ("action," "why," "how," and "metacognitive") elicited by means of concurrent, retrospective, and cued retrospective reporting. In a within-participants design, 26 participants completed electrical circuit troubleshooting tasks under different reporting conditions. The method of cued retrospective reporting used the original computer-based task and a superimposed record of the participant's eye fixations and mouse-keyboard operations as a cue for retrospection. Cued retrospective reporting (with the exception of why information) and concurrent reporting (with the exception of metacognitive information) resulted in a higher number of codes on the different types of information than did retrospective reporting.
Preference pulses and the win-stay, fix-and-sample model of choice.
Hachiga, Yosuke; Sakagami, Takayuki; Silberberg, Alan
2015-11-01
Two groups of six rats each were trained to respond to two levers for a food reinforcer. One group was trained on concurrent variable-ratio 20 extinction schedules of reinforcement. The second group was trained on a concurrent variable-interval 27-s extinction schedule. In both groups, lever-schedule assignments changed randomly following reinforcement; a light cued the lever providing the next reinforcer. In the next condition, the light cue was removed and reinforcer assignment strictly alternated between levers. The next two conditions redetermined, in order, the first two conditions. Preference pulses, defined as a tendency for relative response rate to decline to the just-reinforced alternative with time since reinforcement, only appeared during the extinction schedule. Although the pulse's functional form was well described by a reinforcer-induction equation, there was a large residual between actual data and a pulse-as-artifact simulation (McLean, Grace, Pitts, & Hughes, 2014) used to discern reinforcer-dependent contributions to pulsing. However, if that simulation was modified to include a win-stay tendency (a propensity to stay on the just-reinforced alternative), the residual was greatly reduced. Additional modifications of the parameter values of the pulse-as-artifact simulation enabled it to accommodate the present results as well as those it originally accommodated. In its revised form, this simulation was used to create a model that describes response runs to the preferred alternative as terminating probabilistically, and runs to the unpreferred alternative as punctate with occasional perseverative response runs. After reinforcement, choices are modeled as returning briefly to the lever location that had been just reinforced. This win-stay propensity is hypothesized as due to reinforcer induction. © Society for the Experimental Analysis of Behavior.
2014-08-12
Nolan Warner, Mubarak Shah. Tracking in Dense Crowds Using Prominenceand Neighborhood Motion Concurrence, IEEE Transactions on Pattern Analysis...of computer vision, computer graphics and evacuation dynamics by providing a common platform, and provides...areas that includes Computer Vision, Computer Graphics , and Pedestrian Evacuation Dynamics. Despite the
Counterfactual quantum computation through quantum interrogation
NASA Astrophysics Data System (ADS)
Hosten, Onur; Rakher, Matthew T.; Barreiro, Julio T.; Peters, Nicholas A.; Kwiat, Paul G.
2006-02-01
The logic underlying the coherent nature of quantum information processing often deviates from intuitive reasoning, leading to surprising effects. Counterfactual computation constitutes a striking example: the potential outcome of a quantum computation can be inferred, even if the computer is not run. Relying on similar arguments to interaction-free measurements (or quantum interrogation), counterfactual computation is accomplished by putting the computer in a superposition of `running' and `not running' states, and then interfering the two histories. Conditional on the as-yet-unknown outcome of the computation, it is sometimes possible to counterfactually infer information about the solution. Here we demonstrate counterfactual computation, implementing Grover's search algorithm with an all-optical approach. It was believed that the overall probability of such counterfactual inference is intrinsically limited, so that it could not perform better on average than random guesses. However, using a novel `chained' version of the quantum Zeno effect, we show how to boost the counterfactual inference probability to unity, thereby beating the random guessing limit. Our methods are general and apply to any physical system, as illustrated by a discussion of trapped-ion systems. Finally, we briefly show that, in certain circumstances, counterfactual computation can eliminate errors induced by decoherence.
4273π: Bioinformatics education on low cost ARM hardware
2013-01-01
Background Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. Results We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012–2013. Conclusions 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost. PMID:23937194
4273π: bioinformatics education on low cost ARM hardware.
Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D
2013-08-12
Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.
NASA Technical Reports Server (NTRS)
Jefferson, David; Beckman, Brian
1986-01-01
This paper describes the concept of virtual time and its implementation in the Time Warp Operating System at the Jet Propulsion Laboratory. Virtual time is a distributed synchronization paradigm that is appropriate for distributed simulation, database concurrency control, real time systems, and coordination of replicated processes. The Time Warp Operating System is targeted toward the distributed simulation application and runs on a 32-node JPL Mark II Hypercube.
Concurrent Image Processing Executive (CIPE). Volume 3: User's guide
NASA Technical Reports Server (NTRS)
Lee, Meemong; Cooper, Gregory T.; Groom, Steven L.; Mazer, Alan S.; Williams, Winifred I.; Kong, Mih-Seh
1990-01-01
CIPE (the Concurrent Image Processing Executive) is both an executive which organizes the parameter inputs for hypercube applications and an environment which provides temporary data workspace and simple real-time function definition facilities for image analysis. CIPE provides two types of user interface. The Command Line Interface (CLI) provides a simple command-driven environment allowing interactive function definition and evaluation of algebraic expressions. The menu interface employs a hierarchical screen-oriented menu system where the user is led through a menu tree to any specific application and then given a formatted panel screen for parameter entry. How to initialize the system through the setup function, how to read data into CIPE symbols, how to manipulate and display data through the use of executive functions, and how to run an application in either user interface mode, are described.
Statistical fingerprinting for malware detection and classification
Prowell, Stacy J.; Rathgeb, Christopher T.
2015-09-15
A system detects malware in a computing architecture with an unknown pedigree. The system includes a first computing device having a known pedigree and operating free of malware. The first computing device executes a series of instrumented functions that, when executed, provide a statistical baseline that is representative of the time it takes the software application to run on a computing device having a known pedigree. A second computing device executes a second series of instrumented functions that, when executed, provides an actual time that is representative of the time the known software application runs on the second computing device. The system detects malware when there is a difference in execution times between the first and the second computing devices.
A Comparison of PETSC Library and HPF Implementations of an Archetypal PDE Computation
NASA Technical Reports Server (NTRS)
Hayder, M. Ehtesham; Keyes, David E.; Mehrotra, Piyush
1997-01-01
Two paradigms for distributed-memory parallel computation that free the application programmer from the details of message passing are compared for an archetypal structured scientific computation a nonlinear, structured-grid partial differential equation boundary value problem using the same algorithm on the same hardware. Both paradigms, parallel libraries represented by Argonne's PETSC, and parallel languages represented by the Portland Group's HPF, are found to be easy to use for this problem class, and both are reasonably effective in exploiting concurrency after a short learning curve. The level of involvement required by the application programmer under either paradigm includes specification of the data partitioning (corresponding to a geometrically simple decomposition of the domain of the PDE). Programming in SPAM style for the PETSC library requires writing the routines that discretize the PDE and its Jacobian, managing subdomain-to-processor mappings (affine global- to-local index mappings), and interfacing to library solver routines. Programming for HPF requires a complete sequential implementation of the same algorithm, introducing concurrency through subdomain blocking (an effort similar to the index mapping), and modest experimentation with rewriting loops to elucidate to the compiler the latent concurrency. Correctness and scalability are cross-validated on up to 32 nodes of an IBM SP2.
Group implicit concurrent algorithms in nonlinear structural dynamics
NASA Technical Reports Server (NTRS)
Ortiz, M.; Sotelino, E. D.
1989-01-01
During the 70's and 80's, considerable effort was devoted to developing efficient and reliable time stepping procedures for transient structural analysis. Mathematically, the equations governing this type of problems are generally stiff, i.e., they exhibit a wide spectrum in the linear range. The algorithms best suited to this type of applications are those which accurately integrate the low frequency content of the response without necessitating the resolution of the high frequency modes. This means that the algorithms must be unconditionally stable, which in turn rules out explicit integration. The most exciting possibility in the algorithms development area in recent years has been the advent of parallel computers with multiprocessing capabilities. So, this work is mainly concerned with the development of parallel algorithms in the area of structural dynamics. A primary objective is to devise unconditionally stable and accurate time stepping procedures which lend themselves to an efficient implementation in concurrent machines. Some features of the new computer architecture are summarized. A brief survey of current efforts in the area is presented. A new class of concurrent procedures, or Group Implicit algorithms is introduced and analyzed. The numerical simulation shows that GI algorithms hold considerable promise for application in coarse grain as well as medium grain parallel computers.
1986-12-31
synthesize synchronization skeletons"Science of Computer Programming 2, 1982, pp. 241-266 [Gel85] Gelernter, David, "Generative communication in...effective computation based on given primitives . An architecture is an abstract object-type, whose instances are computing systems. By a parallel computing...explaining the language primitives on this basis. We explain how such a basis can be "simpler" than a general-purpose manual-programming language such as
Jones, Thomas W; Howatson, Glyn; Russell, Mark; French, Duncan N
2016-03-01
The present study examined functional strength and endocrine responses to varying ratios of strength and endurance training in a concurrent training regimen. Thirty resistance trained men completed 6 weeks of 3 d·wk of (a) strength training (ST), (b) concurrent strength and endurance training ratio 3:1 (CT3), (c) concurrent strength and endurance training ratio 1:1 (CT1), or (d) no training (CON). Strength training was conducted using whole-body multijoint exercises, whereas endurance training consisted of treadmill running. Assessments of maximal strength, lower-body power, and endocrine factors were conducted pretraining and after 3 and 6 weeks. After the intervention, ST and CT3 elicited similar increases in lower-body strength; furthermore, ST resulted in greater increases than CT1 and CON (all p ≤ 0.05). All training conditions resulted in similar increases in upper-body strength after training. The ST group observed greater increases in lower-body power than all other conditions (all p ≤ 0.05). After the final training session, CT1 elicited greater increases in cortisol than ST (p = 0.008). When implemented as part of a concurrent training regimen, higher volumes of endurance training result in the inhibition of lower-body strength, whereas low volumes do not. Lower-body power was attenuated by high and low frequencies of endurance training. Higher frequencies of endurance training resulted in increased cortisol responses to training. These data suggest that if strength development is the primary focus of a training intervention, frequency of endurance training should remain low.
HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.
2017-10-01
PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.
Synthesis of Efficient Structures for Concurrent Computation.
1983-10-01
formal presentation of these techniques, called virtualisation and aggregation, can be found n [King-83$. 113.2 Census Functions Trees perform broadcast... Functions .. .. .. .. ... .... ... ... .... ... ... ....... 6 4 User-Assisted Aggregation .. .. .. .. ... ... ... .... ... .. .......... 6 5 Parallel...6. Simple Parallel Structure for Broadcasting .. .. .. .. .. . ... .. . .. . .... 4 Figure 7. Internal Structure of a Prefix Computation Network
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.
1988-01-01
The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.
Running R Statistical Computing Environment Software on the Peregrine
for the development of new statistical methodologies and enjoys a large user base. Please consult the distribution details. Natural language support but running in an English locale R is a collaborative project programming paradigms to better leverage modern HPC systems. The CRAN task view for High Performance Computing
Experience in Grid Site Testing for ATLAS, CMS and LHCb with HammerCloud
NASA Astrophysics Data System (ADS)
Elmsheuser, Johannes; Medrano Llamas, Ramón; Legger, Federica; Sciabà, Andrea; Sciacca, Gianfranco; Úbeda García, Mario; van der Ster, Daniel
2012-12-01
Frequent validation and stress testing of the network, storage and CPU resources of a grid site is essential to achieve high performance and reliability. HammerCloud was previously introduced with the goals of enabling VO- and site-administrators to run such tests in an automated or on-demand manner. The ATLAS, CMS and LHCb experiments have all developed VO plugins for the service and have successfully integrated it into their grid operations infrastructures. This work will present the experience in running HammerCloud at full scale for more than 3 years and present solutions to the scalability issues faced by the service. First, we will show the particular challenges faced when integrating with CMS and LHCb offline computing, including customized dashboards to show site validation reports for the VOs and a new API to tightly integrate with the LHCbDIRAC Resource Status System. Next, a study of the automatic site exclusion component used by ATLAS will be presented along with results for tuning the exclusion policies. A study of the historical test results for ATLAS, CMS and LHCb will be presented, including comparisons between the experiments’ grid availabilities and a search for site-based or temporal failure correlations. Finally, we will look to future plans that will allow users to gain new insights into the test results; these include developments to allow increased testing concurrency, increased scale in the number of metrics recorded per test job (up to hundreds), and increased scale in the historical job information (up to many millions of jobs per VO).
NASA Astrophysics Data System (ADS)
Ramos, Antonio L. L.; Shao, Zhili; Holthe, Aleksander; Sandli, Mathias F.
2017-05-01
The introduction of the System-on-Chip (SoC) technology has brought exciting new opportunities for the development of smart low cost embedded systems spanning a wide range of applications. Currently available SoC devices are capable of performing high speed digital signal processing tasks in software while featuring relatively low development costs and reduced time-to-market. Unmanned aerial vehicles (UAV) are an application example that has shown tremendous potential in an increasing number of scenarios, ranging from leisure to surveillance as well as in search and rescue missions. Video capturing from UAV platforms is a relatively straightforward task that requires almost no preprocessing. However, that does not apply to audio signals, especially in cases where the data is to be used to support real-time decision making. In fact, the enormous amount of acoustic interference from the surroundings, including the noise from the UAVs propellers, becomes a huge problem. This paper discusses a real-time implementation of the NLMS adaptive filtering algorithm applied to enhancing acoustic signals captured from UAV platforms. The model relies on a combination of acoustic sensors and a computational inexpensive algorithm running on a digital signal processor. Given its simplicity, this solution can be incorporated into the main processing system of an UAV using the SoC technology, and run concurrently with other required tasks, such as flight control and communications. Simulations and real-time DSP-based implementations have shown significant signal enhancement results by efficiently mitigating the interference from the noise generated by the UAVs propellers as well as from other external noise sources.
High Resolution Nature Runs and the Big Data Challenge
NASA Technical Reports Server (NTRS)
Webster, W. Phillip; Duffy, Daniel Q.
2015-01-01
NASA's Global Modeling and Assimilation Office at Goddard Space Flight Center is undertaking a series of very computationally intensive Nature Runs and a downscaled reanalysis. The nature runs use the GEOS-5 as an Atmospheric General Circulation Model (AGCM) while the reanalysis uses the GEOS-5 in Data Assimilation mode. This paper will present computational challenges from three runs, two of which are AGCM and one is downscaled reanalysis using the full DAS. The nature runs will be completed at two surface grid resolutions, 7 and 3 kilometers and 72 vertical levels. The 7 km run spanned 2 years (2005-2006) and produced 4 PB of data while the 3 km run will span one year and generate 4 BP of data. The downscaled reanalysis (MERRA-II Modern-Era Reanalysis for Research and Applications) will cover 15 years and generate 1 PB of data. Our efforts to address the big data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS), a specialization of the concept of business process-as-a-service that is an evolving extension of IaaS, PaaS, and SaaS enabled by cloud computing. In this presentation, we will describe two projects that demonstrate this shift. MERRA Analytic Services (MERRA/AS) is an example of cloud-enabled CAaaS. MERRA/AS enables MapReduce analytics over MERRA reanalysis data collection by bringing together the high-performance computing, scalable data management, and a domain-specific climate data services API. NASA's High-Performance Science Cloud (HPSC) is an example of the type of compute-storage fabric required to support CAaaS. The HPSC comprises a high speed Infinib and network, high performance file systems and object storage, and a virtual system environments specific for data intensive, science applications. These technologies are providing a new tier in the data and analytic services stack that helps connect earthbound, enterprise-level data and computational resources to new customers and new mobility-driven applications and modes of work. In our experience, CAaaS lowers the barriers and risk to organizational change, fosters innovation and experimentation, and provides the agility required to meet our customers' increasing and changing needs
Schütte, Kurt H; Seerden, Stefan; Venter, Rachel; Vanwanseele, Benedicte
2018-01-01
Medial tibial stress syndrome (MTSS) is a common overuse running injury with pathomechanics likely to be exaggerated by fatigue. Wearable accelerometry provides a novel alternative to assess biomechanical parameters continuously while running in more ecologically valid settings. The purpose of this study was to determine the influence of outdoor running fatigue and MTSS on both dynamic loading and dynamic stability derived from trunk and tibial accelerometery. Runners with (n=14) and without (n=16) history of MTSS performed an outdoor fatigue run of 3200m. Accelerometer-based measures averaged per lap included dynamic loading of the trunk and tibia (i.e. axial peak positive acceleration, signal power magnitude, and shock attenuation) as well as dynamic trunk stability (i.e. tri-axial root mean square ratio, step and stride regularity, and sample entropy). Regression coefficients from generalised estimating equations were used to evaluate group by fatigue interactions. No evidence could be found for dynamic loading being higher with fatigue in runners with MTSS history (all measures p>0.05). One significant group by running fatigue interaction effect was detected for dynamic stability. Specifically, in MTSS only, decreases mediolateral sample entropy i.e. loss of complexity was associated with running fatigue (p<0.01). The current results indicate that entire acceleration waveform signals reflecting mediolateral trunk control is related to MTSS history, a compensation that went undetected in the non-fatigued running state. We suggest that a practical outdoor running fatigue protocol that concurrently captures trunk accelerometry-based movement complexity warrants further prospective investigation as an in-situ screening tool for MTSS individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Reduction of extinction and reinstatement of cocaine seeking by wheel running in female rats.
Zlebnik, Natalie E; Anker, Justin J; Gliddon, Luke A; Carroll, Marilyn E
2010-03-01
Previous work has shown that wheel running reduced the maintenance of cocaine self-administration in rats. In the present study, the effect of wheel running on extinction and reinstatement of cocaine seeking was examined. Female rats were trained to run in a wheel during 6-h sessions, and they were then catheterized and placed in an operant conditioning chamber where they did not have access to the wheel but were allowed to self-administer iv cocaine. Subsequently, rats were divided into four groups and were tested on the extinction and reinstatement of cocaine seeking while they had varying access to a wheel in an adjoining compartment. The four groups were assigned to the following wheel access conditions: (1) wheel running during extinction and reinstatement (WER), (2) wheel running during extinction and a locked wheel during reinstatement (WE), (3) locked wheel during extinction and wheel running during reinstatement (WR), and (4) locked wheel during extinction and reinstatement (WL). WE and WR were retested later to examine the effect of one session of wheel access on cocaine-primed reinstatement. There were no group differences in wheel revolutions, in rate of acquisition of cocaine self-administration, or in responding during maintenance when there was no wheel access. However, during extinction, WE and WER responded less than WR and WL. WR and WER had lower cocaine-primed reinstatement than WE and WL. One session of wheel exposure in WE also suppressed cocaine-primed reinstatement. Wheel running immediately and effectively reduced cocaine-seeking behavior, but concurrent access to running was necessary. Thus, exercise is a useful and self-sustaining intervention to reduce cocaine-seeking behavior.
Mount, D W; Conrad, B
1986-01-01
We have previously described programs for a variety of types of sequence analysis (1-4). These programs have now been integrated into a single package. They are written in the standard C programming language and run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The programs are widely distributed and may be obtained from the authors as described below. PMID:3753780
NASA Astrophysics Data System (ADS)
Decyk, Viktor K.; Dauger, Dean E.
We have constructed a parallel cluster consisting of Apple Macintosh G4 computers running both Classic Mac OS as well as the Unix-based Mac OS X, and have achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. Unlike other Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the mainstream of computing.
Exploiting loop level parallelism in nonprocedural dataflow programs
NASA Technical Reports Server (NTRS)
Gokhale, Maya B.
1987-01-01
Discussed are how loop level parallelism is detected in a nonprocedural dataflow program, and how a procedural program with concurrent loops is scheduled. Also discussed is a program restructuring technique which may be applied to recursive equations so that concurrent loops may be generated for a seemingly iterative computation. A compiler which generates C code for the language described below has been implemented. The scheduling component of the compiler and the restructuring transformation are described.
Numerical algorithms for finite element computations on concurrent processors
NASA Technical Reports Server (NTRS)
Ortega, J. M.
1986-01-01
The work of several graduate students which relate to the NASA grant is briefly summarized. One student has worked on a detailed analysis of the so-called ijk forms of Gaussian elemination and Cholesky factorization on concurrent processors. Another student has worked on the vectorization of the incomplete Cholesky conjugate method on the CYBER 205. Two more students implemented various versions of Gaussian elimination and Cholesky factorization on the FLEX/32.
High-throughput landslide modelling using computational grids
NASA Astrophysics Data System (ADS)
Wallace, M.; Metson, S.; Holcombe, L.; Anderson, M.; Newbold, D.; Brook, N.
2012-04-01
Landslides are an increasing problem in developing countries. Multiple landslides can be triggered by heavy rainfall resulting in loss of life, homes and critical infrastructure. Through computer simulation of individual slopes it is possible to predict the causes, timing and magnitude of landslides and estimate the potential physical impact. Geographical scientists at the University of Bristol have developed software that integrates a physically-based slope hydrology and stability model (CHASM) with an econometric model (QUESTA) in order to predict landslide risk over time. These models allow multiple scenarios to be evaluated for each slope, accounting for data uncertainties, different engineering interventions, risk management approaches and rainfall patterns. Individual scenarios can be computationally intensive, however each scenario is independent and so multiple scenarios can be executed in parallel. As more simulations are carried out the overhead involved in managing input and output data becomes significant. This is a greater problem if multiple slopes are considered concurrently, as is required both for landslide research and for effective disaster planning at national levels. There are two critical factors in this context: generated data volumes can be in the order of tens of terabytes, and greater numbers of simulations result in long total runtimes. Users of such models, in both the research community and in developing countries, need to develop a means for handling the generation and submission of landside modelling experiments, and the storage and analysis of the resulting datasets. Additionally, governments in developing countries typically lack the necessary computing resources and infrastructure. Consequently, knowledge that could be gained by aggregating simulation results from many different scenarios across many different slopes remains hidden within the data. To address these data and workload management issues, University of Bristol particle physicists and geographical scientists are collaborating to develop methods for providing simple and effective access to landslide models and associated simulation data. Particle physicists have valuable experience in dealing with data complexity and management due to the scale of data generated by particle accelerators such as the Large Hadron Collider (LHC). The LHC generates tens of petabytes of data every year which is stored and analysed using the Worldwide LHC Computing Grid (WLCG). Tools and concepts from the WLCG are being used to drive the development of a Software-as-a-Service (SaaS) platform to provide access to hosted landslide simulation software and data. It contains advanced data management features and allows landslide simulations to be run on the WLCG, dramatically reducing simulation runtimes by parallel execution. The simulations are accessed using a web page through which users can enter and browse input data, submit jobs and visualise results. Replication of the data ensures a local copy can be accessed should a connection to the platform be unavailable. The platform does not know the details of the simulation software it runs, so it is therefore possible to use it to run alternative models at similar scales. This creates the opportunity for activities such as model sensitivity analysis and performance comparison at scales that are impractical using standalone software.
Density-based parallel skin lesion border detection with webCL
2015-01-01
Background Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Methods Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Results Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. Conclusions When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser. PMID:26423836
Density-based parallel skin lesion border detection with webCL.
Lemon, James; Kockara, Sinan; Halic, Tansel; Mete, Mutlu
2015-01-01
Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser.
Evaluating Preclinical Medical Students by Using Computer-Based Problem-Solving Examinations.
ERIC Educational Resources Information Center
Stevens, Ronald H.; And Others
1989-01-01
A study to determine the feasibility of creating and administering computer-based problem-solving examinations for evaluating second-year medical students in immunology and to determine how students would perform on these tests relative to their performances on concurrently administered objective and essay examinations is described. (Author/MLW)
The Concept of Nondeterminism: Its Development and Implications for Teaching
ERIC Educational Resources Information Center
Armoni, Michal; Ben-Ari, Mordechai
2009-01-01
Nondeterminism is a fundamental concept in computer science that appears in various contexts such as automata theory, algorithms and concurrent computation. We present a taxonomy of the different ways that nondeterminism can be defined and used; the categories of the taxonomy are domain, nature, implementation, consistency, execution and…
User's instructions for the cardiovascular Walters model
NASA Technical Reports Server (NTRS)
Croston, R. C.
1973-01-01
The model is a combined, steady-state cardiovascular and thermal model. It was originally developed for interactive use, but was converted to batch mode simulation for the Sigma 3 computer. The model has the purpose to compute steady-state circulatory and thermal variables in response to exercise work loads and environmental factors. During a computer simulation run, several selected variables are printed at each time step. End conditions are also printed at the completion of the run.
A Quantum Computing Approach to Model Checking for Advanced Manufacturing Problems
2014-07-01
amount of time. In summary, the tool we developed succeeded in allowing us to produce good solutions for optimization problems that did not fit ...We compared the value of the objective obtained in each run with the known optimal value, and used this information to compute the probability of ...success for each given instance. Then we used this information to compute the expected number of repetitions (or runs) needed to obtain the optimal
NASA Astrophysics Data System (ADS)
Chen, Xiuhong; Huang, Xianglei; Jiao, Chaoyi; Flanner, Mark G.; Raeker, Todd; Palen, Brock
2017-01-01
The suites of numerical models used for simulating climate of our planet are usually run on dedicated high-performance computing (HPC) resources. This study investigates an alternative to the usual approach, i.e. carrying out climate model simulations on commercially available cloud computing environment. We test the performance and reliability of running the CESM (Community Earth System Model), a flagship climate model in the United States developed by the National Center for Atmospheric Research (NCAR), on Amazon Web Service (AWS) EC2, the cloud computing environment by Amazon.com, Inc. StarCluster is used to create virtual computing cluster on the AWS EC2 for the CESM simulations. The wall-clock time for one year of CESM simulation on the AWS EC2 virtual cluster is comparable to the time spent for the same simulation on a local dedicated high-performance computing cluster with InfiniBand connections. The CESM simulation can be efficiently scaled with the number of CPU cores on the AWS EC2 virtual cluster environment up to 64 cores. For the standard configuration of the CESM at a spatial resolution of 1.9° latitude by 2.5° longitude, increasing the number of cores from 16 to 64 reduces the wall-clock running time by more than 50% and the scaling is nearly linear. Beyond 64 cores, the communication latency starts to outweigh the benefit of distributed computing and the parallel speedup becomes nearly unchanged.
Assessment of a new web-based sexual concurrency measurement tool for men who have sex with men.
Rosenberg, Eli S; Rothenberg, Richard B; Kleinbaum, David G; Stephenson, Rob B; Sullivan, Patrick S
2014-11-10
Men who have sex with men (MSM) are the most affected risk group in the United States' human immunodeficiency virus (HIV) epidemic. Sexual concurrency, the overlapping of partnerships in time, accelerates HIV transmission in populations and has been documented at high levels among MSM. However, concurrency is challenging to measure empirically and variations in assessment techniques used (primarily the date overlap and direct question approaches) and the outcomes derived from them have led to heterogeneity and questionable validity of estimates among MSM and other populations. The aim was to evaluate a novel Web-based and interactive partnership-timing module designed for measuring concurrency among MSM, and to compare outcomes measured by the partnership-timing module to those of typical approaches in an online study of MSM. In an online study of MSM aged ≥18 years, we assessed concurrency by using the direct question method and by gathering the dates of first and last sex, with enhanced programming logic, for each reported partner in the previous 6 months. From these methods, we computed multiple concurrency cumulative prevalence outcomes: direct question, day resolution / date overlap, and month resolution / date overlap including both 1-month ties and excluding ties. We additionally computed variants of the UNAIDS point prevalence outcome. The partnership-timing module was also administered. It uses an interactive month resolution calendar to improve recall and follow-up questions to resolve temporal ambiguities, combines elements of the direct question and date overlap approaches. The agreement between the partnership-timing module and other concurrency outcomes was assessed with percent agreement, kappa statistic (κ), and matched odds ratios at the individual, dyad, and triad levels of analysis. Among 2737 MSM who completed the partnership section of the partnership-timing module, 41.07% (1124/2737) of individuals had concurrent partners in the previous 6 months. The partnership-timing module had the highest degree of agreement with the direct question. Agreement was lower with date overlap outcomes (agreement range 79%-81%, κ range .55-.59) and lowest with the UNAIDS outcome at 5 months before interview (65% agreement, κ=.14, 95% CI .12-.16). All agreements declined after excluding individuals with 1 sex partner (always classified as not engaging in concurrency), although the highest agreement was still observed with the direct question technique (81% agreement, κ=.59, 95% CI .55-.63). Similar patterns in agreement were observed with dyad- and triad-level outcomes. The partnership-timing module showed strong concurrency detection ability and agreement with previous measures. These levels of agreement were greater than others have reported among previous measures. The partnership-timing module may be well suited to quantifying concurrency among MSM at multiple levels of analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Galassi, Mark C.
Diorama is written as a collection of modules that can run in separate threads or in separate processes. This defines a clear interface between the modules and also allows concurrent processing of different parts of the pipeline. The pipeline is determined by a description in a scenario file[Norman and Tornga, 2012, Tornga and Norman, 2014]. The scenario manager parses the XML scenario and sets up the sequence of modules which will generate an event, propagate the signal to a set of sensors, and then run processing modules on the results provided by those sensor simulations. During a run a varietymore » of “observer” and “processor” modules can be invoked to do interim analysis of results. Observers do not modify the simulation results, while processors may affect the final result. At the end of a run results are collated and final reports are put out. A detailed description of the scenario file and how it puts together a simulation are given in [Tornga and Norman, 2014]. The processing pipeline and how to program it with the Diorama API is described in Tornga et al. [2015] and Tornga and Wakeford [2015]. In this report I describe the communications infrastructure that is used.« less
The Katydid system for compiling KEE applications to Ada
NASA Technical Reports Server (NTRS)
Filman, Robert E.; Bock, Conrad; Feldman, Roy
1990-01-01
Components of a system known as Katydid are developed in an effort to compile knowledge-based systems developed in a multimechanism integrated environment (KEE) to Ada. The Katydid core is an Ada library supporting KEE object functionality, and the other elements include a rule compiler, a LISP-to-Ada translator, and a knowledge-base dumper. Katydid employs translation mechanisms that convert LISP knowledge structures and rules to Ada and utilizes basic prototypes of a run-time KEE object-structure library module for Ada. Preliminary results include the semiautomatic compilation of portions of a simple expert system to run in an Ada environment with the described algorithms. It is suggested that Ada can be employed for AI programming and implementation, and the Katydid system is being developed to include concurrency and synchronization mechanisms.
Running Batch Jobs on Peregrine | High-Performance Computing | NREL
Using Resource Feature to Request Different Node Types Peregrine has several types of compute nodes incompatibility and get the job running. More information about requesting different node types in Peregrine is available. Queues In order to meet the needs of different types of jobs, nodes on Peregrine are available
Host-Nation Operations: Soldier Training on Governance (HOST-G) Training Support Package
2011-07-01
restricted this webpage from running scripts or ActiveX controls that could access your computer. Click here for options…” • If this occurs, select that...scripts and ActiveX controls can be useful, but active content might also harm your computer. Are you sure you want to let this file run active
24 CFR 15.110 - What fees will HUD charge?
Code of Federal Regulations, 2013 CFR
2013-04-01
... duplicating machinery. The computer run time includes the cost of operating a central processing unit for that... Applies. (6) Computer run time (includes only mainframe search time not printing) The direct cost of... estimated fee is more than $250.00 or you have a history of failing to pay FOIA fees to HUD in a timely...
NASA Technical Reports Server (NTRS)
Roberts, Floyd E., III
1994-01-01
Software provides for control and acquisition of data from optical pyrometer. There are six individual programs in PYROLASER package. Provides quick and easy way to set up, control, and program standard Pyrolaser. Temperature and emisivity measurements either collected as if Pyrolaser in manual operating mode or displayed on real-time strip charts and stored in standard spreadsheet format for posttest analysis. Shell supplied to allow macros, which are test-specific, added to system easily. Written using Labview software for use on Macintosh-series computers running System 6.0.3 or later, Sun Sparc-series computers running Open-Windows 3.0 or MIT's X Window System (X11R4 or X11R5), and IBM PC or compatible computers running Microsoft Windows 3.1 or later.
NASA Technical Reports Server (NTRS)
1972-01-01
The IDAPS (Image Data Processing System) is a user-oriented, computer-based, language and control system, which provides a framework or standard for implementing image data processing applications, simplifies set-up of image processing runs so that the system may be used without a working knowledge of computer programming or operation, streamlines operation of the image processing facility, and allows multiple applications to be run in sequence without operator interaction. The control system loads the operators, interprets the input, constructs the necessary parameters for each application, and cells the application. The overlay feature of the IBSYS loader (IBLDR) provides the means of running multiple operators which would otherwise overflow core storage.
Research on Synthesis of Concurrent Computing Systems.
1982-09-01
20 1.5.1 An Informal Description of the Techniques ....... ..................... 20 1.5 2 Formal Definitions of Aggregation and Virtualisation ...sparsely interconnected networks . We have also developed techniques to create Kung’s systolic array parallel structure from a specification of matrix...resufts of the computation of that element. For example, if A,j is computed using a single enumeration, then virtualisation would produce a three
Concurrent Simulation of the Eddying General Circulation and Tides in a Global Ocean Model
2010-01-01
Eddying General Circulation and Tides in a Global Ocean Model 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 0602435N 6...STATEMENT Approved for public release, distribution is unlimited. 13. SUPPLEMENTARY NOTES 14. ABSTRACT This paper presents a five-year global ...running 25-h average to approximately separate tidal and non-tidal components of the near-bottom flow. In contrast to earlier high-resolution global
Identification of Program Signatures from Cloud Computing System Telemetry Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nichols, Nicole M.; Greaves, Mark T.; Smith, William P.
Malicious cloud computing activity can take many forms, including running unauthorized programs in a virtual environment. Detection of these malicious activities while preserving the privacy of the user is an important research challenge. Prior work has shown the potential viability of using cloud service billing metrics as a mechanism for proxy identification of malicious programs. Previously this novel detection method has been evaluated in a synthetic and isolated computational environment. In this paper we demonstrate the ability of billing metrics to identify programs, in an active cloud computing environment, including multiple virtual machines running on the same hypervisor. The openmore » source cloud computing platform OpenStack, is used for private cloud management at Pacific Northwest National Laboratory. OpenStack provides a billing tool (Ceilometer) to collect system telemetry measurements. We identify four different programs running on four virtual machines under the same cloud user account. Programs were identified with up to 95% accuracy. This accuracy is dependent on the distinctiveness of telemetry measurements for the specific programs we tested. Future work will examine the scalability of this approach for a larger selection of programs to better understand the uniqueness needed to identify a program. Additionally, future work should address the separation of signatures when multiple programs are running on the same virtual machine.« less
Evaluation of Methods for Multidisciplinary Design Optimization (MDO). Part 2
NASA Technical Reports Server (NTRS)
Kodiyalam, Srinivas; Yuan, Charles; Sobieski, Jaroslaw (Technical Monitor)
2000-01-01
A new MDO method, BLISS, and two different variants of the method, BLISS/RS and BLISS/S, have been implemented using iSIGHT's scripting language and evaluated in this report on multidisciplinary problems. All of these methods are based on decomposing a modular system optimization system into several subtasks optimization, that may be executed concurrently, and the system optimization that coordinates the subtasks optimization. The BLISS method and its variants are well suited for exploiting the concurrent processing capabilities in a multiprocessor machine. Several steps, including the local sensitivity analysis, local optimization, response surfaces construction and updates are all ideally suited for concurrent processing. Needless to mention, such algorithms that can effectively exploit the concurrent processing capabilities of the compute servers will be a key requirement for solving large-scale industrial design problems, such as the automotive vehicle problem detailed in Section 3.4.
Whitehead, RA; Lam, NL; Sun, MS; Sanchez, JJ; Noor, S; Vanderwall, AG; Petersen, TR; Martin, HB
2016-01-01
BACKGROUND Animal models of peripheral neuropathy produced by a number of manipulations are assessed for the presence of pathological pain states such as allodynia. While stimulus-induced behavioral assays are frequently used and important to examine allodynia (i.e. sensitivity to light mechanical touch; von Frey fiber test) other measures of behavior that reflect overall function are not only complementary to stimulus-induced responsive measures, but are also critical to gain a complete understanding of the effects of the pain model on quality of life, a clinically relevant aspect of pain on general function. Voluntary wheel running activity in rodent models of inflammatory and muscle pain is emerging as a reliable index of general function that extends beyond stimulus-induced behavioral assays. Clinically, reports of increased pain intensity occur at night, a period typically characterized with reduced activity during the diurnal cycle. We therefore examined in rats whether alterations in wheel running activity were more robust during the inactive phase compared to the active phase of their diurnal cycle in a widely used rodent model of chronic peripheral neuropathic pain, the sciatic nerve chronic constriction injury (CCI) model. METHODS In adult male Sprague Dawley rats, baseline (BL) hindpaw threshold responses to light mechanical touch were assessed using the von Frey test prior to measuring BL activity levels using freely accessible running wheels (1 hr/day for 7 sequential days) to quantify distance traveled. Running wheel activity BL values are expressed as total distance traveled (m). The overall experimental design was: following BL measures, rats underwent either sham or CCI surgery followed by repeated behavioral re-assessment of hindpaw thresholds and wheel running activity levels for up to 18 days after surgery. Specifically, separate groups of rats were assessed for wheel running activity levels (1 hr total/trial) during the onset (within first 2 hrs) of either the (1) inactive (n=8/gp) or (2) active (n = 8/gp) phase of the diurnal cycle. An additional group of CCI-treated rats (n = 8/gp) were exposed to a locked running wheel to control for the potential effects of wheel running exercise on allodynia. The 1-hr running wheel trial period was further examined at discrete 20-min intervals to identify possible pattern differences in activity during the first, middle and last portion of the 1-hr trial. The effect of neuropathy on activity levels were assessed by measuring the change from their respective BLs to distance traveled in the running wheels. RESULTS While wheel running distances between groups were not different at BL from rats examined during either the inactive phase of the diurnal cycle or active phase of the diurnal cycle, sciatic nerve CCI reduced running wheel activity levels compared to sham-operated controls during the inactive phase. Additionally, compared to sham controls, bilateral low threshold mechanical allodynia was observed at all time-points after surgical induction of neuropathy in rats with free-wheel and locked-wheel access. Allodynia in CCI compared to shams was replicated in rats whose running wheel activity was examined during the active phase of the diurnal cycle. Conversely, no significant reduction in wheel running activity was observed in CCI-treated rats compared to sham controls at any timepoint when activity levels were examined during the active diurnal phase. Lastly, running wheel activity patterns within the 1 hr trial period during the inactive phase of the diurnal cycle were relatively consistent throughout each 20 min phase. CONCLUSIONS Compared to non-neuropathic sham controls, a profound and stable reduction of running wheel activity was observed in CCI rats during the inactive phase of the diurnal cycle. A concurrent robust allodynia persisted in all rats regardless of when wheel running activity was examined or whether they ran on wheels, suggesting that acute wheel running activity does not alter chronic low intensity mechanical allodynia as measured using the von Frey fiber test. Overall, these data support that acute wheel running exercise with limited repeated exposures does not itself alter allodynia and offers a behavioral assay complementary to stimulus-induced measures of neuropathic pain. PMID:27782944
Providing Assistive Technology Applications as a Service Through Cloud Computing.
Mulfari, Davide; Celesti, Antonio; Villari, Massimo; Puliafito, Antonio
2015-01-01
Users with disabilities interact with Personal Computers (PCs) using Assistive Technology (AT) software solutions. Such applications run on a PC that a person with a disability commonly uses. However the configuration of AT applications is not trivial at all, especially whenever the user needs to work on a PC that does not allow him/her to rely on his / her AT tools (e.g., at work, at university, in an Internet point). In this paper, we discuss how cloud computing provides a valid technological solution to enhance such a scenario.With the emergence of cloud computing, many applications are executed on top of virtual machines (VMs). Virtualization allows us to achieve a software implementation of a real computer able to execute a standard operating system and any kind of application. In this paper we propose to build personalized VMs running AT programs and settings. By using the remote desktop technology, our solution enables users to control their customized virtual desktop environment by means of an HTML5-based web interface running on any computer equipped with a browser, whenever they are.
29 CFR 541.106 - Concurrent duties.
Code of Federal Regulations, 2010 CFR
2010-07-01
... DELIMITING THE EXEMPTIONS FOR EXECUTIVE, ADMINISTRATIVE, PROFESSIONAL, COMPUTER AND OUTSIDE SALES EMPLOYEES..., cooking food, stocking shelves and cleaning the establishment, but performance of such nonexempt work does...
Yang, Hui-Wen; Chou, Jian-Liang; Chen, Lin-Yu; Yeh, Chia-Ming; Chen, Yu-Hsin; Lin, Ru-Inn; Su, Her-Young; Chen, Gary CW; Deatherage, Daniel E; Huang, Yi-Wen; Yan, Pearlly S; Lin, Huey-Jen; Nephew, Kenneth P; Huang, Tim H-M; Lai, Hung-Cheng
2011-01-01
Aberrant TGFβ signaling pathway may alter the expression of down-stream targets and promotes ovarian carcinogenesis. However, the mechanism of this impairment is not fully understood. Our previous study identified RunX1T1 as a putative SMAD4 target in an immortalized ovarian surface epithelial cell line, IOSE. In this study, we report that transcription of RunX1T1 was confirmed to be positively regulated by SMAD4 in IOSE cells and epigenetically silenced in a panel of ovarian cancer cell lines by promoter hypermethylation and histone methylation at H3 lysine 9. SMAD4 depletion increased repressive histone modifications of RunX1T1 promoter without affecting promoter methylation in IOSE cells. Epigenetic treatment can restore RunX1T1 expression by reversing its epigenetic status in MCP 3 ovarian cancer cells. When transiently treated with a demethylating agent, the expression of RunX1T1 was partially restored in MCP 3 cells, but gradual re-silencing through promoter re-methylation was observed after the treatment. Interestingly, SMAD4 knockdown accelerated this re-silencing process, suggesting that normal TGFβ signaling is essential for the maintenance of RunX1T1 expression. In vivo analysis confirmed that hypermethylation of RunX1T1 was detected in 35.7% (34/95) of ovarian tumors with high clinical stages (p = 0.035) and in 83% (5/6) of primary ovarian cancer-initiating cells. Additionally, concurrent methylation of RunX1T1 and another SMAD4 target, FBXO32 which was previously found to be hypermethylated in ovarian cancer was observed in this same sample cohort (p < 0.05). Restoration of RunX1T1 inhibited cancer cell growth. Taken together, dysregulated TGFβ/SMAD4 signaling may lead to epigenetic silencing of a putative tumor suppressor, RunX1T1, during ovarian carcinogenesis. PMID:21540640
MATH77 - A LIBRARY OF MATHEMATICAL SUBPROGRAMS FOR FORTRAN 77, RELEASE 4.0
NASA Technical Reports Server (NTRS)
Lawson, C. L.
1994-01-01
MATH77 is a high quality library of ANSI FORTRAN 77 subprograms implementing contemporary algorithms for the basic computational processes of science and engineering. The portability of MATH77 meets the needs of present-day scientists and engineers who typically use a variety of computing environments. Release 4.0 of MATH77 contains 454 user-callable and 136 lower-level subprograms. Usage of the user-callable subprograms is described in 69 sections of the 416 page users' manual. The topics covered by MATH77 are indicated by the following list of chapter titles in the users' manual: Mathematical Functions, Pseudo-random Number Generation, Linear Systems of Equations and Linear Least Squares, Matrix Eigenvalues and Eigenvectors, Matrix Vector Utilities, Nonlinear Equation Solving, Curve Fitting, Table Look-Up and Interpolation, Definite Integrals (Quadrature), Ordinary Differential Equations, Minimization, Polynomial Rootfinding, Finite Fourier Transforms, Special Arithmetic , Sorting, Library Utilities, Character-based Graphics, and Statistics. Besides subprograms that are adaptations of public domain software, MATH77 contains a number of unique packages developed by the authors of MATH77. Instances of the latter type include (1) adaptive quadrature, allowing for exceptional generality in multidimensional cases, (2) the ordinary differential equations solver used in spacecraft trajectory computation for JPL missions, (3) univariate and multivariate table look-up and interpolation, allowing for "ragged" tables, and providing error estimates, and (4) univariate and multivariate derivative-propagation arithmetic. MATH77 release 4.0 is a subroutine library which has been carefully designed to be usable on any computer system that supports the full ANSI standard FORTRAN 77 language. It has been successfully implemented on a CRAY Y/MP computer running UNICOS, a UNISYS 1100 computer running EXEC 8, a DEC VAX series computer running VMS, a Sun4 series computer running SunOS, a Hewlett-Packard 720 computer running HP-UX, a Macintosh computer running MacOS, and an IBM PC compatible computer running MS-DOS. Accompanying the library is a set of 196 "demo" drivers that exercise all of the user-callable subprograms. The FORTRAN source code for MATH77 comprises 109K lines of code in 375 files with a total size of 4.5Mb. The demo drivers comprise 11K lines of code and 418K. Forty-four percent of the lines of the library code and 29% of those in the demo code are comment lines. The standard distribution medium for MATH77 is a .25 inch streaming magnetic tape cartridge in UNIX tar format. It is also available on a 9track 1600 BPI magnetic tape in VAX BACKUP format and a TK50 tape cartridge in VAX BACKUP format. An electronic copy of the documentation is included on the distribution media. Previous releases of MATH77 have been used over a number of years in a variety of JPL applications. MATH77 Release 4.0 was completed in 1992. MATH77 is a copyrighted work with all copyright vested in NASA.
Computer Simulation of Great Lakes-St. Lawrence Seaway Icebreaker Requirements.
1980-01-01
of Run No. 1 for Taconite Task Command ... ....... 6-41 6.22d Results of Run No. I for Oil Can Task Command ........ ... 6-42 6.22e Results of Run No...Port and Period for Run No. 2 ... .. ... ... 6-47 6.23c Results of Run No. 2 for Taconite Task Command ... ....... 6-48 6.23d Results of Run No. 2 for...6-53 6.24b Predicted Icebreaker Fleet by Home Port and Period for Run No. 3 6-54 6.24c Results of Run No. 3 for Taconite Task Command. ....... 6
Software For Drawing Design Details Concurrently
NASA Technical Reports Server (NTRS)
Crosby, Dewey C., III
1990-01-01
Software system containing five computer-aided-design programs enables more than one designer to work on same part or assembly at same time. Reduces time necessary to produce design by implementing concept of parallel or concurrent detailing, in which all detail drawings documenting three-dimensional model of part or assembly produced simultaneously, rather than sequentially. Keeps various detail drawings consistent with each other and with overall design by distributing changes in each detail to all other affected details.
The rid-redundant procedure in C-Prolog
NASA Technical Reports Server (NTRS)
Chen, Huo-Yan; Wah, Benjamin W.
1987-01-01
C-Prolog can conveniently be used for logical inferences on knowledge bases. However, as similar to many search methods using backward chaining, a large number of redundant computation may be produced in recursive calls. To overcome this problem, the 'rid-redundant' procedure was designed to rid all redundant computations in running multi-recursive procedures. Experimental results obtained for C-Prolog on the Vax 11/780 computer show that there is an order of magnitude improvement in the running time and solvable problem size.
An Upgrade of the Aeroheating Software ''MINIVER''
NASA Technical Reports Server (NTRS)
Louderback, Pierce
2013-01-01
Detailed computational modeling: CFO often used to create and execute computational domains. Increasing complexity when moving from 20 to 30 geometries. Computational time increased as finer grids are used (accuracy). Strong tool, but takes time to set up and run. MINIVER: Uses theoretical and empirical correlations. Orders of magnitude faster to set up and run. Not as accurate as CFO, but gives reasonable estimations. MINIVER's Drawbacks: Rigid command-line interface. Lackluster, unorganized documentation. No central control; multiple versions exist and have diverged.
A Functional Description of the Geophysical Data Acquisition System
1990-08-10
less than 50 SPS nor greater than 250 SPS 3.0 SENSORS/TRANSDUCERS 3.1 CHAPTER OVERVIEW Most of the research supported by GDAS has primarily involved two...signal for the computer. The SRUN signal from the computer is fed to a retriggerable oneshot multivibrator on the board. SRUN consists of a pulse train...that is present when the computer is running. The oneshot output drives the RUN lamp on the front panel. Finally, one pin on the board edge connector is
Network support for system initiated checkpoints
Chen, Dong; Heidelberger, Philip
2013-01-29
A system, method and computer program product for supporting system initiated checkpoints in parallel computing systems. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eyler, L L; Trent, D S; Budden, M J
During the course of the TEMPEST computer code development a concurrent effort was conducted to assess the code's performance and the validity of computed results. The results of this work are presented in this document. The principal objective of this effort was to assure the code's computational correctness for a wide range of hydrothermal phenomena typical of fast breeder reactor application. 47 refs., 94 figs., 6 tabs.
Convergence properties of simple genetic algorithms
NASA Technical Reports Server (NTRS)
Bethke, A. D.; Zeigler, B. P.; Strauss, D. M.
1974-01-01
The essential parameters determining the behaviour of genetic algorithms were investigated. Computer runs were made while systematically varying the parameter values. Results based on the progress curves obtained from these runs are presented along with results based on the variability of the population as the run progresses.
Guanche, Carlos A; Sikka, Robby S
2005-05-01
The use of hip arthroscopy has helped delineate intra-articular pathology and has enabled clinicians to further elucidate the factors responsible for injuries, such as running. The subtle development of degenerative changes may be a result of repetitive impact loading associated with this sport. This study presents a population of runners with common pathologic acetabular changes. Case series. Eight high-level runners with an average age of 36 years (range, 19 to 45 years) were seen for complaints of increasing hip pain with running without any history of macrotrauma. All of the patients had either run several marathons (4), were triathletes (1), Olympic middle distance runners (1), or had run more than 10 miles per week for longer than 5 years (2). Plain radiographic analysis revealed no degenerative changes and an average center-edge (CE) angle of 36.7 degrees (range, 28 degrees to 44 degrees). All patients underwent hip arthroscopy with labral debridement. In 6 patients (75%), a chondral injury of the acetabular cartilage underlying the labral tear was noted. In addition, 3 patients had ligamentum teres disruptions. It is possible that the development of these tears is the result of subtle instability, which may be exacerbated by running, eventually leading to labral tearing and possible ligamentum teres disruption. While perhaps concurrently, subtle acetabular dysplasia may play a role. Although this study does not confirm an association between running and the development of labral tears or chondral lesions in the hip, it certainly questions whether there is an injury pattern common to this population, a "runner's hip." Level IV.
Modeling Subsurface Reactive Flows Using Leadership-Class Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mills, Richard T; Hammond, Glenn; Lichtner, Peter
2009-01-01
We describe our experiences running PFLOTRAN - a code for simulation of coupled hydro-thermal-chemical processes in variably saturated, non-isothermal, porous media - on leadership-class supercomputers, including initial experiences running on the petaflop incarnation of Jaguar, the Cray XT5 at the National Center for Computational Sciences at Oak Ridge National Laboratory. PFLOTRAN utilizes fully implicit time-stepping and is built on top of the Portable, Extensible Toolkit for Scientific Computation (PETSc). We discuss some of the hurdles to 'at scale' performance with PFLOTRAN and the progress we have made in overcoming them on leadership-class computer architectures.
A Study of Shared-Memory Mutual Exclusion Protocols Using CADP
NASA Astrophysics Data System (ADS)
Mateescu, Radu; Serwe, Wendelin
Mutual exclusion protocols are an essential building block of concurrent systems: indeed, such a protocol is required whenever a shared resource has to be protected against concurrent non-atomic accesses. Hence, many variants of mutual exclusion protocols exist in the shared-memory setting, such as Peterson's or Dekker's well-known protocols. Although the functional correctness of these protocols has been studied extensively, relatively little attention has been paid to their non-functional aspects, such as their performance in the long run. In this paper, we report on experiments with the performance evaluation of mutual exclusion protocols using Interactive Markov Chains. Steady-state analysis provides an additional criterion for comparing protocols, which complements the verification of their functional properties. We also carefully re-examined the functional properties, whose accurate formulation as temporal logic formulas in the action-based setting turns out to be quite involved.
NASA Technical Reports Server (NTRS)
Collier, Mark D.; Killough, Ronnie; Martin, Nancy L.
1990-01-01
NASA is currently using a set of applications called the Display Builder and Display Manager. They run on Concurrent systems and heavily depend on the Graphic Kernel System (GKS). At this time however, these two applications would more appropriately be developed in X Windows, in which a low X is used for all actual text and graphics display and a standard widget set (such as Motif) is used for the user interface. Use of the X Windows will increase performance, improve the user interface, enhance portability, and improve reliability. Prototype of X Window/Motif based Display Manager provides the following advantages over a GKS based application: improved performance by using a low level X Windows, display of graphic and text will be more efficient; improved user interface by using Motif; Improved portability by operating on both Concurrent and Sun workstations; and Improved reliability.
Grotzkyj Giorgi, Margherita; Howland, Kevin; Martin, Colin; Bonner, Adrian B.
2012-01-01
An HPLC method was developed and validated for the concurrent detection and quantitation of seven water-soluble vitamins (C, B1, B2, B5, B6, B9, B12) in biological matrices (plasma and urine). Separation was achieved at 30°C on a reversed-phase C18-A column using combined isocratic and linear gradient elution with a mobile phase consisting of 0.01% TFA aqueous and 100% methanol. Total run time was 35 minutes. Detection was performed with diode array set at 280 nm. Each vitamin was quantitatively determined at its maximum wavelength. Spectral comparison was used for peak identification in real samples (24 plasma and urine samples from abstinent alcohol-dependent males). Interday and intraday precision were <4% and <7%, respectively, for all vitamins. Recovery percentages ranged from 93% to 100%. PMID:22536136
NASA Technical Reports Server (NTRS)
Wilson, Edward (Inventor)
2006-01-01
The present invention is a method for identifying unknown parameters in a system having a set of governing equations describing its behavior that cannot be put into regression form with the unknown parameters linearly represented. In this method, the vector of unknown parameters is segmented into a plurality of groups where each individual group of unknown parameters may be isolated linearly by manipulation of said equations. Multiple concurrent and independent recursive least squares identification of each said group run, treating other unknown parameters appearing in their regression equation as if they were known perfectly, with said values provided by recursive least squares estimation from the other groups, thereby enabling the use of fast, compact, efficient linear algorithms to solve problems that would otherwise require nonlinear solution approaches. This invention is presented with application to identification of mass and thruster properties for a thruster-controlled spacecraft.
Giorgi, Margherita Grotzkyj; Howland, Kevin; Martin, Colin; Bonner, Adrian B
2012-01-01
An HPLC method was developed and validated for the concurrent detection and quantitation of seven water-soluble vitamins (C, B(1), B(2), B(5), B(6), B(9), B(12)) in biological matrices (plasma and urine). Separation was achieved at 30°C on a reversed-phase C18-A column using combined isocratic and linear gradient elution with a mobile phase consisting of 0.01% TFA aqueous and 100% methanol. Total run time was 35 minutes. Detection was performed with diode array set at 280 nm. Each vitamin was quantitatively determined at its maximum wavelength. Spectral comparison was used for peak identification in real samples (24 plasma and urine samples from abstinent alcohol-dependent males). Interday and intraday precision were <4% and <7%, respectively, for all vitamins. Recovery percentages ranged from 93% to 100%.
Williams, Paul T
2012-01-01
Current physical activity recommendations assume that different activities can be exchanged to produce the same weight-control benefits so long as total energy expended remains the same (exchangeability premise). To this end, they recommend calculating energy expenditure as the product of the time spent performing each activity and the activity's metabolic equivalents (MET), which may be summed to achieve target levels. The validity of the exchangeability premise was assessed using data from the National Runners' Health Study. Physical activity dose was compared to body mass index (BMI) and body circumferences in 33,374 runners who reported usual distance run and pace, and usual times spent running and other exercises per week. MET hours per day (METhr/d) from running was computed from: a) time and intensity, and b) reported distance run (1.02 MET • hours per km). When computed from time and intensity, the declines (slope±SE) per METhr/d were significantly greater (P<10(-15)) for running than non-running exercise for BMI (slopes±SE, male: -0.12 ± 0.00 vs. 0.00±0.00; female: -0.12 ± 0.00 vs. -0.01 ± 0.01 kg/m(2) per METhr/d) and waist circumference (male: -0.28 ± 0.01 vs. -0.07±0.01; female: -0. 31±0.01 vs. -0.05 ± 0.01 cm per METhr/d). Reported METhr/d of running was 38% to 43% greater when calculated from time and intensity than distance. Moreover, the declines per METhr/d run were significantly greater when estimated from reported distance for BMI (males: -0.29 ± 0.01; females: -0.27 ± 0.01 kg/m(2) per METhr/d) and waist circumference (males: -0.67 ± 0.02; females: -0.69 ± 0.02 cm per METhr/d) than when computed from time and intensity (cited above). The exchangeability premise was not supported for running vs. non-running exercise. Moreover, distance-based running prescriptions may provide better weight control than time-based prescriptions for running or other activities. Additional longitudinal studies and randomized clinical trials are required to verify these results prospectively.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arvind; Gostelow, K.P.
1982-02-01
The author argues that by giving a unique name to every activity generated during a computation, the u-interpreter can provide greater concurrency in the interpretation of data flow graphs. 19 references.
A PICKSC Science Gateway for enabling the common plasma physicist to run kinetic software
NASA Astrophysics Data System (ADS)
Hu, Q.; Winjum, B. J.; Zonca, A.; Youn, C.; Tsung, F. S.; Mori, W. B.
2017-10-01
Computer simulations offer tremendous opportunities for studying plasmas, ranging from simulations for students that illuminate fundamental educational concepts to research-level simulations that advance scientific knowledge. Nevertheless, there is a significant hurdle to using simulation tools. Users must navigate codes and software libraries, determine how to wrangle output into meaningful plots, and oftentimes confront a significant cyberinfrastructure with powerful computational resources. Science gateways offer a Web-based environment to run simulations without needing to learn or manage the underlying software and computing cyberinfrastructure. We discuss our progress on creating a Science Gateway for the Particle-in-Cell and Kinetic Simulation Software Center that enables users to easily run and analyze kinetic simulations with our software. We envision that this technology could benefit a wide range of plasma physicists, both in the use of our simulation tools as well as in its adaptation for running other plasma simulation software. Supported by NSF under Grant ACI-1339893 and by the UCLA Institute for Digital Research and Education.
The Design of a High Performance Earth Imagery and Raster Data Management and Processing Platform
NASA Astrophysics Data System (ADS)
Xie, Qingyun
2016-06-01
This paper summarizes the general requirements and specific characteristics of both geospatial raster database management system and raster data processing platform from a domain-specific perspective as well as from a computing point of view. It also discusses the need of tight integration between the database system and the processing system. These requirements resulted in Oracle Spatial GeoRaster, a global scale and high performance earth imagery and raster data management and processing platform. The rationale, design, implementation, and benefits of Oracle Spatial GeoRaster are described. Basically, as a database management system, GeoRaster defines an integrated raster data model, supports image compression, data manipulation, general and spatial indices, content and context based queries and updates, versioning, concurrency, security, replication, standby, backup and recovery, multitenancy, and ETL. It provides high scalability using computer and storage clustering. As a raster data processing platform, GeoRaster provides basic operations, image processing, raster analytics, and data distribution featuring high performance computing (HPC). Specifically, HPC features include locality computing, concurrent processing, parallel processing, and in-memory computing. In addition, the APIs and the plug-in architecture are discussed.
Creating a Parallel Version of VisIt for Microsoft Windows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitlock, B J; Biagas, K S; Rawson, P L
2011-12-07
VisIt is a popular, free interactive parallel visualization and analysis tool for scientific data. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images or movies for presentations. VisIt was designed from the ground up to work on many scales of computers from modest desktops up to massively parallel clusters. VisIt is comprised of a set of cooperating programs. All programs can be run locally or in client/server mode in which some run locally and some run remotely on compute clusters. The VisIt program most able to harness today's computing powermore » is the VisIt compute engine. The compute engine is responsible for reading simulation data from disk, processing it, and sending results or images back to the VisIt viewer program. In a parallel environment, the compute engine runs several processes, coordinating using the Message Passing Interface (MPI) library. Each MPI process reads some subset of the scientific data and filters the data in various ways to create useful visualizations. By using MPI, VisIt has been able to scale well into the thousands of processors on large computers such as dawn and graph at LLNL. The advent of multicore CPU's has made parallelism the 'new' way to achieve increasing performance. With today's computers having at least 2 cores and in many cases up to 8 and beyond, it is more important than ever to deploy parallel software that can use that computing power not only on clusters but also on the desktop. We have created a parallel version of VisIt for Windows that uses Microsoft's MPI implementation (MSMPI) to process data in parallel on the Windows desktop as well as on a Windows HPC cluster running Microsoft Windows Server 2008. Initial desktop parallel support for Windows was deployed in VisIt 2.4.0. Windows HPC cluster support has been completed and will appear in the VisIt 2.5.0 release. We plan to continue supporting parallel VisIt on Windows so our users will be able to take full advantage of their multicore resources.« less
Performing concurrent operations in academic vascular neurosurgery does not affect patient outcomes.
Zygourakis, Corinna C; Lee, Janelle; Barba, Julio; Lobo, Errol; Lawton, Michael T
2017-11-01
OBJECTIVE Concurrent surgeries, also known as "running two rooms" or simultaneous/overlapping operations, have recently come under intense scrutiny. The goal of this study was to evaluate the operative time and outcomes of concurrent versus nonconcurrent vascular neurosurgical procedures. METHODS The authors retrospectively reviewed 1219 procedures performed by 1 vascular neurosurgeon from 2012 to 2015 at the University of California, San Francisco. Data were collected on patient age, sex, severity of illness, risk of mortality, American Society of Anesthesiologists (ASA) status, procedure type, admission type, insurance, transfer source, procedure time, presence of resident or fellow in operating room (OR), number of co-surgeons, estimated blood loss (EBL), concurrent vs nonconcurrent case, severe sepsis, acute respiratory failure, postoperative stroke causing neurological deficit, unplanned return to OR, 30-day mortality, and 30-day unplanned readmission. For aneurysm clipping cases, data were also obtained on intraoperative aneurysm rupture and postoperative residual aneurysm. Chi-square and t-tests were performed to compare concurrent versus nonconcurrent cases, and then mixed-effects models were created to adjust for different procedure types, patient demographics, and clinical indicators between the 2 groups. RESULTS There was a significant difference in procedure type for concurrent (n = 828) versus nonconcurrent (n = 391) cases. Concurrent cases were more likely to be routine/elective admissions (53% vs 35%, p < 0.001) and physician referrals (59% vs 38%, p < 0.001). This difference in patient/case type was also reflected in the lower severity of illness, risk of death, and ASA class in the concurrent versus nonconcurrent cases (p < 0.01). Concurrent cases had significantly longer procedural times (243 vs 213 minutes) and more unplanned 30-day readmissions (5.7% vs 3.1%), but shorter mean length of hospital stay (11.2 vs 13.7 days), higher rates of discharge to home (66% vs 51%), lower 30-day mortality rates (3.1% vs 6.1%), lower rates of acute respiratory failure (4.3% vs 8.2%), and decreased 30-day unplanned returns to the OR (3.3% vs 6.9%; all p < 0.05). Rates of severe sepsis, postoperative stroke, intraoperative aneurysm rupture, and postoperative aneurysm residual were equivalent between the concurrent and nonconcurrent groups (all p values nonsignificant). Mixed-effects models showed that after controlling for procedure type, patient demographics, and clinical indicators, there was no significant difference in acute respiratory failure, severe sepsis, 30-day readmission, postoperative stroke, EBL, length of stay, discharge status, or intraoperative aneurysm rupture between concurrent and nonconcurrent cases. Unplanned return to the OR and 30-day mortality were significantly lower in concurrent cases (odds ratio 0.55, 95% confidence interval 0.31-0.98, p = 0.0431, and odds ratio 0.81, p < 0.001, respectively), but concurrent cases had significantly longer procedure durations (odds ratio 21.73; p < 0.001). CONCLUSIONS Overall, there was a significant difference in the types of concurrent versus nonconcurrent cases, with more routine/elective cases for less sick patients scheduled in an overlapping fashion. After adjusting for patient demographics, procedure type, and clinical indicators, concurrent cases had longer procedure times, but equivalent patient outcomes, as compared with nonconcurrent vascular neurosurgical procedures.
Percent Grammatical Responses as a General Outcome Measure: Initial Validity
ERIC Educational Resources Information Center
Eisenberg, Sarita L.; Guo, Ling-Yu
2018-01-01
Purpose: This report investigated the validity of using percent grammatical responses (PGR) as a measure for assessing grammaticality. To establish construct validity, we computed the correlation of PGR with another measure of grammar skills and with an unrelated skill area. To establish concurrent validity for PGR, we computed the correlation of…
Quizzing and Feedback in Computer-Based and Book-Based Training for Workplace Safety and Health
ERIC Educational Resources Information Center
Rohlman, Diane S.; Eckerman, David A.; Ammerman, Tammara A.; Fercho, Heather L.; Lundeen, Christine A.; Blomquist, Carrie; Anger, W. Kent
2005-01-01
Participants received different amounts of information in either a cTRAIN computer-based instruction (CBI) program or in a booklet format, presented before or concurrently with interactive questions about the information. An interactive CBI presentation that required an overt response during training produced equivalent acquisition and retention…
NASA Astrophysics Data System (ADS)
Varela Rodriguez, F.
2011-12-01
The control system of each of the four major Experiments at the CERN Large Hadron Collider (LHC) is distributed over up to 160 computers running either Linux or Microsoft Windows. A quick response to abnormal situations of the computer infrastructure is crucial to maximize the physics usage. For this reason, a tool was developed to supervise, identify errors and troubleshoot such a large system. Although the monitoring of the performance of the Linux computers and their processes was available since the first versions of the tool, it is only recently that the software package has been extended to provide similar functionality for the nodes running Microsoft Windows as this platform is the most commonly used in the LHC detector control systems. In this paper, the architecture and the functionality of the Windows Management Instrumentation (WMI) client developed to provide centralized monitoring of the nodes running different flavour of the Microsoft platform, as well as the interface to the SCADA software of the control systems are presented. The tool is currently being commissioned by the Experiments and it has already proven to be very efficient optimize the running systems and to detect misbehaving processes or nodes.
Computational steering of GEM based detector simulations
NASA Astrophysics Data System (ADS)
Sheharyar, Ali; Bouhali, Othmane
2017-10-01
Gas based detector R&D relies heavily on full simulation of detectors and their optimization before final prototypes can be built and tested. These simulations in particular those with complex scenarios such as those involving high detector voltages or gas with larger gains are computationally intensive may take several days or weeks to complete. These long-running simulations usually run on the high-performance computers in batch mode. If the results lead to unexpected behavior, then the simulation might be rerun with different parameters. However, the simulations (or jobs) may have to wait in a queue until they get a chance to run again because the supercomputer is a shared resource that maintains a queue of other user programs as well and executes them as time and priorities permit. It may result in inefficient resource utilization and increase in the turnaround time for the scientific experiment. To overcome this issue, the monitoring of the behavior of a simulation, while it is running (or live), is essential. In this work, we employ the computational steering technique by coupling the detector simulations with a visualization package named VisIt to enable the exploration of the live data as it is produced by the simulation.
NSTX-U Control System Upgrades
Erickson, K. G.; Gates, D. A.; Gerhardt, S. P.; ...
2014-06-01
The National Spherical Tokamak Experiment (NSTX) is undergoing a wealth of upgrades (NSTX-U). These upgrades, especially including an elongated pulse length, require broad changes to the control system that has served NSTX well. A new fiber serial Front Panel Data Port input and output (I/O) stream will supersede the aging copper parallel version. Driver support for the new I/O and cyber security concerns require updating the operating system from Redhat Enterprise Linux (RHEL) v4 to RedHawk (based on RHEL) v6. While the basic control system continues to use the General Atomics Plasma Control System (GA PCS), the effort to forwardmore » port the entire software package to run under 64-bit Linux instead of 32-bit Linux included PCS modifications subsequently shared with GA and other PCS users. Software updates focused on three key areas: (1) code modernization through coding standards (C99/C11), (2) code portability and maintainability through use of the GA PCS code generator, and (3) support of 64-bit platforms. Central to the control system upgrade is the use of a complete real time (RT) Linux platform provided by Concurrent Computer Corporation, consisting of a computer (iHawk), an operating system and drivers (RedHawk), and RT tools (NightStar). Strong vendor support coupled with an extensive RT toolset influenced this decision. The new real-time Linux platform, I/O, and software engineering will foster enhanced capability and performance for NSTX-U plasma control.« less
Shibahara, Motoi; Ohnishi, Yasuo; Honda, Eisaburo; Matsuda, Dean K; Uchida, Soshi
2017-07-01
This report describes a case of nonunion of an anterior inferior iliac spine (AIIS) apophyseal avulsion fracture with resultant subspine impingement combined with symptomatic femoroacetabular impingement (FAI). A 16-year-old male soccer player presented with a 6-month history of right groin pain exacerbated by kicking and running. The patient was diagnosed with a displaced nonunion of the AIIS apophysis avulsion fracture causing secondary extra-articular impingement beyond cam-type FAI by physical examination and radiological findings. The authors performed arthroscopic AIIS decompression, with concurrent FAI correction and labral repair and capsular closure. At 4 months after surgery, a radiograph and a computed tomography scan showed complete bony union of the AIIS apophyseal nonunion. Modified Harris Hip Sore and Nonarthritic Hip Score improved from 74.8 and 61, respectively, to 100 for both at final follow-up. The effectiveness of arthroscopic decompression of the AIIS as part of a comprehensive minimally invasive surgery including FAI correction and labral repair resulted in complete union of the AIIS and pain-free return to sport and bony union. [Orthopedics. 2017; 40(4):e725-e728.]. Copyright 2017, SLACK Incorporated.
Quantitative Analyses of the Modes of Deformation in Engineering Thermoplastics
NASA Astrophysics Data System (ADS)
Landes, B. G.; Bubeck, R. A.; Scott, R. L.; Heaney, M. D.
1998-03-01
Synchrotron-based real-time small-angle X-ray scattering (RTSAXS) studies have been performed on rubber-toughened engineering thermoplastics with amorphous and semi-crystalline matrices. Scattering patterns are measured at successive time intervals of 3 ms were analyzed to determine the plastic strain due to crazing. Simultaneous measurements of the absorption of the primary beam by the sample permits the total plastic strain to be concurrently computed. The plastic strain due to other deformation mechanisms (e.g., particle cavitation and macroscopic shear yield can be determined from the difference between the total and craze-derived plastic strains. The contribution from macroscopic shear deformation can be determined from video-based optical data measured simultaneously with the X-ray data. These types of time-resolved experiments result in the generation of prodigious quantities of data, the analysis of which can considerably delay the determination of key results. A newly developed software package that runs in WINDOWSa 95 permits the rapid analysis of the relative contributions of the deformation modes from these time-resolved experiments. Examples of using these techniques on ABS-type and QUESTRAa syndiotactic polystyrene type engineering resins will be given.
Magnetohydrodynamics with GAMER
NASA Astrophysics Data System (ADS)
Zhang, Ui-Han; Schive, Hsi-Yu; Chiueh, Tzihong
2018-06-01
GAMER, a parallel Graphic-processing-unit-accelerated Adaptive-MEsh-Refinement (AMR) hydrodynamic code, has been extended to support magnetohydrodynamics (MHD) with both the corner-transport-upwind and MUSCL-Hancock schemes and the constraint transport technique. The divergent preserving operator for AMR has been applied to reinforce the divergence-free constraint on the magnetic field. GAMER-MHD has fully exploited the concurrent executions between the graphic process unit (GPU) MHD solver and other central processing unit computation pertinent to AMR. We perform various standard tests to demonstrate that GAMER-MHD is both second-order accurate and robust, producing results as accurate as those given by high-resolution uniform-grid runs. We also explore a new 3D MHD test, where the magnetic field assumes the Arnold–Beltrami–Childress configuration, temporarily becomes turbulent with current sheets, and finally settles to a lowest-energy equilibrium state. This 3D problem is adopted for the performance test of GAMER-MHD. The single-GPU performance reaches 1.2 × 108 and 5.5 × 107 cell updates per second for the single- and double-precision calculations, respectively, on Tesla P100. We also demonstrate a parallel efficiency of ∼70% for both weak and strong scaling using 1024 XK nodes on the Blue Waters supercomputers.
Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zamora, Richard James; Voter, Arthur F.; Perez, Danny
Due to its unrivaled ability to predict the dynamical evolution of interacting atoms, molecular dynamics (MD) is a widely used computational method in theoretical chemistry, physics, biology, and engineering. Despite its success, MD is only capable of modeling time scales within several orders of magnitude of thermal vibrations, leaving out many important phenomena that occur at slower rates. The Temperature Accelerated Dynamics (TAD) method overcomes this limitation by thermally accelerating the state-to-state evolution captured by MD. Due to the algorithmically complex nature of the serial TAD procedure, implementations have yet to improve performance by parallelizing the concurrent exploration of multiplemore » states. Here we utilize a discrete event-based application simulator to introduce and explore a new Speculatively Parallel TAD (SpecTAD) method. We investigate the SpecTAD algorithm, without a full-scale implementation, by constructing an application simulator proxy (SpecTADSim). Finally, following this method, we discover that a nontrivial relationship exists between the optimal SpecTAD parameter set and the number of CPU cores available at run-time. Furthermore, we find that a majority of the available SpecTAD boost can be achieved within an existing TAD application using relatively simple algorithm modifications.« less
Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics
Zamora, Richard James; Voter, Arthur F.; Perez, Danny; ...
2016-12-01
Due to its unrivaled ability to predict the dynamical evolution of interacting atoms, molecular dynamics (MD) is a widely used computational method in theoretical chemistry, physics, biology, and engineering. Despite its success, MD is only capable of modeling time scales within several orders of magnitude of thermal vibrations, leaving out many important phenomena that occur at slower rates. The Temperature Accelerated Dynamics (TAD) method overcomes this limitation by thermally accelerating the state-to-state evolution captured by MD. Due to the algorithmically complex nature of the serial TAD procedure, implementations have yet to improve performance by parallelizing the concurrent exploration of multiplemore » states. Here we utilize a discrete event-based application simulator to introduce and explore a new Speculatively Parallel TAD (SpecTAD) method. We investigate the SpecTAD algorithm, without a full-scale implementation, by constructing an application simulator proxy (SpecTADSim). Finally, following this method, we discover that a nontrivial relationship exists between the optimal SpecTAD parameter set and the number of CPU cores available at run-time. Furthermore, we find that a majority of the available SpecTAD boost can be achieved within an existing TAD application using relatively simple algorithm modifications.« less
CERN openlab: Engaging industry for innovation in the LHC Run 3-4 R&D programme
NASA Astrophysics Data System (ADS)
Girone, M.; Purcell, A.; Di Meglio, A.; Rademakers, F.; Gunne, K.; Pachou, M.; Pavlou, S.
2017-10-01
LHC Run3 and Run4 represent an unprecedented challenge for HEP computing in terms of both data volume and complexity. New approaches are needed for how data is collected and filtered, processed, moved, stored and analysed if these challenges are to be met with a realistic budget. To develop innovative techniques we are fostering relationships with industry leaders. CERN openlab is a unique resource for public-private partnership between CERN and leading Information Communication and Technology (ICT) companies. Its mission is to accelerate the development of cutting-edge solutions to be used by the worldwide HEP community. In 2015, CERN openlab started its phase V with a strong focus on tackling the upcoming LHC challenges. Several R&D programs are ongoing in the areas of data acquisition, networks and connectivity, data storage architectures, computing provisioning, computing platforms and code optimisation and data analytics. This paper gives an overview of the various innovative technologies that are currently being explored by CERN openlab V and discusses the long-term strategies that are pursued by the LHC communities with the help of industry in closing the technological gap in processing and storage needs expected in Run3 and Run4.
NASA Technical Reports Server (NTRS)
Yang, Guowei; Pasareanu, Corina S.; Khurshid, Sarfraz
2012-01-01
This paper introduces memoized symbolic execution (Memoise), a novel approach for more efficient application of forward symbolic execution, which is a well-studied technique for systematic exploration of program behaviors based on bounded execution paths. Our key insight is that application of symbolic execution often requires several successive runs of the technique on largely similar underlying problems, e.g., running it once to check a program to find a bug, fixing the bug, and running it again to check the modified program. Memoise introduces a trie-based data structure that stores the key elements of a run of symbolic execution. Maintenance of the trie during successive runs allows re-use of previously computed results of symbolic execution without the need for re-computing them as is traditionally done. Experiments using our prototype embodiment of Memoise show the benefits it holds in various standard scenarios of using symbolic execution, e.g., with iterative deepening of exploration depth, to perform regression analysis, or to enhance coverage.
Benefits Analysis of Multi-Center Dynamic Weather Routes
NASA Technical Reports Server (NTRS)
Sheth, Kapil; McNally, David; Morando, Alexander; Clymer, Alexis; Lock, Jennifer; Petersen, Julien
2014-01-01
Dynamic weather routes are flight plan corrections that can provide airborne flights more than user-specified minutes of flying-time savings, compared to their current flight plan. These routes are computed from the aircraft's current location to a flight plan fix downstream (within a predefined limit region), while avoiding forecasted convective weather regions. The Dynamic Weather Routes automation has been continuously running with live air traffic data for a field evaluation at the American Airlines Integrated Operations Center in Fort Worth, TX since July 31, 2012, where flights within the Fort Worth Air Route Traffic Control Center are evaluated for time savings. This paper extends the methodology to all Centers in United States and presents benefits analysis of Dynamic Weather Routes automation, if it was implemented in multiple airspace Centers individually and concurrently. The current computation of dynamic weather routes requires a limit rectangle so that a downstream capture fix can be selected, preventing very large route changes spanning several Centers. In this paper, first, a method of computing a limit polygon (as opposed to a rectangle used for Fort Worth Center) is described for each of the 20 Centers in the National Airspace System. The Future ATM Concepts Evaluation Tool, a nationwide simulation and analysis tool, is used for this purpose. After a comparison of results with the Center-based Dynamic Weather Routes automation in Fort Worth Center, results are presented for 11 Centers in the contiguous United States. These Centers are generally most impacted by convective weather. A breakdown of individual Center and airline savings is presented and the results indicate an overall average savings of about 10 minutes of flying time are obtained per flight.
Simple, efficient allocation of modelling runs on heterogeneous clusters with MPI
Donato, David I.
2017-01-01
In scientific modelling and computation, the choice of an appropriate method for allocating tasks for parallel processing depends on the computational setting and on the nature of the computation. The allocation of independent but similar computational tasks, such as modelling runs or Monte Carlo trials, among the nodes of a heterogeneous computational cluster is a special case that has not been specifically evaluated previously. A simulation study shows that a method of on-demand (that is, worker-initiated) pulling from a bag of tasks in this case leads to reliably short makespans for computational jobs despite heterogeneity both within and between cluster nodes. A simple reference implementation in the C programming language with the Message Passing Interface (MPI) is provided.
Evaluating the Efficacy of the Cloud for Cluster Computation
NASA Technical Reports Server (NTRS)
Knight, David; Shams, Khawaja; Chang, George; Soderstrom, Tom
2012-01-01
Computing requirements vary by industry, and it follows that NASA and other research organizations have computing demands that fall outside the mainstream. While cloud computing made rapid inroads for tasks such as powering web applications, performance issues on highly distributed tasks hindered early adoption for scientific computation. One venture to address this problem is Nebula, NASA's homegrown cloud project tasked with delivering science-quality cloud computing resources. However, another industry development is Amazon's high-performance computing (HPC) instances on Elastic Cloud Compute (EC2) that promises improved performance for cluster computation. This paper presents results from a series of benchmarks run on Amazon EC2 and discusses the efficacy of current commercial cloud technology for running scientific applications across a cluster. In particular, a 240-core cluster of cloud instances achieved 2 TFLOPS on High-Performance Linpack (HPL) at 70% of theoretical computational performance. The cluster's local network also demonstrated sub-100 ?s inter-process latency with sustained inter-node throughput in excess of 8 Gbps. Beyond HPL, a real-world Hadoop image processing task from NASA's Lunar Mapping and Modeling Project (LMMP) was run on a 29 instance cluster to process lunar and Martian surface images with sizes on the order of tens of gigapixels. These results demonstrate that while not a rival of dedicated supercomputing clusters, commercial cloud technology is now a feasible option for moderately demanding scientific workloads.
Controlling Laboratory Processes From A Personal Computer
NASA Technical Reports Server (NTRS)
Will, H.; Mackin, M. A.
1991-01-01
Computer program provides natural-language process control from IBM PC or compatible computer. Sets up process-control system that either runs without operator or run by workers who have limited programming skills. Includes three smaller programs. Two of them, written in FORTRAN 77, record data and control research processes. Third program, written in Pascal, generates FORTRAN subroutines used by other two programs to identify user commands with device-driving routines written by user. Also includes set of input data allowing user to define user commands to be executed by computer. Requires personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. Also requires FORTRAN 77 compiler and device drivers written by user.
Concurrent ultrasonic weld evaluation system
Hood, Donald W.; Johnson, John A.; Smartt, Herschel B.
1987-01-01
A system for concurrent, non-destructive evaluation of partially completed welds for use in conjunction with an automated welder. The system utilizes real time, automated ultrasonic inspection of a welding operation as the welds are being made by providing a transducer which follows a short distance behind the welding head. Reflected ultrasonic signals are analyzed utilizing computer based digital pattern recognition techniques to discriminate between good and flawed welds on a pass by pass basis. The system also distinguishes between types of weld flaws.
Constant-Round Concurrent Zero Knowledge From Falsifiable Assumptions
2013-01-01
assumptions (e.g., [DS98, Dam00, CGGM00, Gol02, PTV12, GJO+12]), or in alternative models (e.g., super -polynomial-time simulation [Pas03b, PV10]). In the...T (·)-time computations, where T (·) is some “nice” (slightly) super -polynomial function (e.g., T (n) = nlog log logn). We refer to such proof...put a cap on both using a (slightly) super -polynomial function, and thus to guarantee soundness of the concurrent zero-knowledge protocol, we need
Concurrent ultrasonic weld evaluation system
Hood, D.W.; Johnson, J.A.; Smartt, H.B.
1985-09-04
A system for concurrent, non-destructive evaluation of partially completed welds for use in conjunction with an automated welder. The system utilizes real time, automated ultrasonic inspection of a welding operation as the welds are being made by providing a transducer which follows a short distance behind the welding head. Reflected ultrasonic signals are analyzed utilizing computer based digital pattern recognition techniques to discriminate between good and flawed welds on a pass by pass basis. The system also distinguishes between types of weld flaws.
Concurrent ultrasonic weld evaluation system
Hood, D.W.; Johnson, J.A.; Smartt, H.B.
1987-12-15
A system for concurrent, non-destructive evaluation of partially completed welds for use in conjunction with an automated welder is disclosed. The system utilizes real time, automated ultrasonic inspection of a welding operation as the welds are being made by providing a transducer which follows a short distance behind the welding head. Reflected ultrasonic signals are analyzed utilizing computer based digital pattern recognition techniques to discriminate between good and flawed welds on a pass by pass basis. The system also distinguishes between types of weld flaws. 5 figs.
1983-10-01
Concurrency Control Algorithms Computer Corporation of America Wente K. Lin, Philip A. Bernstein, Nathan Goodman and Jerry Nolte APPROVED FOR PUBLIC ...84 03 IZ 004 ’KV This report has been reviewed by the RADC Public Affairs Office (PA) an is releasable to the National Technical Information Service...NTIS). At NTIS it will be releasable to the general public , including foreign na~ions. RADC-TR-83-226, Vol II (of three) has been reviewed and is
Optimization of Car Body under Constraints of Noise, Vibration, and Harshness (NVH), and Crash
NASA Technical Reports Server (NTRS)
Kodiyalam, Srinivas; Yang, Ren-Jye; Sobieszczanski-Sobieski, Jaroslaw (Editor)
2000-01-01
To be competitive on the today's market, cars have to be as light as possible while meeting the Noise, Vibration, and Harshness (NVH) requirements and conforming to Government-man dated crash survival regulations. The latter are difficult to meet because they involve very compute-intensive, nonlinear analysis, e.g., the code RADIOSS capable of simulation of the dynamics, and the geometrical and material nonlinearities of a thin-walled car structure in crash, would require over 12 days of elapsed time for a single design of a 390K elastic degrees of freedom model, if executed on a single processor of the state-of-the-art SGI Origin2000 computer. Of course, in optimization that crash analysis would have to be invoked many times. Needless to say, that has rendered such optimization intractable until now. The car finite element model is shown. The advent of computers that comprise large numbers of concurrently operating processors has created a new environment wherein the above optimization, and other engineering problems heretofore regarded as intractable may be solved. The procedure, shown, is a piecewise approximation based method and involves using a sensitivity based Taylor series approximation model for NVH and a polynomial response surface model for Crash. In that method the NVH constraints are evaluated using a finite element code (MSC/NASTRAN) that yields the constraint values and their derivatives with respect to design variables. The crash constraints are evaluated using the explicit code RADIOSS on the Origin 2000 operating on 256 processors simultaneously to generate data for a polynomial response surface in the design variable domain. The NVH constraints and their derivatives combined with the response surface for the crash constraints form an approximation to the system analysis (surrogate analysis) that enables a cycle of multidisciplinary optimization within move limits. In the inner loop, the NVH sensitivities are recomputed to update the NVH approximation model while keeping the Crash response surface constant. In every outer loop, the Crash response surface approximation is updated, including a gradual increase in the order of the response surface and the response surface extension in the direction of the search. In this optimization task, the NVH discipline has 30 design variables while the crash discipline has 20 design variables. A subset of these design variables (10) are common to both the NVH and crash disciplines. In order to construct a linear response surface for the Crash discipline constraints, a minimum of 21 design points would have to be analyzed using the RADIOSS code. On a single processor in Origin 2000 that amount of computing would require over 9 months! In this work, these runs were carried out concurrently on the Origin 2000 using multiple processors, ranging from 8 to 16, for each crash (RADIOSS) analysis. Another figure shows the wall time required for a single RADIOSS analysis using varying number of processors, as well as provides a comparison of 2 different common data placement procedures within the allotted memories for each analysis. The initial design is an infeasible design with NVH discipline Static Torsion constraint violations of over 10%. The final optimized design is a feasible design with a weight reduction of 15 kg compared to the initial design. This work demonstrates how advanced methodology for optimization combined with the technology of concurrent processing enables applications that until now were out of reach because of very long time-to-solution.
WinHPC System Programming | High-Performance Computing | NREL
Programming WinHPC System Programming Learn how to build and run an MPI (message passing interface (mpi.h) and library (msmpi.lib) are. To build from the command line, run... Start > Intel Software Development Tools > Intel C++ Compiler Professional... > C++ Build Environment for applications running
Computer-based testing of the modified essay question: the Singapore experience.
Lim, Erle Chuen-Hian; Seet, Raymond Chee-Seong; Oh, Vernon M S; Chia, Boon-Lock; Aw, Marion; Quak, Seng-Hock; Ong, Benjamin K C
2007-11-01
The modified essay question (MEQ), featuring an evolving case scenario, tests a candidate's problem-solving and reasoning ability, rather than mere factual recall. Although it is traditionally conducted as a pen-and-paper examination, our university has run the MEQ using computer-based testing (CBT) since 2003. We describe our experience with running the MEQ examination using the IVLE, or integrated virtual learning environment (https://ivle.nus.edu.sg), provide a blueprint for universities intending to conduct computer-based testing of the MEQ, and detail how our MEQ examination has evolved since its inception. An MEQ committee, comprising specialists in key disciplines from the departments of Medicine and Paediatrics, was formed. We utilized the IVLE, developed for our university in 1998, as the online platform on which we ran the MEQ. We calculated the number of man-hours (academic and support staff) required to run the MEQ examination, using either a computer-based or pen-and-paper format. With the support of our university's information technology (IT) specialists, we have successfully run the MEQ examination online, twice a year, since 2003. Initially, we conducted the examination with short-answer questions only, but have since expanded the MEQ examination to include multiple-choice and extended matching questions. A total of 1268 man-hours was spent in preparing for, and running, the MEQ examination using CBT, compared to 236.5 man-hours to run it using a pen-and-paper format. Despite being more labour-intensive, our students and staff prefer CBT to the pen-and-paper format. The MEQ can be conducted using a computer-based testing scenario, which offers several advantages over a pen-and-paper format. We hope to increase the number of questions and incorporate audio and video files, featuring clinical vignettes, to the MEQ examination in the near future.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kopp, H.J.; Mortensen, G.A.
1978-04-01
Approximately 60% of the full CDC 6600/7600 Datatran 2.0 capability was made operational on IBM 360/370 equipment. Sufficient capability was made operational to demonstrate adequate performance for modular program linking applications. Also demonstrated were the basic capabilities and performance required to support moderate-sized data base applications and moderately active scratch input/output applications. Approximately one to two calendar years are required to develop DATATRAN 2.0 capabilities fully for the entire spectrum of applications proposed. Included in the next stage of conversion should be syntax checking and syntax conversion features that would foster greater FORTRAN compatibility between IBM and CDC developed modules.more » The batch portion of the JOSHUA Modular System, which was developed by Savannah River Laboratory to run on an IBM computer, was examined for the feasibility of conversion to run on a Control Data Corporation (CDC) computer. Portions of the JOSHUA Precompiler were changed so as to be operable on the CDC computer. The Data Manager and Batch Monitor were also examined for conversion feasibility, but no changes were made in them. It appears to be feasible to convert the batch portion of the JOSHUA Modular System to run on a CDC computer with an estimated additional two to three man-years of effort. 9 tables.« less
Adaptive subdomain modeling: A multi-analysis technique for ocean circulation models
NASA Astrophysics Data System (ADS)
Altuntas, Alper; Baugh, John
2017-07-01
Many coastal and ocean processes of interest operate over large temporal and geographical scales and require a substantial amount of computational resources, particularly when engineering design and failure scenarios are also considered. This study presents an adaptive multi-analysis technique that improves the efficiency of these computations when multiple alternatives are being simulated. The technique, called adaptive subdomain modeling, concurrently analyzes any number of child domains, with each instance corresponding to a unique design or failure scenario, in addition to a full-scale parent domain providing the boundary conditions for its children. To contain the altered hydrodynamics originating from the modifications, the spatial extent of each child domain is adaptively adjusted during runtime depending on the response of the model. The technique is incorporated in ADCIRC++, a re-implementation of the popular ADCIRC ocean circulation model with an updated software architecture designed to facilitate this adaptive behavior and to utilize concurrent executions of multiple domains. The results of our case studies confirm that the method substantially reduces computational effort while maintaining accuracy.
Identifying the impact of G-quadruplexes on Affymetrix 3' arrays using cloud computing.
Memon, Farhat N; Owen, Anne M; Sanchez-Graillet, Olivia; Upton, Graham J G; Harrison, Andrew P
2010-01-15
A tetramer quadruplex structure is formed by four parallel strands of DNA/ RNA containing runs of guanine. These quadruplexes are able to form because guanine can Hoogsteen hydrogen bond to other guanines, and a tetrad of guanines can form a stable arrangement. Recently we have discovered that probes on Affymetrix GeneChips that contain runs of guanine do not measure gene expression reliably. We associate this finding with the likelihood that quadruplexes are forming on the surface of GeneChips. In order to cope with the rapidly expanding size of GeneChip array datasets in the public domain, we are exploring the use of cloud computing to replicate our experiments on 3' arrays to look at the effect of the location of G-spots (runs of guanines). Cloud computing is a recently introduced high-performance solution that takes advantage of the computational infrastructure of large organisations such as Amazon and Google. We expect that cloud computing will become widely adopted because it enables bioinformaticians to avoid capital expenditure on expensive computing resources and to only pay a cloud computing provider for what is used. Moreover, as well as financial efficiency, cloud computing is an ecologically-friendly technology, it enables efficient data-sharing and we expect it to be faster for development purposes. Here we propose the advantageous use of cloud computing to perform a large data-mining analysis of public domain 3' arrays.
Katz, Jonathan E
2017-01-01
Laboratories tend to be amenable environments for long-term reliable operation of scientific measurement equipment. Indeed, it is not uncommon to find equipment 5, 10, or even 20+ years old still being routinely used in labs. Unfortunately, the Achilles heel for many of these devices is the control/data acquisition computer. Often these computers run older operating systems (e.g., Windows XP) and, while they might only use standard network, USB or serial ports, they require proprietary software to be installed. Even if the original installation disks can be found, it is a burdensome process to reinstall and is fraught with "gotchas" that can derail the process-lost license keys, incompatible hardware, forgotten configuration settings, etc. If you have running legacy instrumentation, the computer is the ticking time bomb waiting to put a halt to your operation.In this chapter, I describe how to virtualize your currently running control computer. This virtualized computer "image" is easy to maintain, easy to back up and easy to redeploy. I have used this multiple times in my own lab to greatly improve the robustness of my legacy devices.After completing the steps in this chapter, you will have your original control computer as well as a virtual instance of that computer with all the software installed ready to control your hardware should your original computer ever be decommissioned.
NASA's computer science research program
NASA Technical Reports Server (NTRS)
Larsen, R. L.
1983-01-01
Following a major assessment of NASA's computing technology needs, a new program of computer science research has been initiated by the Agency. The program includes work in concurrent processing, management of large scale scientific databases, software engineering, reliable computing, and artificial intelligence. The program is driven by applications requirements in computational fluid dynamics, image processing, sensor data management, real-time mission control and autonomous systems. It consists of university research, in-house NASA research, and NASA's Research Institute for Advanced Computer Science (RIACS) and Institute for Computer Applications in Science and Engineering (ICASE). The overall goal is to provide the technical foundation within NASA to exploit advancing computing technology in aerospace applications.
CMS event processing multi-core efficiency status
NASA Astrophysics Data System (ADS)
Jones, C. D.; CMS Collaboration
2017-10-01
In 2015, CMS was the first LHC experiment to begin using a multi-threaded framework for doing event processing. This new framework utilizes Intel’s Thread Building Block library to manage concurrency via a task based processing model. During the 2015 LHC run period, CMS only ran reconstruction jobs using multiple threads because only those jobs were sufficiently thread efficient. Recent work now allows simulation and digitization to be thread efficient. In addition, during 2015 the multi-threaded framework could run events in parallel but could only use one thread per event. Work done in 2016 now allows multiple threads to be used while processing one event. In this presentation we will show how these recent changes have improved CMS’s overall threading and memory efficiency and we will discuss work to be done to further increase those efficiencies.
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters
Torres-Huitzil, Cesar
2013-01-01
Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
NASA Astrophysics Data System (ADS)
Beyer, F.; Zierott, L.; Fallenberg, E. M.; Juergens, K.; Stoeckel, J.; Heindel, W.; Wormanns, D.
2006-03-01
Purpose: To compare sensitivity and reading time when using CAD as second reader resp. concurrent reader. Materials and Methods: Fifty chest MDCT scans due to clinical indication were analysed independently by four radiologists two times: First with CAD as concurrent reader (display of CAD results simultaneously to the primary reading by the radiologist); then after a median of 14 weeks with CAD as second reader (CAD results were shown after completion of a reading session without CAD). A prototype version of Siemens LungCAD (Siemens,Malvern,USA) was used. Sensitivities and reading times for detecting nodules >=4mm of concurrent reading, reading without CAD and second reading were recorded. In a consensus conference false positive findings were eliminated. Student's T-Test was used to compare sensitivities and reading times. Results: 108 true positive nodules were found. Mean sensitivity was .68 for reading without CAD, .68 for concurrent reading and .75 for second reading. Differences of sensitivities were significant between concurrent and second reading (p<.001) resp. reading without CAD and second reading (p=.001). Mean reading time for concurrent reading was significant shorter (274s) compared to reading without CAD (294s;p=.04) and second reading (337sp<.001). New work to be presented: To our knowledge this is the first study that compares sensitivities and reading times between use of CAD as concurrent resp. second reader. Conclusion: CAD can either be used to speed up reading of chest CT cases for pulmonary nodules without loss of sensitivity as concurrent reader -OR (and not AND) to increase sensitivity and reading time as second reader.
Energy Frontier Research With ATLAS: Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Butler, John; Black, Kevin; Ahlen, Steve
2016-06-14
The Boston University (BU) group is playing key roles across the ATLAS experiment: in detector operations, the online trigger, the upgrade, computing, and physics analysis. Our team has been critical to the maintenance and operations of the muon system since its installation. During Run 1 we led the muon trigger group and that responsibility continues into Run 2. BU maintains and operates the ATLAS Northeast Tier 2 computing center. We are actively engaged in the analysis of ATLAS data from Run 1 and Run 2. Physics analyses we have contributed to include Standard Model measurements (W and Z cross sections,more » t\\bar{t} differential cross sections, WWW^* production), evidence for the Higgs decaying to \\tau^+\\tau^-, and searches for new phenomena (technicolor, Z' and W', vector-like quarks, dark matter).« less
Automatic Data Filter Customization Using a Genetic Algorithm
NASA Technical Reports Server (NTRS)
Mandrake, Lukas
2013-01-01
This work predicts whether a retrieval algorithm will usefully determine CO2 concentration from an input spectrum of GOSAT (Greenhouse Gases Observing Satellite). This was done to eliminate needless runtime on atmospheric soundings that would never yield useful results. A space of 50 dimensions was examined for predictive power on the final CO2 results. Retrieval algorithms are frequently expensive to run, and wasted effort defeats requirements and expends needless resources. This algorithm could be used to help predict and filter unneeded runs in any computationally expensive regime. Traditional methods such as the Fischer discriminant analysis and decision trees can attempt to predict whether a sounding will be properly processed. However, this work sought to detect a subsection of the dimensional space that can be simply filtered out to eliminate unwanted runs. LDAs (linear discriminant analyses) and other systems examine the entire data and judge a "best fit," giving equal weight to complex and problematic regions as well as simple, clear-cut regions. In this implementation, a genetic space of "left" and "right" thresholds outside of which all data are rejected was defined. These left/right pairs are created for each of the 50 input dimensions. A genetic algorithm then runs through countless potential filter settings using a JPL computer cluster, optimizing the tossed-out data s yield (proper vs. improper run removal) and number of points tossed. This solution is robust to an arbitrary decision boundary within the data and avoids the global optimization problem of whole-dataset fitting using LDA or decision trees. It filters out runs that would not have produced useful CO2 values to save needless computation. This would be an algorithmic preprocessing improvement to any computationally expensive system.
Open-source meteor detection software for low-cost single-board computers
NASA Astrophysics Data System (ADS)
Vida, D.; Zubović, D.; Šegon, D.; Gural, P.; Cupec, R.
2016-01-01
This work aims to overcome the current price threshold of meteor stations which can sometimes deter meteor enthusiasts from owning one. In recent years small card-sized computers became widely available and are used for numerous applications. To utilize such computers for meteor work, software which can run on them is needed. In this paper we present a detailed description of newly-developed open-source software for fireball and meteor detection optimized for running on low-cost single board computers. Furthermore, an update on the development of automated open-source software which will handle video capture, fireball and meteor detection, astrometry and photometry is given.
How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing
NASA Astrophysics Data System (ADS)
Decyk, V. K.; Dauger, D. E.
We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Venkata, Manjunath Gorentla; Aderholdt, William F
The pre-exascale systems are expected to have a significant amount of hierarchical and heterogeneous on-node memory, and this trend of system architecture in extreme-scale systems is expected to continue into the exascale era. along with hierarchical-heterogeneous memory, the system typically has a high-performing network ad a compute accelerator. This system architecture is not only effective for running traditional High Performance Computing (HPC) applications (Big-Compute), but also for running data-intensive HPC applications and Big-Data applications. As a consequence, there is a growing desire to have a single system serve the needs of both Big-Compute and Big-Data applications. Though the system architecturemore » supports the convergence of the Big-Compute and Big-Data, the programming models and software layer have yet to evolve to support either hierarchical-heterogeneous memory systems or the convergence. A programming abstraction to address this problem. The programming abstraction is implemented as a software library and runs on pre-exascale and exascale systems supporting current and emerging system architecture. Using distributed data-structures as a central concept, it provides (1) a simple, usable, and portable abstraction for hierarchical-heterogeneous memory and (2) a unified programming abstraction for Big-Compute and Big-Data applications.« less
User's guide to the Octopus computer network (the SHOC manual)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, C.; Thompson, D.; Whitten, G.
1977-07-18
This guide explains how to enter, run, and debug programs on the Octopus network. It briefly describes the network's operation, and directs the reader to other documents for further information. It stresses those service programs that will be most useful in the long run; ''quick'' methods that have little flexibility are not discussed. The Octopus timesharing network gives the user access to four CDC 7600 computers, two CDC STAR computers, and a broad array of peripheral equipment, from any of 800 or so remote terminals. 16 figures, 7 tables.
User's guide to the Octopus computer network (the SHOC manual)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, C.; Thompson, D.; Whitten, G.
1976-10-07
This guide explains how to enter, run, and debug programs on the Octopus network. It briefly describes the network's operation, and directs the reader to other documents for further information. It stresses those service programs that will be most useful in the long run; ''quick'' methods that have little flexibility are not discussed. The Octopus timesharing network gives the user access to four CDC 7600 computers, two CDC STAR computers, and a broad array of peripheral equipment, from any of 800 or so remote terminals. 8 figures, 4 tables.
User's guide to the Octopus computer network (the SHOC manual)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, C.; Thompson, D.; Whitten, G.
1975-06-02
This guide explains how to enter, run, and debug programs on the Octopus network. It briefly describes the network's operation, and directs the reader to other documents for further information. It stresses those service programs that will be most useful in the long run; ''quick'' methods that have little flexibility are not discussed. The Octopus timesharing network gives the user access to four CDC 7600 computers and a broad array of peripheral equipment, from any of 800 remote terminals. Octopus will soon include the Laboratory's STAR-100 computers. 9 figures, 5 tables. (auth)
ERIC Educational Resources Information Center
Perilli, Viviana; Lancioni, Giulio E.; Singh, Nirbhay N.; O'Reilly, Mark F.; Sigafoos, Jeff; Cassano, Germana; Cordiano, Noemi; Pinto, Katia; Minervini, Mauro G.; Oliva, Doretta
2012-01-01
This study assessed whether four patients with a diagnosis of Alzheimer's disease could make independent phone calls via a computer-aided telephone system. The study was carried out according to a non-concurrent multiple baseline design across participants. All participants started with baseline during which the telephone system was not available,…
Concurrent EEG And NIRS Tomographic Imaging Based on Wearable Electro-Optodes
2014-04-13
Interfaces ( BCIs ), and other systems in the same computational framework. Figure 11 below shows...Improving Brain-‐Computer Interfaces Using Independent Component Analysis, In: Towards Future BCIs , 2012
Computer Program Re-layers Engineering Drawings
NASA Technical Reports Server (NTRS)
Crosby, Dewey C., III
1990-01-01
RULCHK computer program aids in structuring layers of information pertaining to part or assembly designed with software described in article "Software for Drawing Design Details Concurrently" (MFS-28444). Checks and optionally updates structure of layers for part. Enables designer to construct model and annotate its documentation without burden of manually layering part to conform to standards at design time.
ERIC Educational Resources Information Center
Cazzell, Samantha; Skinner, Christopher H.; Ciancio, Dennis; Aspiranti, Kathleen; Watson, Tiffany; Taylor, Kala; McCurdy, Merilee; Skinner, Amy
2017-01-01
A concurrent multiple-baseline across-tasks design was used to evaluate the effectiveness of a computer flash-card sight-word recognition intervention with elementary-school students with intellectual disability. This intervention allowed the participants to self-determine each response interval and resulted in both participants acquiring…
Low Proficiency Learners in Synchronous Computer-Assisted and Face-to-Face Interactions
ERIC Educational Resources Information Center
Tam, Shu Sim; Kan, Ngat Har; Ng, Lee Luan
2010-01-01
This experimental study offers empirical evidence of the effect of the computer-mediated environment on the linguistic output of low proficiency learners. The subjects were 32 female undergraduates with high and low proficiency in ESL. A within-subject repeated measures concurrent nested QUAN-qual (Creswell, 2003) mixed methods approach was used.…
Massively parallel quantum computer simulator
NASA Astrophysics Data System (ADS)
De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.
2007-01-01
We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.
Computer Sciences and Data Systems, volume 1
NASA Technical Reports Server (NTRS)
1987-01-01
Topics addressed include: software engineering; university grants; institutes; concurrent processing; sparse distributed memory; distributed operating systems; intelligent data management processes; expert system for image analysis; fault tolerant software; and architecture research.
NASA Technical Reports Server (NTRS)
Moss, Thomas; Nurge, Mark; Perusich, Stephen
2011-01-01
The In-Situ Resource Utilization (ISRU) Regolith & Environmental Science and Oxygen & Lunar Volatiles Extraction (RESOLVE) software provides operation of the physical plant from a remote location with a high-level interface that can access and control the data from external software applications of other subsystems. This software allows autonomous control over the entire system with manual computer control of individual system/process components. It gives non-programmer operators the capability to easily modify the high-level autonomous sequencing while the software is in operation, as well as the ability to modify the low-level, file-based sequences prior to the system operation. Local automated control in a distributed system is also enabled where component control is maintained during the loss of network connectivity with the remote workstation. This innovation also minimizes network traffic. The software architecture commands and controls the latest generation of RESOLVE processes used to obtain, process, and quantify lunar regolith. The system is grouped into six sub-processes: Drill, Crush, Reactor, Lunar Water Resource Demonstration (LWRD), Regolith Volatiles Characterization (RVC) (see example), and Regolith Oxygen Extraction (ROE). Some processes are independent, some are dependent on other processes, and some are independent but run concurrently with other processes. The first goal is to analyze the volatiles emanating from lunar regolith, such as water, carbon monoxide, carbon dioxide, ammonia, hydrogen, and others. This is done by heating the soil and analyzing and capturing the volatilized product. The second goal is to produce water by reducing the soil at high temperatures with hydrogen. This is done by raising the reactor temperature in the range of 800 to 900 C, causing the reaction to progress by adding hydrogen, and then capturing the water product in a desiccant bed. The software needs to run the entire unit and all sub-processes; however, throughout testing, many variables and parameters need to be changed as more is learned about the system operation. The Master Events Controller (MEC) is run on a standard laptop PC using Windows XP. This PC runs in parallel to another laptop that monitors the GC, and a third PC that monitors the drilling/ crushing operation. These three PCs interface to the process through a CompactRIO, OPC Servers, and modems.
JAX Colony Management System (JCMS): an extensible colony and phenotype data management system.
Donnelly, Chuck J; McFarland, Mike; Ames, Abigail; Sundberg, Beth; Springer, Dave; Blauth, Peter; Bult, Carol J
2010-04-01
The Jackson Laboratory Colony Management System (JCMS) is a software application for managing data and information related to research mouse colonies, associated biospecimens, and experimental protocols. JCMS runs directly on computers that run one of the PC Windows operating systems, but can be accessed via web browser interfaces from any computer running a Windows, Macintosh, or Linux operating system. JCMS can be configured for a single user or multiple users in small- to medium-size work groups. The target audience for JCMS includes laboratory technicians, animal colony managers, and principal investigators. The application provides operational support for colony management and experimental workflows, sample and data tracking through transaction-based data entry forms, and date-driven work reports. Flexible query forms allow researchers to retrieve database records based on user-defined criteria. Recent advances in handheld computers with integrated barcode readers, middleware technologies, web browsers, and wireless networks add to the utility of JCMS by allowing real-time access to the database from any networked computer.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946
ATLAS@Home: Harnessing Volunteer Computing for HEP
NASA Astrophysics Data System (ADS)
Adam-Bourdarios, C.; Cameron, D.; Filipčič, A.; Lancon, E.; Wu, W.; ATLAS Collaboration
2015-12-01
A recent common theme among HEP computing is exploitation of opportunistic resources in order to provide the maximum statistics possible for Monte Carlo simulation. Volunteer computing has been used over the last few years in many other scientific fields and by CERN itself to run simulations of the LHC beams. The ATLAS@Home project was started to allow volunteers to run simulations of collisions in the ATLAS detector. So far many thousands of members of the public have signed up to contribute their spare CPU cycles for ATLAS, and there is potential for volunteer computing to provide a significant fraction of ATLAS computing resources. Here we describe the design of the project, the lessons learned so far and the future plans.
Alleman, Coleman N.; Foulk, James W.; Mota, Alejandro; ...
2017-11-06
The heterogeneity in mechanical fields introduced by microstructure plays a critical role in the localization of deformation. In order to resolve this incipient stage of failure, it is therefore necessary to incorporate microstructure with sufficient resolution. On the other hand, computational limitations make it infeasible to represent the microstructure in the entire domain at the component scale. Here, the authors demonstrate the use of concurrent multiscale modeling to incorporate explicit, finely resolved microstructure in a critical region while resolving the smoother mechanical fields outside this region with a coarser discretization to limit computational cost. The microstructural physics is modeled withmore » a high-fidelity model that incorporates anisotropic crystal elasticity and rate-dependent crystal plasticity to simulate the behavior of a stainless steel alloy. The component-scale material behavior is treated with a lower fidelity model incorporating isotropic linear elasticity and rate-independent J 2 plasticity. The microstructural and component scale subdomains are modeled concurrently, with coupling via the Schwarz alternating method, which solves boundary-value problems in each subdomain separately and transfers solution information between subdomains via Dirichlet boundary conditions. In this study, the framework is applied to model incipient localization in tensile specimens during necking.« less
A Response Surface Methodology for Bi-Level Integrated System Synthesis (BLISS)
NASA Technical Reports Server (NTRS)
Altus, Troy David; Sobieski, Jaroslaw (Technical Monitor)
2002-01-01
The report describes a new method for optimization of engineering systems such as aerospace vehicles whose design must harmonize a number of subsystems and various physical phenomena, each represented by a separate computer code, e.g., aerodynamics, structures, propulsion, performance, etc. To represent the system internal couplings, the codes receive output from other codes as part of their inputs. The system analysis and optimization task is decomposed into subtasks that can be executed concurrently, each subtask conducted using local state and design variables and holding constant a set of the system-level design variables. The subtasks results are stored in form of the Response Surfaces (RS) fitted in the space of the system-level variables to be used as the subtask surrogates in a system-level optimization whose purpose is to optimize the system objective(s) and to reconcile the system internal couplings. By virtue of decomposition and execution concurrency, the method enables a broad workfront in organization of an engineering project involving a number of specialty groups that might be geographically dispersed, and it exploits the contemporary computing technology of massively concurrent and distributed processing. The report includes a demonstration test case of supersonic business jet design.
NASA Astrophysics Data System (ADS)
Alleman, Coleman N.; Foulk, James W.; Mota, Alejandro; Lim, Hojun; Littlewood, David J.
2018-02-01
The heterogeneity in mechanical fields introduced by microstructure plays a critical role in the localization of deformation. To resolve this incipient stage of failure, it is therefore necessary to incorporate microstructure with sufficient resolution. On the other hand, computational limitations make it infeasible to represent the microstructure in the entire domain at the component scale. In this study, the authors demonstrate the use of concurrent multiscale modeling to incorporate explicit, finely resolved microstructure in a critical region while resolving the smoother mechanical fields outside this region with a coarser discretization to limit computational cost. The microstructural physics is modeled with a high-fidelity model that incorporates anisotropic crystal elasticity and rate-dependent crystal plasticity to simulate the behavior of a stainless steel alloy. The component-scale material behavior is treated with a lower fidelity model incorporating isotropic linear elasticity and rate-independent J2 plasticity. The microstructural and component scale subdomains are modeled concurrently, with coupling via the Schwarz alternating method, which solves boundary-value problems in each subdomain separately and transfers solution information between subdomains via Dirichlet boundary conditions. In this study, the framework is applied to model incipient localization in tensile specimens during necking.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alleman, Coleman N.; Foulk, James W.; Mota, Alejandro
The heterogeneity in mechanical fields introduced by microstructure plays a critical role in the localization of deformation. In order to resolve this incipient stage of failure, it is therefore necessary to incorporate microstructure with sufficient resolution. On the other hand, computational limitations make it infeasible to represent the microstructure in the entire domain at the component scale. Here, the authors demonstrate the use of concurrent multiscale modeling to incorporate explicit, finely resolved microstructure in a critical region while resolving the smoother mechanical fields outside this region with a coarser discretization to limit computational cost. The microstructural physics is modeled withmore » a high-fidelity model that incorporates anisotropic crystal elasticity and rate-dependent crystal plasticity to simulate the behavior of a stainless steel alloy. The component-scale material behavior is treated with a lower fidelity model incorporating isotropic linear elasticity and rate-independent J 2 plasticity. The microstructural and component scale subdomains are modeled concurrently, with coupling via the Schwarz alternating method, which solves boundary-value problems in each subdomain separately and transfers solution information between subdomains via Dirichlet boundary conditions. In this study, the framework is applied to model incipient localization in tensile specimens during necking.« less
Improving overlay control through proper use of multilevel query APC
NASA Astrophysics Data System (ADS)
Conway, Timothy H.; Carlson, Alan; Crow, David A.
2003-06-01
Many state-of-the-art fabs are operating with increasingly diversified product mixes. For example, at Cypress Semiconductor, it is not unusual to be concurrently running multiple technologies and many devices within each technology. This diverse product mix significantly increases the difficulty of manually controlling overlay process corrections. As a result, automated run-to-run feedforward-feedback control has become a necessary and vital component of manufacturing. However, traditional run-to-run controllers rely on highly correlated historical events to forecast process corrections. For example, the historical process events typically are constrained to match the current event for exposure tool, device, process level and reticle ID. This narrowly defined process stream can result in insufficient data when applied to lowvolume or new-release devices. The run-to-run controller implemented at Cypress utilizes a multi-level query (Level-N) correlation algorithm, where each subsequent level widens the search criteria for available historical data. The paper discusses how best to widen the search criteria and how to determine and apply a known bias to account for tool-to-tool and device-to-device differences. Specific applications include offloading lots from one tool to another when the first tool is down for preventive maintenance, utilizing related devices to determine a default feedback vector for new-release devices, and applying bias values to account for known reticle-to-reticle differences. In this study, we will show how historical data can be leveraged from related devices or tools to overcome the limitations of narrow process streams. In particular, this paper discusses how effectively handling narrow process streams allows Cypress to offload lots from a baseline tool to an alternate tool.
EPAs Virtual Embryo: Modeling Developmental Toxicity
Embryogenesis is regulated by concurrent activities of signaling pathways organized into networks that control spatial patterning, molecular clocks, morphogenetic rearrangements and cell differentiation. Quantitative mathematical and computational models are needed to better unde...
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin; ...
2015-02-19
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
Multigrid methods with space–time concurrency
Falgout, R. D.; Friedhoff, S.; Kolev, Tz. V.; ...
2017-10-06
Here, we consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-integration techniques impose serious limitations on parallel performance due to the sequential nature of the time-stepping approach, allowing spatial concurrency only. This paper considers the three basic options of multigrid algorithms on space–time grids that allow parallelism in space and time: coarsening in space and time, semicoarsening in the spatial dimensions, and semicoarsening in the temporalmore » dimension. We develop parallel software and performance models to study the three methods at scales of up to 16K cores and introduce an extension of one of them for handling multistep time integration. We then discuss advantages and disadvantages of the different approaches and their benefit compared to traditional space-parallel algorithms with sequential time stepping on modern architectures.« less
A concurrent distributed system for aircraft tactical decision generation
NASA Technical Reports Server (NTRS)
Mcmanus, John W.
1990-01-01
A research program investigating the use of AI techniques to aid in the development of a tactical decision generator (TDG) for within visual range (WVR) air combat engagements is discussed. The application of AI programming and problem-solving methods in the development and implementation of a concurrent version of the computerized logic for air-to-air warfare simulations (CLAWS) program, a second-generation TDG, is presented. Concurrent computing environments and programming approaches are discussed, and the design and performance of prototype concurrent TDG system (Cube CLAWS) are presented. It is concluded that the Cube CLAWS has provided a useful testbed to evaluate the development of a distributed blackboard system. The project has shown that the complexity of developing specialized software on a distributed, message-passing architecture such as the Hypercube is not overwhelming, and that reasonable speedups and processor efficiency can be achieved by a distributed blackboard system. The project has also highlighted some of the costs of using a distributed approach to designing a blackboard system.
Multigrid methods with space–time concurrency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Falgout, R. D.; Friedhoff, S.; Kolev, Tz. V.
Here, we consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-integration techniques impose serious limitations on parallel performance due to the sequential nature of the time-stepping approach, allowing spatial concurrency only. This paper considers the three basic options of multigrid algorithms on space–time grids that allow parallelism in space and time: coarsening in space and time, semicoarsening in the spatial dimensions, and semicoarsening in the temporalmore » dimension. We develop parallel software and performance models to study the three methods at scales of up to 16K cores and introduce an extension of one of them for handling multistep time integration. We then discuss advantages and disadvantages of the different approaches and their benefit compared to traditional space-parallel algorithms with sequential time stepping on modern architectures.« less
RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices
Alzantot, Moustafa; Wang, Yingnan; Ren, Zhengshuang; Srivastava, Mani B.
2018-01-01
Mobile devices have become an essential part of our daily lives. By virtue of both their increasing computing power and the recent progress made in AI, mobile devices evolved to act as intelligent assistants in many tasks rather than a mere way of making phone calls. However, popular and commonly used tools and frameworks for machine intelligence are still lacking the ability to make proper use of the available heterogeneous computing resources on mobile devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity android devices while running deep learning models. We leveraged the heterogeneous computing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is implemented as an extension to the popular open-source framework TensorFlow. By integrating our acceleration framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU. We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources. Our result shows that although GPUs on the phones are capable of offering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multiplication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support. PMID:29629431
Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units
NASA Astrophysics Data System (ADS)
Kemal, Jonathan Yashar
For purposes of optimizing and analyzing turbomachinery and other designs, the unsteady Favre-averaged flow-field differential equations for an ideal compressible gas can be solved in conjunction with the heat conduction equation. We solve all equations using the finite-volume multiple-grid numerical technique, with the dual time-step scheme used for unsteady simulations. Our numerical solver code targets CUDA-capable Graphical Processing Units (GPUs) produced by NVIDIA. Making use of MPI, our solver can run across networked compute notes, where each MPI process can use either a GPU or a Central Processing Unit (CPU) core for primary solver calculations. We use NVIDIA Tesla C2050/C2070 GPUs based on the Fermi architecture, and compare our resulting performance against Intel Zeon X5690 CPUs. Solver routines converted to CUDA typically run about 10 times faster on a GPU for sufficiently dense computational grids. We used a conjugate cylinder computational grid and ran a turbulent steady flow simulation using 4 increasingly dense computational grids. Our densest computational grid is divided into 13 blocks each containing 1033x1033 grid points, for a total of 13.87 million grid points or 1.07 million grid points per domain block. To obtain overall speedups, we compare the execution time of the solver's iteration loop, including all resource intensive GPU-related memory copies. Comparing the performance of 8 GPUs to that of 8 CPUs, we obtain an overall speedup of about 6.0 when using our densest computational grid. This amounts to an 8-GPU simulation running about 39.5 times faster than running than a single-CPU simulation.
Computing shifts to monitor ATLAS distributed computing infrastructure and operations
NASA Astrophysics Data System (ADS)
Adam, C.; Barberis, D.; Crépé-Renaudin, S.; De, K.; Fassi, F.; Stradling, A.; Svatos, M.; Vartapetian, A.; Wolters, H.
2017-10-01
The ATLAS Distributed Computing (ADC) group established a new Computing Run Coordinator (CRC) shift at the start of LHC Run 2 in 2015. The main goal was to rely on a person with a good overview of the ADC activities to ease the ADC experts’ workload. The CRC shifter keeps track of ADC tasks related to their fields of expertise and responsibility. At the same time, the shifter maintains a global view of the day-to-day operations of the ADC system. During Run 1, this task was accomplished by a person of the expert team called the ADC Manager on Duty (AMOD), a position that was removed during the shutdown period due to the reduced number and availability of ADC experts foreseen for Run 2. The CRC position was proposed to cover some of the AMODs former functions, while allowing more people involved in computing to participate. In this way, CRC shifters help with the training of future ADC experts. The CRC shifters coordinate daily ADC shift operations, including tracking open issues, reporting, and representing ADC in relevant meetings. The CRC also facilitates communication between the ADC experts team and the other ADC shifters. These include the Distributed Analysis Support Team (DAST), which is the first point of contact for addressing all distributed analysis questions, and the ATLAS Distributed Computing Shifters (ADCoS), which check and report problems in central services, sites, Tier-0 export, data transfers and production tasks. Finally, the CRC looks at the level of ADC activities on a weekly or monthly timescale to ensure that ADC resources are used efficiently.
RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices.
Alzantot, Moustafa; Wang, Yingnan; Ren, Zhengshuang; Srivastava, Mani B
2017-06-01
Mobile devices have become an essential part of our daily lives. By virtue of both their increasing computing power and the recent progress made in AI, mobile devices evolved to act as intelligent assistants in many tasks rather than a mere way of making phone calls. However, popular and commonly used tools and frameworks for machine intelligence are still lacking the ability to make proper use of the available heterogeneous computing resources on mobile devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity android devices while running deep learning models. We leveraged the heterogeneous computing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is implemented as an extension to the popular open-source framework TensorFlow. By integrating our acceleration framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU. We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources. Our result shows that although GPUs on the phones are capable of offering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multiplication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support.
NASA Technical Reports Server (NTRS)
Chawner, David M.; Gomez, Ray J.
2010-01-01
In the Applied Aerosciences and CFD branch at Johnson Space Center, computational simulations are run that face many challenges. Two of which are the ability to customize software for specialized needs and the need to run simulations as fast as possible. There are many different tools that are used for running these simulations and each one has its own pros and cons. Once these simulations are run, there needs to be software capable of visualizing the results in an appealing manner. Some of this software is called open source, meaning that anyone can edit the source code to make modifications and distribute it to all other users in a future release. This is very useful, especially in this branch where many different tools are being used. File readers can be written to load any file format into a program, to ease the bridging from one tool to another. Programming such a reader requires knowledge of the file format that is being read as well as the equations necessary to obtain the derived values after loading. When running these CFD simulations, extremely large files are being loaded and having values being calculated. These simulations usually take a few hours to complete, even on the fastest machines. Graphics processing units (GPUs) are usually used to load the graphics for computers; however, in recent years, GPUs are being used for more generic applications because of the speed of these processors. Applications run on GPUs have been known to run up to forty times faster than they would on normal central processing units (CPUs). If these CFD programs are extended to run on GPUs, the amount of time they would require to complete would be much less. This would allow more simulations to be run in the same amount of time and possibly perform more complex computations.
Maintaining and Enhancing Diversity of Sampled Protein Conformations in Robotics-Inspired Methods.
Abella, Jayvee R; Moll, Mark; Kavraki, Lydia E
2018-01-01
The ability to efficiently sample structurally diverse protein conformations allows one to gain a high-level view of a protein's energy landscape. Algorithms from robot motion planning have been used for conformational sampling, and several of these algorithms promote diversity by keeping track of "coverage" in conformational space based on the local sampling density. However, large proteins present special challenges. In particular, larger systems require running many concurrent instances of these algorithms, but these algorithms can quickly become memory intensive because they typically keep previously sampled conformations in memory to maintain coverage estimates. In addition, robotics-inspired algorithms depend on defining useful perturbation strategies for exploring the conformational space, which is a difficult task for large proteins because such systems are typically more constrained and exhibit complex motions. In this article, we introduce two methodologies for maintaining and enhancing diversity in robotics-inspired conformational sampling. The first method addresses algorithms based on coverage estimates and leverages the use of a low-dimensional projection to define a global coverage grid that maintains coverage across concurrent runs of sampling. The second method is an automatic definition of a perturbation strategy through readily available flexibility information derived from B-factors, secondary structure, and rigidity analysis. Our results show a significant increase in the diversity of the conformations sampled for proteins consisting of up to 500 residues when applied to a specific robotics-inspired algorithm for conformational sampling. The methodologies presented in this article may be vital components for the scalability of robotics-inspired approaches.
Toward an in-situ analytics and diagnostics framework for earth system models
NASA Astrophysics Data System (ADS)
Anantharaj, Valentine; Wolf, Matthew; Rasch, Philip; Klasky, Scott; Williams, Dean; Jacob, Rob; Ma, Po-Lun; Kuo, Kwo-Sen
2017-04-01
The development roadmaps for many earth system models (ESM) aim for a globally cloud-resolving model targeting the pre-exascale and exascale systems of the future. The ESMs will also incorporate more complex physics, chemistry and biology - thereby vastly increasing the fidelity of the information content simulated by the model. We will then be faced with an unprecedented volume of simulation output that would need to be processed and analyzed concurrently in order to derive the valuable scientific results. We are already at this threshold with our current generation of ESMs at higher resolution simulations. Currently, the nominal I/O throughput in the Community Earth System Model (CESM) via Parallel IO (PIO) library is around 100 MB/s. If we look at the high frequency I/O requirements, it would require an additional 1 GB / simulated hour, translating to roughly 4 mins wallclock / simulated-day => 24.33 wallclock hours / simulated-model-year => 1,752,000 core-hours of charge per simulated-model-year on the Titan supercomputer at the Oak Ridge Leadership Computing Facility. There is also a pending need for 3X more volume of simulation output . Meanwhile, many ESMs use instrument simulators to run forward models to compare model simulations against satellite and ground-based instruments, such as radars and radiometers. The CFMIP Observation Simulator Package (COSP) is used in CESM as well as the Accelerated Climate Model for Energy (ACME), one of the ESMs specifically targeting current and emerging leadership-class computing platforms These simulators can be computationally expensive, accounting for as much as 30% of the computational cost. Hence the data are often written to output files that are then used for offline calculations. Again, the I/O bottleneck becomes a limitation. Detection and attribution studies also use large volume of data for pattern recognition and feature extraction to analyze weather and climate phenomenon such as tropical cyclones, atmospheric rivers, blizzards, etc. It is evident that ESMs need an in-situ framework to decouple the diagnostics and analytics from the prognostics and physics computations of the models so that the diagnostic computations could be performed concurrently without limiting model throughput. We are designing a science-driven online analytics framework for earth system models. Our approach is to adopt several data workflow technologies, such as the Adaptable IO System (ADIOS), being developed under the U.S. Exascale Computing Project (ECP) and integrate these to allow for extreme performance IO, in situ workflow integration, science-driven analytics and visualization all in a easy to use computational framework. This will allow science teams to write data 100-1000 times faster and seamlessly move from post processing the output for validation and verification purposes to performing these calculations in situ. We can easily and knowledgeably envision a near-term future where earth system models like ACME and CESM will have to address not only the challenges of the volume of data but also need to consider the velocity of the data. The earth system model of the future in the exascale era, as they incorporate more complex physics at higher resolutions, will be able to analyze more simulation content without having to compromise targeted model throughput.
Multiple grid problems on concurrent-processing computers
NASA Technical Reports Server (NTRS)
Eberhardt, D. S.; Baganoff, D.
1986-01-01
Three computer codes were studied which make use of concurrent processing computer architectures in computational fluid dynamics (CFD). The three parallel codes were tested on a two processor multiple-instruction/multiple-data (MIMD) facility at NASA Ames Research Center, and are suggested for efficient parallel computations. The first code is a well-known program which makes use of the Beam and Warming, implicit, approximate factored algorithm. This study demonstrates the parallelism found in a well-known scheme and it achieved speedups exceeding 1.9 on the two processor MIMD test facility. The second code studied made use of an embedded grid scheme which is used to solve problems having complex geometries. The particular application for this study considered an airfoil/flap geometry in an incompressible flow. The scheme eliminates some of the inherent difficulties found in adapting approximate factorization techniques onto MIMD machines and allows the use of chaotic relaxation and asynchronous iteration techniques. The third code studied is an application of overset grids to a supersonic blunt body problem. The code addresses the difficulties encountered when using embedded grids on a compressible, and therefore nonlinear, problem. The complex numerical boundary system associated with overset grids is discussed and several boundary schemes are suggested. A boundary scheme based on the method of characteristics achieved the best results.
Soule, Pat LeRoy
1978-01-01
Water-surface profiles of the 25-, 50-, and 100-year recurrence interval discharges have been computed for all streams and reaches of channels in Fairfax County, Virginia, having a drainage area greater than 1 square mile except for Dogue Creek, Little Hunting Creek, and that portion of Cameron Run above Lake Barcroft. Maps having a 2-foot contour interval and a horizontal scale of 1 inch equals 100 feet were used for base on which flood boundaries were delineated for 25-, 50-, and 100-year floods to be expected in each basin under ultimate development conditions. This report is one of a series and presents a discussion of techniques employed in computing discharges and profiles as well as the flood profiles and maps on which flood boundaries have been delineated for the Occoquan River and its tributaries within Fairfax County and those streams on Mason Neck within Fairfax County tributary to the Potomac River. (Woodard-USGS)
The implementation and use of Ada on distributed systems with high reliability requirements
NASA Technical Reports Server (NTRS)
Knight, J. C.
1986-01-01
The general inadequacy of Ada for programming systems that must survive processor loss was shown. A solution to the problem was proposed in which there are no syntatic changes to Ada. The approach was evaluated using a full-scale, realistic application. The application used was the Advanced Transport Operating System (ATOPS), an experimental computer control system developed for a modified Boeing 737 aircraft. The ATOPS system is a full authority, real-time avionics system providing a large variety of advanced features. Methods of building fault tolerance into concurrent systems were explored. A set of criteria by which the proposed method will be judged was examined. Extensive interaction with personnel from Computer Sciences Corporation and NASA Langley occurred to determine the requirements of the ATOPS software. Backward error recovery in concurrent systems was assessed.
Measuring quality indicators in the operating room: cleaning and turnover time.
Jericó, Marli de Carvalho; Perroca, Márcia Galan; da Penha, Vivian Colombo
2011-01-01
This exploratory-descriptive study was carried out in the Surgical Center Unit of a university hospital aiming to measure time spent with concurrent cleaning performed by the cleaning service and turnover time and also investigated potential associations between cleaning time and the surgery's magnitude and specialty, period of the day and the room's size. The sample consisted of 101 surgeries, computing cleaning time and 60 surgeries, computing turnover time. The Kaplan-Meier method was used to analyze time and Pearson's correlation to study potential correlations. The time spent in concurrent cleaning was 7.1 minutes and turnover time was 35.6 minutes. No association between cleaning time and the other variables was found. These findings can support nurses in the efficient use of resources thereby speeding up the work process in the operating room.
Locality Aware Concurrent Start for Stencil Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shrestha, Sunil; Gao, Guang R.; Manzano Franco, Joseph B.
Stencil computations are at the heart of many physical simulations used in scientific codes. Thus, there exists a plethora of optimization efforts for this family of computations. Among these techniques, tiling techniques that allow concurrent start have proven to be very efficient in providing better performance for these critical kernels. Nevertheless, with many core designs being the norm, these optimization techniques might not be able to fully exploit locality (both spatial and temporal) on multiple levels of the memory hierarchy without compromising parallelism. It is no longer true that the machine can be seen as a homogeneous collection of nodesmore » with caches, main memory and an interconnect network. New architectural designs exhibit complex grouping of nodes, cores, threads, caches and memory connected by an ever evolving network-on-chip design. These new designs may benefit greatly from carefully crafted schedules and groupings that encourage parallel actors (i.e. threads, cores or nodes) to be aware of the computational history of other actors in close proximity. In this paper, we provide an efficient tiling technique that allows hierarchical concurrent start for memory hierarchy aware tile groups. Each execution schedule and tile shape exploit the available parallelism, load balance and locality present in the given applications. We demonstrate our technique on the Intel Xeon Phi architecture with selected and representative stencil kernels. We show improvement ranging from 5.58% to 31.17% over existing state-of-the-art techniques.« less
ACON: a multipurpose production controller for plasma physics codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Snell, C.
1983-01-01
ACON is a BCON controller designed to run large production codes on the CTSS Cray-1 or the LTSS 7600 computers. ACON can also be operated interactively, with input from the user's terminal. The controller can run one code or a sequence of up to ten codes during the same job. Options are available to get and save Mass storage files, to perform Historian file updating operations, to compile and load source files, and to send out print and film files. Special features include ability to retry after Mass failures, backup options for saving files, startup messages for the various codes,more » and ability to reserve specified amounts of computer time after successive code runs. ACON's flexibility and power make it useful for running a number of different production codes.« less
Evolution of the ATLAS Software Framework towards Concurrency
NASA Astrophysics Data System (ADS)
Jones, R. W. L.; Stewart, G. A.; Leggett, C.; Wynne, B. M.
2015-05-01
The ATLAS experiment has successfully used its Gaudi/Athena software framework for data taking and analysis during the first LHC run, with billions of events successfully processed. However, the design of Gaudi/Athena dates from early 2000 and the software and the physics code has been written using a single threaded, serial design. This programming model has increasing difficulty in exploiting the potential of current CPUs, which offer their best performance only through taking full advantage of multiple cores and wide vector registers. Future CPU evolution will intensify this trend, with core counts increasing and memory per core falling. Maximising performance per watt will be a key metric, so all of these cores must be used as efficiently as possible. In order to address the deficiencies of the current framework, ATLAS has embarked upon two projects: first, a practical demonstration of the use of multi-threading in our reconstruction software, using the GaudiHive framework; second, an exercise to gather requirements for an updated framework, going back to the first principles of how event processing occurs. In this paper we report on both these aspects of our work. For the hive based demonstrators, we discuss what changes were necessary in order to allow the serially designed ATLAS code to run, both to the framework and to the tools and algorithms used. We report on what general lessons were learned about the code patterns that had been employed in the software and which patterns were identified as particularly problematic for multi-threading. These lessons were fed into our considerations of a new framework and we present preliminary conclusions on this work. In particular we identify areas where the framework can be simplified in order to aid the implementation of a concurrent event processing scheme. Finally, we discuss the practical difficulties involved in migrating a large established code base to a multi-threaded framework and how this can be achieved for LHC Run 3.
ERIC Educational Resources Information Center
Perilli, Viviana; Lancioni, Giulio E.; Laporta, Dominga; Paparella, Adele; Caffo, Alessandro O.; Singh, Nirbhay N.; O'Reilly, Mark F.; Sigafoos, Jeff; Oliva, Doretta
2013-01-01
This study extended the assessment of a computer-aided telephone system to enable five patients with a diagnosis of Alzheimer's disease to make phone calls independently. The patients were divided into two groups and exposed to intervention according to a non-concurrent multiple baseline design across groups. All patients started with baseline in…
ERIC Educational Resources Information Center
National Science Foundation, Washington, DC.
This report addresses an opportunity to accelerate progress in virtually every branch of science and engineering concurrently, while also boosting the American economy as business firms also learn to exploit these new capabilities. The successful rapid advancement in both science and technology creates its own challenges, four of which are…
MIT Laboratory for Computer Science Progress Report 27
1990-06-01
because of the natural, yet unexploited, concurrence that characterizes contemporary and prospective applications from business to sensory computing...432. 14 Advanced Network Architecture Academic Staff D. Clark, Group Leader D. Tennenhouse J. Saltzer Research Staff J. Davin K. Sollins Graduate...Murray Hill, NJ, July 1989. 23 24 Clinical Decision Making Academic Staff R. Patil P. Szolovits, Group Leader G. Rennels Collaborating Investigators M
NASA Astrophysics Data System (ADS)
Daniluk, Andrzej
2011-06-01
A computational model is a computer program, which attempts to simulate an abstract model of a particular system. Computational models use enormous calculations and often require supercomputer speed. As personal computers are becoming more and more powerful, more laboratory experiments can be converted into computer models that can be interactively examined by scientists and students without the risk and cost of the actual experiments. The future of programming is concurrent programming. The threaded programming model provides application programmers with a useful abstraction of concurrent execution of multiple tasks. The objective of this release is to address the design of architecture for scientific application, which may execute as multiple threads execution, as well as implementations of the related shared data structures. New version program summaryProgram title: GrowthCP Catalogue identifier: ADVL_v4_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADVL_v4_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 32 269 No. of bytes in distributed program, including test data, etc.: 8 234 229 Distribution format: tar.gz Programming language: Free Object Pascal Computer: multi-core x64-based PC Operating system: Windows XP, Vista, 7 Has the code been vectorised or parallelized?: No RAM: More than 1 GB. The program requires a 32-bit or 64-bit processor to run the generated code. Memory is addressed using 32-bit (on 32-bit processors) or 64-bit (on 64-bit processors with 64-bit addressing) pointers. The amount of addressed memory is limited only by the available amount of virtual memory. Supplementary material: The figures mentioned in the "Summary of revisions" section can be obtained here. Classification: 4.3, 7.2, 6.2, 8, 14 External routines: Lazarus [1] Catalogue identifier of previous version: ADVL_v3_0 Journal reference of previous version: Comput. Phys. Comm. 181 (2010) 709 Does the new version supersede the previous version?: Yes Nature of problem: Reflection high-energy electron diffraction (RHEED) is an important in-situ analysis technique, which is capable of giving quantitative information about the growth process of thin layers and its control. It can be used to calibrate growth rate, analyze surface morphology, calibrate surface temperature, monitor the arrangement of the surface atoms, and provide information about growth kinetics. Such control allows the development of structures where the electrons can be confined in space, giving quantum wells or even quantum dots. In order to determine the atomic positions of atoms in the first few layers, the RHEED intensity must be measured as a function of the scattering angles and then compared with dynamic calculations. The objective of this release is to address the design of architecture for application that simulates the rocking curves RHEED intensities during hetero-epitaxial growth process of thin films. Solution method: The GrowthCP is a complex numerical model that uses multiple threads for simulation of epitaxial growth of thin layers. This model consists of two transactional parts. The first part is a mathematical model being based on the Runge-Kutta method with adaptive step-size control. The second part represents first-principles of the one-dimensional RHEED computational model. This model is based on solving a one-dimensional Schrödinger equation. Several problems can arise when applications contain a mixture of data access code, numerical code, and presentation code. Such applications are difficult to maintain, because interdependencies between all the components cause strong ripple effects whenever a change is made anywhere. Adding new data views often requires reimplementing a numerical code, which then requires maintenance in multiple places. In order to solve problems of this type, the computational and threading layers of the project have been implemented in the form of one design pattern as a part of Model-View-Controller architecture. Reasons for new version: Responding to the users' feedback the Growth09 project has been upgraded to a standard that allows the carrying out of sample computations of the RHEED intensities for a disordered surface for a wide range of single- and epitaxial hetero-structures. The design pattern on which the project is based has also been improved. It is shown that this model can be effectively used for multithreaded growth simulations of thin epitaxial layers and corresponding RHEED intensities for a wide range of single- and hetero-structures. Responding to the users' feedback the present release has been implemented using a well-documented free compiler [1] not requiring the special configuration and installation additional libraries. Summary of revisions: The logical structure of the Growth09 program has been modified according to the scheme showed in Fig. 1. The class diagram in Fig. 1 is a static view of the main platform-specific elements of the GrowthCP architecture. Fig. 2 provides a dynamic view by showing the creation and destruction simplistic sequence diagram for the process. The program requires the user to provide the appropriate parameters in the form of a knowledge base for the crystal structures under investigation. These parameters are loaded from the parameters. ini files at run-time. Instructions to prepare the .ini files can be found in the new distribution. The program enables carrying out different growth models and one-dimensional dynamical RHEED calculations for the fcc lattice with basis of three-atoms, fcc lattice with basis of two-atoms, fcc lattice with single atom basis, Zinc-Blende, Sodium Chloride, and Wurtzite crystalline structures and hetero-structures, but yet the Fourier component of the scattering potential in the TRHEEDCalculations.crystPotUgXXX() procedure can be modified and implemented according to users' specific application requirements. The Fourier component of the scattering potential of the whole crystalline hetero-structures can be determined as a sum of contributions coming from all thin slices of individual atomic layers. To carry out one-dimensional calculations of the scattering potentials, the program uses properly constructed self-consistent procedures. Each component of the system shown in Figs. 1 and 2 is fully extendable and can easily be adapted to new changeable requirements. Two essential logical elements of the system, i.e. TGrowthTransaction and TRHEEDCalculations classes, were designed and implemented in this way for them to pass the information to themselves without the need to use the data-exchange files given. In consequence each of them can be independently modified and/or extended. Implementing other types of differential equations and the different algorithm for solving them in the TGrowthTransaction class does not require another implementation of the TRHEEDCalculations class. Similarly, implementing other forms of scattering potential and different algorithm for RHEED calculation stays without the influence on the TGrowthTransaction class construction. Unusual features: The program is distributed in the form of main project GrowthCP.lpr, with associated files, and should be compiled using Lazarus IDE. The program should be compiled with English/USA regional and language options. Running time: The typical running time is machine and user-parameters dependent. References: http://sourceforge.net/projects/lazarus/files/.
Experimental Realization of High-Efficiency Counterfactual Computation.
Kong, Fei; Ju, Chenyong; Huang, Pu; Wang, Pengfei; Kong, Xi; Shi, Fazhan; Jiang, Liang; Du, Jiangfeng
2015-08-21
Counterfactual computation (CFC) exemplifies the fascinating quantum process by which the result of a computation may be learned without actually running the computer. In previous experimental studies, the counterfactual efficiency is limited to below 50%. Here we report an experimental realization of the generalized CFC protocol, in which the counterfactual efficiency can break the 50% limit and even approach unity in principle. The experiment is performed with the spins of a negatively charged nitrogen-vacancy color center in diamond. Taking advantage of the quantum Zeno effect, the computer can remain in the not-running subspace due to the frequent projection by the environment, while the computation result can be revealed by final detection. The counterfactual efficiency up to 85% has been demonstrated in our experiment, which opens the possibility of many exciting applications of CFC, such as high-efficiency quantum integration and imaging.
Experimental Realization of High-Efficiency Counterfactual Computation
NASA Astrophysics Data System (ADS)
Kong, Fei; Ju, Chenyong; Huang, Pu; Wang, Pengfei; Kong, Xi; Shi, Fazhan; Jiang, Liang; Du, Jiangfeng
2015-08-01
Counterfactual computation (CFC) exemplifies the fascinating quantum process by which the result of a computation may be learned without actually running the computer. In previous experimental studies, the counterfactual efficiency is limited to below 50%. Here we report an experimental realization of the generalized CFC protocol, in which the counterfactual efficiency can break the 50% limit and even approach unity in principle. The experiment is performed with the spins of a negatively charged nitrogen-vacancy color center in diamond. Taking advantage of the quantum Zeno effect, the computer can remain in the not-running subspace due to the frequent projection by the environment, while the computation result can be revealed by final detection. The counterfactual efficiency up to 85% has been demonstrated in our experiment, which opens the possibility of many exciting applications of CFC, such as high-efficiency quantum integration and imaging.
Evolution of a standard microprocessor-based space computer
NASA Technical Reports Server (NTRS)
Fernandez, M.
1980-01-01
An existing in inventory computer hardware/software package (B-1 RFS/ECM) was repackaged and applied to multiple missile/space programs. Concurrent with the application efforts, low risk modifications were made to the computer from program to program to take advantage of newer, advanced technology and to meet increasingly more demanding requirements (computational and memory capabilities, longer life, and fault tolerant autonomy). It is concluded that microprocessors hold promise in a number of critical areas for future space computer applications. However, the benefits of the DoD VHSIC Program are required and the old proliferation problem must be revised.
1983-11-15
Concurrent Algorithms", A. Cremers , Dortmund University, West Germany, and T. Hibbard, JPL, Pasadena, CA 64 "An Overview of Signal Representations in...n O f\\ n O P- A -> Problem-oriented specification of concurrent algorithms Armin B. Cremers and Thomas N. Hibbard Preliminary version September...1982 s* Armin B. Cremers Computer Science Department University of Dortmund P.O. Box 50 05 00 D-4600 Dortmund 50 Fed. Rep. Germany
Running of scalar spectral index in multi-field inflation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gong, Jinn-Ouk, E-mail: jinn-ouk.gong@apctp.org
We compute the running of the scalar spectral index in general multi-field slow-roll inflation. By incorporating explicit momentum dependence at the moment of horizon crossing, we can find the running straightforwardly. At the same time, we can distinguish the contributions from the quasi de Sitter background and the super-horizon evolution of the field fluctuations.
NASA Astrophysics Data System (ADS)
Hill, M. C.; Jakeman, J.; Razavi, S.; Tolson, B.
2015-12-01
For many environmental systems model runtimes have remained very long as more capable computers have been used to add more processes and more time and space discretization. Scientists have also added more parameters and kinds of observations, and many model runs are needed to explore the models. Computational demand equals run time multiplied by number of model runs divided by parallelization opportunities. Model exploration is conducted using sensitivity analysis, optimization, and uncertainty quantification. Sensitivity analysis is used to reveal consequences of what may be very complex simulated relations, optimization is used to identify parameter values that fit the data best, or at least better, and uncertainty quantification is used to evaluate the precision of simulated results. The long execution times make such analyses a challenge. Methods for addressing this challenges include computationally frugal analysis of the demanding original model and a number of ingenious surrogate modeling methods. Both commonly use about 50-100 runs of the demanding original model. In this talk we consider the tradeoffs between (1) original model development decisions, (2) computationally frugal analysis of the original model, and (3) using many model runs of the fast surrogate model. Some questions of interest are as follows. If the added processes and discretization invested in (1) are compared with the restrictions and approximations in model analysis produced by long model execution times, is there a net benefit related of the goals of the model? Are there changes to the numerical methods that could reduce the computational demands while giving up less fidelity than is compromised by using computationally frugal methods or surrogate models for model analysis? Both the computationally frugal methods and surrogate models require that the solution of interest be a smooth function of the parameters or interest. How does the information obtained from the local methods typical of (2) and the global averaged methods typical of (3) compare for typical systems? The discussion will use examples of response of the Greenland glacier to global warming and surface and groundwater modeling.
Program Processes Thermocouple Readings
NASA Technical Reports Server (NTRS)
Quave, Christine A.; Nail, William, III
1995-01-01
Digital Signal Processor for Thermocouples (DART) computer program implements precise and fast method of converting voltage to temperature for large-temperature-range thermocouple applications. Written using LabVIEW software. DART available only as object code for use on Macintosh II FX or higher-series computers running System 7.0 or later and IBM PC-series and compatible computers running Microsoft Windows 3.1. Macintosh version of DART (SSC-00032) requires LabVIEW 2.2.1 or 3.0 for execution. IBM PC version (SSC-00031) requires LabVIEW 3.0 for Windows 3.1. LabVIEW software product of National Instruments and not included with program.
NASA Technical Reports Server (NTRS)
Mcenulty, R. E.
1977-01-01
The G189A simulation of the Shuttle Orbiter ECLSS was upgraded. All simulation library versions and simulation models were converted from the EXEC2 to the EXEC8 computer system and a new program, G189PL, was added to the combination master program library. The program permits the post-plotting of up to 100 frames of plot data over any time interval of a G189 simulation run. The overlay structure of the G189A simulations were restructured for the purpose of conserving computer core requirements and minimizing run time requirements.
INHYD: Computer code for intraply hybrid composite design. A users manual
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Sinclair, J. H.
1983-01-01
A computer program (INHYD) was developed for intraply hybrid composite design. A users manual for INHYD is presented. In INHYD embodies several composite micromechanics theories, intraply hybrid composite theories, and an integrated hygrothermomechanical theory. The INHYD can be run in both interactive and batch modes. It has considerable flexibility and capability, which the user can exercise through several options. These options are demonstrated through appropriate INHYD runs in the manual.
Topology Optimization for Reducing Additive Manufacturing Processing Distortions
2017-12-01
features that curl or warp under thermal load and are subsequently struck by the recoater blade /roller. Support structures act to wick heat away and...was run for 150 iterations. The material properties for all examples were Young’s modulus E = 1 GPa, Poisson’s ratio ν = 0.25, and thermal expansion...the element-birth model is significantly more computationally expensive for a full op- timization run . Consider, the computational complexity of a
MindModeling@Home . . . and Anywhere Else You Have Idle Processors
2009-12-01
was SETI @Home. It was established in 1999 for the purpose of demonstrating the utility of “distributed grid computing” by providing a mechanism for...the public imagination, and SETI @Home remains the longest running and one of the most popular volunteer computing projects in the world. This...pursuits. Most of them, including SETI @Home, run on a software architecture called the Berkeley Open Infrastructure for Network Computing (BOINC). Some of
NASA Astrophysics Data System (ADS)
Zhiying, Chen; Ping, Zhou
2017-11-01
Considering the robust optimization computational precision and efficiency for complex mechanical assembly relationship like turbine blade-tip radial running clearance, a hierarchically response surface robust optimization algorithm is proposed. The distribute collaborative response surface method is used to generate assembly system level approximation model of overall parameters and blade-tip clearance, and then a set samples of design parameters and objective response mean and/or standard deviation is generated by using system approximation model and design of experiment method. Finally, a new response surface approximation model is constructed by using those samples, and this approximation model is used for robust optimization process. The analyses results demonstrate the proposed method can dramatic reduce the computational cost and ensure the computational precision. The presented research offers an effective way for the robust optimization design of turbine blade-tip radial running clearance.
Implementing Parquet equations using HPX
NASA Astrophysics Data System (ADS)
Kellar, Samuel; Wagle, Bibek; Yang, Shuxiang; Tam, Ka-Ming; Kaiser, Hartmut; Moreno, Juana; Jarrell, Mark
A new C++ runtime system (HPX) enables simulations of complex systems to run more efficiently on parallel and heterogeneous systems. This increased efficiency allows for solutions to larger simulations of the parquet approximation for a system with impurities. The relevancy of the parquet equations depends upon the ability to solve systems which require long runs and large amounts of memory. These limitations, in addition to numerical complications arising from stability of the solutions, necessitate running on large distributed systems. As the computational resources trend towards the exascale and the limitations arising from computational resources vanish efficiency of large scale simulations becomes a focus. HPX facilitates efficient simulations through intelligent overlapping of computation and communication. Simulations such as the parquet equations which require the transfer of large amounts of data should benefit from HPX implementations. Supported by the the NSF EPSCoR Cooperative Agreement No. EPS-1003897 with additional support from the Louisiana Board of Regents.
DualSPHysics: A numerical tool to simulate real breakwaters
NASA Astrophysics Data System (ADS)
Zhang, Feng; Crespo, Alejandro; Altomare, Corrado; Domínguez, José; Marzeddu, Andrea; Shang, Shao-ping; Gómez-Gesteira, Moncho
2018-02-01
The open-source code DualSPHysics is used in this work to compute the wave run-up in an existing dike in the Chinese coast using realistic dimensions, bathymetry and wave conditions. The GPU computing power of the DualSPHysics allows simulating real-engineering problems that involve complex geometries with a high resolution in a reasonable computational time. The code is first validated by comparing the numerical free-surface elevation, the wave orbital velocities and the time series of the run-up with physical data in a wave flume. Those experiments include a smooth dike and an armored dike with two layers of cubic blocks. After validation, the code is applied to a real case to obtain the wave run-up under different incident wave conditions. In order to simulate the real open sea, the spurious reflections from the wavemaker are removed by using an active wave absorption technique.
Prediction of sound radiated from different practical jet engine inlets
NASA Technical Reports Server (NTRS)
Zinn, B. T.; Meyer, W. L.
1980-01-01
Existing computer codes for calculating the far field radiation patterns surrounding various practical jet engine inlet configurations under different excitation conditions were upgraded. The computer codes were refined and expanded so that they are now more efficient computationally by a factor of about three and they are now capable of producing accurate results up to nondimensional wave numbers of twenty. Computer programs were also developed to help generate accurate geometrical representations of the inlets to be investigated. This data is required as input for the computer programs which calculate the sound fields. This new geometry generating computer program considerably reduces the time required to generate the input data which was one of the most time consuming steps in the process. The results of sample runs using the NASA-Lewis QCSEE inlet are presented and comparison of run times and accuracy are made between the old and upgraded computer codes. The overall accuracy of the computations is determined by comparison of the results of the computations with simple source solutions.
NASA Astrophysics Data System (ADS)
Gerjuoy, Edward
2005-06-01
The security of messages encoded via the widely used RSA public key encryption system rests on the enormous computational effort required to find the prime factors of a large number N using classical (conventional) computers. In 1994 Peter Shor showed that for sufficiently large N, a quantum computer could perform the factoring with much less computational effort. This paper endeavors to explain, in a fashion comprehensible to the nonexpert, the RSA encryption protocol; the various quantum computer manipulations constituting the Shor algorithm; how the Shor algorithm performs the factoring; and the precise sense in which a quantum computer employing Shor's algorithm can be said to accomplish the factoring of very large numbers with less computational effort than a classical computer. It is made apparent that factoring N generally requires many successive runs of the algorithm. Our analysis reveals that the probability of achieving a successful factorization on a single run is about twice as large as commonly quoted in the literature.
Programming the social computer.
Robertson, David; Giunchiglia, Fausto
2013-03-28
The aim of 'programming the global computer' was identified by Milner and others as one of the grand challenges of computing research. At the time this phrase was coined, it was natural to assume that this objective might be achieved primarily through extending programming and specification languages. The Internet, however, has brought with it a different style of computation that (although harnessing variants of traditional programming languages) operates in a style different to those with which we are familiar. The 'computer' on which we are running these computations is a social computer in the sense that many of the elementary functions of the computations it runs are performed by humans, and successful execution of a program often depends on properties of the human society over which the program operates. These sorts of programs are not programmed in a traditional way and may have to be understood in a way that is different from the traditional view of programming. This shift in perspective raises new challenges for the science of the Web and for computing in general.
Exascale computing and what it means for shock physics
NASA Astrophysics Data System (ADS)
Germann, Timothy
2015-06-01
The U.S. Department of Energy is preparing to launch an Exascale Computing Initiative, to address the myriad challenges required to deploy and effectively utilize an exascale-class supercomputer (i.e., one capable of performing 1018 operations per second) in the 2023 timeframe. Since physical (power dissipation) requirements limit clock rates to at most a few GHz, this will necessitate the coordination of on the order of a billion concurrent operations, requiring sophisticated system and application software, and underlying mathematical algorithms, that may differ radically from traditional approaches. Even at the smaller workstation or cluster level of computation, the massive concurrency and heterogeneity within each processor will impact computational scientists. Through the multi-institutional, multi-disciplinary Exascale Co-design Center for Materials in Extreme Environments (ExMatEx), we have initiated an early and deep collaboration between domain (computational materials) scientists, applied mathematicians, computer scientists, and hardware architects, in order to establish the relationships between algorithms, software stacks, and architectures needed to enable exascale-ready materials science application codes within the next decade. In my talk, I will discuss these challenges, and what it will mean for exascale-era electronic structure, molecular dynamics, and engineering-scale simulations of shock-compressed condensed matter. In particular, we anticipate that the emerging hierarchical, heterogeneous architectures can be exploited to achieve higher physical fidelity simulations using adaptive physics refinement. This work is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research.
Myers, E W; Mount, D W
1986-01-01
We describe a program which may be used to find approximate matches to a short predefined DNA sequence in a larger target DNA sequence. The program predicts the usefulness of specific DNA probes and sequencing primers and finds nearly identical sequences that might represent the same regulatory signal. The program is written in the C programming language and will run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The program has been integrated into an existing software package for the IBM personal computer (see article by Mount and Conrad, this volume). Some examples of its use are given. PMID:3753785
AGIS: Evolution of Distributed Computing information system for ATLAS
NASA Astrophysics Data System (ADS)
Anisenkov, A.; Di Girolamo, A.; Alandes, M.; Karavakis, E.
2015-12-01
ATLAS, a particle physics experiment at the Large Hadron Collider at CERN, produces petabytes of data annually through simulation production and tens of petabytes of data per year from the detector itself. The ATLAS computing model embraces the Grid paradigm and a high degree of decentralization of computing resources in order to meet the ATLAS requirements of petabytes scale data operations. It has been evolved after the first period of LHC data taking (Run-1) in order to cope with new challenges of the upcoming Run- 2. In this paper we describe the evolution and recent developments of the ATLAS Grid Information System (AGIS), developed in order to integrate configuration and status information about resources, services and topology of the computing infrastructure used by the ATLAS Distributed Computing applications and services.
Jaschob, Daniel; Riffle, Michael
2012-07-30
Laboratories engaged in computational biology or bioinformatics frequently need to run lengthy, multistep, and user-driven computational jobs. Each job can tie up a computer for a few minutes to several days, and many laboratories lack the expertise or resources to build and maintain a dedicated computer cluster. JobCenter is a client-server application and framework for job management and distributed job execution. The client and server components are both written in Java and are cross-platform and relatively easy to install. All communication with the server is client-driven, which allows worker nodes to run anywhere (even behind external firewalls or "in the cloud") and provides inherent load balancing. Adding a worker node to the worker pool is as simple as dropping the JobCenter client files onto any computer and performing basic configuration, which provides tremendous ease-of-use, flexibility, and limitless horizontal scalability. Each worker installation may be independently configured, including the types of jobs it is able to run. Executed jobs may be written in any language and may include multistep workflows. JobCenter is a versatile and scalable distributed job management system that allows laboratories to very efficiently distribute all computational work among available resources. JobCenter is freely available at http://code.google.com/p/jobcenter/.
CPU SIM: A Computer Simulator for Use in an Introductory Computer Organization-Architecture Class.
ERIC Educational Resources Information Center
Skrein, Dale
1994-01-01
CPU SIM, an interactive low-level computer simulation package that runs on the Macintosh computer, is described. The program is designed for instructional use in the first or second year of undergraduate computer science, to teach various features of typical computer organization through hands-on exercises. (MSE)
Flame-Vortex Studies to Quantify Markstein Numbers Needed to Model Flame Extinction Limits
NASA Technical Reports Server (NTRS)
Driscoll, James F.; Feikema, Douglas A.
2003-01-01
This has quantified a database of Markstein numbers for unsteady flames; future work will quantify a database of flame extinction limits for unsteady conditions. Unsteady extinction limits have not been documented previously; both a stretch rate and a residence time must be measured, since extinction requires that the stretch rate be sufficiently large for a sufficiently long residence time. Ma was measured for an inwardly-propagating flame (IPF) that is negatively-stretched under microgravity conditions. Computations also were performed using RUN-1DL to explain the measurements. The Markstein number of an inwardly-propagating flame, for both the microgravity experiment and the computations, is significantly larger than that of an outwardy-propagating flame. The computed profiles of the various species within the flame suggest reasons. Computed hydrogen concentrations build up ahead of the IPF but not the OPF. Understanding was gained by running the computations for both simplified and full-chemistry conditions. Numerical Simulations. To explain the experimental findings, numerical simulations of both inwardly and outwardly propagating spherical flames (with complex chemistry) were generated using the RUN-1DL code, which includes 16 species and 46 reactions.
Design and performance of the virtualization platform for offline computing on the ATLAS TDAQ Farm
NASA Astrophysics Data System (ADS)
Ballestrero, S.; Batraneanu, S. M.; Brasolin, F.; Contescu, C.; Di Girolamo, A.; Lee, C. J.; Pozo Astigarraga, M. E.; Scannicchio, D. A.; Twomey, M. S.; Zaytsev, A.
2014-06-01
With the LHC collider at CERN currently going through the period of Long Shutdown 1 there is an opportunity to use the computing resources of the experiments' large trigger farms for other data processing activities. In the case of the ATLAS experiment, the TDAQ farm, consisting of more than 1500 compute nodes, is suitable for running Monte Carlo (MC) production jobs that are mostly CPU and not I/O bound. This contribution gives a thorough review of the design and deployment of a virtualized platform running on this computing resource and of its use to run large groups of CernVM based virtual machines operating as a single CERN-P1 WLCG site. This platform has been designed to guarantee the security and the usability of the ATLAS private network, and to minimize interference with TDAQ's usage of the farm. Openstack has been chosen to provide a cloud management layer. The experience gained in the last 3.5 months shows that the use of the TDAQ farm for the MC simulation contributes to the ATLAS data processing at the level of a large Tier-1 WLCG site, despite the opportunistic nature of the underlying computing resources being used.
Deriving Tools from Real-Time Runs: A New CCMC Support for SEC and AFWA
NASA Technical Reports Server (NTRS)
Hesse, Michael; Rastatter, Lutz; MacNeice, Peter; Kuznetsova, Masha
2007-01-01
The Community Coordinated Modeling Center (CCMC) is a US inter-agency activity aiming at research in support of the generation of advanced space weather models. As one of its main functions, the CCMC provides to researchers the use of space science models, even if they are not model owners themselves. In particular, the CCMC provides to the research community the execution of "runs-on-request" for specific events of interest to space science researchers. Through this activity and the concurrent development of advanced visualization tools, CCMC provides, to the general science community, unprecedented access to a large number of state-of-the-art research models. CCMC houses models that cover the entire domain from the Sun to the Earth. In this presentation, we will provide an overview of CCMC modeling services that are available to support activities at the Space Environment Center, or at the Air Force Weather Agency.
Scalable load balancing for massively parallel distributed Monte Carlo particle transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Brien, M. J.; Brantley, P. S.; Joy, K. I.
2013-07-01
In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrencemore » Livermore National Laboratory. (authors)« less
Performance of a supercharged direct-injection stratified-charge rotary combustion engine
NASA Technical Reports Server (NTRS)
Bartrand, Timothy A.; Willis, Edward A.
1990-01-01
A zero-dimensional thermodynamic performance computer model for direct-injection stratified-charge rotary combustion engines was modified and run for a single rotor supercharged engine. Operating conditions for the computer runs were a single boost pressure and a matrix of speeds, loads and engine materials. A representative engine map is presented showing the predicted range of efficient operation. After discussion of the engine map, a number of engine features are analyzed individually. These features are: heat transfer and the influence insulating materials have on engine performance and exhaust energy; intake manifold pressure oscillations and interactions with the combustion chamber; and performance losses and seal friction. Finally, code running times and convergence data are presented.
Decrease in Ground-Run Distance of Small Airplanes by Applying Electrically-Driven Wheels
NASA Astrophysics Data System (ADS)
Kobayashi, Hiroshi; Nishizawa, Akira
A new takeoff method for small airplanes was proposed. Ground-roll performance of an airplane driven by electrically-powered wheels was experimentally and computationally studied. The experiments verified that the ground-run distance was decreased by half with a combination of the powered driven wheels and propeller without increase of energy consumption during the ground-roll. The computational analysis showed the ground-run distance of the wheel-driven aircraft was independent of the motor power when the motor capability exceeded the friction between tires and ground. Furthermore, the distance was minimized when the angle of attack was set to the value so that the wing generated negative lift.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Z.; Bessa, M. A.; Liu, W.K.
A predictive computational theory is shown for modeling complex, hierarchical materials ranging from metal alloys to polymer nanocomposites. The theory can capture complex mechanisms such as plasticity and failure that span across multiple length scales. This general multiscale material modeling theory relies on sound principles of mathematics and mechanics, and a cutting-edge reduced order modeling method named self-consistent clustering analysis (SCA) [Zeliang Liu, M.A. Bessa, Wing Kam Liu, “Self-consistent clustering analysis: An efficient multi-scale scheme for inelastic heterogeneous materials,” Comput. Methods Appl. Mech. Engrg. 306 (2016) 319–341]. SCA reduces by several orders of magnitude the computational cost of micromechanical andmore » concurrent multiscale simulations, while retaining the microstructure information. This remarkable increase in efficiency is achieved with a data-driven clustering method. Computationally expensive operations are performed in the so-called offline stage, where degrees of freedom (DOFs) are agglomerated into clusters. The interaction tensor of these clusters is computed. In the online or predictive stage, the Lippmann-Schwinger integral equation is solved cluster-wise using a self-consistent scheme to ensure solution accuracy and avoid path dependence. To construct a concurrent multiscale model, this scheme is applied at each material point in a macroscale structure, replacing a conventional constitutive model with the average response computed from the microscale model using just the SCA online stage. A regularized damage theory is incorporated in the microscale that avoids the mesh and RVE size dependence that commonly plagues microscale damage calculations. The SCA method is illustrated with two cases: a carbon fiber reinforced polymer (CFRP) structure with the concurrent multiscale model and an application to fatigue prediction for additively manufactured metals. For the CFRP problem, a speed up estimated to be about 43,000 is achieved by using the SCA method, as opposed to FE2, enabling the solution of an otherwise computationally intractable problem. The second example uses a crystal plasticity constitutive law and computes the fatigue potency of extrinsic microscale features such as voids. This shows that local stress and strain are capture sufficiently well by SCA. This model has been incorporated in a process-structure-properties prediction framework for process design in additive manufacturing.« less
NASADIG - NASA DEVICE INDEPENDENT GRAPHICS LIBRARY (AMDAHL VERSION)
NASA Technical Reports Server (NTRS)
Rogers, J. E.
1994-01-01
The NASA Device Independent Graphics Library, NASADIG, can be used with many computer-based engineering and management applications. The library gives the user the opportunity to translate data into effective graphic displays for presentation. The software offers many features which allow the user flexibility in creating graphics. These include two-dimensional plots, subplot projections in 3D-space, surface contour line plots, and surface contour color-shaded plots. Routines for three-dimensional plotting, wireframe surface plots, surface plots with hidden line removal, and surface contour line plots are provided. Other features include polar and spherical coordinate plotting, world map plotting utilizing either cylindrical equidistant or Lambert equal area projection, plot translation, plot rotation, plot blowup, splines and polynomial interpolation, area blanking control, multiple log/linear axes, legends and text control, curve thickness control, and multiple text fonts (18 regular, 4 bold). NASADIG contains several groups of subroutines. Included are subroutines for plot area and axis definition; text set-up and display; area blanking; line style set-up, interpolation, and plotting; color shading and pattern control; legend, text block, and character control; device initialization; mixed alphabets setting; and other useful functions. The usefulness of many routines is dependent on the prior definition of basic parameters. The program's control structure uses a serial-level construct with each routine restricted for activation at some prescribed level(s) of problem definition. NASADIG provides the following output device drivers: Selanar 100XL, VECTOR Move/Draw ASCII and PostScript files, Tektronix 40xx, 41xx, and 4510 Rasterizer, DEC VT-240 (4014 mode), IBM AT/PC compatible with SmartTerm 240 emulator, HP Lasergrafix Film Recorder, QMS 800/1200, DEC LN03+ Laserprinters, and HP LaserJet (Series III). NASADIG is written in FORTRAN and is available for several platforms. NASADIG 5.7 is available for DEC VAX series computers running VMS 5.0 or later (MSC-21801), Cray X-MP and Y-MP series computers running UNICOS (COS-10049), and Amdahl 5990 mainframe computers running UTS (COS-10050). NASADIG 5.1 is available for UNIX-based operating systems (MSC-22001). The UNIX version has been successfully implemented on Sun4 series computers running SunOS, SGI IRIS computers running IRIX, Hewlett Packard 9000 computers running HP-UX, and Convex computers running Convex OS (MSC-22001). The standard distribution medium for MSC-21801 is a set of two 6250 BPI 9-track magnetic tapes in DEC VAX BACKUP format. It is also available on a set of two TK50 tape cartridges in DEC VAX BACKUP format. The standard distribution medium for COS-10049 and COS-10050 is a 6250 BPI 9-track magnetic tape in UNIX tar format. Other distribution media and formats may be available upon request. The standard distribution medium for MSC-22001 is a .25 inch streaming magnetic tape cartridge (Sun QIC-24) in UNIX tar format. Alternate distribution media and formats are available upon request. With minor modification, the UNIX source code can be ported to other platforms including IBM PC/AT series computers and compatibles. NASADIG is also available bundled with TRASYS, the Thermal Radiation Analysis System (COS-10026, DEC VAX version; COS-10040, CRAY version).
NASADIG - NASA DEVICE INDEPENDENT GRAPHICS LIBRARY (UNIX VERSION)
NASA Technical Reports Server (NTRS)
Rogers, J. E.
1994-01-01
The NASA Device Independent Graphics Library, NASADIG, can be used with many computer-based engineering and management applications. The library gives the user the opportunity to translate data into effective graphic displays for presentation. The software offers many features which allow the user flexibility in creating graphics. These include two-dimensional plots, subplot projections in 3D-space, surface contour line plots, and surface contour color-shaded plots. Routines for three-dimensional plotting, wireframe surface plots, surface plots with hidden line removal, and surface contour line plots are provided. Other features include polar and spherical coordinate plotting, world map plotting utilizing either cylindrical equidistant or Lambert equal area projection, plot translation, plot rotation, plot blowup, splines and polynomial interpolation, area blanking control, multiple log/linear axes, legends and text control, curve thickness control, and multiple text fonts (18 regular, 4 bold). NASADIG contains several groups of subroutines. Included are subroutines for plot area and axis definition; text set-up and display; area blanking; line style set-up, interpolation, and plotting; color shading and pattern control; legend, text block, and character control; device initialization; mixed alphabets setting; and other useful functions. The usefulness of many routines is dependent on the prior definition of basic parameters. The program's control structure uses a serial-level construct with each routine restricted for activation at some prescribed level(s) of problem definition. NASADIG provides the following output device drivers: Selanar 100XL, VECTOR Move/Draw ASCII and PostScript files, Tektronix 40xx, 41xx, and 4510 Rasterizer, DEC VT-240 (4014 mode), IBM AT/PC compatible with SmartTerm 240 emulator, HP Lasergrafix Film Recorder, QMS 800/1200, DEC LN03+ Laserprinters, and HP LaserJet (Series III). NASADIG is written in FORTRAN and is available for several platforms. NASADIG 5.7 is available for DEC VAX series computers running VMS 5.0 or later (MSC-21801), Cray X-MP and Y-MP series computers running UNICOS (COS-10049), and Amdahl 5990 mainframe computers running UTS (COS-10050). NASADIG 5.1 is available for UNIX-based operating systems (MSC-22001). The UNIX version has been successfully implemented on Sun4 series computers running SunOS, SGI IRIS computers running IRIX, Hewlett Packard 9000 computers running HP-UX, and Convex computers running Convex OS (MSC-22001). The standard distribution medium for MSC-21801 is a set of two 6250 BPI 9-track magnetic tapes in DEC VAX BACKUP format. It is also available on a set of two TK50 tape cartridges in DEC VAX BACKUP format. The standard distribution medium for COS-10049 and COS-10050 is a 6250 BPI 9-track magnetic tape in UNIX tar format. Other distribution media and formats may be available upon request. The standard distribution medium for MSC-22001 is a .25 inch streaming magnetic tape cartridge (Sun QIC-24) in UNIX tar format. Alternate distribution media and formats are available upon request. With minor modification, the UNIX source code can be ported to other platforms including IBM PC/AT series computers and compatibles. NASADIG is also available bundled with TRASYS, the Thermal Radiation Analysis System (COS-10026, DEC VAX version; COS-10040, CRAY version).
Fargier, Raphaël; Laganaro, Marina
2016-01-01
Running a concurrent task while speaking clearly interferes with speech planning, but whether verbal vs. non-verbal tasks interfere with the same processes is virtually unknown. We investigated the neural dynamics of dual-task interference on word production using event-related potentials (ERPs) with either tones or syllables as concurrent stimuli. Participants produced words from pictures in three conditions: without distractors, while passively listening to distractors and during a distractor detection task. Production latencies increased for tasks with higher attentional demand and were longer for syllables relative to tones. ERP analyses revealed common modulations by dual-task for verbal and non-verbal stimuli around 240 ms, likely corresponding to lexical selection. Modulations starting around 350 ms prior to vocal onset were only observed when verbal stimuli were involved. These later modulations, likely reflecting interference with phonological-phonetic encoding, were observed only when overlap between tasks was maximal and the same underlying neural circuits were engaged (cross-talk).
Multiple running speed signals in medial entorhinal cortex
Hinman, James R.; Brandon, Mark P.; Climer, Jason R.; Chapman, G. William; Hasselmo, Michael E.
2016-01-01
Grid cells in medial entorhinal cortex (MEC) can be modeled using oscillatory interference or attractor dynamic mechanisms that perform path integration, a computation requiring information about running direction and speed. The two classes of computational models often use either an oscillatory frequency or a firing rate that increases as a function of running speed. Yet it is currently not known whether these are two manifestations of the same speed signal or dissociable signals with potentially different anatomical substrates. We examined coding of running speed in MEC and identified these two speed signals to be independent of each other within individual neurons. The medial septum (MS) is strongly linked to locomotor behavior and removal of MS input resulted in strengthening of the firing rate speed signal, while decreasing the strength of the oscillatory speed signal. Thus two speed signals are present in MEC that are differentially affected by disrupted MS input. PMID:27427460
Running SINDA '85/FLUINT interactive on the VAX
NASA Technical Reports Server (NTRS)
Simmonds, Boris
1992-01-01
Computer software as engineering tools are typically run in three modes: Batch, Demand, and Interactive. The first two are the most popular in the SINDA world. The third one is not so popular, due probably to the users inaccessibility to the command procedure files for running SINDA '85, or lack of familiarity with the SINDA '85 execution processes (pre-processor, processor, compilation, linking, execution and all of the file assignment, creation, deletions and de-assignments). Interactive is the mode that makes thermal analysis with SINDA '85 a real-time design tool. This paper explains a command procedure sufficient (the minimum modifications required in an existing demand command procedure) to run SINDA '85 on the VAX in an interactive mode. To exercise the procedure a sample problem is presented exemplifying the mode, plus additional programming capabilities available in SINDA '85. Following the same guidelines the process can be extended to other SINDA '85 residence computer platforms.
Multi-GPGPU Tsunami simulation at Toyama-bay
NASA Astrophysics Data System (ADS)
Furuyama, Shoichi; Ueda, Yuki
2017-07-01
Accelerated multi General Purpose Graphics Processing Unit (GPGPU) calculation for Tsunami run-up simulation was achieved at the wide area (whole Toyama-bay in Japan) by faster computation technique. Toyama-bay has active-faults at the sea-bed. It has a high possibility to occur earthquakes and Tsunami waves in the case of the huge earthquake, that's why to predict the area of Tsunami run-up is important for decreasing damages to residents by the disaster. However it is very hard task to achieve the simulation by the computer resources problem. A several meter's order of the high resolution calculation is required for the running-up Tsunami simulation because artificial structures on the ground such as roads, buildings, and houses are very small. On the other hand the huge area simulation is also required. In the Toyama-bay case the area is 42 [km] × 15 [km]. When 5 [m] × 5 [m] size computational cells are used for the simulation, over 26,000,000 computational cells are generated. To calculate the simulation, a normal CPU desktop computer took about 10 hours for the calculation. An improvement of calculation time is important problem for the immediate prediction system of Tsunami running-up, as a result it will contribute to protect a lot of residents around the coastal region. The study tried to decrease this calculation time by using multi GPGPU system which is equipped with six NVIDIA TESLA K20xs, InfiniBand network connection between computer nodes by MVAPICH library. As a result 5.16 times faster calculation was achieved on six GPUs than one GPU case and it was 86% parallel efficiency to the linear speed up.
SOI layout decomposition for double patterning lithography on high-performance computer platforms
NASA Astrophysics Data System (ADS)
Verstov, Vladimir; Zinchenko, Lyudmila; Makarchuk, Vladimir
2014-12-01
In the paper silicon on insulator layout decomposition algorithms for the double patterning lithography on high performance computing platforms are discussed. Our approach is based on the use of a contradiction graph and a modified concurrent breadth-first search algorithm. We evaluate our technique on 45 nm Nangate Open Cell Library including non-Manhattan geometry. Experimental results show that our soft computing algorithms decompose layout successfully and a minimal distance between polygons in layout is increased.
SEI Report on Graduate Software Engineering Education for 1991
1991-04-01
12, 12 (Dec. 1979), 85-94. Andrews83 Andrews, Gregory R . and Schneider, Fred B. “Concepts and Notations for Concurrent Programming.” ACM Computing...Barringer87 Barringer , H. “Up and Down the Temporal Way.” Computer J. 30, 2 (Apr. 1987), 134-148. Bjørner78 The Vienna Development Method: The Meta-Language...Lecture Notes in Computer Science. Bruns86 Bruns, Glenn R . Technology Assessment: PAISLEY. Tech. Rep. MCC TR STP-296-86, MCC, Austin, Texas, Sept
NASA Technical Reports Server (NTRS)
Gilbert, Percy; Jones, Robert E.; Kramarchuk, Ihor; Williams, Wallace D.; Pouch, John J.
1987-01-01
Using a recently developed technology called thermal-wave microscopy, NASA Lewis Research Center has developed a computer controlled submicron thermal-wave microscope for the purpose of investigating III-V compound semiconductor devices and materials. This paper describes the system's design and configuration and discusses the hardware and software capabilities. Knowledge of the Concurrent 3200 series computers is needed for a complete understanding of the material presented. However, concepts and procedures are of general interest.
An analysis of running skyline load path.
Ward W. Carson; Charles N. Mann
1971-01-01
This paper is intended for those who wish to prepare an algorithm to determine the load path of a running skyline. The mathematics of a simplified approach to this running skyline design problem are presented. The approach employs assumptions which reduce the complexity of the problem to the point where it can be solved on desk-top computers of limited capacities. The...
Job Priorities on Peregrine | High-Performance Computing | NREL
allocation when run with qos=high. Requesting a Node Reservation If you are doing work that requires real scheduler more efficiently plan resources for larger jobs. When projects reach their allocation limit, jobs associated with those projects will run at very low priority, which will ensure that these jobs run only when
Running High-Throughput Jobs on Peregrine | High-Performance Computing |
unique name (using "name=") and usse the task name to create a unique output file name. For runs on and how many tasks to give to each worker at a time using the NITRO_COORD_OPTIONS environment . Finally, you start Nitro by executing launch_nitro.sh. Sample Nitro job script To run a job using the
a Fast and Flexible Method for Meta-Map Building for Icp Based Slam
NASA Astrophysics Data System (ADS)
Kurian, A.; Morin, K. W.
2016-06-01
Recent developments in LiDAR sensors make mobile mapping fast and cost effective. These sensors generate a large amount of data which in turn improves the coverage and details of the map. Due to the limited range of the sensor, one has to collect a series of scans to build the entire map of the environment. If we have good GNSS coverage, building a map is a well addressed problem. But in an indoor environment, we have limited GNSS reception and an inertial solution, if available, can quickly diverge. In such situations, simultaneous localization and mapping (SLAM) is used to generate a navigation solution and map concurrently. SLAM using point clouds possesses a number of computational challenges even with modern hardware due to the shear amount of data. In this paper, we propose two strategies for minimizing the cost of computation and storage when a 3D point cloud is used for navigation and real-time map building. We have used the 3D point cloud generated by Leica Geosystems's Pegasus Backpack which is equipped with Velodyne VLP-16 LiDARs scanners. To improve the speed of the conventional iterative closest point (ICP) algorithm, we propose a point cloud sub-sampling strategy which does not throw away any key features and yet significantly reduces the number of points that needs to be processed and stored. In order to speed up the correspondence finding step, a dual kd-tree and circular buffer architecture is proposed. We have shown that the proposed method can run in real time and has excellent navigation accuracy characteristics.
Computer Modeling to Evaluate the Impact of Technology Changes on Resident Procedural Volume.
Grenda, Tyler R; Ballard, Tiffany N S; Obi, Andrea T; Pozehl, William; Seagull, F Jacob; Chen, Ryan; Cohn, Amy M; Daskin, Mark S; Reddy, Rishindra M
2016-12-01
As resident "index" procedures change in volume due to advances in technology or reliance on simulation, it may be difficult to ensure trainees meet case requirements. Training programs are in need of metrics to determine how many residents their institutional volume can support. As a case study of how such metrics can be applied, we evaluated a case distribution simulation model to examine program-level mediastinoscopy and endobronchial ultrasound (EBUS) volumes needed to train thoracic surgery residents. A computer model was created to simulate case distribution based on annual case volume, number of trainees, and rotation length. Single institutional case volume data (2011-2013) were applied, and 10 000 simulation years were run to predict the likelihood (95% confidence interval) of all residents (4 trainees) achieving board requirements for operative volume during a 2-year program. The mean annual mediastinoscopy volume was 43. In a simulation of pre-2012 board requirements (thoracic pathway, 25; cardiac pathway, 10), there was a 6% probability of all 4 residents meeting requirements. Under post-2012 requirements (thoracic, 15; cardiac, 10), however, the likelihood increased to 88%. When EBUS volume (mean 19 cases per year) was concurrently evaluated in the post-2012 era (thoracic, 10; cardiac, 0), the likelihood of all 4 residents meeting case requirements was only 23%. This model provides a metric to predict the probability of residents meeting case requirements in an era of changing volume by accounting for unpredictable and inequitable case distribution. It could be applied across operations, procedures, or disease diagnoses and may be particularly useful in developing resident curricula and schedules.
AlgoRun: a Docker-based packaging system for platform-agnostic implemented algorithms.
Hosny, Abdelrahman; Vera-Licona, Paola; Laubenbacher, Reinhard; Favre, Thibauld
2016-08-01
There is a growing need in bioinformatics for easy-to-use software implementations of algorithms that are usable across platforms. At the same time, reproducibility of computational results is critical and often a challenge due to source code changes over time and dependencies. The approach introduced in this paper addresses both of these needs with AlgoRun, a dedicated packaging system for implemented algorithms, using Docker technology. Implemented algorithms, packaged with AlgoRun, can be executed through a user-friendly interface directly from a web browser or via a standardized RESTful web API to allow easy integration into more complex workflows. The packaged algorithm includes the entire software execution environment, thereby eliminating the common problem of software dependencies and the irreproducibility of computations over time. AlgoRun-packaged algorithms can be published on http://algorun.org, a centralized searchable directory to find existing AlgoRun-packaged algorithms. AlgoRun is available at http://algorun.org and the source code under GPL license is available at https://github.com/algorun laubenbacher@uchc.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greenough, Jeffrey A.; de Supinski, Bronis R.; Yates, Robert K.
2005-04-25
We describe the performance of the block-structured Adaptive Mesh Refinement (AMR) code Raptor on the 32k node IBM BlueGene/L computer. This machine represents a significant step forward towards petascale computing. As such, it presents Raptor with many challenges for utilizing the hardware efficiently. In terms of performance, Raptor shows excellent weak and strong scaling when running in single level mode (no adaptivity). Hardware performance monitors show Raptor achieves an aggregate performance of 3:0 Tflops in the main integration kernel on the 32k system. Results from preliminary AMR runs on a prototype astrophysical problem demonstrate the efficiency of the current softwaremore » when running at large scale. The BG/L system is enabling a physics problem to be considered that represents a factor of 64 increase in overall size compared to the largest ones of this type computed to date. Finally, we provide a description of the development work currently underway to address our inefficiencies.« less
ASDIR-II. Volume I. User Manual
1975-12-01
normally the most significant part of the overall aircraft IR signature. The 4 radiance is directly dependent upon the geometric view factors , a set...tactors as punched card output in. a view factor computer run. For the view factor computer run IB49 through 53 and all IDS input A, from IDS-2 to IDS-6...may be excluded from the input string if the * program execution is requested to stop after punching the viewv factors . Inputs required for punching
Feasibility of Virtual Machine and Cloud Computing Technologies for High Performance Computing
2014-05-01
Hat Enterprise Linux SaaS software as a service VM virtual machine vNUMA virtual non-uniform memory access WRF weather research and forecasting...previously mentioned in Chapter I Section B1 of this paper, which is used to run the weather research and forecasting ( WRF ) model in their experiments...against a VMware virtualization solution of WRF . The experiment consisted of running WRF in a standard configuration between the D-VTM and VMware while
The Air Force Geophysics Laboratory Standalone Data Acquisition System: A Functional Description.
1980-10-09
the board are a buffer for the RUN/HALT front panel switch and a retriggerable oneshot multivibrator. This latter circuit senses the SRUN pulse train...recording on the data tapes, and providing the master timing source for data acquisition. An Electronic Research Company (ERC) model 2446 digital...the computer is fed to a retriggerable oneshot multivibrator on the board. (SRUN consists of a pulse train that is present when the computer is running
Improved Algorithms Speed It Up for Codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hazi, A
2005-09-20
Huge computers, huge codes, complex problems to solve. The longer it takes to run a code, the more it costs. One way to speed things up and save time and money is through hardware improvements--faster processors, different system designs, bigger computers. But another side of supercomputing can reap savings in time and speed: software improvements to make codes--particularly the mathematical algorithms that form them--run faster and more efficiently. Speed up math? Is that really possible? According to Livermore physicist Eugene Brooks, the answer is a resounding yes. ''Sure, you get great speed-ups by improving hardware,'' says Brooks, the deputy leadermore » for Computational Physics in N Division, which is part of Livermore's Physics and Advanced Technologies (PAT) Directorate. ''But the real bonus comes on the software side, where improvements in software can lead to orders of magnitude improvement in run times.'' Brooks knows whereof he speaks. Working with Laboratory physicist Abraham Szoeke and others, he has been instrumental in devising ways to shrink the running time of what has, historically, been a tough computational nut to crack: radiation transport codes based on the statistical or Monte Carlo method of calculation. And Brooks is not the only one. Others around the Laboratory, including physicists Andrew Williamson, Randolph Hood, and Jeff Grossman, have come up with innovative ways to speed up Monte Carlo calculations using pure mathematics.« less
Hari, Pradip; Ko, Kevin; Koukoumidis, Emmanouil; Kremer, Ulrich; Martonosi, Margaret; Ottoni, Desiree; Peh, Li-Shiuan; Zhang, Pei
2008-10-28
Increasingly, spatial awareness plays a central role in many distributed and mobile computing applications. Spatially aware applications rely on information about the geographical position of compute devices and their supported services in order to support novel functionality. While many spatial application drivers already exist in mobile and distributed computing, very little systems research has explored how best to program these applications, to express their spatial and temporal constraints, and to allow efficient implementations on highly dynamic real-world platforms. This paper proposes the SARANA system architecture, which includes language and run-time system support for spatially aware and resource-aware applications. SARANA allows users to express spatial regions of interest, as well as trade-offs between quality of result (QoR), latency and cost. The goal is to produce applications that use resources efficiently and that can be run on diverse resource-constrained platforms ranging from laptops to personal digital assistants and to smart phones. SARANA's run-time system manages QoR and cost trade-offs dynamically by tracking resource availability and locations, brokering usage/pricing agreements and migrating programs to nodes accordingly. A resource cost model permeates the SARANA system layers, permitting users to express their resource needs and QoR expectations in units that make sense to them. Although we are still early in the system development, initial versions have been demonstrated on a nine-node system prototype.
Multithreaded transactions in scientific computing. The Growth06_v2 program
NASA Astrophysics Data System (ADS)
Daniluk, Andrzej
2009-07-01
Writing a concurrent program can be more difficult than writing a sequential program. Programmer needs to think about synchronization, race conditions and shared variables. Transactions help reduce the inconvenience of using threads. A transaction is an abstraction, which allows programmers to group a sequence of actions on the program into a logical, higher-level computation unit. This paper presents a new version of the GROWTHGr and GROWTH06 programs. New version program summaryProgram title: GROWTH06_v2 Catalogue identifier: ADVL_v2_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADVL_v2_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 65 255 No. of bytes in distributed program, including test data, etc.: 865 985 Distribution format: tar.gz Programming language: Object Pascal Computer: Pentium-based PC Operating system: Windows 9x, XP, NT, Vista RAM: more than 1 MB Classification: 4.3, 7.2, 6.2, 8, 14 Catalogue identifier of previous version: ADVL_v2_0 Journal reference of previous version: Comput. Phys. Comm. 175 (2006) 678 Does the new version supersede the previous version?: Yes Nature of problem: The programs compute the RHEED intensities during the growth of thin epitaxial structures prepared using the molecular beam epitaxy (MBE). The computations are based on the use of kinematical diffraction theory. Solution method: Epitaxial growth of thin films is modelled by a set of non-linear differential equations [1]. The Runge-Kutta method with adaptive stepsize control was used for solving initial value problem for non-linear differential equations [2]. Reasons for new version: According to the users' suggestions functionality of the program has been improved. Moreover, new use cases have been added which make the handling of the program easier and more efficient than the previous ones [3]. Summary of revisions:The design pattern (See Fig. 2 of Ref. [3]) has been modified according to the scheme shown on Fig. 1. A graphical user interface (GUI) for the program has been reconstructed. Fig. 2 presents a hybrid diagram of a GUI that shows how onscreen objects connect to use cases. The program has been compiled with English/USA regional and language options. Note: The figures mentioned above are contained in the program distribution file. Unusual features: The program is distributed in the form of source project GROWTH06_v2.dpr with associated files, and should be compiled using Borland Delphi compilers versions 6 or latter (including Borland Developer Studio 2006 and Code Gear compilers for Delphi). Additional comments: Two figures are included in the program distribution file. These are captioned Static classes model for Transaction design pattern. A model of a window that shows how onscreen objects connect to use cases. Running time: The typical running time is machine and user-parameters dependent. References: [1] A. Daniluk, Comput. Phys. Comm. 170 (2005) 265. [2] W.H. Press, B.P. Flannery, S.A. Teukolsky, W.T. Vetterling, Numerical Recipes in Pascal: The Art of Scientific Computing, first ed., Cambridge University Press, 1989. [3] M. Brzuszek, A. Daniluk, Comput. Phys. Comm. 175 (2006) 678.
A hardware/software environment to support R D in intelligent machines and mobile robotic systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mann, R.C.
1990-01-01
The Center for Engineering Systems Advanced Research (CESAR) serves as a focal point at the Oak Ridge National Laboratory (ORNL) for basic and applied research in intelligent machines. R D at CESAR addresses issues related to autonomous systems, unstructured (i.e. incompletely known) operational environments, and multiple performing agents. Two mobile robot prototypes (HERMIES-IIB and HERMIES-III) are being used to test new developments in several robot component technologies. This paper briefly introduces the computing environment at CESAR which includes three hypercube concurrent computers (two on-board the mobile robots), a graphics workstation, VAX, and multiple VME-based systems (several on-board the mobile robots).more » The current software environment at CESAR is intended to satisfy several goals, e.g.: code portability, re-usability in different experimental scenarios, modularity, concurrent computer hardware transparent to applications programmer, future support for multiple mobile robots, support human-machine interface modules, and support for integration of software from other, geographically disparate laboratories with different hardware set-ups. 6 refs., 1 fig.« less
Lee, Jae H.; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T.; Seo, Youngho
2014-01-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting. PMID:27081299
Lee, Jae H; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T; Seo, Youngho
2014-11-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting.
EPA'S TOXICOGENOMICS PARTNERSHIPS ACROSS GOVERNMENT, ACADEMIA AND INDUSTRY
Genomics, proteomics and metabonomics technologies are transforming the science of toxicology, and concurrent advances in computing and informatics are providing management and analysis solutions for this onslaught of toxicogenomic data. EPA has been actively developing an intra...
Power module Data Management System (DMS) study
NASA Technical Reports Server (NTRS)
1978-01-01
Computer trades and analyses of selected Power Module Data Management Subsystem issues to support concurrent inhouse MSFC Power Study are provided. The charts which summarize and describe the results are presented. Software requirements and definitions are included.
Methods and devices for determining quality of services of storage systems
Seelam, Seetharami R [Yorktown Heights, NY; Teller, Patricia J [Las Cruces, NM
2012-01-17
Methods and systems for allowing access to computer storage systems. Multiple requests from multiple applications can be received and processed efficiently to allow traffic from multiple customers to access the storage system concurrently.
NASA Astrophysics Data System (ADS)
Sapra, Karan; Gupta, Saurabh; Atchley, Scott; Anantharaj, Valentine; Miller, Ross; Vazhkudai, Sudharshan
2016-04-01
Efficient resource utilization is critical for improved end-to-end computing and workflow of scientific applications. Heterogeneous node architectures, such as the GPU-enabled Titan supercomputer at the Oak Ridge Leadership Computing Facility (OLCF), present us with further challenges. In many HPC applications on Titan, the accelerators are the primary compute engines while the CPUs orchestrate the offloading of work onto the accelerators, and moving the output back to the main memory. On the other hand, applications that do not exploit GPUs, the CPU usage is dominant while the GPUs idle. We utilized Heterogenous Functional Partitioning (HFP) runtime framework that can optimize usage of resources on a compute node to expedite an application's end-to-end workflow. This approach is different from existing techniques for in-situ analyses in that it provides a framework for on-the-fly analysis on-node by dynamically exploiting under-utilized resources therein. We have implemented in the Community Earth System Model (CESM) a new concurrent diagnostic processing capability enabled by the HFP framework. Various single variate statistics, such as means and distributions, are computed in-situ by launching HFP tasks on the GPU via the node local HFP daemon. Since our current configuration of CESM does not use GPU resources heavily, we can move these tasks to GPU using the HFP framework. Each rank running the atmospheric model in CESM pushes the variables of of interest via HFP function calls to the HFP daemon. This node local daemon is responsible for receiving the data from main program and launching the designated analytics tasks on the GPU. We have implemented these analytics tasks in C and use OpenACC directives to enable GPU acceleration. This methodology is also advantageous while executing GPU-enabled configurations of CESM when the CPUs will be idle during portions of the runtime. In our implementation results, we demonstrate that it is more efficient to use HFP framework to offload the tasks to GPUs instead of doing it in the main application. We observe increased resource utilization and overall productivity in this approach by using HFP framework for end-to-end workflow.
Algorithms and software for nonlinear structural dynamics
NASA Technical Reports Server (NTRS)
Belytschko, Ted; Gilbertsen, Noreen D.; Neal, Mark O.
1989-01-01
The objective of this research is to develop efficient methods for explicit time integration in nonlinear structural dynamics for computers which utilize both concurrency and vectorization. As a framework for these studies, the program WHAMS, which is described in Explicit Algorithms for the Nonlinear Dynamics of Shells (T. Belytschko, J. I. Lin, and C.-S. Tsay, Computer Methods in Applied Mechanics and Engineering, Vol. 42, 1984, pp 225 to 251), is used. There are two factors which make the development of efficient concurrent explicit time integration programs a challenge in a structural dynamics program: (1) the need for a variety of element types, which complicates the scheduling-allocation problem; and (2) the need for different time steps in different parts of the mesh, which is here called mixed delta t integration, so that a few stiff elements do not reduce the time steps throughout the mesh.
NASA Technical Reports Server (NTRS)
Menon, R. G.; Kurdila, A. J.
1992-01-01
This paper presents a concurrent methodology to simulate the dynamics of flexible multibody systems with a large number of degrees of freedom. A general class of open-loop structures is treated and a redundant coordinate formulation is adopted. A range space method is used in which the constraint forces are calculated using a preconditioned conjugate gradient method. By using a preconditioner motivated by the regular ordering of the directed graph of the structures, it is shown that the method is order N in the total number of coordinates of the system. The overall formulation has the advantage that it permits fine parallelization and does not rely on system topology to induce concurrency. It can be efficiently implemented on the present generation of parallel computers with a large number of processors. Validation of the method is presented via numerical simulations of space structures incorporating large number of flexible degrees of freedom.
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.
1987-01-01
The results of ongoing research directed at developing a graph theoretical model for describing data and control flow associated with the execution of large grained algorithms in a spatial distributed computer environment is presented. This model is identified by the acronym ATAMM (Algorithm/Architecture Mapping Model). The purpose of such a model is to provide a basis for establishing rules for relating an algorithm to its execution in a multiprocessor environment. Specifications derived from the model lead directly to the description of a data flow architecture which is a consequence of the inherent behavior of the data and control flow described by the model. The purpose of the ATAMM based architecture is to optimize computational concurrency in the multiprocessor environment and to provide an analytical basis for performance evaluation. The ATAMM model and architecture specifications are demonstrated on a prototype system for concept validation.
The engineering design integration (EDIN) system. [digital computer program complex
NASA Technical Reports Server (NTRS)
Glatt, C. R.; Hirsch, G. N.; Alford, G. E.; Colquitt, W. N.; Reiners, S. J.
1974-01-01
A digital computer program complex for the evaluation of aerospace vehicle preliminary designs is described. The system consists of a Univac 1100 series computer and peripherals using the Exec 8 operating system, a set of demand access terminals of the alphanumeric and graphics types, and a library of independent computer programs. Modification of the partial run streams, data base maintenance and construction, and control of program sequencing are provided by a data manipulation program called the DLG processor. The executive control of library program execution is performed by the Univac Exec 8 operating system through a user established run stream. A combination of demand and batch operations is employed in the evaluation of preliminary designs. Applications accomplished with the EDIN system are described.