Discrete Event Simulation Parallel Discrete-Event Simulation
Discrete Event Simulation Parallel Discrete-Event Simulation Using TM for PDES Conclusions & Future Works Using TM for high-performance Discrete-Event Simulation on multi-core architectures EuroTM 2013 14, 2013 Olivier Dalle Using TM for high-performance Discrete-Event Simulation on mu #12;Discrete
Presented by Parallel Discrete Event Simulation
Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Discrete by UT-Battelle for the U.S. Department of Energy Perumalla_PDES_SC10 Many applications in ORNL system design Application instrumentation, performance tuning, debugging · Cyber infrastructure
The effects of parallel processing architectures on discrete event simulation
NASA Astrophysics Data System (ADS)
Cave, William; Slatt, Edward; Wassmer, Robert E.
2005-05-01
As systems become more complex, particularly those containing embedded decision algorithms, mathematical modeling presents a rigid framework that often impedes representation to a sufficient level of detail. Using discrete event simulation, one can build models that more closely represent physical reality, with actual algorithms incorporated in the simulations. Higher levels of detail increase simulation run time. Hardware designers have succeeded in producing parallel and distributed processor computers with theoretical speeds well into the teraflop range. However, the practical use of these machines on all but some very special problems is extremely limited. The inability to use this power is due to great difficulties encountered when trying to translate real world problems into software that makes effective use of highly parallel machines. This paper addresses the application of parallel processing to simulations of real world systems of varying inherent parallelism. It provides a brief background in modeling and simulation validity and describes a parameter that can be used in discrete event simulation to vary opportunities for parallel processing at the expense of absolute time synchronization and is constrained by validity. It focuses on the effects of model architecture, run-time software architecture, and parallel processor architecture on speed, while providing an environment where modelers can achieve sufficient model accuracy to produce valid simulation results. It describes an approach to simulation development that captures subject area expert knowledge to leverage inherent parallelism in systems in the following ways: * Data structures are separated from instructions to track which instruction sets share what data. This is used to determine independence and thus the potential for concurrent processing at run-time. * Model connectivity (independence) can be inspected visually to determine if the inherent parallelism of a physical system is properly represented. Models need not be changed to move from a single processor to parallel processor hardware architectures. * Knowledge of the architectural parallelism is stored within the system and used during run time to allocate processors to processes in a maximally efficient way.
Parallel discrete-event simulation of FCFS stochastic queueing networks
NASA Technical Reports Server (NTRS)
Nicol, David M.
1988-01-01
Physical systems are inherently parallel. Intuition suggests that simulations of these systems may be amenable to parallel execution. The parallel execution of a discrete-event simulation requires careful synchronization of processes in order to ensure the execution's correctness; this synchronization can degrade performance. Largely negative results were recently reported in a study which used a well-known synchronization method on queueing network simulations. Discussed here is a synchronization method (appointments), which has proven itself to be effective on simulations of FCFS queueing networks. The key concept behind appointments is the provision of lookahead. Lookahead is a prediction on a processor's future behavior, based on an analysis of the processor's simulation state. It is shown how lookahead can be computed for FCFS queueing network simulations, give performance data that demonstrates the method's effectiveness under moderate to heavy loads, and discuss performance tradeoffs between the quality of lookahead, and the cost of computing lookahead.
Parallel Discrete Event Simulation Benno Overeinder Bob Hertzberger Peter Sloot
smoothly and continuously. A model is a discrete time model if time flows in jumps of some specified time. Continuous time models can be further divided into differential equation and discrete event classes are presented. 1 #12;2 Concepts of Discrete Event Simulation Modelling and simulation can be characterized
The cost of conservative synchronization in parallel discrete event simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
The performance of a synchronous conservative parallel discrete-event simulation protocol is analyzed. The class of simulation models considered is oriented around a physical domain and possesses a limited ability to predict future behavior. A stochastic model is used to show that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of the average per-event overhead due to synchronization, event list manipulation, lookahead calculations, and processor idle time approach the complexity of the average per-event overhead of a serial simulation. The method is therefore within a constant factor of optimal. The analysis demonstrates that on large problems--those for which parallel processing is ideally suited--there is often enough parallel workload so that processors are not usually idle. The viability of the method is also demonstrated empirically, showing how good performance is achieved on large problems using a thirty-two node Intel iPSC/2 distributed memory multiprocessor.
Synchronous parallel system for emulation and discrete event simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (inventor)
1992-01-01
A synchronous parallel system for emulation and discrete event simulation having parallel nodes responds to received messages at each node by generating event objects having individual time stamps, stores only the changes to state variables of the simulation object attributable to the event object, and produces corresponding messages. The system refrains from transmitting the messages and changing the state variables while it determines whether the changes are superseded, and then stores the unchanged state variables in the event object for later restoral to the simulation object if called for. This determination preferably includes sensing the time stamp of each new event object and determining which new event object has the earliest time stamp as the local event horizon, determining the earliest local event horizon of the nodes as the global event horizon, and ignoring the events whose time stamps are less than the global event horizon. Host processing between the system and external terminals enables such a terminal to query, monitor, command or participate with a simulation object during the simulation process.
Parallel discrete-event simulation schemes with heterogeneous processing elements
NASA Astrophysics Data System (ADS)
Kim, Yup; Kwon, Ikhyun; Chae, Huiseung; Yook, Soon-Hyung
2014-07-01
To understand the effects of nonidentical processing elements (PEs) on parallel discrete-event simulation (PDES) schemes, two stochastic growth models, the restricted solid-on-solid (RSOS) model and the Family model, are investigated by simulations. The RSOS model is the model for the PDES scheme governed by the Kardar-Parisi-Zhang equation (KPZ scheme). The Family model is the model for the scheme governed by the Edwards-Wilkinson equation (EW scheme). Two kinds of distributions for nonidentical PEs are considered. In the first kind computing capacities of PEs are not much different, whereas in the second kind the capacities are extremely widespread. The KPZ scheme on the complex networks shows the synchronizability and scalability regardless of the kinds of PEs. The EW scheme never shows the synchronizability for the random configuration of PEs of the first kind. However, by regularizing the arrangement of PEs of the first kind, the EW scheme is made to show the synchronizability. In contrast, EW scheme never shows the synchronizability for any configuration of PEs of the second kind.
Towards Automated Performance Prediction in Bulk-Synchronous Parallel Discrete-Event Simulation
Mauricio Marín
1999-01-01
This paper discusses the running time cost of performing discrete-event simulation on the bulk-synchronous parallel (BSP) model of computing. The BSP model provides a general purpose framework for parallel computing which is independent of the architecture of the computer and thereby it enables the development of portable software. In addition, the structure of BSP computations allows the accurate determination of
Application of Parallel Discrete Event Simulation to the Space Surveillance Network
NASA Astrophysics Data System (ADS)
Jefferson, D.; Leek, J.
2010-09-01
In this paper we describe how and why we chose parallel discrete event simulation (PDES) as the paradigm for modeling the Space Surveillance Network (SSN) in our modeling framework, TESSA (Testbed Environment for Space Situational Awareness). DES is a simulation paradigm appropriate for systems dominated by discontinuous state changes at times that must be calculated dynamically. It is used primarily for complex man-made systems like telecommunications, vehicular traffic, computer networks, economic models etc., although it is also useful for natural systems that are not described by equations, such as particle systems, population dynamics, epidemics, and combat models. It is much less well known than simple time-stepped simulation methods, but has the great advantage of being time scale independent, so that one can freely mix processes that operate at time scales over many orders of magnitude with no runtime performance penalty. In simulating the SSN we model in some detail: (a) the orbital dynamics of up to 105 objects, (b) their reflective properties, (c) the ground- and space-based sensor systems in the SSN, (d) the recognition of orbiting objects and determination of their orbits, (e) the cueing and scheduling of sensor observations, (f) the 3-d structure of satellites, and (g) the generation of collision debris. TESSA is thus a mixed continuous-discrete model. But because many different types of discrete objects are involved with such a wide variation in time scale (milliseconds for collisions, hours for orbital periods) it is suitably described using discrete events. The PDES paradigm is surprising and unusual. In any instantaneous runtime snapshot some parts my be far ahead in simulation time while others lag behind, yet the required causal relationships are always maintained and synchronized correctly, exactly as if the simulation were executed sequentially. The TESSA simulator is custom-built, conservatively synchronized, and designed to scale to thousands of nodes. There are many PDES platforms we might have used, but two requirements led us to build our own. First, the parallel components of our SSN simulation are coded and maintained by separate teams, so TESSA is designed to support transparent coupling and interoperation of separately compiled components written in any of six programming languages. Second, conventional PDES simulators are designed so that while the parallel components run concurrently, each of them is internally sequential, whereas for TESSA we needed to support MPI-based parallelism within each component. The TESSA simulator is still a work in progress and currently has some significant limitations. The paper describes those as well.
Trends in Discrete Event Simulations
Eduard Babulak
2008-01-01
Discrete event simulation technologies have been up and down as global manufacturing industries went through radical changes. The changes have created new problems, challenges and opportunities to the discrete event simulation. On manufacturing applications, it is no longer an isolated model but the distributed modeling and simulation along the supply chain. In order to study the hybrid manufacturing systems, it
Thulasidasan, Sunil [Los Alamos National Laboratory; Kasiviswanathan, Shiva [Los Alamos National Laboratory; Eidenbenz, Stephan [Los Alamos National Laboratory; Romero, Philip [Los Alamos National Laboratory
2010-01-01
We re-examine the problem of load balancing in conservatively synchronized parallel, discrete-event simulations executed on high-performance computing clusters, focusing on simulations where computational and messaging load tend to be spatially clustered. Such domains are frequently characterized by the presence of geographic 'hot-spots' - regions that generate significantly more simulation events than others. Examples of such domains include simulation of urban regions, transportation networks and networks where interaction between entities is often constrained by physical proximity. Noting that in conservatively synchronized parallel simulations, the speed of execution of the simulation is determined by the slowest (i.e most heavily loaded) simulation process, we study different partitioning strategies in achieving equitable processor-load distribution in domains with spatially clustered load. In particular, we study the effectiveness of partitioning via spatial scattering to achieve optimal load balance. In this partitioning technique, nearby entities are explicitly assigned to different processors, thereby scattering the load across the cluster. This is motivated by two observations, namely, (i) since load is spatially clustered, spatial scattering should, intuitively, spread the load across the compute cluster, and (ii) in parallel simulations, equitable distribution of CPU load is a greater determinant of execution speed than message passing overhead. Through large-scale simulation experiments - both of abstracted and real simulation models - we observe that scatter partitioning, even with its greatly increased messaging overhead, significantly outperforms more conventional spatial partitioning techniques that seek to reduce messaging overhead. Further, even if hot-spots change over the course of the simulation, if the underlying feature of spatial clustering is retained, load continues to be balanced with spatial scattering leading us to the observation that spatial scattering can often obviate the need for dynamic load balancing.
Optimized Hypervisor Scheduler for Parallel Discrete Event Simulations on Virtual Machine Platforms
Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL
2013-01-01
With the advent of virtual machine (VM)-based platforms for parallel computing, it is now possible to execute parallel discrete event simulations (PDES) over multiple virtual machines, in contrast to executing in native mode directly over hardware as is traditionally done over the past decades. While mature VM-based parallel systems now offer new, compelling benefits such as serviceability, dynamic reconfigurability and overall cost effectiveness, the runtime performance of parallel applications can be significantly affected. In particular, most VM-based platforms are optimized for general workloads, but PDES execution exhibits unique dynamics significantly different from other workloads. Here we first present results from experiments that highlight the gross deterioration of the runtime performance of VM-based PDES simulations when executed using traditional VM schedulers, quantitatively showing the bad scaling properties of the scheduler as the number of VMs is increased. The mismatch is fundamental in nature in the sense that any fairness-based VM scheduler implementation would exhibit this mismatch with PDES runs. We also present a new scheduler optimized specifically for PDES applications, and describe its design and implementation. Experimental results obtained from running PDES benchmarks (PHOLD and vehicular traffic simulations) over VMs show over an order of magnitude improvement in the run time of the PDES-optimized scheduler relative to the regular VM scheduler, with over 20 reduction in run time of simulations using up to 64 VMs. The observations and results are timely in the context of emerging systems such as cloud platforms and VM-based high performance computing installations, highlighting to the community the need for PDES-specific support, and the feasibility of significantly reducing the runtime overhead for scalable PDES on VM platforms.
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor)
1998-01-01
The present invention is embodied in a method of performing object-oriented simulation and a system having inter-connected processor nodes operating in parallel to simulate mutual interactions of a set of discrete simulation objects distributed among the nodes as a sequence of discrete events changing state variables of respective simulation objects so as to generate new event-defining messages addressed to respective ones of the nodes. The object-oriented simulation is performed at each one of the nodes by assigning passive self-contained simulation objects to each one of the nodes, responding to messages received at one node by generating corresponding active event objects having user-defined inherent capabilities and individual time stamps and corresponding to respective events affecting one of the passive self-contained simulation objects of the one node, restricting the respective passive self-contained simulation objects to only providing and receiving information from die respective active event objects, requesting information and changing variables within a passive self-contained simulation object by the active event object, and producing corresponding messages specifying events resulting therefrom by the active event objects.
Writing parallel, discrete-event simulations in ModSim: Insight and experience
Rich, D.O.; Michelsen, R.E.
1989-09-11
The Time Warp Operating System (TWOS) has been the focus of much research in parallel simulation. A new language, called ModSim, has been developed for use in conjunction with TWOS. The coupling of ModSim and TWOS provides a tool to construct large, complex simulation models that will run on several parallel and distributed computer systems. As part of the Griffin Project'' underway here at Los Alamos National Laboratory, there is strong interest in assessing the coupling of ModSim and TWOS from an application-oriented perspective. To this end, a key component of the Eagle combat simulation has been implemented in ModSim for execution on TWOS. In this paper brief overviews of ModSim and TWOS will be presented. Finally, the compatibility of the computational models presented by the language and the operating system will be examined in light of experience gained to date. 18 refs., 4 figs.
A discrete event method for wave simulation
Nutaro, James J [ORNL
2006-01-01
This article describes a discrete event interpretation of the finite difference time domain (FDTD) and digital wave guide network (DWN) wave simulation schemes. The discrete event method is formalized using the discrete event system specification (DEVS). The scheme is shown to have errors that are proportional to the resolution of the spatial grid. A numerical example demonstrates the relative efficiency of the scheme with respect to FDTD and DWN schemes. The potential for the discrete event scheme to reduce numerical dispersion and attenuation errors is discussed.
Miguel-Alonso, José
alternatives. A set of experiments are carried out to understand how model parameters influence simulator threshold is influenced by some parameters of the simulated model (size of the network, load level to an inefficient implementation of the algorithm. After this preliminary evaluation performed with a toy model
Distributed discrete event simulation. Final report
De Vries, R.C. [Univ. of New Mexico, Albuquerque, NM (United States). EECE Dept.
1988-02-01
The presentation given here is restricted to discrete event simulation. The complexity of and time required for many present and potential discrete simulations exceeds the reasonable capacity of most present serial computers. The desire, then, is to implement the simulations on a parallel machine. However, certain problems arise in an effort to program the simulation on a parallel machine. In one category of methods deadlock care arise and some method is required to either detect deadlock and recover from it or to avoid deadlock through information passing. In the second category of methods, potentially incorrect simulations are allowed to proceed. If the situation is later determined to be incorrect, recovery from the error must be initiated. In either case, computation and information passing are required which would not be required in a serial implementation. The net effect is that the parallel simulation may not be much better than a serial simulation. In an effort to determine alternate approaches, important papers in the area were reviewed. As a part of that review process, each of the papers was summarized. The summary of each paper is presented in this report in the hopes that those doing future work in the area will be able to gain insight that might not otherwise be available, and to aid in deciding which papers would be most beneficial to pursue in more detail. The papers are broken down into categories and then by author. Conclusions reached after examining the papers and other material, such as direct talks with an author, are presented in the last section. Also presented there are some ideas that surfaced late in the research effort. These promise to be of some benefit in limiting information which must be passed between processes and in better understanding the structure of a distributed simulation. Pursuit of these ideas seems appropriate.
Discrete event simulation of continuous systems
Nutaro, James J [ORNL
2007-01-01
Computer simulation of a system described by differential equations requires that some element of the system be approximated by discrete quantities. There are two system aspects that can be made discrete; time and state. When time is discrete, the differential equation is approximated by a difference equation (i.e., a discrete time system), and the solution is calculated at fixed points in time. When the state is discrete, the differential equation is approximated by a discrete event system. Events correspond to jumps through the discrete state space of the approximation.
Regenerative Steady-State Simulation of Discrete-Event Systems
Henderson, Shane
Regenerative Steady-State Simulation of Discrete-Event Systems Shane G. Henderson University of Michigan, and Peter W. Glynn Stanford University The regenerative method possesses certain asymptotic. Therefore, applying the regenerative method to steady-state discrete-event system simulations is of great
A Parallel Discrete Event IP Network Emulator Russell Bradford y , Rob Simmonds and Brian Unger
Bradford, Russell
A Parallel Discrete Event IP Network Emulator Russell Bradford y , Rob Simmonds #3; and Brian Unger that can act as a realtime network emulator. Real Internet Protocol (IP) traffic generated by application, RealTime Simulation, In ternet Protocol (IP). 1 Introduction This paper describes the Internet
Discrete Event Simulation Modeling of Radiation Medicine Delivery Methods
Paul M. Lewis; Dennis I. Serig; Rick Archer
1998-12-31
The primary objective of this work was to evaluate the feasibility of using discrete event simulation (DES) modeling to estimate the effects on system performance of changes in the human, hardware, and software elements of radiation medicine delivery methods.
Input modeling: input modeling techniques for discrete-event simulations
Lawrence Leemis
2001-01-01
Most discrete-event simulation models have stochastic elements that mimic the probabilistic nature of the system under consideration. A close match between the input model and the true underlying probabilistic mechanism associated with the system is required for successful input modeling. The general question considered here is how to model an element (e.g., arrival process, service times) in a discrete-event simulation
Input modeling techniques for discrete-event simulations
Lawrence Leemis
2001-01-01
Most discrete-event simulation models have stochastic elements that mimic the probabilistic nature of the system under consideration. A close match between the input model and the true underlying probabilistic mechanism associated with the system is required for successful input modeling. The general question considered here is how to model an element (e.g., arrival process, service times) in a discrete-event simulation,
Influence Diagrams in Analysis of Discrete Event Simulation Data
Jirka Poropudas; Kai Matti Virtanen
2009-01-01
In this paper, influence diagrams (IDs) are used as simulation metamodels to aid simulation based decision making. A decision problem under consideration is studied using discrete event simulation with decision alternatives as simulation parameters. The simulation data are used to construct an ID that presents the changes in simulation state with chance nodes. The decision alternatives and objectives of the
Applications of discrete event simulation modeling to military problems
Raymond R. Hill; J. O. Miller; Gregory A. McIntyre
2001-01-01
The military is a big user of discrete event simulation models. The use of these models range from training and wargaming their constructive use in important military analyses. In this paper we discuss the uses of military simulation, the issues associated with military simulation to include categorizations of various types of military simulation. We then discuss three particular simulation studies
Enhancing ERP system's functionality with discrete event simulation
Young B. Moon; Dinar Phatak
2005-01-01
Purpose – To develop a methodology to augment enterprise resource planning (ERP) systems with the discrete event simulation's inherent ability to handle the uncertainties. Design\\/methodology\\/approach – The ERP system still contains and uses the material requirements planning (MRP) logic as its central planning function. As a result, the ERP system inherits a number of shortcomings associated with the MRP system,
Enhancing complex system performance using discrete-event simulation
Glenn O. Allgood; Mohammed M. Olama; Joe E. Lake
2010-01-01
In this paper, we utilize discrete-event simulation (DES) merged with human factors analysis to provide the venue within which the separation and deconfliction of the system\\/human operating principles can occur. A concrete example is presented to illustrate the performance enhancement gains for an aviation cargo flow and security inspection system achieved through the development and use of a process DES.
Optimization of Operations Resources via Discrete Event Simulation Modeling
NASA Technical Reports Server (NTRS)
Joshi, B.; Morris, D.; White, N.; Unal, R.
1996-01-01
The resource levels required for operation and support of reusable launch vehicles are typically defined through discrete event simulation modeling. Minimizing these resources constitutes an optimization problem involving discrete variables and simulation. Conventional approaches to solve such optimization problems involving integer valued decision variables are the pattern search and statistical methods. However, in a simulation environment that is characterized by search spaces of unknown topology and stochastic measures, these optimization approaches often prove inadequate. In this paper, we have explored the applicability of genetic algorithms to the simulation domain. Genetic algorithms provide a robust search strategy that does not require continuity and differentiability of the problem domain. The genetic algorithm successfully minimized the operation and support activities for a space vehicle, through a discrete event simulation model. The practical issues associated with simulation optimization, such as stochastic variables and constraints, were also taken into consideration.
Synchronization of autonomous objects in discrete event simulation
NASA Technical Reports Server (NTRS)
Rogers, Ralph V.
1990-01-01
Autonomous objects in event-driven discrete event simulation offer the potential to combine the freedom of unrestricted movement and positional accuracy through Euclidean space of time-driven models with the computational efficiency of event-driven simulation. The principal challenge to autonomous object implementation is object synchronization. The concept of a spatial blackboard is offered as a potential methodology for synchronization. The issues facing implementation of a spatial blackboard are outlined and discussed.
Reversible Parallel Discrete Event Formulation of a TLM-based Radio Signal Propagation Model
Seal, Sudip K [ORNL; Perumalla, Kalyan S [ORNL
2011-01-01
Radio signal strength estimation is essential in many applications, including the design of military radio communications and industrial wireless installations. For scenarios with large or richly- featured geographical volumes, parallel processing is required to meet the memory and computa- tion time demands. Here, we present a scalable and efficient parallel execution of the sequential model for radio signal propagation recently developed by Nutaro et al. Starting with that model, we (a) provide a vector-based reformulation that has significantly lower computational overhead for event handling, (b) develop a parallel decomposition approach that is amenable to reversibility with minimal computational overheads, (c) present a framework for transparently mapping the conservative time-stepped model into an optimistic parallel discrete event execution, (d) present a new reversible method, along with its analysis and implementation, for inverting the vector-based event model to be executed in an optimistic parallel style of execution, and (e) present performance results from implementation on Cray XT platforms. We demonstrate scalability, with the largest runs tested on up to 127,500 cores of a Cray XT5, enabling simulation of larger scenarios and with faster execution than reported before on the radio propagation model. This also represents the first successful demonstration of the ability to efficiently map a conservative time-stepped model to an optimistic discrete-event execution.
Advances in Discrete-Event Simulation for MSL Command Validation
NASA Technical Reports Server (NTRS)
Patrikalakis, Alexander; O'Reilly, Taifun
2013-01-01
In the last five years, the discrete event simulator, SEQuence GENerator (SEQGEN), developed at the Jet Propulsion Laboratory to plan deep-space missions, has greatly increased uplink operations capacity to deal with increasingly complicated missions. In this paper, we describe how the Mars Science Laboratory (MSL) project makes full use of an interpreted environment to simulate change in more than fifty thousand flight software parameters and conditional command sequences to predict the result of executing a conditional branch in a command sequence, and enable the ability to warn users whenever one or more simulated spacecraft states change in an unexpected manner. Using these new SEQGEN features, operators plan more activities in one sol than ever before.
A survey of recent advances in discrete input parameter discrete-event simulation optimization
JAMES R. SWISHER; PAUL D. HYDEN; SHELDON H. JACOBSON; LEE W. SCHRUBEN
2004-01-01
Discrete-event simulation optimization is a problem of significant interest to practitioners interested in extracting useful information about an actual (or yet to be designed) system that can be modeled using discrete-event simulation. This paper presents a survey of the literature on discrete-event simulation optimization published in recent years (1988 to the present), with a particular focus on discrete input parameter
Metrics for Availability Analysis Using a Discrete Event Simulation Method
Schryver, Jack C [ORNL; Nutaro, James J [ORNL; Haire, Marvin Jonathan [ORNL
2012-01-01
The system performance metric 'availability' is a central concept with respect to the concerns of a plant's operators and owners, yet it can be abstract enough to resist explanation at system levels. Hence, there is a need for a system-level metric more closely aligned with a plant's (or, more generally, a system's) raison d'etre. Historically, availability of repairable systems - intrinsic, operational, or otherwise - has been defined as a ratio of times. This paper introduces a new concept of availability, called endogenous availability, defined in terms of a ratio of quantities of product yield. Endogenous availability can be evaluated using a discrete event simulation analysis methodology. A simulation example shows that endogenous availability reduces to conventional availability in a simple series system with different processing rates and without intermediate storage capacity, but diverges from conventional availability when storage capacity is progressively increased. It is shown that conventional availability tends to be conservative when a design includes features, such as in - process storage, that partially decouple the components of a larger system.
Enhancing Complex System Performance Using Discrete-Event Simulation
Allgood, Glenn O [ORNL; Olama, Mohammed M [ORNL; Lake, Joe E [ORNL
2010-01-01
In this paper, we utilize discrete-event simulation (DES) merged with human factors analysis to provide the venue within which the separation and deconfliction of the system/human operating principles can occur. A concrete example is presented to illustrate the performance enhancement gains for an aviation cargo flow and security inspection system achieved through the development and use of a process DES. The overall performance of the system is computed, analyzed, and optimized for the different system dynamics. Various performance measures are considered such as system capacity, residual capacity, and total number of pallets waiting for inspection in the queue. These metrics are performance indicators of the system's ability to service current needs and respond to additional requests. We studied and analyzed different scenarios by changing various model parameters such as the number of pieces per pallet ratio, number of inspectors and cargo handling personnel, number of forklifts, number and types of detection systems, inspection modality distribution, alarm rate, and cargo closeout time. The increased physical understanding resulting from execution of the queuing model utilizing these vetted performance measures identified effective ways to meet inspection requirements while maintaining or reducing overall operational cost and eliminating any shipping delays associated with any proposed changes in inspection requirements. With this understanding effective operational strategies can be developed to optimally use personnel while still maintaining plant efficiency, reducing process interruptions, and holding or reducing costs.
Simulation analysis: applications of discrete event simulation modeling to military problems
Raymond R. Hill; John O. Miller; Gregory A. McIntyre
2001-01-01
The military is a big user of discrete event simulation models. The use of these models range from training and wargaming their constructive use in important military analyses. In this paper we discuss the uses of military simulation, the issues associated with military simulation to include categorizations of various types of military simulation. We then discuss three particular simulation studies
Parallel discrete event simulation with predictors
Gummadi, Vidya
1995-01-01
The motivation for this research has been its applicability in sequence checking in a spacecraft's control commands. Spacecrafts are controlled by sequences of time-tagged control commands which are essentially onboard computer programs. 'The...
Parallel Discrete Event Simulation of Lyme Disease
Varela, Carlos
(larvae and nymphs) usually feed on the white-footed mouse Peromyscus leucopus. How- ever, immature ticks also may bite a variety of mammals and birds. Humans inadver- tently bitten by an infectious nymph may nymphs. The following spring these nymphs quest for a second blood meal. Those nymphs successfully
Parallel discrete event simulation with predictors
Gummadi, Vidya
1995-01-01
sequence of control commands needs to be verified for correct execution before the commands are executed on the spacecraft to avoid the catastrophic effects of incorrect execution. The sequence checking employed (presently) is inherently sequential...
Discrete event simulation of continuous systems James Nutaro
by a difference equation (i.e., a discrete time system), and the solution is calculated at fixed points in time. The approximating discrete event system is a function from a continuous time set to a discrete state set. The state.55 0.65 x(t) t (c) Discrete state Figure 1: Time and state discretizations of a system. From
Discrete event simulation and production system design for Rockwell hardness test blocks
Scheinman, David Eliot
2009-01-01
The research focuses on increasing production volume and decreasing costs at a hardness test block manufacturer. A discrete event simulation model is created to investigate potential system wide improvements. Using the ...
Discrete-event simulation of fluid stochastic Petri nets Gianfranco Ciardo1
Ciardo, Gianfranco
Discrete-event simulation of fluid stochastic Petri nets Gianfranco Ciardo1 David Nicol2 Kishor S. Trivedi3 ciardo@cs.wm.edu nicol@cs.dartmouth.edu kst@egr.duke.edu 1 Dept. of Computer Science, College
A methodology for unit testing actors in proprietary discrete event based simulations
Mark E. Coyne; Scott R. Graham; Kenneth M. Hopkinson; Stuart H. Kurkowski
2008-01-01
1 ABSTRACT This paper presents a dependency injection based, unit test- ing methodology for unit testing components, or actors, involved in discrete event based computer network simu- lation via an xUnit testing framework. The fundamental purpose of discrete event based computer network simulation is verification of networking protocols used in physical-not simulated-networks. Thus, use of rigorous unit testing and test
Lloyd G. Connelly; Aaron E. Bair
2004-01-01
Objectives: This article explores the potential of discrete event simulation (DES) methods to advance system-level investigation of emergency department (ED) operations. To this end, the authors describe the development and operation of Emergency Department SIMulation (EDSIM), a new platform for computer simulation of ED activity at a Level 1 trauma center. The authors also demonstrate one potential application of EDSIM
Integrating Discrete Event and Process-Level Simulation for Training in the I-X Framework
Wickler, G; Tate, Austin; Potter, S
The aim of this paper is to describe I-Sim, a simulation tool that is a fully integrated part of the underlying agent framework, I-X. I-Sim controls a discrete event simulator, based on the same activity model that is ...
A Comparison of Discrete Event Simulation and System Dynamics for Modelling Healthcare Systems
, and its use of graphical interfaces to facilitate communication with, and comprehension by, health careA Comparison of Discrete Event Simulation and System Dynamics for Modelling Healthcare Systems there are identifiable features of certain systems that make one methodology superior to the other. We illustrate the use
Angel A. Juan; Arai Monteforte; Albert Ferrer; Carles Serrat; Javier Faulin
2009-01-01
This paper discusses the convenience of predicting, quantitatively, time-dependent reliability and availability levels asso- ciated with most building or civil engineering structures. Then, the paper reviews different approaches to these problems and proposes the use of discrete-event simulation as the most realistic way to deal with them, specially during the design stage. The paper also reviews previous work on the
Dessert, an Open-Source .NET Framework for Process-Based Discrete-Event Simulation
Robbiano, Lorenzo
Dessert, an Open-Source .NET Framework for Process-Based Discrete-Event Simulation Giovanni LagorioPy, within the strongly-typed .NET environment. Both frameworks build domain-specific languages, a process-based DES framework for .NET, explaining the rationale behind its design, and discussing
Performance estimation of distributed real-time embedded systems by discrete event simulations
Gabor Madl; Nikil Dutt; Sherif Abdelwahed
2007-01-01
Key challenges in the performance estimation of distributed real-time embedded (DRE) systems include the systematic measurement of coverage by simulations, and the automated generation of directed test vectors. This paper investigates how DRE systems can be represented as discrete event sys- tems (DES) in continuous time, and proposes an automated method for the performance evaluation of such systems. The proposed
Discrete-event simulation on the World Wide Web using Java
Arnold H. Buss; Kirk A. Stork
1996-01-01
This paper introduces Simkit, a small set of Java classes for creating discrete event simulation models. Simkit may be used to either implement stand-alone models or Web page applets. Exploiting network capabilities of Java, the lingua franca of the World Wide Web (WWW), Simkit models can easily be implemented as applets and executed in a Web browser. Java's graphical capabilities
Koala: A DiscreteEvent Simulation Model of Infrastructure Clouds Koala is a discrete-event simulator that can model infrastructure as a service (IaaS) clouds of up to O(105 ) nodes. The model, written in SLX1 , facilitates investigation of global behavior throughout a single IaaS cloud. Koala
Discrete-Event Simulation of Fluid Stochastic Petri Nets
Gianfranco Ciardo; David M. Nicol; Kishor S. Trivedi
1999-01-01
The purpose of this paper is to describe a method for the simulation of the recently introduced uid stochas- tic Petri nets. Since such nets result in rather complex system of partial dieren tial equations, numerical so- lution becomes a formidable task. Because of a mixed (discrete and continuous) state space, simulative solu- tion also poses some interesting challenges, which
Discrete-event simulation of fluid stochastic Petri nets
Gianfranco Ciardo; D. Nicol; Kishor S. Trivedi
1997-01-01
The purpose of the paper is to describe a method for the simulation of recently introduced fluid stochastic Petri nets. Since such nets result in a rather complex set of partial differential equations, numerical solution becomes a formidable task. Because of a mixed, discrete and continuous state space, simulative solution also poses some interesting challenges, which are addressed in the
A graphical, intelligent interface for discrete-event simulations
Michelsen, C.; Dreicer, J.; Morgeson, D.
1988-01-01
This paper will present a prototype of anengagement analysis simulation tool. This simulation environment is to assist a user (analyst) in performing sensitivity analysis via the repeated execution of user-specified engagement scenarios. This analysis tool provides an intelligent front-end which is easy to use and modify. The intelligent front-end provides the capabilities to assist the use in the selection of appropriate scenario value. The incorporated graphics capabilities also provide additional insight into the simulation events as they are )openreverse arrowquotes)unfolding.)closingreverse arrowquotes) 4 refs., 4 figs.
DISCRETE EVENT SIMULATION OF OPTICAL SWITCH MATRIX PERFORMANCE IN COMPUTER NETWORKS
Imam, Neena [ORNL; Poole, Stephen W [ORNL
2013-01-01
In this paper, we present application of a Discrete Event Simulator (DES) for performance modeling of optical switching devices in computer networks. Network simulators are valuable tools in situations where one cannot investigate the system directly. This situation may arise if the system under study does not exist yet or the cost of studying the system directly is prohibitive. Most available network simulators are based on the paradigm of discrete-event-based simulation. As computer networks become increasingly larger and more complex, sophisticated DES tool chains have become available for both commercial and academic research. Some well-known simulators are NS2, NS3, OPNET, and OMNEST. For this research, we have applied OMNEST for the purpose of simulating multi-wavelength performance of optical switch matrices in computer interconnection networks. Our results suggest that the application of DES to computer interconnection networks provides valuable insight in device performance and aids in topology and system optimization.
A Framework for the Optimization of Discrete-Event Simulation Models
NASA Technical Reports Server (NTRS)
Joshi, B. D.; Unal, R.; White, N. H.; Morris, W. D.
1996-01-01
With the growing use of computer modeling and simulation, in all aspects of engineering, the scope of traditional optimization has to be extended to include simulation models. Some unique aspects have to be addressed while optimizing via stochastic simulation models. The optimization procedure has to explicitly account for the randomness inherent in the stochastic measures predicted by the model. This paper outlines a general purpose framework for optimization of terminating discrete-event simulation models. The methodology combines a chance constraint approach for problem formulation, together with standard statistical estimation and analyses techniques. The applicability of the optimization framework is illustrated by minimizing the operation and support resources of a launch vehicle, through a simulation model.
Discrete-event simulation for the design and evaluation of physical protection systems
Jordan, S.E.; Snell, M.K.; Madsen, M.M. [Sandia National Labs., Albuquerque, NM (United States); Smith, J.S.; Peters, B.A. [Texas A and M Univ., College Station, TX (United States). Industrial Engineering Dept.
1998-08-01
This paper explores the use of discrete-event simulation for the design and control of physical protection systems for fixed-site facilities housing items of significant value. It begins by discussing several modeling and simulation activities currently performed in designing and analyzing these protection systems and then discusses capabilities that design/analysis tools should have. The remainder of the article then discusses in detail how some of these new capabilities have been implemented in software to achieve a prototype design and analysis tool. The simulation software technology provides a communications mechanism between a running simulation and one or more external programs. In the prototype security analysis tool, these capabilities are used to facilitate human-in-the-loop interaction and to support a real-time connection to a virtual reality (VR) model of the facility being analyzed. This simulation tool can be used for both training (in real-time mode) and facility analysis and design (in fast mode).
APEX - a Petri net process modeling tool built on a discrete-event simulation system
Gish, J.W. [GTE Labs. Inc., Waltham, MA (United States)
1996-12-31
APEX, the Animated Process Experimentation tool, provides a capability for defining, simulating and animating process models. Primarily constructed for the modeling and analysis of software process models, we have found that APEX is much more broadly applicable and is suitable for process modeling tasks outside the domain of software processes. APEX has been constructed as a library of simulation blocks that implement timed hierarchical colored Petri Nets. These Petri Net blocks operate in conjunction with EXTEND, a general purpose continuous and discrete-event simulation tool. EXTEND provides a flexible, powerful and extensible environment with features particularly suitable for the modeling of complex processes. APEX`s Petri Net block additions to EXTEND provide an inexpensive capability with well-defined and easily understood semantics that is a powerful, easy to use, flexible means to engage in process modeling and evaluation. The vast majority of software process research has focused on the enactment of software processes. Little has been said about the actual creation and evaluation of software process models necessary to support enactment. APEX has been built by the Software Engineering Process Technology Project at GTE Laboratories which has been focusing on this neglected area of process model definition and analysis. We have constructed high-level software lifecycle models, a set of models that demonstrate differences between four levels of the SEI Capability Maturity Model (CMM), customer care process models, as well as models involving more traditional synchronization and coordination problems such as producer-consumer and 2-phase commit. APEX offers a unique blend of technology from two different disciplines: discrete-event simulation and Petri Net modeling. Petri Nets provide a well-defined and rich semantics in a simple, easy to understand notation. The simulation framework allows for execution, animation, and measurement of the resultant models.
The effects of indoor environmental exposures on pediatric asthma: a discrete event simulation model
2012-01-01
Background In the United States, asthma is the most common chronic disease of childhood across all socioeconomic classes and is the most frequent cause of hospitalization among children. Asthma exacerbations have been associated with exposure to residential indoor environmental stressors such as allergens and air pollutants as well as numerous additional factors. Simulation modeling is a valuable tool that can be used to evaluate interventions for complex multifactorial diseases such as asthma but in spite of its flexibility and applicability, modeling applications in either environmental exposures or asthma have been limited to date. Methods We designed a discrete event simulation model to study the effect of environmental factors on asthma exacerbations in school-age children living in low-income multi-family housing. Model outcomes include asthma symptoms, medication use, hospitalizations, and emergency room visits. Environmental factors were linked to percent predicted forced expiratory volume in 1 second (FEV1%), which in turn was linked to risk equations for each outcome. Exposures affecting FEV1% included indoor and outdoor sources of NO2 and PM2.5, cockroach allergen, and dampness as a proxy for mold. Results Model design parameters and equations are described in detail. We evaluated the model by simulating 50,000 children over 10 years and showed that pollutant concentrations and health outcome rates are comparable to values reported in the literature. In an application example, we simulated what would happen if the kitchen and bathroom exhaust fans were improved for the entire cohort, and showed reductions in pollutant concentrations and healthcare utilization rates. Conclusions We describe the design and evaluation of a discrete event simulation model of pediatric asthma for children living in low-income multi-family housing. Our model simulates the effect of environmental factors (combustion pollutants and allergens), medication compliance, seasonality, and medical history on asthma outcomes (symptom-days, medication use, hospitalizations, and emergency room visits). The model can be used to evaluate building interventions and green building construction practices on pollutant concentrations, energy savings, and asthma healthcare utilization costs, and demonstrates the value of a simulation approach for studying complex diseases such as asthma. PMID:22989068
Cloning Parallel Simulations MARIA HYBINETTE
Hybinette, Maria
management among others. The goal of this research is to provide a simulation-based decision aid for managers decision aid includes three components: (1) a situation data base built from live data feeds memory mul- tiprocessor. A running parallel discrete event simulation is dynamically cloned at decision
Discrete event simulation tool for analysis of qualitative models of continuous processing systems
NASA Technical Reports Server (NTRS)
Malin, Jane T. (inventor); Basham, Bryan D. (inventor); Harris, Richard A. (inventor)
1990-01-01
An artificial intelligence design and qualitative modeling tool is disclosed for creating computer models and simulating continuous activities, functions, and/or behavior using developed discrete event techniques. Conveniently, the tool is organized in four modules: library design module, model construction module, simulation module, and experimentation and analysis. The library design module supports the building of library knowledge including component classes and elements pertinent to a particular domain of continuous activities, functions, and behavior being modeled. The continuous behavior is defined discretely with respect to invocation statements, effect statements, and time delays. The functionality of the components is defined in terms of variable cluster instances, independent processes, and modes, further defined in terms of mode transition processes and mode dependent processes. Model construction utilizes the hierarchy of libraries and connects them with appropriate relations. The simulation executes a specialized initialization routine and executes events in a manner that includes selective inherency of characteristics through a time and event schema until the event queue in the simulator is emptied. The experimentation and analysis module supports analysis through the generation of appropriate log files and graphics developments and includes the ability of log file comparisons.
Capacity Planning for Maternal-Fetal Medicine Using Discrete Event Simulation.
Ferraro, Nicole M; Reamer, Courtney B; Reynolds, Thomas A; Howell, Lori J; Moldenhauer, Julie S; Day, Theodore Eugene
2015-07-01
Background?Maternal-fetal medicine is a rapidly growing field requiring collaboration from many subspecialties. We provide an evidence-based estimate of capacity needs for our clinic, as well as demonstrate how simulation can aid in capacity planning in similar environments. Methods?A Discrete Event Simulation of the Center for Fetal Diagnosis and Treatment and Special Delivery Unit at The Children's Hospital of Philadelphia was designed and validated. This model was then used to determine the time until demand overwhelms inpatient bed availability under increasing capacity. Findings?No significant deviation was found between historical inpatient censuses and simulated censuses for the validation phase (p?=?0.889). Prospectively increasing capacity was found to delay time to balk (the inability of the center to provide bed space for a patient in need of admission). With current capacity, the model predicts mean time to balk of 276 days. Adding three beds delays mean time to first balk to 762 days; an additional six beds to 1,335 days. Conclusion?Providing sufficient access is a patient safety issue, and good planning is crucial for targeting infrastructure investments appropriately. Computer-simulated analysis can provide an evidence base for both medical and administrative decision making in a complex clinical environment. PMID:25519198
Haitham M. Al-Deek; Ayman A. Mohamed; Linda Malone
2005-01-01
This article presents a discrete-event stochastic microscopic simulation model specifically developed to evaluate the operational performance of toll plazas. The model has been calibrated, validated, and applied to toll plazas equipped with Electronic Toll Collection (ETC) in Orlando, Florida. Traffic behavior is represented using a set of mathematical and logic algorithms that control the conflicts among vehicles within the toll
Discrete-event simulation of a wide-area health care network.
McDaniel, J G
1995-01-01
OBJECTIVE: Predict the behavior and estimate the telecommunication cost of a wide-area message store-and-forward network for health care providers that uses the telephone system. DESIGN: A tool with which to perform large-scale discrete-event simulations was developed. Network models for star and mesh topologies were constructed to analyze the differences in performances and telecommunication costs. The distribution of nodes in the network models approximates the distribution of physicians, hospitals, medical labs, and insurers in the Province of Saskatchewan, Canada. Modeling parameters were based on measurements taken from a prototype telephone network and a survey conducted at two medical clinics. Simulation studies were conducted for both topologies. RESULTS: For either topology, the telecommunication cost of a network in Saskatchewan is projected to be less than $100 (Canadian) per month per node. The estimated telecommunication cost of the star topology is approximately half that of the mesh. Simulations predict that a mean end-to-end message delivery time of two hours or less is achievable at this cost. A doubling of the data volume results in an increase of less than 50% in the mean end-to-end message transfer time. CONCLUSION: The simulation models provided an estimate of network performance and telecommunication cost in a specific Canadian province. At the expected operating point, network performance appeared to be relatively insensitive to increases in data volume. Similar results might be anticipated in other rural states and provinces in North America where a telephone-based network is desired. PMID:7583646
Angel A. Juan; Albert Ferrer; Carles Serrat; Javier Faulin; Gleb Beliakov; Joshua Hester
\\u000a This chapter discusses and illustrates some potential applications of discrete-event simulation (DES) techniques in structural\\u000a reliability and availability analysis, emphasizing the convenience of using probabilistic approaches in modern building and\\u000a civil engineering practices. After reviewing existing literature on the topic, some advantages of probabilistic techniques\\u000a over analytical ones are highlighted. Then, we introduce a general framework for performing structural reliability
Modelling the treated course of schizophrenia: development of a discrete event simulation model.
Heeg, Bart; Buskens, Erik; Knapp, Martin; van Aalst, Gerda; Dries, Pieter J T; de Haan, Lieuwe; van Hout, Ben A
2005-01-01
In schizophrenia, modelling techniques may be needed to estimate the long-term costs and effects of new interventions. However, it seems that a simple direct link between symptoms and costs does not exist. Decisions about whether a patient will be hospitalized or admitted to a different healthcare setting are based not only on symptoms but also on social and environmental factors. This paper describes the development of a model to assess the dependencies between a broad range of parameters in the treatment of schizophrenia. In particular, the model attempts to incorporate social and environmental factors into the decision-making process for the prescription of new drugs to patients. The model was used to analyse the potential benefits of improving compliance with medication by 20% in patients in the UK. A discrete event simulation (DES) model was developed, to describe a cohort of schizophrenia patients with multiple psychotic episodes. The model takes into account the patient's sex, disease severity, potential risk of harm to self and society, and social and environmental factors. Other variables that change over time include the number of psychiatric consultations, the presence of psychotic episodes, symptoms, treatments, compliance, side-effects, the lack of ability to take care of him/herself, care setting and risk of harm. Outcomes are costs, psychotic episodes and symptoms. Univariate and multivariate sensitivity analyses were performed. Direct medical costs were considered (year of costing 2002), applying a 6.0% discount rate for costs and a 1.5% discount rate for outcome. The timeframe of the model is 5 years. When 50% of the decisions about the patient care setting are based on symptoms, a 20% increase in compliance was estimated to save 16,147 pounds and to avoid 0.55 psychotic episodes per patient over 5 years. Sensitivity analysis showed that the costs savings associated with increased compliance are robust over a range of variations in parameters. DES offers a flexible structure for modelling a disease, taking into account how a patient's history affects the course of the disease over time. This approach is particularly pertinent to schizophrenia, in which treatment decisions are complex. The model shows that better compliance increases the time between relapses, decreases the symptom score, and reduces the requirement for treatment in an intensive patient care setting, leading to cost savings. The extent of the cost savings depends on the relative importance of symptoms and of social and environmental factors in these decisions. PMID:16416759
Flexi-Cluster: A Simulator for a Single Compute Cluster Flexi-Cluster is a flexible, discrete-event simulation model for a single compute cluster, such as might be deployed within a compute grid. The model management and scheduling within a single compute cluster. The key innovation in the model is to permit users
Oregon, University of
turning to individual-based models [1][3]. Rather than summarizing average case behavior with sys- tems of differential equations, an individual based model uses a representation of each individual plant or animal that represent cells. A straightforward approach to distributed simulation is to partition the physical territory
Parallel simulation of timed Petri-nets
David M. Nicol; Subhas C. Roy
1991-01-01
This paper considers the problem of using a parallel computer to execute discrete-event simulations of timed Petri-nets. We first develop synchronization and simulation algorithms for this task, and discuss a parallelized Petri-net simulator which has been implemented on an Intel iPSC\\/2 distributed memory multiprocessor. Next we describe a graphics-based frontend for the simulator, used to build timed Petri-net models. Finally,
On Parallel Stochastic Simulation of Diffusive Systems
Lorenzo Dematté; Tommaso Mazza
2008-01-01
The parallel simulation of biochemical reactions is a very interesting problem: biochemical systems are inherently parallel,\\u000a yet the majority of the algorithms to simulate them, including the well-known and widespread Gillespie SSA, are strictly sequential.\\u000a Here we investigate, in a general way, how to characterize the simulation of biochemical systems in terms of Discrete Event\\u000a Simulation. We dissect their inherent
NASA Technical Reports Server (NTRS)
Dubos, Gregory F.; Cornford, Steven
2012-01-01
While the ability to model the state of a space system over time is essential during spacecraft operations, the use of time-based simulations remains rare in preliminary design. The absence of the time dimension in most traditional early design tools can however become a hurdle when designing complex systems whose development and operations can be disrupted by various events, such as delays or failures. As the value delivered by a space system is highly affected by such events, exploring the trade space for designs that yield the maximum value calls for the explicit modeling of time.This paper discusses the use of discrete-event models to simulate spacecraft development schedule as well as operational scenarios and on-orbit resources in the presence of uncertainty. It illustrates how such simulations can be utilized to support trade studies, through the example of a tool developed for DARPA's F6 program to assist the design of "fractionated spacecraft".
Using machine learning techniques to interpret results from discrete event
Mladenic, Dunja
Using machine learning techniques to interpret results from discrete event simulation Dunja Mladeni machine learning techniques. The results of two simulators were processed as machine learning problems discovered. Key words: discrete event simulation, machine learning, artificial intelligence 1 Introduction
Pan, Chong; Zhang, Dali; Kon, Audrey Wan Mei; Wai, Charity Sue Lea; Ang, Woo Boon
2015-06-01
Continuous improvement in process efficiency for specialist outpatient clinic (SOC) systems is increasingly being demanded due to the growth of the patient population in Singapore. In this paper, we propose a discrete event simulation (DES) model to represent the patient and information flow in an ophthalmic SOC system in the Singapore National Eye Centre (SNEC). Different improvement strategies to reduce the turnaround time for patients in the SOC were proposed and evaluated with the aid of the DES model and the Design of Experiment (DOE). Two strategies for better patient appointment scheduling and one strategy for dilation-free examination are estimated to have a significant impact on turnaround time for patients. One of the improvement strategies has been implemented in the actual SOC system in the SNEC with promising improvement reported. PMID:25012400
Parallelized direct execution simulation of message-passing parallel programs
NASA Technical Reports Server (NTRS)
Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.
1994-01-01
As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
TPASS: dynamic, discrete-event simulation and animation of a Toll Plaza
Robert T. Redding; Andrew J. Junga
1992-01-01
This paper describes the development of a software package that simulates and animates the operation of a toll plaza. The Toll Plaza Animation\\/Simulation System (TPASS) gives transportation authorities the ability to experiment with various toll plaza configurations and traffic characteristics in order to determine the resulting queuing, wait times, and toll revenue. TPASS was designed to consider all of the
- plained by climate change or human-caused degradation of the landscape. Finally, by placing model villages- ing specific time frames) as the core of its contents. The simulation models for environmental changes simulation model is being developed to meet the project goals. The human and landscape models serve as both
Hessam S. Sarjoughian; Dongping Huang; Gary W. Godding; Karl G. Kempf; Wenlin Wang; Daniel E. Rivera; Hans D. Mittelmann
2005-01-01
Simulation modeling combined with decision control can offer important benefits for analysis, design, and operation of semiconductor supply-chain network systems. Detailed simulation of physical processes provides information for its controller to account for (expected) stochasticity present in the manufacturing processes. In turn, the controller can provide (near) optimal decisions for the operation of the processes and thus handle uncertainty in
Using split event sets to form and schedule event combinations in discrete event simulation
N. Manjikian; W. M. Loucks
1992-01-01
Examines the operational characteristics of event set implementations in the presence of a large number of scheduled events. The authors examine a technique to reduce the number of items (i.e., events) to be scheduled by combining all the events to be processed by the same part of the simulation (referred to as a logical process) at the same simulation time.
Forest biomass supply logistics for a power plant using the discrete-event simulation approach
Mahdi Mobini; Taraneh Sowlati; Shahabaddine Sokhansanj
2011-01-01
This study investigates the logistics of supplying forest biomass to a potential power plant. Due to the complexities in such a supply logistics system, a simulation model based on the framework of Integrated Biomass Supply Analysis and Logistics (IBSAL) is developed in this study to evaluate the cost of delivered forest biomass, the equilibrium moisture content, and carbon emissions from
General Methodology for Metabolic Pathways Modeling and Simulation using Discrete Event
Boyer, Edmond
. Example on glycolysis of Yeast. T. Antoine-Santoni1 , F. Bernardi1 , F. Giamarchi1 1 University of Corsica and simulation of a metabolic pathway, the glycolysis. Index Terms-- DEVS, Methodology, Bioinformatic, Glycolysis. Section V uses our DEVS- based methodology on the studied case: the glycolysis of the yeast. Different
Using split event sets to form and schedule event combinations in discrete event simulation
Naraig Manjikian; Wayne M. Loucks
1992-01-01
This paper examines the operational characteristics of event set implementations in the presence of a large number of scheduled events. We first examine a technique to reduce the number of items (i.e., events) to be scheduled by combining all the events to be processed by the same part of the simulation (referred to as a logical process in this paper)
Forest biomass supply logistics for a power plant using the discrete-event simulation approach
Mobini, Mahdi [University of British Columbia, Vancouver; Sowlati, T. [University of British Columbia, Vancouver; Sokhansanj, Shahabaddine [ORNL
2011-04-01
This study investigates the logistics of supplying forest biomass to a potential power plant. Due to the complexities in such a supply logistics system, a simulation model based on the framework of Integrated Biomass Supply Analysis and Logistics (IBSAL) is developed in this study to evaluate the cost of delivered forest biomass, the equilibrium moisture content, and carbon emissions from the logistics operations. The model is applied to a proposed case of 300 MW power plant in Quesnel, BC, Canada. The results show that the biomass demand of the power plant would not be met every year. The weighted average cost of delivered biomass to the gate of the power plant is about C$ 90 per dry tonne. Estimates of equilibrium moisture content of delivered biomass and CO2 emissions resulted from the processes are also provided.
Distributed state reconstruction for discrete event systems
Eric Fabre; Albert Benveniste; Claude Jard; Laurie Ricker; Mark Smith
2000-01-01
We consider the state estimation problem for stochastic discrete event dynamic system (DEDS) obtained by the parallel composition of several subsystems. A distributed inference algorithm is developed in the case of distributed observations. It is composed of asynchronous agents that only have a local view of the model and of observations. This algorithm only handles local states of subsystems, which
Simulating Billion-Task Parallel Programs
Perumalla, Kalyan S [ORNL] [ORNL; Park, Alfred J [ORNL] [ORNL
2014-01-01
In simulating large parallel systems, bottom-up approaches exercise detailed hardware models with effects from simplified software models or traces, whereas top-down approaches evaluate the timing and functionality of detailed software models over coarse hardware models. Here, we focus on the top-down approach and significantly advance the scale of the simulated parallel programs. Via the direct execution technique combined with parallel discrete event simulation, we stretch the limits of the top-down approach by simulating message passing interface (MPI) programs with millions of tasks. Using a timing-validated benchmark application, a proof-of-concept scaling level is achieved to over 0.22 billion virtual MPI processes on 216,000 cores of a Cray XT5 supercomputer, representing one of the largest direct execution simulations to date, combined with a multiplexing ratio of 1024 simulated tasks per real task.
On extending parallelism to serial simulators
NASA Technical Reports Server (NTRS)
Nicol, David; Heidelberger, Philip
1994-01-01
This paper describes an approach to discrete event simulation modeling that appears to be effective for developing portable and efficient parallel execution of models of large distributed systems and communication networks. In this approach, the modeler develops submodels using an existing sequential simulation modeling tool, using the full expressive power of the tool. A set of modeling language extensions permit automatically synchronized communication between submodels; however, the automation requires that any such communication must take a nonzero amount off simulation time. Within this modeling paradigm, a variety of conservative synchronization protocols can transparently support conservative execution of submodels on potentially different processors. A specific implementation of this approach, U.P.S. (Utilitarian Parallel Simulator), is described, along with performance results on the Intel Paragon.
External Adjustment of Runtime Parameters in Time Warp Synchronized Parallel Simulators
Radharamanan Radhakrishnan; Lantz Moore; Philip A. Wilsey
1997-01-01
Several optimizations to the Time Warp synchronization pro to- col for parallel discrete event simulation have been propos ed and studied. Many of these optimizations have included some for m of dynamic adjustment (or control) of the operating parameter s of the simulation (e.g., checkpoint interval, cancellation strategy). Tra- ditionally dynamic parameter adjustment has been performe d at the simulation
Concurrency and discrete event control
NASA Technical Reports Server (NTRS)
Heymann, Michael
1990-01-01
Much of discrete event control theory has been developed within the framework of automata and formal languages. An alternative approach inspired by the theories of process-algebra as developed in the computer science literature is presented. The framework, which rests on a new formalism of concurrency, can adequately handle nondeterminism and can be used for analysis of a wide range of discrete event phenomena.
Guo, Shien; Getsios, Denis; Hernandez, Luis; Cho, Kelly; Lawler, Elizabeth; Altincatal, Arman; Lanes, Stephan; Blankenburg, Michael
2012-01-01
The growing understanding of the use of biomarkers in Alzheimer's disease (AD) may enable physicians to make more accurate and timely diagnoses. Florbetaben, a beta-amyloid tracer used with positron emission tomography (PET), is one of these diagnostic biomarkers. This analysis was undertaken to explore the potential value of florbetaben PET in the diagnosis of AD among patients with suspected dementia and to identify key data that are needed to further substantiate its value. A discrete event simulation was developed to conduct exploratory analyses from both US payer and societal perspectives. The model simulates the lifetime course of disease progression for individuals, evaluating the impact of their patient management from initial diagnostic work-up to final diagnosis. Model inputs were obtained from specific analyses of a large longitudinal dataset from the New England Veterans Healthcare System and supplemented with data from public data sources and assumptions. The analyses indicate that florbetaben PET has the potential to improve patient outcomes and reduce costs under certain scenarios. Key data on the use of florbetaben PET, such as its influence on time to confirmation of final diagnosis, treatment uptake, and treatment persistency, are unavailable and would be required to confirm its value. PMID:23326754
2011-01-01
Background Recent reforms in Portugal aimed at strengthening the role of the primary care system, in order to improve the quality of the health care system. Since 2006 new policies aiming to change the organization, incentive structures and funding of the primary health care sector were designed, promoting the evolution of traditional primary health care centres (PHCCs) into a new type of organizational unit - family health units (FHUs). This study aimed to compare performances of PHCC and FHU organizational models and to assess the potential gains from converting PHCCs into FHUs. Methods Stochastic discrete event simulation models for the two types of organizational models were designed and implemented using Simul8 software. These models were applied to data from nineteen primary care units in three municipalities of the Greater Lisbon area. Results The conversion of PHCCs into FHUs seems to have the potential to generate substantial improvements in productivity and accessibility, while not having a significant impact on costs. This conversion might entail a 45% reduction in the average number of days required to obtain a medical appointment and a 7% and 9% increase in the average number of medical and nursing consultations, respectively. Conclusions Reorganization of PHCC into FHUs might increase accessibility of patients to services and efficiency in the provision of primary care services. PMID:21999336
Krukenberg, Harry J.
1996-01-01
purposes of a simulation study very early in this project. Finally, I would like to express my sincere gratitude to Mr. Robert Gaias, Mr. Steve Essig, and all of the other people at Delta Research Corporation who helped me to obtain the critical input...
Nutaro, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Kuruganti, Phani Teja [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Protopopescu, Vladimir A. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Shankar, Mallikarjun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
2012-02-08
The efficient and accurate management of time in simulations of hybrid models is an outstanding engineering problem. General a priori knowledge about the dynamic behavior of the hybrid system (i.e. essentially continuous, essentially discrete, or 'truly hybrid') facilitates this task. Indeed, for essentially discrete and essentially continuous systems, existing software packages can be conveniently used to perform quite sophisticated and satisfactory simulations. The situation is different for 'truly hybrid' systems, for which direct application of existing software packages results in a lengthy design process, cumbersome software assemblies, inaccurate results, or some combination of these independent of the designer's a priori knowledge about the system's structure and behavior. The main goal of this paper is to provide a methodology whereby simulation designers can use a priori knowledge about the hybrid model's structure to build a straightforward, efficient, and accurate simulator with existing software packages. The proposed methodology is based on a formal decomposition and re-articulation of the hybrid system; this is the main theoretical result of the paper. To set the result in the right perspective, we briefly review the essentially continuous and essentially discrete approaches, which are illustrated with typical examples. Then we present our new, split system approach, first in a general formal context, then in three more specific guises that reflect the viewpoints of three main communities of hybrid system researchers and practitioners. For each of these variants we indicate an implementation path. Our approach is illustrated with an archetypal problem of power grid control.
Khalid, Ruzelan; M. Nawawi, Mohd Kamal; Kawsar, Luthful A.; Ghani, Noraida A.; Kamil, Anton A.; Mustafa, Adli
2013-01-01
M/G/C/C state dependent queuing networks consider service rates as a function of the number of residing entities (e.g., pedestrians, vehicles, and products). However, modeling such dynamic rates is not supported in modern Discrete Simulation System (DES) software. We designed an approach to cater this limitation and used it to construct the M/G/C/C state-dependent queuing model in Arena software. Using the model, we have evaluated and analyzed the impacts of various arrival rates to the throughput, the blocking probability, the expected service time and the expected number of entities in a complex network topology. Results indicated that there is a range of arrival rates for each network where the simulation results fluctuate drastically across replications and this causes the simulation results and analytical results exhibit discrepancies. Detail results that show how tally the simulation results and the analytical results in both abstract and graphical forms and some scientific justifications for these have been documented and discussed. PMID:23560037
S. D. SMITH
1996-01-01
Earthmoving operations are a major part of many civil engineering works and as such they present an ideal opportunity for improved estimation forecasts and efficiency. This paper presents a computer simulation model which is based on the assumption that an earthmoving cycle can be broken down into its component parts which can be represented by Erlang probability distributions. Such a
Krukenberg, Harry J.
1996-01-01
framework is based on the theoretical concepts of systems as developed by Zeigler (1976). The model component provides a description of the physical elements of the system and their interrelationships. The experiment component is used to set... ideal conditions and a series of time distributions for various probletns which may lengthen the activity completion time. Halpin (1977) developed the CYCLONE (CYCLic Operations NEtwork system) and later MicroCYCLONE simulation languages for modeling...
Diagnosable discrete event systems design
Yuanlin Wen; Pei-shu Fan; Muder Jeng
2007-01-01
This paper presents an approach using Petri nets for designing diagnosable discrete event systems such as complex semiconductor manufacturing machines. The concept is based on diagnosability analysis and enhancement. In this paper, we interpret and formulate the diagnosability problem as a binary integer linear programming problem that may have a feasible solution. If the system is predicted to be non-diagnosable,
Parallel Atomistic Simulations
HEFFELFINGER,GRANT S.
2000-01-18
Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
Weening, J.S.
1988-05-01
CSIM is a simulator for parallel Lisp, based on a continuation passing interpreter. It models a shared-memory multiprocessor executing programs written in Common Lisp, extended with several primitives for creating and controlling processes. This paper describes the structure of the simulator, measures its performance, and gives an example of its use with a parallel Lisp program.
Distributed State Reconstruction for Discrete Event Systems1
Eric Fabre; Albert Benveniste; Claude Jard; Laurie Ricker; Mark Smith
2000-01-01
We consider the state estimation problem for stochas- tic discrete event dynamic system (DEDS) obtained by the parallel composition of several subsystems. A dis- tributed inference algorithm is developed in the case of distributed observations. It is composed of asyn- chronous agents that only have a local view of the model and of observations. This algorithm only handles local states
Scaling Time Warp-based Discrete Event Execution to 10^{4} Processors on Blue Gene Supercomputer
Perumalla, Kalyan S [ORNL
2007-01-01
Lately, important large-scale simulation applications, such as emergency/event planning and response, are emerging that are based on discrete event models. The applications are characterized by their scale (several millions of simulated entities), their fine-grained nature of computation (microseconds per event), and their highly dynamic inter-entity event interactions. The desired scale and speed together call for highly scalable parallel discrete event simulation (PDES) engines. However, few such parallel engines have been designed or tested on platforms with thousands of processors. Here an overview is given of a unique PDES engine that has been designed to support Time Warp-style optimistic parallel execution as well as a more generalized mixed, optimistic-conservative synchronization. The engine is designed to run on massively parallel architectures with minimal overheads. A performance study of the engine is presented, including the first results to date of PDES benchmarks demonstrating scalability to as many as 16,384 processors, on an IBM Blue Gene supercomputer. The results show, for the first time, the promise of effectively sustaining very large scale discrete event execution on up to 10^{4} processors.
DISCRETE EVENT MODELING IN PTOLEMY II
Lukito Muliadi
Abstract This report describes the discrete-event semantics and its implementation,in the Ptolemy II software architecture. The discrete-event system representation is appropriate for time-oriented systems such as queueing systems, communication networks, and hardware systems. A key strength in our discrete-event implementation ,is that simultaneous ,events are handled systematically and deterministically. A formal and rigorous treatment of this property is given. One
Xyce parallel electronic simulator.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.
2010-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.
FINDING LOOKAHEAD IN WARGAMING: A REQUIREMENT FOR SCALEABLE PARALLEL SIMULATION
Emmet Beeker; John Chludzinski
Military simulations generally fall into two main categories: time-stepped and discrete-event simulation (DES). Where it can be used, DES provides performance advantages and can reduce temporal distortions imposed upon time-stepped simulations. In fact, some time-stepped simulations, such as the OneSAF Testbed (OTB), use some discrete-event techniques to mitigate these problems. The main difficulty with DES is scaling beyond a single
Optimal Discrete Event Supervisory Control of Aircraft Gas Turbine Engines
NASA Technical Reports Server (NTRS)
Litt, Jonathan (Technical Monitor); Ray, Asok
2004-01-01
This report presents an application of the recently developed theory of optimal Discrete Event Supervisory (DES) control that is based on a signed real measure of regular languages. The DES control techniques are validated on an aircraft gas turbine engine simulation test bed. The test bed is implemented on a networked computer system in which two computers operate in the client-server mode. Several DES controllers have been tested for engine performance and reliability.
Stabilization of discrete-event processes
NASA Technical Reports Server (NTRS)
Brave, Y.; Heymann, M.
1990-01-01
Discrete-event processes are modeled by state-machines in the Ramadge-Wonham framework with control by a feedback event disablement mechanism. In this paper, concepts of stabilization of discrete-event processes are defined and investigated. The possibility of driving a process (under control) from arbitrary initial states to a prescribed subset of the state set and then keeping it there indefinitely is examined. This stabilization property is studied also with respect to 'open-loop' processes and their asymptotic behavior is characterized. Polynomial time algorithms are presented for verifying various types of attraction and for the synthesis of attractors.
Chow's Team Petri Net Models discrete event
Kaber, David B.
k Chow's Team Petri Net Models Â discrete event stochastic models (set fixed time interval updates) Outputs Include: Operation times Plate location & time Plate characteristics Machine/process states: errors, schedule optimization, and event times GOMSL Model to Represent Monitoring Task: - Monitoring
Diagnosability Enhancement of Discrete Event Systems
YuanLin Wen; ChunHsi Li; MuDer Jeng
2006-01-01
This paper presents an iterative systematic methodology for enhancing diagnosability of discrete event systems by adding sensors. The methodology consists of the following steps. First, Petri nets are used to model the target system. Then, an algorithm of polynomial complexity is adopted to analyze a sufficient condition of diagnosability of the modeled system. Here, diagnosability is defined in the context
Discrete Event Execution with One-Sided and Two-Sided GVT Algorithms on 216,000 Processor Cores
Perumalla, Kalyan S [ORNL] [ORNL; Park, Alfred J [ORNL] [ORNL; Tipparaju, Vinod [ORNL] [ORNL
2014-01-01
Global virtual time (GVT) computation is a key determinant of the efficiency and runtime dynamics of parallel discrete event simulations (PDES), especially on large-scale parallel platforms. Here, three execution modes of a generalized GVT computation algorithm are studied on high-performance parallel computing systems: (1) a synchronous GVT algorithm that affords ease of implementation, (2) an asynchronous GVT algorithm that is more complex to implement but can relieve blocking latencies, and (3) a variant of the asynchronous GVT algorithm to exploit one-sided communication in extant supercomputing platforms. Performance results are presented of implementations of these algorithms on up to 216,000 cores of a Cray XT5 system, exercised on a range of parameters: optimistic and conservative synchronization, fine- to medium-grained event computation, synthetic and non-synthetic applications, and different lookahead values. Performance of up to 54 billion events executed per second is registered. Detailed PDES-specific runtime metrics are presented to further the understanding of tightly-coupled discrete event dynamics on massively parallel platforms.
Introduction to parallel programming
Brawer, S. (Encore Computer Corp., Marlborough, MA (US))
1989-01-01
This book describes parallel programming and all the basic concepts illustrated by examples in a simplified FORTRAN. Concepts covered include: The parallel programming model; The creation of multiple processes; Memory sharing; Scheduling; Data dependencies. In addition, a number of parallelized applications are presented, including a discrete-time, discrete-event simulator, numerical integration, Gaussian elimination, and parallelized versions of the traveling salesman problem and the exploration of a maze.
Multiple Autonomous Discrete Event Controllers for Constellations
NASA Technical Reports Server (NTRS)
Esposito, Timothy C.
2003-01-01
The Multiple Autonomous Discrete Event Controllers for Constellations (MADECC) project is an effort within the National Aeronautics and Space Administration Goddard Space Flight Center's (NASA/GSFC) Information Systems Division to develop autonomous positioning and attitude control for constellation satellites. It will be accomplished using traditional control theory and advanced coordination algorithms developed by the Johns Hopkins University Applied Physics Laboratory (JHU/APL). This capability will be demonstrated in the discrete event control test-bed located at JHU/APL. This project will be modeled for the Leonardo constellation mission, but is intended to be adaptable to any constellation mission. To develop a common software architecture. the controllers will only model very high-level responses. For instance, after determining that a maneuver must be made. the MADECC system will output B (Delta)V (velocity change) value. Lower level systems must then decide which thrusters to fire and for how long to achieve that (Delta)V.
ZAMBEZI: a parallel pattern parallel fault sequential circuit fault simulator
Minesh B. Amin; Bapiraju Vinnakota
1996-01-01
Sequential circuit fault simulators use the multiple bits in a computer data word to accelerate simulation. We introduce, and implement, a new sequential circuit fault simulator, a parallel pattern parallel fault simulator, ZAMBEZI, which simultaneously simulates multiple faults with multiple vectors in one data word. ZAMBEZI is developed by enhancing the control flow, of existing parallel pattern algorithms. For a
A Discrete Event Controller Using Petri Nets Applied To Assembly
B. J. McCarragher; H. Asacla
1992-01-01
paper takes a new approach to robotic assembly, treating assembly as a discrete event system. A discrete event in assembly is defined as a change in contact state reflecting a change in a ge- ometric constraint. The discrete event modelling is ac~complished using Petri nets. The problems of task-level planning .and syn- thesis are addressed. Using the Petri net modelling,
HTDD based parallel fault simulator
Joanna Sapiecha; Krzysztof Sapiecha; Stanislaw Deniziak
1998-01-01
In this paper a new efficient approach to bit-parallel fault simulation for sequential circuits is introduced and evaluated with the help of ISCAS89 benchmarks. Digital systems are modelled using Hierarchical Ternary Decision Diagrams (HTDDs). It leads to substantial reduction of both the number of simulated faults and calculations needed for simulation. Moreover, an approach presented in this paper is able
F. Ris; P. M. Kogge
1989-01-01
Various papers on parallel processing are presented. The general topics addressed include: sorting and searching, image processing, algorithms, physical applications, graphs and trees, synchronization and scheduling, numerical linear algebra, program transformations, theoretical framework, and discrete event simulation.
Simulation Environment Configuration for Parallel Simulation of Multicore Embedded Systems
Ha, Soonhoi
Simulation Environment Configuration for Parallel Simulation of Multicore Embedded Systems Dukyoung turnaround time due to the growing demand of simulation time. Parallel simulation aims to accelerate the simulation speed by running component simulators concurrently. But extra overhead of communication
Automated Parallelization of Timed Petri-Net Simulations
David M. Nicol; Weizhen Mao
1995-01-01
Timed Petri-nets are used to model numerous types of large complex systems, especiallycomputer architectures and communication networks. While formal analysis of such modelsis sometimes possible, discrete-event simulation remains the most general technique availablefor assessing the model's behavior. However, simulation's computational requirements can bemassive, especially on the large complex models that defeat analytic methods. One way ofmeeting these requirements is by
Parallel Simulation of Multicomponent Systems
Michael T. Heath; Xiangmin Jiao
2004-01-01
\\u000a Simulation of multicomponent systems poses many critical challenges in science and engineering. We overview some software\\u000a and algorithmic issues in developing high-performance simulation tools for such systems, based on our experience in developing\\u000a a large-scale, fully-coupled code for detailed simulation of solid propellant rockets. We briefly sketch some of our solutions\\u000a to these issues, with focus on parallel and performance
Parallelizing Timed Petri Net simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1993-01-01
The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.
An assessment of the ModSim/TWOS parallel simulation environment
Rich, D.O.; Michelsen, R.E.
1991-01-01
The Time Warp Operating System (TWOS) has been the focus of significant research in parallel, discrete-event simulation (PDES). A new language, ModSim, has been developed for use in conjunction with TWOS. The coupling of ModSim and TWOS is an attempt to address the development of large-scale, complex, discrete-event simulation models for parallel execution. The approach, simply stated, is to provide a high-level simulation-language that embodies well-known software engineering principles combined with a high-performance parallel execution environment. The inherent difficulty with this approach is the mapping of the simulation application to the parallel run-time environment. To use TWOS, Time Warp applications are currently developed in C and must be tailored according to a set of constraints and conventions. C/TWOS applications are carefully developed using explicit calls to the Time Warp primitives; thus, the mapping of application to parallel run-time environment is done by the application developer. The disadvantage to this approach is the questionable scalability to larger software efforts; the obvious advantage is the degree of control over managing the efficient execution of the application. The ModSim/TWOS system provides an automatic mapping from a ModSim application to an equivalent C/TWOS application. The major flaw with the ModSim/TWOS system is it currently exists is that there is no compiler support for mapping a ModSim application into an efficient C/TWOS application. Moreover, the ModSim language as currently defined does not provide explicit hooks into the Time Warp Operating System and hence the developer is unable to tailor a ModSim application in the same fashion that a C application can be tailored. Without sufficient compiler support, there is a mismatch between ModSim's object-oriented, process-based execution model and the Time Warp execution model.
Analytic Perturbation Analysis of Discrete Event Dynamic Systems
Uryasev, S.
1994-09-01
This paper considers a new Analytic Perturbation Analysis (APA) approach for Discrete Event Dynamic Systems (DEDS) with discontinuous sample-path functions with respect to control parameters. The performance functions for DEDS usually are formulated as mathematical expectations, which can be calculated only numerically. APA is based on new analytic formulas for the gradients of expectations of indicator functions; therefore, it is called an analytic perturbation analysis. The gradient of performance function may not coincide with the expectation of a gradient of sample-path function (i.e., the interchange formula for the gradient and expectation sign may not be valid). Estimates of gradients can be obtained with one simulation run of the models.
Decentralized Modular Control of Concurrent Discrete Event Systems
Kumar, Ratnesh
Decentralized Modular Control of Concurrent Discrete Event Systems Changyan Zhou, Ratnesh Kumar, and Ramavarapu S. Sreenivas Abstract-- The paper studies decentralized modular control of concurrent discrete event systems that are composed of multiple interacting modules. A modular supervisor consists of a set
Automated Control Synthesis for an Assembly Line using Discrete Event System Control Theory
Kumar, Ratnesh
1 Automated Control Synthesis for an Assembly Line using Discrete Event System Control Theory- tems, an educational test-bed that simulates an automated car assembly line has been built using LEGO r blocks. Finite state machines (FSMs) are used for modeling operations of the assembly line
DEVS Today: Recent Advances in Discrete Event-Based Information Technology
further, we stop to review the basic DEVS formalism within a larger framework for modeling and simulation provides a means of specifying a mathematical object called a system [3,4,5]. Basically, a system has specify discrete event system parameters. 1 Apparently, readers of Mandarin use a vertical line and use
Xyce parallel electronic simulator design.
Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.
2010-09-01
This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.
Parallel Network Simulations with NEURON
Migliore, M.; Cannia, C.; Lytton, W.W; Markram, Henry; Hines, M. L.
2009-01-01
The NEURON simulation environment has been extended to support parallel network simulations. Each processor integrates the equations for its subnet over an interval equal to the minimum (interprocessor) presynaptic spike generation to postsynaptic spike delivery connection delay. The performance of three published network models with very different spike patterns exhibits superlinear speedup on Beowulf clusters and demonstrates that spike communication overhead is often less than the benefit of an increased fraction of the entire problem fitting into high speed cache. On the EPFL IBM Blue Gene, almost linear speedup was obtained up to 100 processors. Increasing one model from 500 to 40,000 realistic cells exhibited almost linear speedup on 2000 processors, with an integration time of 9.8 seconds and communication time of 1.3 seconds. The potential for speed-ups of several orders of magnitude makes practical the running of large network simulations that could otherwise not be explored. PMID:16732488
Data parallel sequential circuit fault simulation
Minesh B. Amin; Bapiraju Vinnakota
1996-01-01
Sequential circuit fault simulation is a compute-intensive problem. Parallel simulation is one method to reduce fault simulation time. In this paper, we discuss a novel technique to partition the fault set for the fault parallel simulation of sequential circuits on multiple processors. When applied statically, the technique can scale well for up to thirty two processors on an ethernet. The
Parallel simulation of the Sharks World problem
Rajive L. Bagrodia; Wen-Toh Liao
1990-01-01
The Sharks World problem has been suggested as a suitable application to evaluate the effectiveness of parallel simulation algorithms. This paper develops a simulation model in Maisie, a C-based simulation language. With minor modifications, a Maisie progrmm may be executed using either sequential or parallel simulation algorithms. The paper presents the results of executing the Maisie model on a multicomputer
Modelling machine ensembles with discrete event dynamical system theory
NASA Technical Reports Server (NTRS)
Hunter, Dan
1990-01-01
Discrete Event Dynamical System (DEDS) theory can be utilized as a control strategy for future complex machine ensembles that will be required for in-space construction. The control strategy involves orchestrating a set of interactive submachines to perform a set of tasks for a given set of constraints such as minimum time, minimum energy, or maximum machine utilization. Machine ensembles can be hierarchically modeled as a global model that combines the operations of the individual submachines. These submachines are represented in the global model as local models. Local models, from the perspective of DEDS theory , are described by the following: a set of system and transition states, an event alphabet that portrays actions that takes a submachine from one state to another, an initial system state, a partial function that maps the current state and event alphabet to the next state, and the time required for the event to occur. Each submachine in the machine ensemble is presented by a unique local model. The global model combines the local models such that the local models can operate in parallel under the additional logistic and physical constraints due to submachine interactions. The global model is constructed from the states, events, event functions, and timing requirements of the local models. Supervisory control can be implemented in the global model by various methods such as task scheduling (open-loop control) or implementing a feedback DEDS controller (closed-loop control).
A New Conservative Synchronization Protocol for Dynamic Wargame Simulation
Phuong T. Bui; Sheau-Dong Lang; David A. Workman
Parallel processing of discrete events requires that causality be retained, even when events are processed out of strict time ordering. The two approaches to parallel simulation are optimistic and conservative. Optimistic simulation allows processors to independently simulate events assuming they are temporally correct. When it is discovered that there is a temporal discrepancy, the simulation is \\
Modular supervisory control of discrete-event systems
W. M. Wonhamt; P. J. Ramadge
1988-01-01
A modular approach to the supervisory control of a class of discrete-event systems is formulated, and illustrated with an\\u000a example. Discrete-event systems are modeled by automata together with a mechanism for enabling and disabling a subset of state\\u000a transitions. The basic problem of interest is to ensure by appropriate supervision that the closed loop behavior of the system\\u000a lies within
Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL
2013-01-01
Virtual machine (VM) technologies, especially those offered via Cloud platforms, present new dimensions with respect to performance and cost in executing parallel discrete event simulation (PDES) applications. Due to the introduction of overall cost as a metric, the choice of the highest-end computing configuration is no longer the most economical one. Moreover, runtime dynamics unique to VM platforms introduce new performance characteristics, and the variety of possible VM configurations give rise to a range of choices for hosting a PDES run. Here, an empirical study of these issues is undertaken to guide an understanding of the dynamics, trends and trade-offs in executing PDES on VM/Cloud platforms. Performance results and cost measures are obtained from actual execution of a range of scenarios in two PDES benchmark applications on the Amazon Cloud offerings and on a high-end VM host machine. The data reveals interesting insights into the new VM-PDES dynamics that come into play and also leads to counter-intuitive guidelines with respect to choosing the best and second-best configurations when overall cost of execution is considered. In particular, it is found that choosing the highest-end VM configuration guarantees neither the best runtime nor the least cost. Interestingly, choosing a (suitably scaled) low-end VM configuration provides the least overall cost without adversely affecting the total runtime.
Applications Parallel PIC plasma simulation through particle
Vlad, Gregorio
Applications Parallel PIC plasma simulation through particle decomposition techniques B. Di Martino 2000 Abstract Parallelization of a particle-in-cell (PIC) code has been accomplished through technique requires a moderate eort in porting the code in parallel form and results in intrinsic load
Hybrid parallel tempering and simulated annealing method
Yaohang Li; Vladimir A. Protopopescu; Nikita Arnold; Xinyu Zhang; Andrey Gorin
2009-01-01
In this paper, we propose a new hybrid scheme of parallel tempering and simulated annealing (hybrid PT\\/SA). Within the hybrid PT\\/SA scheme, a composite system with multi- ple conformations is evolving in parallel on a temperature ladder with various transition step sizes. The simulated annealing (SA) process uses a cooling scheme to decrease the temperature values in the temperature ladder
Hierarchical Discrete Event Supervisory Control of Aircraft Propulsion Systems
NASA Technical Reports Server (NTRS)
Yasar, Murat; Tolani, Devendra; Ray, Asok; Shah, Neerav; Litt, Jonathan S.
2004-01-01
This paper presents a hierarchical application of Discrete Event Supervisory (DES) control theory for intelligent decision and control of a twin-engine aircraft propulsion system. A dual layer hierarchical DES controller is designed to supervise and coordinate the operation of two engines of the propulsion system. The two engines are individually controlled to achieve enhanced performance and reliability, necessary for fulfilling the mission objectives. Each engine is operated under a continuously varying control system that maintains the specified performance and a local discrete-event supervisor for condition monitoring and life extending control. A global upper level DES controller is designed for load balancing and overall health management of the propulsion system.
Discrete Event Multilevel Models for Systems Biology
Adelinde M. Uhrmacher; Daniela Degenring; Bernard P. Zeigler
2005-01-01
\\u000a Diverse modeling and simulation methods are being applied in the area of Systems Biology. Most models in Systems Biology can\\u000a easily be located within the space that is spanned by three dimensions of modeling: continuous and discrete; quantitative\\u000a and qualitative; stochastic and deterministic. These dimensions are not entirely independent nor are they exclusive. Many\\u000a modeling approaches are hybrid as they
Tropper, Carl
Conservative Synchronization of Large-Scale Network Simulations Alfred Park Richard M. Fujimoto-0280 {park,fujimoto,kalyan}@cc.gatech.edu Abstract Parallel discrete event simulation techniques have enabled and asynchronous algorithms for conservative parallel network simulation. We develop an analytical model
Diagnosis of asynchronous discrete event systems, a net unfolding approach
Paris-Sud XI, Université de
order model of time. Our basic mathematical tool is that of net unfoldings originating from the Petri. Diagnosis in the framework of Petri net models has also been investigated by some authors. Hadjicostis1 Diagnosis of asynchronous discrete event systems, a net unfolding approach Albert Benveniste
Diagnosis of asynchronous discrete event systems, a net unfolding approach
Albert Benveniste; Eric Fabre; Claude Jard; Stefan Haar
2002-01-01
This paper studies the diagnosis of asynchronous discrete event systems. We follow a so-called true concurrency approach, in which neither the global state nor global time are available. Instead, we use only local states in combination with a partial order model of time; our basic mathematical tool is that of Petri net unfoldings. This study was motivated by the problem
Optimal sensor activation for diagnosing discrete event systems
Weilin Wang; Stéphane Lafortune; Anouck R. Girard; Feng Lin
2010-01-01
The problem of dynamic sensor activation for event diagnosis in partially observed discrete event systems is considered. Diagnostic agents are able to activate sensors dynamically during the evolution of the system. Sensor activation policies for diagnostic agents are functions that determine which sensors are to be activated after the occurrence of a trace of events. The sensor activation policy must
Fluidization and fluid views of discrete event systems
Manuel Silva; Cristian Mahulea
2011-01-01
Fluidization is an efficient relaxation technique to tackle classical state explosion problems in discrete event systems (DES), which consists in approximating the discrete states by some continuous or hybrid ones. This is not a technical work in the most classical sense. More epistemological and methodological, the purpose is to overview fluidization in several well-known modeling paradigms for DES. As a
Monitoring and Active Diagnosis for Discrete-Event Systems
Pencolé, Yannick
. Keywords: Fault diagnosis, discrete-event systems, finite state machines, decision trees, planning 1 on dynamical systems. Sec- tion 3 presents a formal background about fault diagnosis in a DES. Section 4 (fault detection and isolation) and on-line planning/replanning (Chanthery et al. (2005)) launched
Data Parallel SwitchLevel Simulation \\Lambda Randal E. Bryant
Bryant, Randal E.
Mellon University Abstract Data parallel simulation involves simulating the be havior of a circuit over runs on a a massively parallel SIMD machine, with each processor simulat ing the circuit behavior parallelism in simulation utilize circuit parallelism. In this mode, the simulator extracts parallelism from
Dingwei Wang
2009-01-01
Simulation is an effective means when solving the design and operation problems in modern container terminals. Applying object-oriented discrete event system simulation method, and analyzing the technological process of loading and unloading as well as the operational management in modern container terminals, a simulation model was constructed to describe the whole operation system of container terminals, in which container ships,
Parallel methods for the flight simulation model
Wei Zhong Xiong; C. Swietlik
1994-01-01
The Advanced Computer Applications Center (ACAC) has been involved in evaluating advanced parallel architecture computers and the applicability of these machines to computer simulation models. The advanced systems investigated include parallel machines with shared. memory and distributed architectures consisting of an eight processor Alliant FX\\/8, a twenty four processor sor Sequent Symmetry, Cray XMP, IBM RISC 6000 model 550, and
Hierarchical, modular discrete-event modelling in an object-oriented environment
Zeigler, B.P.
1987-11-01
Hierarchical, modular specification of discrete-event models offers a basis for reusable model bases and hence for enhanced simulation of truly varied design alternatives. The authors describe an environment which realizes the DEVS formalism developed for hierarchical, modular models. It is implemented in PC-Scheme, a powerful Lisp dialect for microcomputers containing an object-oriented programming subsystem. Since both the implementation and the underlying language are accessible to the user, the result is a capable medium for combining simulation modelling and artificial intelligence techniques.
PARALLEL IMPLEMENTATION OF VLSI HED CIRCUIT SIMULATION
Silc, Jurij
14 PARALLEL IMPLEMENTATION OF VLSI HED CIRCUIT SIMULATION INFORMATICA 2/91 Keywords: circuit, India Junj Sile Marjan Spegel Jozef Stefan Institute, Ljubljana, Slovenia The importance of circuit simulation in the design of VLSI circuits has channelised research work in the direction of finding methods
Parallel Circuit Simulation Using Hierarchical Relaxation
Gih-guang Hung; Yen-cheng Wen; Kyle Gallivan; Resve A. Saleh
1990-01-01
This paper describes a class of parallel algorithms for circuit simulation based on hierarchical relaxation that has been implemented on the Cedar multiprocessor. The Cedar machine is a reconfigurable, general-purpose supercomputer that was designed and implemented at the University of Illinois. A hierarchical circuit simulation scheme was developed to exploit the hierarchical organization of Cedar. The new algorithm and a
Simulating the scheduling of parallel supercomputer applications
Seager, M.K.; Stichnoth, J.M.
1989-09-19
An Event Driven Simulator for Evaluating Multiprocessing Scheduling (EDSEMS) disciplines is presented. The simulator is made up of three components: machine model; parallel workload characterization ; and scheduling disciplines for mapping parallel applications (many processes cooperating on the same computation) onto processors. A detailed description of how the simulator is constructed, how to use it and how to interpret the output is also given. Initial results are presented from the simulation of parallel supercomputer workloads using Dog-Eat-Dog,'' Family'' and Gang'' scheduling disciplines. These results indicate that Gang scheduling is far better at giving the number of processors that a job requests than Dog-Eat-Dog or Family scheduling. In addition, the system throughput and turnaround time are not adversely affected by this strategy. 10 refs., 8 figs., 1 tab.
SYNTHESIS OF SUPERVISORS FOR TIME-VARYING DISCRETE EVENT SYSTEMS
Eduard Montgomery; Meira Costa; Antonio Marcus; Nogueira Limay
2004-01-01
We introduce a time-varying automaton to model discrete event systems. The structure of this time-varying automaton is very similar structure to (max,+) automaton, but allowing variable event lifetimes. Based on this time-varying automa- ton the design of timed supervisors is obtained by using the dioid algebra, where the languages used to describe the dis- crete event system as well the
An application of discrete-event theory to truck dispatching
Stephane Blouin; Martin Guay; Karen Rudie
2007-01-01
This article focuses on the dispatching problem of an oilsand excavation process subject to production objectives and specifications.\\u000a Herein, we cast the truck dispatching task in a decision-making framework for determining solutions and helping a dispatcher\\u000a to make decisions. In this paper, we apply the discrete-event formalism to investigate the dispatching of a large truck fleet.\\u000a For this purpose, we
Sample-path analysis of stochastic discrete-event systems
Muhammad El-Taha; Shaler Stidham
1993-01-01
This paper presents a unified sample-path approach for deriving distribution-free relations between performance measures for stochastic discrete-event systems extending previous results for discrete-state processes to processes with a general state space. A unique feature of our approach is that all our results are shown to follow from a single fundamental theorem: the sample-path version of the renewal-reward theorem (Y=X). As
Visualization and Tracking of Parallel CFD Simulations
NASA Technical Reports Server (NTRS)
Vaziri, Arsi; Kremenetsky, Mark
1995-01-01
We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.
Parallel processing of a rotating shaft simulation
NASA Technical Reports Server (NTRS)
Arpasi, Dale J.
1989-01-01
A FORTRAN program describing the vibration modes of a rotor-bearing system is analyzed for parellelism in this simulation using a Pascal-like structured language. Potential vector operations are also identified. A critical path through the simulation is identified and used in conjunction with somewhat fictitious processor characteristics to determine the time to calculate the problem on a parallel processing system having those characteristics. A parallel processing overhead time is included as a parameter for proper evaluation of the gain over serial calculation. The serial calculation time is determined for the same fictitious system. An improvement of up to 640 percent is possible depending on the value of the overhead time. Based on the analysis, certain conclusions are drawn pertaining to the development needs of parallel processing technology, and to the specification of parallel processing systems to meet computational needs.
Timed Residuals for Fault Detection and Isolation in Discrete Event Systems
Paris-Sud XI, Université de
Timed Residuals for Fault Detection and Isolation in Discrete Event Systems Stefan Schneider detection and isolation in discrete event systems is proposed. An identified model constitutes a timed of a virtual production plant with an external controller. Keywords-Discrete Event System; Timed Automata
Supervisory control of real-time discrete event systems under bounded time constraints
Lim, Jong-Tae
Supervisory control of real-time discrete event systems under bounded time constraints S.-J. Park discrete event systems (DESs) under bounded time constraints is presented. In order to address the bounded behaviour in real-time discrete event systems (DESs) depend not only on their logical correctness but also
NASA Technical Reports Server (NTRS)
Greenberg, Albert G.; Lubachevsky, Boris D.; Nicol, David M.; Wright, Paul E.
1994-01-01
Fast, efficient parallel algorithms are presented for discrete event simulations of dynamic channel assignment schemes for wireless cellular communication networks. The driving events are call arrivals and departures, in continuous time, to cells geographically distributed across the service area. A dynamic channel assignment scheme decides which call arrivals to accept, and which channels to allocate to the accepted calls, attempting to minimize call blocking while ensuring co-channel interference is tolerably low. Specifically, the scheme ensures that the same channel is used concurrently at different cells only if the pairwise distances between those cells are sufficiently large. Much of the complexity of the system comes from ensuring this separation. The network is modeled as a system of interacting continuous time automata, each corresponding to a cell. To simulate the model, conservative methods are used; i.e., methods in which no errors occur in the course of the simulation and so no rollback or relaxation is needed. Implemented on a 16K processor MasPar MP-1, an elegant and simple technique provides speedups of about 15 times over an optimized serial simulation running on a high speed workstation. A drawback of this technique, typical of conservative methods, is that processor utilization is rather low. To overcome this, new methods were developed that exploit slackness in event dependencies over short intervals of time, thereby raising the utilization to above 50 percent and the speedup over the optimized serial code to about 120 times.
Simulation of a master-slave event set processor
J. C. Comfort
1984-01-01
Event set manipulation may consume a considerable amount of the computation time spent in performing a discrete-event simulation. One way of minimizing this time is to allow event set processing to proceed in parallel with the remainder of the simulation computation. The paper describes a multiprocessor simulation computer, in which all non-event set processing is performed by the principal processor
Discrete-Event Execution Alternatives on General Purpose Graphical Processing Units
Perumalla, Kalyan S [ORNL
2006-01-01
Graphics cards, traditionally designed as accelerators for computer graphics, have evolved to support more general-purpose computation. General Purpose Graphical Processing Units (GPGPUs) are now being used as highly efficient, cost-effective platforms for executing certain simulation applications. While most of these applications belong to the category of time-stepped simulations, little is known about the applicability of GPGPUs to discrete event simulation (DES). Here, we identify some of the issues & challenges that the GPGPU stream-based interface raises for DES, and present some possible approaches to moving DES to GPGPUs. Initial performance results on simulation of a diffusion process show that DES-style execution on GPGPU runs faster than DES on CPU and also significantly faster than time-stepped simulations on either CPU or GPGPU.
Performance limitations in parallel processor simulations
NASA Technical Reports Server (NTRS)
O'Grady, E. Pearse; Wang, Chung-Hsien
1987-01-01
A jet-engine model is partitioned and simulated on a parallel processor system consisting of five 8086/8087 floating-point computers. The simulation uses Heun's integration method. A near-optimal parallel simulation (in the sense of minimum execution time) achieves speedup of only 2.13 and efficiency of 42.6 percent, in effect wasting 57.4 percent of the available processing power. A detailed analysis identifies and graphically demonstrates why the system fails to achieve ideal performance (viz., speedup of 5 and efficiency of 100 percent). Inherent characteristics of the problem equations and solution algorithm account for the loss of nearly half of the available processing power. Overheads associated with interprocessor communication and processor synchronization account for only a small fraction of the lost processing power. The effects of these and other factors which limit parallel processor performance are illustrated through real-time timing-analyzer tracers describing the run/idle status of the parallel processors during the simulation.
Parallel Agent Based Simulation on PC Cluster
Haridi, Seif
1 Parallel Agent Based Simulation on PC Cluster Seif Haridi Konstantin Popov Mahmoud Rafea Fredrik Â· Participate in "local" social networks Â· Maintain a portfolio of frequently visited sites Â· Have memory about from his portfolio. Â· Surf along the links of the already visited sites. Â· Replace a site
Control of discrete event systems modeled as hierarchical state machines
NASA Technical Reports Server (NTRS)
Brave, Y.; Heymann, M.
1991-01-01
The authors examine a class of discrete event systems (DESs) modeled as asynchronous hierarchical state machines (AHSMs). For this class of DESs, they provide an efficient method for testing reachability, which is an essential step in many control synthesis procedures. This method utilizes the asynchronous nature and hierarchical structure of AHSMs, thereby illustrating the advantage of the AHSM representation as compared with its equivalent (flat) state machine representation. An application of the method is presented where an online minimally restrictive solution is proposed for the problem of maintaining a controlled AHSM within prescribed legal bounds.
Parallel and Distributed System Simulation
NASA Technical Reports Server (NTRS)
Dongarra, Jack
1998-01-01
This exploratory study initiated our research into the software infrastructure necessary to support the modeling and simulation techniques that are most appropriate for the Information Power Grid. Such computational power grids will use high-performance networking to connect hardware, software, instruments, databases, and people into a seamless web that supports a new generation of computation-rich problem solving environments for scientists and engineers. In this context we looked at evaluating the NetSolve software environment for network computing that leverages the potential of such systems while addressing their complexities. NetSolve's main purpose is to enable the creation of complex applications that harness the immense power of the grid, yet are simple to use and easy to deploy. NetSolve uses a modular, client-agent-server architecture to create a system that is very easy to use. Moreover, it is designed to be highly composable in that it readily permits new resources to be added by anyone willing to do so. In these respects NetSolve is to the Grid what the World Wide Web is to the Internet. But like the Web, the design that makes these wonderful features possible can also impose significant limitations on the performance and robustness of a NetSolve system. This project explored the design innovations that push the performance and robustness of the NetSolve paradigm as far as possible without sacrificing the Web-like ease of use and composability that make it so powerful.
AGN variability time scales and the discrete-event model
P. Favre; T. J. -L. Courvoisier; S. Paltani
2005-08-29
We analyse the ultraviolet variability time scales in a sample of 15 Type 1 Active Galactic Nuclei (AGN) observed by IUE. Using a structure function analysis, we demonstrate the existence in most objects of a maximum variability time scale of the order of 0.02-1.00 year. We do not find any significant dependence of these maximum variability time scales on the wavelength, but we observe a weak correlation with the average luminosity of the objects. We also observe in several objects the existence of long-term variability, which seems decoupled from the short-term one. We interpret the existence of a maximum variability time scale as a possible evidence that the light curves of Type 1 AGN are the result of the superimposition of independent events. In the framework of the so-called discrete-event model, we study the event energy and event rate as a function of the object properties. We confront our results to predictions from existing models based on discrete events. We show that models based on a fixed event energy, like supernova explosions, can be ruled out. In their present form, models based on magnetic blobs are also unable to account for the observed relations. Stellar collision models, while not completely satisfactory, cannot be excluded.
Xyce parallel electronic simulator : reference guide.
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick
2011-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to run on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.
Parallel Monte Carlo Ion Recombination Simulation in Orca
Seinstra, Frank J.
Parallel Monte Carlo Ion Recombination Simulation in Orca Frank J. Seinstra Department. This report describes the implementation in Orca of a realistic Monte Carlo simulation of the recombinations. Keywords: Parallel computing, Orca, Ethernet, Myrinet, Monte Carlo sim ulation, ion recombination
Parallel-distributed mobile robot simulator
NASA Astrophysics Data System (ADS)
Okada, Hiroyuki; Sekiguchi, Minoru; Watanabe, Nobuo
1996-06-01
The aim of this project is to achieve an autonomous learning and growth function based on active interaction with the real world. It should also be able to autonomically acquire knowledge about the context in which jobs take place, and how the jobs are executed. This article describes a parallel distributed movable robot system simulator with an autonomous learning and growth function. The autonomous learning and growth function which we are proposing is characterized by its ability to learn and grow through interaction with the real world. When the movable robot interacts with the real world, the system compares the virtual environment simulation with the interaction result in the real world. The system then improves the virtual environment to match the real-world result more closely. This the system learns and grows. It is very important that such a simulation is time- realistic. The parallel distributed movable robot simulator was developed to simulate the space of a movable robot system with an autonomous learning and growth function. The simulator constructs a virtual space faithful to the real world and also integrates the interfaces between the user, the actual movable robot and the virtual movable robot. Using an ultrafast CG (computer graphics) system (FUJITSU AG series), time-realistic 3D CG is displayed.
Cong, Jason "Jingsheng"
Parallel Logic Level simulation of VLSI Circuits Abstract In this paper, we study parallel logic evaluated the impact on parallel circuit simulation of different number of partitions with different. Few staticstics have been published to exploit the parallelism and analyze performance in circuit
Parallel Molecular Dynamics Simulation on Elastic Properties of Solid Argon
Futoshi Shimizu; Hajime Kimizuka; Hideo Kaburaki; Ju Li; Sidney Yip
2000-01-01
Parallel Molecular Dynamics Stencil has been developed to execute effectively large-scale parallel molecular dynamics simulations. The Stencil is adapted to varieties of molecular dy- namics simulations without special attention to parallelization techniques. As an example of large-scale simulation using this Stencil, the adiabatic elastic constants of solid argon in crys- talline and amorphous states, have been evaluated over the temperature
MPI Parallelization of PIC Simulation with Adaptive Mesh Refinement
Tatsuki Matsui; Hideyuki Usui; Toseo Moritaka; Masanori Nunami
2011-01-01
With the prevalence of massively parallel com- puter architecture, MPI parallelization of existing simulation codes for stand-alone system and its parallel optimization to achieve feasible scalability are in critical need. Of many nu- merical approaches, adaptive mesh refinement(AMR) is known to be one of the particular cases in which MPI parallelization is challenging. In this manuscript, our ongoing project and
Parallel beam dynamics simulation of linear accelerators
Qiang, Ji; Ryne, Robert D.
2002-01-31
In this paper we describe parallel particle-in-cell methods for the large scale simulation of beam dynamics in linear accelerators. These techniques have been implemented in the IMPACT (Integrated Map and Particle Accelerator Tracking) code. IMPACT is being used to study the behavior of intense charged particle beams and as a tool for the design of next-generation linear accelerators. As examples, we present applications of the code to the study of emittance exchange in high intensity beams and to the study of beam transport in a proposed accelerator for the development of accelerator-driven waste transmutation technologies.
Parallel gate-level circuit simulation on shared memory architectures
Rajive Bagrodia; Yu-an Chen; Vikas Jha; Nicki Sonpar
1995-01-01
This paper presents the results of an experimental study to evaluate the effectiveness of parallel simulation in reducing the execution time of gate-level models of VLSI circuits. Specific contributions of this paper include (i) the design of a gate-level parallel simulator that can be executed, without any changes on both distributed memory and shared memory parallel architectures, (ii) demonstrated speedups
Massively Parallel Simulations of Solar Flares and Plasma Turbulence
Grauer, Rainer
Massively Parallel Simulations of Solar Flares and Plasma Turbulence Lukas Arnold, Christoph Beetz simulations need an efficient parallel implementation. We will describe the physics behind these problems and present the numerical frameworks for solving these problems on massive parallel computers. 1 Introduction
Distributed Event-Driven Simulation of VHDL-SPICE Mixed-Signal Circuits
Dragos Lungeanu; C.-J. Richard Shi
2001-01-01
Presents a new framework and its prototype implementation for the distributed simulation of a mixed-signal system where some parts are modeled by differential and algebraic equations (as in SPICE) and other parts are modeled by discrete events (as in VHDL or Verilog). The work is built on top of a general-purpose framework of parallel and distributed simulation that combines both
A Survey of Petri Net Methods for Controlled Discrete Event Systems
L. E. HOLLOWAY; B. H. KROGH; A. GIUA
1997-01-01
This paper surveys recent research on the application of Petri net models to the analysis and synthesis of controllers for discrete event systems. Petri nets have been used extensively in applications such as automated manufacturing, and there exists a large body of tools for qualitative and quantitative analysis of Petri nets. The goal of Petri net research in discrete event
Improving the Teaching of Discrete-Event Control Systems Using a LEGO Manufacturing Prototype
ERIC Educational Resources Information Center
Sanchez, A.; Bucio, J.
2012-01-01
This paper discusses the usefulness of employing LEGO as a teaching-learning aid in a post-graduate-level first course on the control of discrete-event systems (DESs). The final assignment of the course is presented, which asks students to design and implement a modular hierarchical discrete-event supervisor for the coordination layer of a…
Integrating discrete events and continuous head movements for video-based interaction techniques
Tatiana V. Evreinova; Grigori Evreinov; Roope Raisamo
2009-01-01
Human head gestures can potentially trigger different commands from the list of available options in graphical user interfaces or in virtual and smart environments. However, continuous tracking techniques are limited in generating discrete events which could be used to execute a predefined set of commands. In this article, we discuss a possibility to encode a set of discrete events by
Parallel and Distributed Multi-Algorithm Circuit Simulation
Dai, Ruicheng
2012-10-19
With the proliferation of parallel computing, parallel computer-aided design (CAD) has received significant research interests. Transient transistor-level circuit simulation plays an important role in digital/analog circuit design and verification...
Parallel Algorithms for Time and Frequency Domain Circuit Simulation
Dong, Wei
2010-10-12
simulation are the main focuses of our re- search, and correspondingly, the parallel simulation techniques in the following chap- ters are discussed and proposed based on these two analysis methods. B. Time domain analysis: transient simulation 1...
Decision Making in Fuzzy Discrete Event Systems1
Lin, F.; Ying, H.; MacArthur, R. D.; Cohn, J.A.; Barth-Jones, D.; Crane, L.R.
2009-01-01
The primary goal of the study presented in this paper is to develop a novel and comprehensive approach to decision making using fuzzy discrete event systems (FDES) and to apply such an approach to real-world problems. At the theoretical front, we develop a new control architecture of FDES as a way of decision making, which includes a FDES decision model, a fuzzy objective generator for generating optimal control objectives, and a control scheme using both disablement and enforcement. We develop an online approach to dealing with the optimal control problem efficiently. As an application, we apply the approach to HIV/AIDS treatment planning, a technical challenge since AIDS is one of the most complex diseases to treat. We build a FDES decision model for HIV/AIDS treatment based on expert’s knowledge, treatment guidelines, clinic trials, patient database statistics, and other available information. Our preliminary retrospective evaluation shows that the approach is capable of generating optimal control objectives for real patients in our AIDS clinic database and is able to apply our online approach to deciding an optimal treatment regimen for each patient. In the process, we have developed methods to resolve the following two new theoretical issues that have not been addressed in the literature: (1) the optimal control problem has state dependent performance index and hence it is not monotonic, (2) the state space of a FDES is infinite. PMID:19562097
State-feedback control of fuzzy discrete-event systems.
Lin, Feng; Ying, Hao
2010-06-01
In a 2002 paper, we combined fuzzy logic with discrete-event systems (DESs) and established an automaton model of fuzzy DESs (FDESs). The model can effectively represent deterministic uncertainties and vagueness, as well as human subjective observation and judgment inherent to many real-world problems, particularly those in biomedicine. We also investigated optimal control of FDESs and applied the results to optimize HIV/AIDS treatments for individual patients. Since then, other researchers have investigated supervisory control problems in FDESs, and several results have been obtained. These results are mostly derived by extending the traditional supervisory control of (crisp) DESs, which are string based. In this paper, we develop state-feedback control of FDESs that is different from the supervisory control extensions. We use state space to describe the system behaviors and use state feedback in control. Both disablement and enforcement are allowed. Furthermore, we study controllability based on the state space and prove that a controller exists if and only if the controlled system behavior is (state-based) controllable. We discuss various properties of the state-based controllability. Aside from novelty, the proposed new framework has the advantages of being able to address a wide range of practical problems that cannot be effectively dealt with by existing approaches. We use the diabetes treatment as an example to illustrate some key aspects of our theoretical results. PMID:19884087
Parallel Finite Element Simulation of Tracer Injection in Oil Reservoirs
Coutinho, Alvaro L. G. A.
Parallel Finite Element Simulation of Tracer Injection in Oil Reservoirs Alvaro L.G.A. Coutinho In this work, parallel finite element techniques for the simulation of tracer injection in oil reservoirs a renewed interest on the utilization of finite element approximations in reservoir simulations, mainly
Empirical study of parallel LRU simulation algorithms
NASA Technical Reports Server (NTRS)
Carr, Eric; Nicol, David M.
1994-01-01
This paper reports on the performance of five parallel algorithms for simulating a fully associative cache operating under the LRU (Least-Recently-Used) replacement policy. Three of the algorithms are SIMD, and are implemented on the MasPar MP-2 architecture. Two other algorithms are parallelizations of an efficient serial algorithm on the Intel Paragon. One SIMD algorithm is quite simple, but its cost is linear in the cache size. The two other SIMD algorithm are more complex, but have costs that are independent on the cache size. Both the second and third SIMD algorithms compute all stack distances; the second SIMD algorithm is completely general, whereas the third SIMD algorithm presumes and takes advantage of bounds on the range of reference tags. Both MIMD algorithm implemented on the Paragon are general and compute all stack distances; they differ in one step that may affect their respective scalability. We assess the strengths and weaknesses of these algorithms as a function of problem size and characteristics, and compare their performance on traces derived from execution of three SPEC benchmark programs.
Parallel Numerical Simulations of Water Reservoirs
NASA Astrophysics Data System (ADS)
Torres, Pedro; Mangiavacchi, Norberto
2010-11-01
The study of the water flow and scalar transport in water reservoirs is important for the determination of the water quality during the initial stages of the reservoir filling and during the life of the reservoir. For this scope, a parallel 2D finite element code for solving the incompressible Navier-Stokes equations coupled with scalar transport was implemented using the message-passing programming model, in order to perform simulations of hidropower water reservoirs in a computer cluster environment. The spatial discretization is based on the MINI element that satisfies the Babuska-Brezzi (BB) condition, which provides sufficient conditions for a stable mixed formulation. All the distributed data structures needed in the different stages of the code, such as preprocessing, solving and post processing, were implemented using the PETSc library. The resulting linear systems for the velocity and the pressure fields were solved using the projection method, implemented by an approximate block LU factorization. In order to increase the parallel performance in the solution of the linear systems, we employ the static condensation method for solving the intermediate velocity at vertex and centroid nodes separately. We compare performance results of the static condensation method with the approach of solving the complete system. In our tests the static condensation method shows better performance for large problems, at the cost of an increased memory usage. Performance results for other intensive parts of the code in a computer cluster are also presented.
A polymorphic reconfigurable emulator for parallel simulation
NASA Technical Reports Server (NTRS)
Parrish, E. A., Jr.; Mcvey, E. S.; Cook, G.
1980-01-01
Microprocessor and arithmetic support chip technology was applied to the design of a reconfigurable emulator for real time flight simulation. The system developed consists of master control system to perform all man machine interactions and to configure the hardware to emulate a given aircraft, and numerous slave compute modules (SCM) which comprise the parallel computational units. It is shown that all parts of the state equations can be worked on simultaneously but that the algebraic equations cannot (unless they are slowly varying). Attempts to obtain algorithms that will allow parellel updates are reported. The word length and step size to be used in the SCM's is determined and the architecture of the hardware and software is described.
Parallel multiscale simulations of a brain aneurysm
Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em
2012-01-01
Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multi-scale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver ?? ?r. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers ( ?? ?r and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work. PMID:23734066
Parallel multiscale simulations of a brain aneurysm
NASA Astrophysics Data System (ADS)
Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em
2013-07-01
Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver N??T?r. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (N??T?r and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work.
PARALLEL MOLECULAR DYNAMICS TECHNIQUES FOR THE SIMULATION OF ANISOTROPIC SYSTEMS
Wilson, Mark R.
PARALLEL MOLECULAR DYNAMICS TECHNIQUES FOR THE SIMULATION OF ANISOTROPIC SYSTEMS M. R. WILSON to parallelisation 1.1. THE NEED FOR PARALLELISM With the rapidincreases inthe speed of moderncomputers over the past few years, many people pose the question, \\why bother with parallel comput- ing?" For many researchers
PARALLEL COMPUTER SIMULATION TECHNIQUES FOR THE STUDY OF MACROMOLECULES
Wilson, Mark R.
PARALLEL COMPUTER SIMULATION TECHNIQUES FOR THE STUDY OF MACROMOLECULES Mark R. Wilson and Jaroslav for molecular dynamics, involving replicated data techniques; and go on to show how parallel performance can with even the slowest of communication links (ethernet). Finally, parallel techniques for conducting Monte
Improving the performance of parallel relaxation-based circuit simulators
Gih-guang Hung; Yen-cheng Wen; Kyle A. Gallivan; Resve A. Saleh
1993-01-01
Describes methods of increasing parallelism, thereby improving the performance, of waveform relaxation-based parallel circuit simulators. The key contribution is the use of parallel nonlinear relaxation and parallel model evaluation to solve large subcircuits that may lead to load balancing problems. These large subcircuits are further partitioned and solved on clusters of tightly-coupled multiprocessors. This paper describes a general hybrid\\/hierarchical approach
MAPS: multi-algorithm parallel circuit simulation
Xiaoji Ye; Wei Dong; Peng Li; Sani R. Nassif
2008-01-01
The emergence of multi-core and many-core processors has introduced new opportunities and challenges to EDA research and development. While the availability of increasing parallel computing power holds new promise to address many computing challenges in CAD, the leverage of hardware parallelism can only be possible with a new generation of parallel CAD applications. In this paper, we propose a novel
PARASPICE: A Parallel Circuit Simulator for Shared-Memory Multiprocessors
Gung-chung Yang
1990-01-01
This paper presents a general approach to parallelizing direct method circuit simulation. The approach extracts parallel tasks at the algorithmic level for each compute-intensive module and therefore is suitable for a wide range of shared-memory multiprocessors. The implementation of the approach in SPICE2 resulted in a portable parallel direct circuit simulator, PARASPICE. The superior performance of PARASPICE is demonstrated on
Partitioning strategies for parallel KIVA-4 engine simulations
Torres, D J [Los Alamos National Laboratory; Kong, S C [IOWA STATE UNIV
2008-01-01
Parallel KIVA-4 is described and simulated in four different engine geometries. The Message Passing-Interface (MPl) was used to parallelize KIVA-4. Par itioning strategies ar accesed in light of the fact that cells can become deactivated and activated during the course of an engine simulation which will affect the load balance between processors.
libFAUDES — An open source C++ library for discrete event systems
Thomas Moor; Klaus Schmidt; Sebastian Perk
2008-01-01
The libFAUDES (Friedrich-Alexander University Discrete Event Systems) library is an open source C++ software library for discrete event systems (DES) that is developed at the University of Erlangen-Nuremberg. The core library supports methods for the DES analysis and supervisor synthesis, while a built-in plugin mechanism allows of specialized library extensions. In this paper, we evaluate libFAUDES according to the benchmark
Parallel methods for dynamic simulation of multiple manipulator systems
NASA Technical Reports Server (NTRS)
Mcmillan, Scott; Sadayappan, P.; Orin, David E.
1993-01-01
In this paper, efficient dynamic simulation algorithms for a system of m manipulators, cooperating to manipulate a large load, are developed; their performance, using two possible forms of parallelism on a general-purpose parallel computer, is investigated. One form, temporal parallelism, is obtained with the use of parallel numerical integration methods. A speedup of 3.78 on four processors of CRAY Y-MP8 was achieved with a parallel four-point block predictor-corrector method for the simulation of a four manipulator system. These multi-point methods suffer from reduced accuracy, and when comparing these runs with a serial integration method, the speedup can be as low as 1.83 for simulations with the same accuracy. To regain the performance lost due to accuracy problems, a second form of parallelism is employed. Spatial parallelism allows most of the dynamics of each manipulator chain to be computed simultaneously. Used exclusively in the four processor case, this form of parallelism in conjunction with a serial integration method results in a speedup of 3.1 on four processors over the best serial method. In cases where there are either more processors available or fewer chains in the system, the multi-point parallel integration methods are still advantageous despite the reduced accuracy because both forms of parallelism can then combine to generate more parallel tasks and achieve greater effective speedups. This paper also includes results for these cases.
Parallel filtering in global gyrokinetic simulations
NASA Astrophysics Data System (ADS)
Jolliet, S.; McMillan, B. F.; Villard, L.; Vernay, T.; Angelino, P.; Tran, T. M.; Brunner, S.; Bottino, A.; Idomura, Y.
2012-02-01
In this work, a Fourier solver [B.F. McMillan, S. Jolliet, A. Bottino, P. Angelino, T.M. Tran, L. Villard, Comp. Phys. Commun. 181 (2010) 715] is implemented in the global Eulerian gyrokinetic code GT5D [Y. Idomura, H. Urano, N. Aiba, S. Tokuda, Nucl. Fusion 49 (2009) 065029] and in the global Particle-In-Cell code ORB5 [S. Jolliet, A. Bottino, P. Angelino, R. Hatzky, T.M. Tran, B.F. McMillan, O. Sauter, K. Appert, Y. Idomura, L. Villard, Comp. Phys. Commun. 177 (2007) 409] in order to reduce the memory of the matrix associated with the field equation. This scheme is verified with linear and nonlinear simulations of turbulence. It is demonstrated that the straight-field-line angle is the coordinate that optimizes the Fourier solver, that both linear and nonlinear turbulent states are unaffected by the parallel filtering, and that the k? spectrum is independent of plasma size at fixed normalized poloidal wave number.
Parallel magnetic field perturbations in gyrokinetic simulations
Joiner, N.; Hirose, A. [Department of Physics and Engineering Physics, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5E2 (Canada); Dorland, W. [University of Maryland, College Park, Maryland 20742 (United States)
2010-07-15
At low beta it is common to neglect parallel magnetic field perturbations on the basis that they are of order beta{sup 2}. This is only true if effects of order beta are canceled by a term in the nablaB drift also of order beta[H. L. Berk and R. R. Dominguez, J. Plasma Phys. 18, 31 (1977)]. To our knowledge this has not been rigorously tested with modern gyrokinetic codes. In this work we use the gyrokinetic code GS2[Kotschenreuther et al., Comput. Phys. Commun. 88, 128 (1995)] to investigate whether the compressional magnetic field perturbation B{sub ||} is required for accurate gyrokinetic simulations at low beta for microinstabilities commonly found in tokamaks. The kinetic ballooning mode (KBM) demonstrates the principle described by Berk and Dominguez strongly, as does the trapped electron mode, in a less dramatic way. The ion and electron temperature gradient (ETG) driven modes do not typically exhibit this behavior; the effects of B{sub ||} are found to depend on the pressure gradients. The terms which are seen to cancel at long wavelength in KBM calculations can be cumulative in the ion temperature gradient case and increase with eta{sub e}. The effect of B{sub ||} on the ETG instability is shown to depend on the normalized pressure gradient beta{sup '} at constant beta.
A Parallel and Accelerated Circuit Simulator with Precise Accuracy
Peter M. Lee; Shinji Ito; Takeaki Hashimoto; Tomomasa Touma; Junji Sato; Goichi Yokomizo; Ic
2002-01-01
We have developed a hi ghly parallel and accelerated circuit simulator which produces precise results for large scale simulation. We incorporated multithreading in both the model and matrix calculations to achieve not only a factor of 10 acceleration compared to the defacto standard circuit simulator used worldwide, but also equal or exceed the performance of timing-based event -driven simulators with
DCCB and SCC Based Fast Circuit Partition Algorithm For Parallel SPICE Simulation
Wang, Yu
DCCB and SCC Based Fast Circuit Partition Algorithm For Parallel SPICE Simulation Xiaowei Zhou, Yu facing VLSI circuits for parallel simulation. This paper presents an efficient circuit partition algorithm specially designed for VLSI circuit partition and parallel simulation. The algorithm
Parallel transistor level circuit simulation using domain decomposition methods
He Peng; Chung-kuan Cheng
2009-01-01
This paper presents an efficient parallel transistor level full-chip circuit simulation tool with SPICE-accuracy. The new approach partitions the circuit into a linear domain and several non-linear domains based on circuit non-linearity and connectivity. The linear domain is solved by parallel fast linear solver while nonlinear domains are parallelly distributed into different processors and solved by direct solver. Parallel domain
Development of parallelism for circuit simulation by tearing
H. Onozuka; M. Kanoh; C. Mizuta; T. Nakata; N. Tanabe
1993-01-01
A hierarchical clustering with min-cut exchange method for parallel circuit simulation is presented. Partitioning into subcircuits is near optimum in terms of distribution of computational cost and does not sacrifice the sparsity of the entire matrix. In order to compute the arising dense interconnection matrix in parallel, multilevel and distributed row-base dissection algorithms are used. A processing speed up of
Use of Parallel Tempering for the Simulation of Polymer Melts
Alex Bunker; Burkhard Duenweg; Doros Theodorou
2000-01-01
The parallel tempering algorithm(C. J. Geyer, Computing Science and Statistics: Proceedings of the 23rd Symposium of the Interface, 156 (1991).) is based on simulating several systems in parallel, each of which have a slightly different Hamiltonian. The systems are put in equilibrium with each other by stochastic swaps between neighboring Hamiltonians. Previous implementations have mainly focused on the temperature as
Development of a Massively-Parallel, Biological Circuit Simulator
Richard L. Schiek; Elebeoba E. May
2003-01-01
Genetic expression and control pathways can be successfully modeled as electrical circuits. Given the vast quantity of genomic data, very large and complex genetic circuits can be constructed. To tackle such problems, the massively-parallel, electronic circuit simulator, Xyce™, is being adapted to address biological problems. Unique to this biocircuit simulator is the ability to simulate not just one or a
Parallelizing Circuit Simulation - A Combined Algorithmic And Specialized Hardware Approach
Jacob White; Nicholas Weiner
Accurate performance estimation of high-density integrated circuits requires the kind of detailed numerical simulation performed in programs like ASTAP[1] and SPICE[2]. Because of the large computation time required for such prograins when applied to large circuits, accelerating numerical simulation is an important problem. Parallel processing promises to be a viable approach to accclerating the simulation of large circuits. This paper
New Iterative Linear Solvers For Parallel Circuit Simulation
Reiji Suda
1996-01-01
This thesis discusses iterative linear solvers for parallel transient analysis of large scale logic circuits. Theincreasing importance of large scale circuit simulation is the driving force of the researches on efficient parallelcircuit simulation. The most time consuming part of circuit transient analysis is the model evaluation, andthe next is the linear solver, which takes about 1\\/5 of simulation time. Although
Parallel Simulation of a Stochastic Agent/Environment Interaction Model
Vialle, Stéphane
interest for simulating complex systems (such as social ones, e.g. [14, 12], biological ones, e.g. [30 of such a model in case of simulating a set of mobile autonomous robots evolving in a structured environment1 Parallel Simulation of a Stochastic Agent/Environment Interaction Model Makram BOUZID 1 , Vincent
Large nonadiabatic quantum molecular dynamics simulations on parallel computers
NASA Astrophysics Data System (ADS)
Shimojo, Fuyuki; Ohmura, Satoshi; Mou, Weiwei; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya
2013-01-01
We have implemented a quantum molecular dynamics simulation incorporating nonadiabatic electronic transitions on massively parallel computers to study photoexcitation dynamics of electrons and ions. The nonadiabatic quantum molecular dynamics (NAQMD) simulation is based on Casida's linear response time-dependent density functional theory to describe electronic excited states and Tully's fewest-switches surface hopping approach to describe nonadiabatic electron-ion dynamics. To enable large NAQMD simulations, a series of techniques are employed for efficiently calculating long-range exact exchange correction and excited-state forces. The simulation program is parallelized using hybrid spatial and band decomposition, and is tested for various materials.
Aerodynamic simulation on massively parallel systems
NASA Technical Reports Server (NTRS)
Haeuser, Jochem; Simon, Horst D.
1992-01-01
This paper briefly addresses the computational requirements for the analysis of complete configurations of aircraft and spacecraft currently under design to be used for advanced transportation in commercial applications as well as in space flight. The discussion clearly shows that massively parallel systems are the only alternative which is both cost effective and on the other hand can provide the necessary TeraFlops, needed to satisfy the narrow design margins of modern vehicles. It is assumed that the solution of the governing physical equations, i.e., the Navier-Stokes equations which may be complemented by chemistry and turbulence models, is done on multiblock grids. This technique is situated between the fully structured approach of classical boundary fitted grids and the fully unstructured tetrahedra grids. A fully structured grid best represents the flow physics, while the unstructured grid gives best geometrical flexibility. The multiblock grid employed is structured within a block, but completely unstructured on the block level. While a completely unstructured grid is not straightforward to parallelize, the above mentioned multiblock grid is inherently parallel, in particular for multiple instruction multiple datastream (MIMD) machines. In this paper guidelines are provided for setting up or modifying an existing sequential code so that a direct parallelization on a massively parallel system is possible. Results are presented for three parallel systems, namely the Intel hypercube, the Ncube hypercube, and the FPS 500 system. Some preliminary results for an 8K CM2 machine will also be mentioned. The code run is the two dimensional grid generation module of Grid, which is a general two dimensional and three dimensional grid generation code for complex geometries. A system of nonlinear Poisson equations is solved. This code is also a good testcase for complex fluid dynamics codes, since the same datastructures are used. All systems provided good speedups, but message passing MIMD systems seem to be best suited for large miltiblock applications.
A conservative approach to parallelizing the Sharks World simulation
NASA Technical Reports Server (NTRS)
Nicol, David M.; Riffe, Scott E.
1990-01-01
Parallelizing a benchmark problem for parallel simulation, the Sharks World, is described. The described solution is conservative, in the sense that no state information is saved, and no 'rollbacks' occur. The used approach illustrates both the principal advantage and principal disadvantage of conservative parallel simulation. The advantage is that by exploiting lookahead an approach was found that dramatically improves the serial execution time, and also achieves excellent speedups. The disadvantage is that if the model rules are changed in such a way that the lookahead is destroyed, it is difficult to modify the solution to accommodate the changes.
Dickens, Phillip M.
0481, or permissions@acm.org . #12; Analysis of Bounded Time Warp and Comparison with YAWNS Phillip M. Dickens Illinois; Analysis of Bounded Time Warp and Comparison with YAWNS \\Delta 1 This paper studies an analytic model of parallel discreteevent simulation, comparing the YAWNS conservative synchronization protocol with Bounded
A survey of simulation optimization techniques and procedures
James R. Swisher; Paul D. Hyden; S. H. Jacobson; L. W. Schruben
2000-01-01
Discrete event simulation optimization is a problem of significant interest to practitioners interested in extracting useful information about an actual (or yet to be designed) system that can be modeled using discrete event simulation. This paper presents a brief survey of the literature on discrete event simulation optimization over the past decade (1988 to the present). Swisher et al. (2000)
Parallel Signal Processing and System Simulation using aCe
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2003-01-01
Recently, networked and cluster computation have become very popular for both signal processing and system simulation. A new language is ideally suited for parallel signal processing applications and system simulation since it allows the programmer to explicitly express the computations that can be performed concurrently. In addition, the new C based parallel language (ace C) for architecture-adaptive programming allows programmers to implement algorithms and system simulation applications on parallel architectures by providing them with the assurance that future parallel architectures will be able to run their applications with a minimum of modification. In this paper, we will focus on some fundamental features of ace C and present a signal processing application (FFT).
Cellular Automata + Parallel Computing = Computational Simulation
Domenico Talia
In the latest years a novel method has been added in science to theory and laboratory experiments for studying and solving scientific problems. That method can b e defined as computational simulation. Computational simulation is based on the use of computers for modeling and simulation of complex systems in science and engineering. According to this approach, a computer equipped with
SPINET: A Parallel Computing Approach to Spine Simulations
Schneider, Jean-Guy
SPINET: A Parallel Computing Approach to Spine Simulations Peter G. Kropf 1 , Edgar F.A. Lederer 2, and symbolic and modern functional programming. The target application is the human spine. Simulations of the spine help to investigate and better understand the mechanisms of back pain and spinal injury. Two
Coarse Grain Parallel Finite Element Simulations for Incompressible Flows
Grant, P. W.
.F. Webster1 Institute of nonNewtonian Fluid Mechanics Department of Computer Science University of Wales in a sequential form for the simulation of incompressible Newtonian and non-Newtonian flows [1, 2]. Here University Uxbridge, Middlesex, UB8 3PH, UK Parallel simulation of incompressible fluid flows is considered
HOPE: an efficient parallel fault simulator for synchronous sequential circuits
Hyung Ki Lee; Dong Sam Ha
1992-01-01
In this paper, we present an efficient sequential circuit parallel fault simulator, HOPE, which simulates 32 faults at a time. The key idea incorporated in HOPE is to screen out faults with short propagation paths through the single fault propagation. A systematic method of identifying faults with short propagation paths is presented. The proposed method substantially reduces the total number
Exploiting model independence for parallel PCS network simulation
Azzedine Boukerche; Sajal K. Das; Alessandro Fabbri; Oktay Yildiz
1999-01-01
In this paper, we present a parallel simulator (SWiMNet) for PCS networks using a combination of optimistic and conservative paradigms. The proposed methodology exploits event precomputation permitted by model independence within the PCS components. The low percentage of blocked calls is exploited in the channel allocation simulation of precomputed events by means of an optimistic approach. %To illustrate and verify
Parallel Performance of a Combustion Chemistry Simulation Gregg Skinner
Padua, David
Parallel Performance of a Combustion Chemistry Simulation Gregg Skinner Rudolf Eigenmann Center. 1 Introduction Numerical simulations of reactive ow are widely used for problems such as controlling Agency and by Army contract #DABT63-92-C-0033. This work is not necessarily representative
The 3-D massively parallel impact simulations using PCTH
NASA Astrophysics Data System (ADS)
Fang, H. E.; Robinson, A. C.
Simulations of hypervelocity impact problems are performed frequently by government laboratories and contractors for armor/anti-armor applications. These simulations need to deal with shock wave physics phenomena, large material deformation, motion of debris particles, and complex geometries. As a result, memory and processing time requirements are large for detailed, three-dimensional calculations. The large massively parallel supercomputing systems of the future will provide the power necessary to greatly reduce simulation times currently required by shared-memory, vector supercomputers. This paper gives an introduction to PCTH, a next-generation shock wave physics code which is being built at Sandia National Laboratories for massively parallel supercomputers, and demonstrates that massively parallel hydrocodes, such as PCTH, can provide highly-detailed, three-dimensional simulations of armor/anti-armor systems.
3-D massively parallel impact simulations using PCTH
Fang, H.E.; Robinson, A.C.
1992-01-01
Simulations of hypervelocity impact problems are performed frequently by government laboratories and contractors for armor/anti-armor applications. These simulations need to deal with shock wave physics phenomena, large material deformation, motion of debris particles and complex geometries. As a result, memory and processing time requirements are large for detailed, three-dimensional calculations. The large massively parallel supercomputing systems of the future will provide the power necessary to greatly reduce simulation times currently required by shared-memory, vector supercomputers. This paper gives an introduction to PCTH, a next-generation shock wave physics code which is being built at Sandia National Laboratories for massively parallel supercomputers, and demonstrates that massively parallel hydrocodes, such as PCTH, can provide highly-detailed, three-dimensional simulations of armor/anti-armor systems.
3-D massively parallel impact simulations using PCTH
Fang, H.E.; Robinson, A.C.
1992-12-31
Simulations of hypervelocity impact problems are performed frequently by government laboratories and contractors for armor/anti-armor applications. These simulations need to deal with shock wave physics phenomena, large material deformation, motion of debris particles and complex geometries. As a result, memory and processing time requirements are large for detailed, three-dimensional calculations. The large massively parallel supercomputing systems of the future will provide the power necessary to greatly reduce simulation times currently required by shared-memory, vector supercomputers. This paper gives an introduction to PCTH, a next-generation shock wave physics code which is being built at Sandia National Laboratories for massively parallel supercomputers, and demonstrates that massively parallel hydrocodes, such as PCTH, can provide highly-detailed, three-dimensional simulations of armor/anti-armor systems.
Evaluating Contention Management Using Discrete Event Brian Demsky Alokika Dash
Massachusetts at Amherst, University of
implementation, it can be useful to estimate the potential benefits from the contention manager before of contention managers. Our simulator also allows re- searchers to estimate the potential benefits of a highly {bdemsky,adash}@uci.edu Abstract Understanding the behavior and benefits of contention managers
MODEL-BASED CONTROL SYNTHESIS FOR DISCRETE EVENT SYSTEMS
Mourad Chouikha; Eckehard Schnieder; Bernhard Ober; Siemens AG
In control engineering models of the controlled systems are the basis for controller synthesis as well as for analyti- cal or simulative examination of open or closed-loop be- haviour. This model-based methodology is being trans- ferred into automation engineering by means of a devel- opment environment for the programming of logical con- trollers. Petri net models of the controlled system
Activity regions for the specification of discrete event systems
Alexandre Muzy; Luc Touraille; Hans Vangheluwe; Olivier Michel; Mamadou Kaba Traoré; David R. C. Hill
2010-01-01
The common view on modeling and simulation of dynamic systems is to focus on the specification of the state of the system and its transition function. Although some interesting challenges remain to efficiently and elegantly support this view, we consider in this paper that this problem is solved. Instead, we propose here to focus on a new point of view
Parallel Computing Environments and Methods for Power Distribution System Simulation
Ning Lu; Z. Todd Taylor; David P. Chassin; Ross T. Guttromson; R. Scott Studham
2004-01-01
The development of cost-effective high- performance parallel computing on multi-processor supercomputers makes it attractive to port excessively time consuming simulation software from personal computers (PC) to super computes. The power distribution system simulator (PDSS) takes a bottom-up approach and simulates load at the appliance level, where detailed thermal models for appliances are used. This approach works well for a small
Simulation of a word recognition system on two parallel architectures
Yoder, M.A.; Jamieson, L.H. (Purdue Univ., Lafayette, IN (USA). Dept. of Electrical Engineering)
1989-09-01
When designing a parallel architecture it is advantageous to consider the applications for which the architecture will be used. This paper examines the use of two parallel architectures, a single instruction stream multiple data stream (SIMD) machine and a VLSI processor array, to implement an isolated word recognition system. SIMD and VLSI processor array algorithms were written for each of the components of the recognition system. The component parallel algorithms were simulated along with two complete recognition systems, one composed of SIMD algorithms and the other composed of VLSI processor array algorithms.
Paris-Sud XI, Université de
Predictability Analysis of Distributed Discrete Event Systems Lina Ye and Philippe Dague and Farid Nouioua Abstract-- Predictability is an important system property that determines with certainty the future occurrence of a fault based on a model of the system and a sequence of observations. The existing
Lee W. Schruben
2006-01-01
This paper illustrates the use of mathematical programming in computing gradient estimators. Consistency property of these estimators is established under the usual assumptions for IPA gradient estimator consistency. A finite difference tolerance limit is introduced. For complex discrete-event systems, more concise linear programming representations are developed. These new representations provide a direct way of calculating gradient estimates.
Wai Kin Chan; Lee W. Schruben
2006-01-01
This paper illustrates the use of mathematical programming in computing gradient estimators. Consistency property of these estimators is established under the usual assumptions for IPA gradient estimator consistency. A finite difference tolerance limit is introduced. For complex discrete-event systems, more concise linear programming representations are developed. These new representations provide a direct way of calculating gradient estimates
Determination of Timed Transitions in Identified Discrete-Event Models for Fault Detection
Boyer, Edmond
is to model large-scale Discrete Event Systems (DESs) with little a-priori knowledge. For this class approaches are knowledge based and therefore comprehensive a-priori in- formation is required. For large-scale generic method for time guard determination of large-scale DESs affected by dis- turbances. Known
Discrete-event requirements model for sensor fusion to provide real-time diagnostic feedback
NASA Astrophysics Data System (ADS)
Rokonuzzaman, Mohd; Gosine, Raymond G.
1998-06-01
Minimally-invasive surgical techniques reduce the size of the access corridor and affected zones resulting in limited real-time perceptual information available to the practitioners. A real-time feedback system is required to offset deficiencies in perceptual information. This feedback system acquires data from multiple sensors and fuses these data to extract pertinent information within defined time windows. To perform this task, a set of computing components interact with each other resulting in a discrete event dynamic system. In this work, a new discrete event requirements model for sensor fusion has been proposed to ensure logical and temporal correctness of the operation of the real-time diagnostic feedback system. This proposed scheme models system requirements as a Petri net based discrete event dynamic machine. The graphical representation and quantitative analysis of this model has been developed. Having a natural graphical property, this Petri net based model enables the requirements engineer to communicate intuitively with the client to avoid faults in the early phase of the development process. The quantitative analysis helps justify the logical and temporal correctness of the operation of the system. It has been shown that this model can be analyzed to check the presence of deadlock, reachability, and repetitiveness of the operation of the sensor fusion system. This proposed novel technique to model the requirements of sensor fusion as a discrete event dynamic system has the potential to realize highly reliable real-time diagnostic feedback system for many applications, such as minimally invasive instrumentation.
Frank L. Lewis
2001-01-01
A supervisory controller for discrete-event (DE) systems is presented that uses a novel matrix formulation. This matrix formulation makes it possible to directly write down the DE controller from standard manufacturing tools such as the bill of materials or the assembly tree. The matrices also make it straightforward to actually implement the DE controller on a manufacturing workcell for sequencing
On the Development and Implementation of a Matrix-Based Discrete Event Controller
José Mireles; Frank Lewis
2001-01-01
A supervisory controller for Discrete Event (DE) Systems is presented that uses a novel Matrix Formulation. This matrix formulation makes it direct to write down the DE controller from standard manufacturing tools such as the Bill of Materials or the assembly tree. The matrices also make it straightforward to actually implement the DE controller on a manufacturing workcell for sequencing
Diagnosis of asynchronous discrete-event systems: a net unfolding approach
ALBERT BENVENISTE; ERIC FABRE; Stefan Haar; Claude Jard
2003-01-01
In this paper, we consider the diagnosis of asynchronous discrete event systems. We follow a so-called true concurrency approach, in which no global state and no global time is available. Instead, we use only local states in combination with a partial order model of time. Our basic mathematical tool is that of net unfoldings originating from the Petri net research
Diagnosability Analysis and Sensor Selection in Discrete-Event Systems with Permanent Failures
J. Pan; S. Hashtrudi-Zad
2007-01-01
In this paper, the problems of failure diagnosability and sensor selection for failure detection and isolation in discrete-event systems are studied. The system could operate in normal condition, or in a set of faulty conditions each corresponding to a combination of failure modes of the system. A polynomial algorithm is proposed that verifies diagnosability by examining the distinguishability of two
to motivate possibilities of modeling nosocomial infec- tion dynamics. This is done in the context of hospital is formulated to estimate key population-level nosocomial transmission parameters and isolation procedures model development. Key Words: Delay equations, discrete events, nosocomial infection dynamics, surveil
Fault Detection and Isolation in Manufacturing Systems with an Identified Discrete Event Model
Paris-Sud XI, Université de
Fault Detection and Isolation in Manufacturing Systems with an Identified Discrete Event Model) In this paper a generic method for fault detection and isolation (FDI) in manufacturing systems considered and controller built on the basis of observed fault free system behavior. An identification algorithm known from
Joon Sung Hong; Hae-Sang Song; Tag Gon Kim; Kyu Ho Park
1997-01-01
We present a time domain extension of the hierarchical and modular discrete event specification (DEVS) formalism. This extension is important for establishing a seamless real-time software development framework. Formalisms help describe a system unambiguously. If formal models are implemented without any consistent frameworks, however, it is hard to guarantee that there is no semantic gap between models and codes. Real-Time
Modeling a drilling control system, as a Discrete-Event-System
Nejm Saadallah; Hein Meling; Benoit Daireaux
2011-01-01
The application of Discrete Event Systems to model a drilling control system is investigated in this paper. Issues in the drilling process include enhancing the drilling performance, and reducing the risk of accidents. The drilling control systems that are in use today were designed to meet earlier requirements which unfortunately do not hold today. Control System engineers meet great difficulties
UML2ALLOY: A TOOL FOR LIGHTWEIGHT MODELLING OF DISCRETE EVENT SYSTEMS
Bordbar, Behzad
UML2ALLOY: A TOOL FOR LIGHTWEIGHT MODELLING OF DISCRETE EVENT SYSTEMS Behzad Bordbar School and automatic analysis of a wide variety of systems. On the other hand, the Unified Modelling Language (UML of the UML and Alloy into a single CASE tool, which aims to take advantage of the positive aspect of both
Centralized and decentralized asynchronous optimization of stochastic discrete-event systems
F. J. Vazquez-Abad; C. G. Cassandras; V. Julka
1998-01-01
We propose and analyze centralized and decentralized asynchronous control structures for the parametric optimization of stochastic discrete-event systems (DES) consisting of K distributed components. We use a stochastic approximation type of optimization scheme driven by gradient estimates of a global performance measure with respect to local control parameters. The estimates are obtained in distributed and asynchronous fashion at the K
A hybrid parallel framework for the cellular Potts model simulations
Jiang, Yi [Los Alamos National Laboratory; He, Kejing [SOUTH CHINA UNIV; Dong, Shoubin [SOUTH CHINA UNIV
2009-01-01
The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approach achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).
Parallel Monte Carlo Simulation for control system design
NASA Technical Reports Server (NTRS)
Schubert, Wolfgang M.
1995-01-01
The research during the 1993/94 academic year addressed the design of parallel algorithms for stochastic robustness synthesis (SRS). SRS uses Monte Carlo simulation to compute probabilities of system instability and other design-metric violations. The probabilities form a cost function which is used by a genetic algorithm (GA). The GA searches for the stochastic optimal controller. The existing sequential algorithm was analyzed and modified to execute in a distributed environment. For this, parallel approaches to Monte Carlo simulation and genetic algorithms were investigated. Initial empirical results are available for the KSR1.
Koutsoukos, Xenofon D.
Fault Diagnosis of Continuous Systems Using Discrete-Event Methods Matthew Daigle, Xenofon.j.daigle,xenofon.koutsoukos,gautam.biswas@vanderbilt.edu Abstract-- Fault diagnosis is crucial for ensuring the safe operation of complex engineering systems fault isolation in systems with complex continuous dynamics. This paper presents a novel discrete- event
Coglio, Alessandro
Opera Pia 13, I16145 Genova, Italy {music, tokamak}@dist.unige.it Abstract -- We propose a formalism manufacturing field, as discrete event controllers for plants [3]. In this paper, a controller for a DES S = á, consider the design of a Colored Petri Net as discrete event controller for a large plant where uses
Makino, Jun
machine for scientific simulation with the peak speed exceeding one teraflops. In the present project, we hundred times faster than that of GRAPE4 system. In addition to this ultrafast specialpurpose computerpurpose parallel computer. In the following, we briefly discuss the positioning of our GRAPE6 system compared
Simulating the Immune Response on a Distributed Parallel Computer
NASA Astrophysics Data System (ADS)
Castiglione, F.; Bernaschi, M.; Succi, S.
The application of ideas and methods of statistical mechanics to problems of biological relevance is one of the most promising frontiers of theoretical and computational mathematical physics.1,2 Among others, the computer simulation of the immune system dynamics stands out as one of the prominent candidates for this type of investigations. In the recent years immunological research has been drawing increasing benefits from the resort to advanced mathematical modeling on modern computers.3,4 Among others, Cellular Automata (CA), i.e., fully discrete dynamical systems evolving according to boolean laws, appear to be extremely well suited to computer simulation of biological systems.5 A prominent example of immunological CA is represented by the Celada-Seiden automaton, that has proven capable of providing several new insights into the dynamics of the immune system response. To date, the Celada-Seiden automaton was not in a position to exploit the impressive advances of computer technology, and notably parallel processing, simply because no parallel version of this automaton had been developed yet. In this paper we fill this gap and describe a parallel version of the Celada-Seiden cellular automaton aimed at simulating the dynamic response of the immune system. Details on the parallel implementation as well as performance data on the IBM SP2 parallel platform are presented and commented on.
Parallelization of Rocket Engine Simulator Software (PRESS)
NASA Technical Reports Server (NTRS)
Cezzar, Ruknet
1998-01-01
We have outlined our work in the last half of the funding period. We have shown how a demo package for RESSAP using MPI can be done. However, we also mentioned the difficulties with the UNIX platform. We have reiterated some of the suggestions made during the presentation of the progress of the at Fourth Annual HBCU Conference. Although we have discussed, in some detail, how TURBDES/PUMPDES software can be run in parallel using MPI, at present, we are unable to experiment any further with either MPI or PVM. Due to X windows not being implemented, we are also not able to experiment further with XPVM, which it will be recalled, has a nice GUI interface. There are also some concerns, on our part, about MPI being an appropriate tool. The best thing about MPr is that it is public domain. Although and plenty of documentation exists for the intricacies of using MPI, little information is available on its actual implementations. Other than very typical, somewhat contrived examples, such as Jacobi algorithm for solving Laplace's equation, there are few examples which can readily be applied to real situations, such as in our case. In effect, the review of literature on both MPI and PVM, and there is a lot, indicate something similar to the enormous effort which was spent on LISP and LISP-like languages as tools for artificial intelligence research. During the development of a book on programming languages [12], when we searched the literature for very simple examples like taking averages, reading and writing records, multiplying matrices, etc., we could hardly find a any! Yet, so much was said and done on that topic in academic circles. It appears that we faced the same problem with MPI, where despite significant documentation, we could not find even a simple example which supports course-grain parallelism involving only a few processes. From the foregoing, it appears that a new direction may be required for more productive research during the extension period (10/19/98 - 10/18/99). At the least, the research would need to be done on Windows 95/Windows NT based platforms. Moreover, with the acquisition of Lahey Fortran package for PC platform, and the existing Borland C + + 5. 0, we can do work on C + + wrapper issues. We have carefully studied the blueprint for Space Transportation Propulsion Integrated Design Environment for the next 25 years [13] and found the inclusion of HBCUs in that effort encouraging. Especially in the long period for which a map is provided, there is no doubt that HBCUs will grow and become better equipped to do meaningful research. In the shorter period, as was suggested in our presentation at the HBCU conference, some key decisions regarding the aging Fortran based software for rocket propellants will need to be made. One important issue is whether or not object oriented languages such as C + + or Java should be used for distributed computing. Whether or not "distributed computing" is necessary for the existing software is yet another, larger, question to be tackled with.
Parallel computer simulation of autotransformer-fed AC traction networks
R. John Hill; I. H. Cevik
1990-01-01
Simulation of electrical conditions in power lines and rails in an on-line mode in AC and DC electrified railways is necessary for effective engineering design. A description is given of the use of a parallel computer to simulate voltages along an autotransformer-fed AC railway. The algorithm, based on the solution of algebraic equations for a single train, produces a faster-than-real-time
A Bounded-Optimistic, Parallel Beta-Binders Simulator
Stefan Leye; Adelinde M. Uhrmacher; Corrado Priami
2008-01-01
Compartments play an important role in molecular and cell biology modeling, which motivated the development of BETA-BINDERS, a formalism which is an extension of the pi-CALCULUS. To execute BETA-BINDERS models, sophisticated simulators are required to ensure a sound and efficient execution. Parallel and distributed simulation represents one means to achieve the later. However, stochastically scheduled events hamper the definition of
Xyce parallel electronic simulator reference guide, version 6.0.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David G. [Raytheon, Albuquerque, NM
2013-08-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1].
Xyce Parallel Electronic Simulator : reference guide, version 4.1.
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick
2009-02-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.
PARALLEL SIMULATION OF LARGE-SCALE WATER DISTRIBUTION SYSTEMS
Bargiela, Andrzej
PARALLEL SIMULATION OF LARGE-SCALE WATER DISTRIBUTION SYSTEMS J.K. Hartley, A. Bargiela, R.J. Cant in the specific context of water distribution networks. Today's increasing complexity of such systems, together recovery from measurement failure situations. INTRODUCTION Water distribution systems are large
Efficient Parallel Simulations in Support of Medical Device Design
Marek Behr; Mike Nicolai; Markus Probst
2007-01-01
A parallel solver for incompressible fluid flow simulation, u sed in biomedical device design among other applications, is dis- cussed. The major compute- and communication-intensive portions of the code are described. Using unsteady flow in a comp lex implantable axial blood pump as a model problem, scalability characteristics of the solver are briefly examined. The cod e that ex- hibited
Balanced Decomposition for Power System Simulation on Parallel Computers
Catholic University of Chile (Universidad Católica de Chile)
Balanced Decomposition for Power System Simulation on Parallel Computers Felipe Morales, Hugh iterations are attempted by using the Jacobian matrix balanced decomposition. The partition permits to the balance in the blocks and to the error of proximity of this respect to the original matrix. References 5
Parallel electronic circuit simulation on the iPSC system
C.-P. Yuan; R. Lucas; P. Chan; R. Dutton
1988-01-01
A parallel circuit simulator was implemented on the iPSC system. Concurrent model evaluation, hierarchical BBDF (bordered block diagonal form) reordering, and distributed multifrontal decomposition to solve the sparse matrix are used. A speedup of six times has been achieved on an eight-processor iPSC hypercube system
Point-centered domain decomposition for parallel molecular dynamics simulation
R. Koradi; M. Billeter; P. Güntert
2000-01-01
A new algorithm for molecular dynamics simulations of biological macromolecules on parallel computers, point-centered domain decomposition, is introduced. The molecular system is divided into clusters that are assigned to individual processors. Each cluster is characterized by a center point and comprises all atoms that are closer to its center point than to the center point of any other cluster. The
Parallel-in-time molecular-dynamics simulations
L. Baffico; S. Bernard; Y. Maday; G. Turinici; G. Zérah
2002-01-01
While there have been many progress in the field of multiscale simulations in the space domain, in particular, due to efficient parallelization techniques, much less is known in the way to perform similar approaches in the time domain. In this paper we show on two examples that, provided we can describe in a rough but still accurate way the system
A Parallel Engine for Graphical Interactive Molecular Dynamics Simulations
Eduardo Rocha Rodrigues; Airam Jonatas Preto; Stephan Stephany
2004-01-01
The current work proposes a parallel implementation for interactive molecular dynamics simulations (MD). The interactive capability is modeled by finite automata that are executed in the processing nodes. Any interaction implies in a communication between the user interface and the finite automata. The ADKS, an interactive sequential MD code that provides graphical output was chosen as a case study. A
Leaky Modes in Parallel-Plate EMP Simulators
Ali Rushdi; Ronald Menendez; Raj Mittra; Shung-Wu Lee
1978-01-01
The finite-width parallel-plate waveguide is a useful tool as an EMP simulator, and its characteristics have recently been investigated by a number of workers. In this paper, we report the results of a study of the modal fields in such a waveguide. Once these modal fields and their corresponding wavenumbers are known, the problem of source excitation in such a
Molecular simulation of rheological properties using massively parallel supercomputers
Bhupathiraju, R.K.; Cui, S.T.; Gupta, S.A.; Cummings, P.T. [Univ. of Tennessee, Knoxville, TN (United States). Dept of Chemical Engineering; Cochran, H.D. [Oak Ridge National Lab., TN (United States)
1996-11-01
Advances in parallel supercomputing now make possible molecular-based engineering and science calculations that will soon revolutionize many technologies, such as those involving polymers and those involving aqueous electrolytes. We have developed a suite of message-passing codes for classical molecular simulation of such complex fluids and amorphous materials and have completed a number of demonstration calculations of problems of scientific and technological importance with each. In this paper, we will focus on the molecular simulation of rheological properties, particularly viscosity, of simple and complex fluids using parallel implementations of non-equilibrium molecular dynamics. Such calculations represent significant challenges computationally because, in order to reduce the thermal noise in the calculated properties within acceptable limits, large systems and/or long simulated times are required.
Reusable Component Model Development Approach for Parallel and Distributed Simulation
Zhu, Feng; Yao, Yiping; Chen, Huilong; Yao, Feng
2014-01-01
Model reuse is a key issue to be resolved in parallel and distributed simulation at present. However, component models built by different domain experts usually have diversiform interfaces, couple tightly, and bind with simulation platforms closely. As a result, they are difficult to be reused across different simulation platforms and applications. To address the problem, this paper first proposed a reusable component model framework. Based on this framework, then our reusable model development approach is elaborated, which contains two phases: (1) domain experts create simulation computational modules observing three principles to achieve their independence; (2) model developer encapsulates these simulation computational modules with six standard service interfaces to improve their reusability. The case study of a radar model indicates that the model developed using our approach has good reusability and it is easy to be used in different simulation platforms and applications. PMID:24729751
Simulation optimization: a survey of simulation optimization techniques and procedures
James R. Swisher; Paul D. Hyden; Sheldon H. Jacobson; Lee W. Schruben
2000-01-01
Discrete-event simulation optimization is a problem of significant interest to practitioners interested in extracting useful information about an actual (or yet to be designed) system that can be modeled using discrete-event simulation. This paper presents a brief survey of the literature on discrete-event simulation optimization over the past decade (1988 to the present). Swisher et al. (2000) provides a more
Potts-model grain growth simulations: Parallel algorithms and applications
Wright, S.A.; Plimpton, S.J.; Swiler, T.P. [and others
1997-08-01
Microstructural morphology and grain boundary properties often control the service properties of engineered materials. This report uses the Potts-model to simulate the development of microstructures in realistic materials. Three areas of microstructural morphology simulations were studied. They include the development of massively parallel algorithms for Potts-model grain grow simulations, modeling of mass transport via diffusion in these simulated microstructures, and the development of a gradient-dependent Hamiltonian to simulate columnar grain growth. Potts grain growth models for massively parallel supercomputers were developed for the conventional Potts-model in both two and three dimensions. Simulations using these parallel codes showed self similar grain growth and no finite size effects for previously unapproachable large scale problems. In addition, new enhancements to the conventional Metropolis algorithm used in the Potts-model were developed to accelerate the calculations. These techniques enable both the sequential and parallel algorithms to run faster and use essentially an infinite number of grain orientation values to avoid non-physical grain coalescence events. Mass transport phenomena in polycrystalline materials were studied in two dimensions using numerical diffusion techniques on microstructures generated using the Potts-model. The results of the mass transport modeling showed excellent quantitative agreement with one dimensional diffusion problems, however the results also suggest that transient multi-dimension diffusion effects cannot be parameterized as the product of the grain boundary diffusion coefficient and the grain boundary width. Instead, both properties are required. Gradient-dependent grain growth mechanisms were included in the Potts-model by adding an extra term to the Hamiltonian. Under normal grain growth, the primary driving term is the curvature of the grain boundary, which is included in the standard Potts-model Hamiltonian.
Verifying Ptolemy II Discrete-Event Models Using Real-Time Maude
Kyungmin Bae; Peter Csaba Ölveczky; Thomas Huining Feng; Stavros Tripakis
2009-01-01
This paper shows how Ptolemy II discrete-event (DE) models can be formally analyzed using Real-Time Maude. We formalize in\\u000a Real-Time Maude the semantics of a subset of hierarchical Ptolemy II DE models, and explain how the code generation infrastructure\\u000a of Ptolemy II has been used to automatically synthesize a Real-Time Maude verification model from a Ptolemy II design model.\\u000a This
Yuanlin Wen; Muder Jeng; Lider Jeng; Pei-shu Fan
2006-01-01
\\u000a This paper presents an intelligent systematic methodology for enhancing diagnosability of discrete event systems by adding\\u000a sensors. The methodology consists of the following iteractive steps. First, Petri nets are used to model the target system.\\u000a Then, an algorithm of polynomial complexity is adopted to analyze a sufficient condition of diagnosability of the modeled\\u000a system. Here, diagnosability is defined in the
Fault Diagnosis in Discrete Event Systems Modeled by Partially Observed Petri Nets
Yu Ru; Christoforos N. Hadjicostis
2009-01-01
In this paper, we study fault diagnosis in discrete event systems modeled by partially observed Petri nets, i.e., Petri nets\\u000a equipped with sensors that allow observation of the number of tokens in some of the places and\\/or partial observation of the\\u000a firing of some of the transitions. We assume that the Petri net model is accompanied by a (possibly implicit)
Fault Diagnosis in Discrete-Event Systems with Incomplete Models: Learnability and Diagnosability.
Kwong, Raymond H; Yonge-Mallo, David L
2015-07-01
Most model-based approaches to fault diagnosis of discrete-event systems require a complete and accurate model of the system to be diagnosed. However, the discrete-event model may have arisen from abstraction and simplification of a continuous time system, or through model building from input-output data. As such, it may not capture the dynamic behavior of the system completely. In a previous paper, we addressed the problem of diagnosing faults given an incomplete model of the discrete-event system. We presented the learning diagnoser which not only diagnoses faults, but also attempts to learn missing model information through parsimonious hypothesis generation. In this paper, we study the properties of learnability and diagnosability. Learnability deals with the issue of whether the missing model information can be learned, while diagnosability corresponds to the ability to detect and isolate a fault after it has occurred. We provide conditions under which the learning diagnoser can learn missing model information. We define the notions of weak and strong diagnosability and also give conditions under which they hold. PMID:25204002
Casting pearls ballistically: Efficient massively parallel simulation of particle deposition
Lubachevsky, B.D. [AT& T Bell Laboratories, Murray Hill, NJ (United States)] [AT& T Bell Laboratories, Murray Hill, NJ (United States); Privman, V. [Clarkson Univ., Potsdam, NY (United States)] [Clarkson Univ., Potsdam, NY (United States); Roy, S.C. [College of William and Mary, Williamsburg, VA (United States)] [College of William and Mary, Williamsburg, VA (United States)
1996-06-01
We simulate ballistic particle deposition wherein a large number of spherical particles are {open_quotes}cast{close_quotes} vertically over a planar horizontal surface. Upon first contact (with the surface or with a previously deposited particle) each particle stops. This model helps material scientists to study the adsorption and sediment formation. The model is sequential, with particles deposited one by one. We have found an equivalent formulation using a continuous time random process and we simulate the latter in parallel using a method similar to the one previously employed for simulating Ising spins. We augment the parallel algorithm for simulating Ising spins with several techniques aimed at the increase of efficiency of producing the particle configuration and statistics collection. Some of these techniques are similar to earlier ones. We implement the resulting algorithm on a 16K PE MasPar MP-1 and a 4K PE MasPar MP-2. The parallel code runs on MasPar computers nearly two orders of magnitude faster than an optimized sequential code runs on a fast workstation. 17 refs., 9 figs.
Numerical simulation of supersonic wake flow with parallel computers
Wong, C.C. [Sandia National Labs., Albuquerque, NM (United States); Soetrisno, M. [Amtec Engineering, Inc., Bellevue, WA (United States)
1995-07-01
Simulating a supersonic wake flow field behind a conical body is a computing intensive task. It requires a large number of computational cells to capture the dominant flow physics and a robust numerical algorithm to obtain a reliable solution. High performance parallel computers with unique distributed processing and data storage capability can provide this need. They have larger computational memory and faster computing time than conventional vector computers. We apply the PINCA Navier-Stokes code to simulate a wind-tunnel supersonic wake experiment on Intel Gamma, Intel Paragon, and IBM SP2 parallel computers. These simulations are performed to study the mean flow in the near wake region of a sharp, 7-degree half-angle, adiabatic cone at Mach number 4.3 and freestream Reynolds number of 40,600. Overall the numerical solutions capture the general features of the hypersonic laminar wake flow and compare favorably with the wind tunnel data. With a refined and clustering grid distribution in the recirculation zone, the calculated location of the rear stagnation point is consistent with the 2D axisymmetric and 3D experiments. In this study, we also demonstrate the importance of having a large local memory capacity within a computer node and the effective utilization of the number of computer nodes to achieve good parallel performance when simulating a complex, large-scale wake flow problem.
Noise simulation in cone beam CT imaging with parallel computing
Tu, Shu-Ju; Shaw, Chris C; Chen, Lingyun
2007-01-01
We developed a computer noise simulation model for cone beam computed tomography imaging using a general purpose PC cluster. This model uses a mono-energetic x-ray approximation and allows us to investigate three primary performance components, specifically quantum noise, detector blurring and additive system noise. A parallel random number generator based on the Weyl sequence was implemented in the noise simulation and a visualization technique was accordingly developed to validate the quality of the parallel random number generator. In our computer simulation model, three-dimensional (3D) phantoms were mathematically modelled and used to create 450 analytical projections, which were then sampled into digital image data. Quantum noise was simulated and added to the analytical projection image data, which were then filtered to incorporate flat panel detector blurring. Additive system noise was generated and added to form the final projection images. The Feldkamp algorithm was implemented and used to reconstruct the 3D images of the phantoms. A 24 dual-Xeon PC cluster was used to compute the projections and reconstructed images in parallel with each CPU processing 10 projection views for a total of 450 views. Based on this computer simulation system, simulated cone beam CT images were generated for various phantoms and technique settings. Noise power spectra for the flat panel x-ray detector and reconstructed images were then computed to characterize the noise properties. As an example among the potential applications of our noise simulation model, we showed that images of low contrast objects can be produced and used for image quality evaluation. PMID:16481694
Design and application of parallel hybrid vehicle simulation platform
Xiao Ye; Zhenhua Jin; Biao Liu; Mingjie Chen; Qinchun Lu
2010-01-01
A parallel type hybrid-electric bus model is constructed to study hybrid powertrain and its sub-systems. The model is a forward-looking model based on LabVIEW simulation module. The engine, clutch and battery model were specially designed according to the target powertrain. Each part of the model is separately validated by test bench experiment. Part model behavior is compared with that of
MPI-SIM: using parallel simulation to evaluate MPI programs
Sundeep Prakash; Rajive L. Bagrodia
1998-01-01
This paper describes the design and implementation of MPI-SIM, a library for the execution driven parallel simulation of MPI programs. MPI-LITE, a portable library that supports multithreaded MPI is also de- scribed. MPI-SIM, which is built on top of MPI-LITE, can be used to predict the performance of existing MPI programs as a function of architectural characteristics including number of
Dynamic Load Balancing Strategies for Conservative Parallel Simulations
Azzedine Boukerche; Sajal K. Das
1997-01-01
This paper studies the problem of load balancing for conservative parallel simulations for execution on a multicomputer. The synchronization protocol makes use of Chandy-Misra null-messages. We propose a dynamic load balancing algorithm which assumes no compile time knowledge about the workload parameters. It is based upon a process migration mechanism, and the notion of CPU-queue length, which indicates the workload
Massively parallel simulations of diffusion in dense polymeric structures
Jean-Loup Faulon; J. David Hobbs; David M. Ford; Robert T. Wilcox
1997-01-01
An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon (1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of
K-NN algorithm in Parallel VLSI Simulation School of Computer Science
Tropper, Carl
level circuit simulation. A fundamental problem posed by a parallel environment is the decision of whether it is best to simulate a particular circuit sequentially or on a parallel platform.Furthermore, in the event that a circuit should be simulated on a parallel platform, it is necessary to decide how many
Xyce Parallel Electronic Simulator - Users' Guide Version 2.1.
Hutchinson, Scott A; Hoekstra, Robert J.; Russo, Thomas V.; Rankin, Eric; Pawlowski, Roger P.; Fixel, Deborah A; Schiek, Richard; Bogdan, Carolyn W.; Shirley, David N.; Campbell, Phillip M.; Keiter, Eric R.
2005-06-01
This manual describes the use of theXyceParallel Electronic Simulator.Xycehasbeen designed as a SPICE-compatible, high-performance analog circuit simulator, andhas been written to support the simulation needs of the Sandia National Laboratorieselectrical designers. This development has focused on improving capability over thecurrent state-of-the-art in the following areas:%04Capability to solve extremely large circuit problems by supporting large-scale par-allel computing platforms (up to thousands of processors). Note that this includessupport for most popular parallel and serial computers.%04Improved performance for all numerical kernels (e.g., time integrator, nonlinearand linear solvers) through state-of-the-art algorithms and novel techniques.%04Device models which are specifically tailored to meet Sandia's needs, includingmany radiation-aware devices.3 XyceTMUsers' Guide%04Object-oriented code design and implementation using modern coding practicesthat ensure that theXyceParallel Electronic Simulator will be maintainable andextensible far into the future.Xyceis a parallel code in the most general sense of the phrase - a message passingparallel implementation - which allows it to run efficiently on the widest possible numberof computing platforms. These include serial, shared-memory and distributed-memoryparallel as well as heterogeneous platforms. Careful attention has been paid to thespecific nature of circuit-simulation problems to ensure that optimal parallel efficiencyis achieved as the number of processors grows.The development ofXyceprovides a platform for computational research and de-velopment aimed specifically at the needs of the Laboratory. WithXyce, Sandia hasan %22in-house%22 capability with which both new electrical (e.g., device model develop-ment) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms)research and development can be performed. As a result,Xyceis a unique electricalsimulation capability, designed to meet the unique needs of the laboratory.4 XyceTMUsers' GuideAcknowledgementsThe authors would like to acknowledge the entire Sandia National Laboratories HPEMS(High Performance Electrical Modeling and Simulation) team, including Steve Wix, CarolynBogdan, Regina Schells, Ken Marx, Steve Brandon and Bill Ballard, for their support onthis project. We also appreciate very much the work of Jim Emery, Becky Arnold and MikeWilliamson for the help in reviewing this document.Lastly, a very special thanks to Hue Lai for typesetting this document with LATEX.TrademarksThe information herein is subject to change without notice.Copyrightc 2002-2003 Sandia Corporation. All rights reserved.XyceTMElectronic Simulator andXyceTMtrademarks of Sandia Corporation.Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence DesignSystems, Inc.Silicon Graphics, the Silicon Graphics logo and IRIX are registered trademarks of SiliconGraphics, Inc.Microsoft, Windows and Windows 2000 are registered trademark of Microsoft Corporation.Solaris and UltraSPARC are registered trademarks of Sun Microsystems Corporation.Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation.HP and Alpha are registered trademarks of Hewlett-Packard company.Amtec and TecPlot are trademarks of Amtec Engineering, Inc.Xyce's expression library is based on that inside Spice 3F5 developed by the EECS De-partment at the University of California.All other trademarks are property of their respective owners.ContactsBug Reportshttp://tvrusso.sandia.gov/bugzillaEmailxyce-support%40sandia.govWorld Wide Webhttp://www.cs.sandia.gov/xyce5 XyceTMUsers' GuideThis page is left intentionally blank6
Massively Parallel Processing for Fast and Accurate Stamping Simulations
NASA Astrophysics Data System (ADS)
Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu
2005-08-01
The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.
Particle simulation of plasmas on the massively parallel processor
NASA Technical Reports Server (NTRS)
Gledhill, I. M. A.; Storey, L. R. O.
1987-01-01
Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.
Stochastic Event Counter for Discrete-Event Systems Under Unreliable Observations
Tae-Sic Yoo; Humberto E. Garcia
2008-06-01
This paper addresses the issues of counting the occurrence of special events in the framework of partiallyobserved discrete-event dynamical systems (DEDS). First, we develop a noble recursive procedure that updates active counter information state sequentially with available observations. In general, the cardinality of active counter information state is unbounded, which makes the exact recursion infeasible computationally. To overcome this difficulty, we develop an approximated recursive procedure that regulates and bounds the size of active counter information state. Using the approximated active counting information state, we give an approximated minimum mean square error (MMSE) counter. The developed algorithms are then applied to count special routing events in a material flow system.
Supervisor Localization: A Top-Down Approach to Distributed Control of Discrete-Event Systems
Cai, K.; Wonham, W. M. [Systems Control Group, Department of Electrical and Computer Engineering, University of Toronto, 10 King's College Road, Toronto, ON, M5S 3G4 (Canada)
2009-03-05
A purely distributed control paradigm is proposed for discrete-event systems (DES). In contrast to control by one or more external supervisors, distributed control aims to design built-in strategies for individual agents. First a distributed optimal nonblocking control problem is formulated. To solve it, a top-down localization procedure is developed which systematically decomposes an external supervisor into local controllers while preserving optimality and nonblockingness. An efficient localization algorithm is provided to carry out the computation, and an automated guided vehicles (AGV) example presented for illustration. Finally, the 'easiest' and 'hardest' boundary cases of localization are discussed.
Niehof, Jonathan T.; Morley, Steven K.
2012-01-01
We review and develop techniques to determine associations between series of discrete events. The bootstrap, a nonparametric statistical method, allows the determination of the significance of associations with minimal assumptions about the underlying processes. We find the key requirement for this method: one of the series must be widely spaced in time to guarantee the theoretical applicability of the bootstrap. If this condition is met, the calculated significance passes a reasonableness test. We conclude with some potential future extensions and caveats on the applicability of these methods. The techniques presented have been implemented in a Python-based software toolkit.
Massively Parallel Methods for Simulating the Phase-Field Model
Tikare, V.; Fan, D.; Plimpton, S.J.; Fye, R.M.
2000-12-01
Prediction of the evolution of microstructures in weapons systems is critical to meeting the objectives of stockpile stewardship in accordance with the Nuclear Weapons Test Ban Treaty. For example, accurate simulation of microstructural evolution in solder joints, cermets, PZT power generators, etc. is necessary for predicting the performance, aging, and reliability both of individual components and of entire weapons systems. A recently developed but promising approach called the ''Phase-Field Model'' (PFM) has the potential of allowing the accurate quantitative prediction of microstructural evolution, with all the spatial and thermodynamic complexity of a real microstructure. Simulating with the PFM requires solving a set of coupled nonlinear differential equations, one for each material variable (e.g., grain orientation, phase, composition, stresses, anisotropy, etc.). While the PFM is versatile and is able to incorporate the necessary complexity for modeling real material systems, it is very computationally intensive, and it has been a difficult and major challenge to formulate an efficient algorithmic implementation of the approach. We found that second order in space algorithm is more stable and leads to more accurate results. However, the computational requirements still remain high, so we have developed a single field algorithm to reduce the computations by 2 orders of magnitude. We have created a 3-D parallel version of the basic phase-field (PF model) and benchmarked it performance. Preliminary results indicate that we will be able to run very large problems effectively with the new parallel code. Microstructural evolution in a diffusion couple was simulated using PFM to simultaneously simulate grain growth, diffusion and phase transformation. Solute drag in a variable composition material, a process no other model can simulate, was successfully simulated using the phase-field model. The phase field model was used to study the evolution of fractal high curvature structures to show that these structures have very different morphological and kinetic behaviors than those of equi-axed structures.
Modelling and real-time simulation of continuous-discrete systems in mechatronics
Lindow, H. [Rostocker, Magdeburg (Germany)
1996-12-31
This work presents a methodology for simulation and modelling of systems with continuous - discrete dynamics. It derives hybrid discrete event models from Lagrange`s equations of motion. This method combines continuous mechanical, electrical and thermodynamical submodels on one hand with discrete event models an the other hand into a hybrid discrete event model. This straight forward software development avoids numeric overhead.
A new parallel environment for interactive simulations implementing safe multithreading with MPI
Eduardo Rocha Rodrigues; Airam Jonatas Preto; Stephan Stephany
2005-01-01
This work presents a new parallel environment for interactive simulations. This environment integrates a MPI-based parallel simulation engine, a visualization module, and a user interface that supports modification of simulation parameters and visualization at runtime. This requires multiple threads, one to execute the simulation or the visualization, and other to receive user input. Since many MPI implementations are not thread-safe,
An evaluation of parallel simulated annealing strategies with application to standard cell placement
John A. Chandy; Sungho Kim; Balkrishna Ramkumar; Steven Parkes; Prithviraj Banerjee
1997-01-01
Simulated annealing, a methodology for solving combinatorial optimization problems, is a very computationally expensive algorithm, and as such, numerous researchers have undertaken efforts to parallelize it. In this paper, we investigate three of these parallel simulated annealing strategies when applied to standard cell placement, specifically the TimberWolfSC placement tool. We have examined a parallel moves strategy, as well as two
Parallel Hyperbolic PDE Simulation on Clusters: Cell versus GPU Scott Rostrup and Hans De Sterck
De Sterck, Hans
Parallel Hyperbolic PDE Simulation on Clusters: Cell versus GPU Scott Rostrup and Hans De Sterck Increasingly, high-performance computing is looking towards data-parallel computational devices to enhance investigate the acceleration of parallel hyperbolic partial differential equation simulation on structured
Control synthesis of timed discrete event systems based on predicate invariance.
Chen, H; Hanisch, H M
2000-01-01
In this paper, arc-timed Petri nets are used to model controlled real-time discrete event systems, and the control synthesis problem that designs a controller for a system to satisfy its given closed-loop behavior specification is addressed. For the problem with the closed-loop behavior specified by a state predicate, real-time control-invariant predicates are introduced, and a fixpoint algorithm to compute the unique extremal control-invariant subpredicate of a given predicate, key to the control synthesis, is presented. For the problem with the behavior specified by a labeled arc-timed Petri net, it is shown that the control synthesis problem can be transformed into one that synthesizes a controller for an induced arc-timed Petri net with a state predicate specification. The problem can then be solved by using the fixpoint algorithm as well. The algorithm involves conjunction and disjunction operations of polyhedral sets and can be algorithmically implemented, making automatic synthesis of controllers for real-time discrete event systems possible. PMID:18252404
Parallel Unsteady Turbopump Simulations for Liquid Rocket Engines
NASA Technical Reports Server (NTRS)
Kiris, Cetin C.; Kwak, Dochan; Chan, William
2000-01-01
This paper reports the progress being made towards complete turbo-pump simulation capability for liquid rocket engines. Space Shuttle Main Engine (SSME) turbo-pump impeller is used as a test case for the performance evaluation of the MPI and hybrid MPI/Open-MP versions of the INS3D code. Then, a computational model of a turbo-pump has been developed for the shuttle upgrade program. Relative motion of the grid system for rotor-stator interaction was obtained by employing overset grid techniques. Time-accuracy of the scheme has been evaluated by using simple test cases. Unsteady computations for SSME turbo-pump, which contains 136 zones with 35 Million grid points, are currently underway on Origin 2000 systems at NASA Ames Research Center. Results from time-accurate simulations with moving boundary capability, and the performance of the parallel versions of the code will be presented in the final paper.
Financial simulations on a massively parallel Connection Machine
Hutchinson, J.M.; Zenios, S.A. (Thinking Machine Corp., Cambridge, MA (US))
1991-01-01
This paper reports on the valuation of complex financial instruments that appear in the banking and insurance industries which requires simulations of their cashflow behavior in a volatile interest rate environment. These simulations are complex and computationally intensive. Their use, thus far, has been limited to intra-day analysis and planning. Researchers at the Wharton School and Thinking Machines Corporation have developed model formulations for massively parallel architectures, like the Connection Machine CM-2. A library of financial modeling primitives has been designed and used to implement a model for the valuation of mortgage-backed securities. Analyzing a portfolio of these securities-which would require 2 days on a large mainframe-is carried out in 1 hour on a CM-2a.
Time Complexity of a Parallel Conjugate Gradient Solver for Light Scattering Simulations
Hoffmann, Walter
Time Complexity of a Parallel Conjugate Gradient Solver for Light Scattering Simulations: Theory Solver for Light Scattering Simulations: Theory and SPMD Implementation ABSTRACT We describe systems of equations emerging from Elastic Light Scattering simulations. The execution time
Use of Parallel Tempering for the Simulation of Polymer Melts
NASA Astrophysics Data System (ADS)
Bunker, Alex; Duenweg, Burkhard; Theodorou, Doros
2000-03-01
The parallel tempering algorithm(C. J. Geyer, Computing Science and Statistics: Proceedings of the 23rd Symposium of the Interface, 156 (1991).) is based on simulating several systems in parallel, each of which have a slightly different Hamiltonian. The systems are put in equilibrium with each other by stochastic swaps between neighboring Hamiltonians. Previous implementations have mainly focused on the temperature as control variable. In contrast, we vary the excluded-volume interaction in a continuum bead-spring polymer melt, as has been done for lattice polymers already(Y. Iba, G. Chikenji, M. Kikuchi, J. Phys. Soc. Japan v. 67, 3327 (1998).). The "softest" interactions allow for substantial monomer overlap such that pivot moves become feasible. We have benchmarked the algorithm by comparing it to the chain breaking algorithm used on the same system. Possible applications of the algorithm include the simulation of polymer systems with complex topologies and combining the method with the Gibbs ensemble technique for the phase behavior of polymer blends.
Roadmap for efficient parallelization of breast anatomy simulation
NASA Astrophysics Data System (ADS)
Chui, Joseph H.; Pokrajac, David D.; Maidment, Andrew D. A.; Bakic, Predrag R.
2012-03-01
A roadmap has been proposed to optimize the simulation of breast anatomy by parallel implementation, in order to reduce the time needed to generate software breast phantoms. The rapid generation of high resolution phantoms is needed to support virtual clinical trials of breast imaging systems. We have recently developed an octree-based recursive partitioning algorithm for breast anatomy simulation. The algorithm has good asymptotic complexity; however, its current MATLAB implementation cannot provide optimal execution times. The proposed roadmap for efficient parallelization includes the following steps: (i) migrate the current code to a C/C++ platform and optimize it for single-threaded implementation; (ii) modify the code to allow for multi-threaded CPU implementation; (iii) identify and migrate the code to a platform designed for multithreaded GPU implementation. In this paper, we describe our results in optimizing the C/C++ code for single-threaded and multi-threaded CPU implementations. As the first step of the proposed roadmap we have identified a bottleneck component in the MATLAB implementation using MATLAB's profiling tool, and created a single threaded CPU implementation of the algorithm using C/C++'s overloaded operators and standard template library. The C/C++ implementation has been compared to the MATLAB version in terms of accuracy and simulation time. A 520-fold reduction of the execution time was observed in a test of phantoms with 50- 400 ?m voxels. In addition, we have identified several places in the code which will be modified to allow for the next roadmap milestone of the multithreaded CPU implementation.
Massively Parallel Simulations of Diffusion in Dense Polymeric Structures
Faulon, Jean-Loup, Wilcox, R.T. [Sandia National Labs., Albuquerque, NM (United States)], Hobbs, J.D. [Montana Tech of the Univ. of Montana, Butte, MT (United States). Dept. of Chemistry and Geochemistry], Ford, D.M. [Texas A and M Univ., College Station, TX (United States). Dept. of Chemical Engineering
1997-11-01
An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon(1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of magnitude.The main advantage of the technique is that one can circumvent the bottlenecks in configuration space that inhibit relaxation in molecular dynamics simulations. The technique is based on the fact that tetravalent atoms (such as carbon and silicon) fit in the center of a regular tetrahedron and that regular tetrahedrons can be used to mesh the three-dimensional space. Thus, the problem of polymer equilibration described by continuous equations in molecular dynamics is reduced to a discrete problem where solutions are approximated by simple algorithms. Practical modeling applications include the constructing of butyl rubber and ethylene-propylene-dimer-monomer (EPDM) models for oxygen and water diffusion calculations. Butyl and EPDM are used in O-ring systems and serve as sealing joints in many manufactured objects. Diffusion coefficients of small gases have been measured experimentally on both polymeric systems, and in general the diffusion coefficients in EPDM are an order of magnitude larger than in butyl. In order to better understand the diffusion phenomena, 10, 000 atoms models were generated and equilibrated for butyl and EPDM. The models were submitted to a massively parallel molecular dynamics simulation to monitor the trajectories of the diffusing species.
Parallel continuous simulated tempering and its applications in large-scale molecular simulations
NASA Astrophysics Data System (ADS)
Zang, Tianwu; Yu, Linglin; Zhang, Chong; Ma, Jianpeng
2014-07-01
In this paper, we introduce a parallel continuous simulated tempering (PCST) method for enhanced sampling in studying large complex systems. It mainly inherits the continuous simulated tempering (CST) method in our previous studies [C. Zhang and J. Ma, J. Chem. Phys. 130, 194112 (2009); C. Zhang and J. Ma, J. Chem. Phys. 132, 244101 (2010)], while adopts the spirit of parallel tempering (PT), or replica exchange method, by employing multiple copies with different temperature distributions. Differing from conventional PT methods, despite the large stride of total temperature range, the PCST method requires very few copies of simulations, typically 2-3 copies, yet it is still capable of maintaining a high rate of exchange between neighboring copies. Furthermore, in PCST method, the size of the system does not dramatically affect the number of copy needed because the exchange rate is independent of total potential energy, thus providing an enormous advantage over conventional PT methods in studying very large systems. The sampling efficiency of PCST was tested in two-dimensional Ising model, Lennard-Jones liquid and all-atom folding simulation of a small globular protein trp-cage in explicit solvent. The results demonstrate that the PCST method significantly improves sampling efficiency compared with other methods and it is particularly effective in simulating systems with long relaxation time or correlation time. We expect the PCST method to be a good alternative to parallel tempering methods in simulating large systems such as phase transition and dynamics of macromolecules in explicit solvent.
LARGE-SCALE MOLECULAR DYNAMICS SIMULATION USING VECTOR AND PARALLEL COMPUTERS
Rapaport, Dennis C.
LARGE-SCALE MOLECULAR DYNAMICS SIMULATION USING VECTOR AND PARALLEL COMPUTERS D.C. RAPAPORT Physics on vector and parallel architectures for molecular dynamics simulation are described. For simulating systems. .............................................. 4 1.1. The molecular dynamics approach to the N-body problem ........... 4 1.2. Computational
Wisconsin Wind Tunnel II: A Fast and Portable Parallel Architecture Simulator
Shubhendu S. Mukherjee; Steven K. Reinhardt; Babak Falsafi; Mike Litzkow; Steve Huss-Lederman; Mark D. Hill; James R. Larus; David A. Wood
1997-01-01
The design of future parallel computers requires rapid simulation of target designs running realistic workloads. These simulations have been accelerated using two techniques: direct execution and the use of a parallel host. Historically, these techniques have been considered to have poor portability. This paper identi- fies and describes the implementation of four key oper- ations necessary to make such simulation
Lushnikov, P M
2002-06-01
An efficient numerical algorithm is presented for massively parallel simulations of dispersion-managed wavelength-division-multiplexed optical fiber systems. The algorithm is based on a weak nonlinearity approximation and independent parallel calculations of fast Fourier transforms on multiple central processor units (CPUs). The algorithm allows one to implement numerical simulations M/2 times faster than a direct numerical simulation by a split-step method, where M is a number of CPUs in a parallel network. PMID:18026330
NASA Astrophysics Data System (ADS)
Damiani, Sarah; Griffin, Christopher; Phoha, Shashi
2003-12-01
Autonomous Sensor Networks have the potential for broad applicability to national security, intelligent transportation, industrial production and environmental and hazardous process control. Distributed sensors may be used for detecting bio-terrorist attacks, for contraband interdiction, border patrol, monitoring building safety and security, battlefield surveillance, or may be embedded in complex dynamic systems for enabling fault tolerant operations. In this paper we present algorithms and automation tools for constructing discrete event controllers for complex networked systems that restrict the dynamic behavior of the system according to given specifications. In our previous work we have modeled dynamic system as a discrete event automation whose open loop behavior is represented as a language L of strings generated with the alphabet 'Elipson' of all possible atomic events that cause state transitions in the network. The controlled behavior is represented by a sublanguage K, contained in L, that restricts the behavior of the system according to the specifications of the controller. We have developed the algebraic structure of controllable sublanguages as perfect right partial ideals that satisfy a precontrollability condition. In this paper we develop an iterative algorithm to take an ad hoc specification described using a natural language, and to formulate a complete specification that results in a controllable sublanguage. A supervisory controller modeled as an automaton that runs synchronously with the open loop system in the sense of Ramadge and Wonham is automatically generated to restrict the behavior of the open loop system to the controllable sublanguage. A battlefield surveillance scenario illustrates the iterative evolution of ad hoc specifications for controlling an autonomous sensor network and the generation of a controller that reconfigures the sensor network to dynamically adapt to environmental perturbations.
Parallel algorithm for multiscale atomistic/continuum simulations using LAMMPS
NASA Astrophysics Data System (ADS)
Pavia, F.; Curtin, W. A.
2015-07-01
Deformation and fracture processes in engineering materials often require simultaneous descriptions over a range of length and time scales, with each scale using a different computational technique. Here we present a high-performance parallel 3D computing framework for executing large multiscale studies that couple an atomic domain, modeled using molecular dynamics and a continuum domain, modeled using explicit finite elements. We use the robust Coupled Atomistic/Discrete-Dislocation (CADD) displacement-coupling method, but without the transfer of dislocations between atoms and continuum. The main purpose of the work is to provide a multiscale implementation within an existing large-scale parallel molecular dynamics code (LAMMPS) that enables use of all the tools associated with this popular open-source code, while extending CADD-type coupling to 3D. Validation of the implementation includes the demonstration of (i) stability in finite-temperature dynamics using Langevin dynamics, (ii) elimination of wave reflections due to large dynamic events occurring in the MD region and (iii) the absence of spurious forces acting on dislocations due to the MD/FE coupling, for dislocations further than 10 Å from the coupling boundary. A first non-trivial example application of dislocation glide and bowing around obstacles is shown, for dislocation lengths of??50 nm using fewer than 1 000?000 atoms but reproducing results of extremely large atomistic simulations at much lower computational cost.
Ion dynamics at supercritical quasi-parallel shocks: Hybrid simulations
Su Yanqing; Lu Quanming; Gao Xinliang; Huang Can; Wang Shui [CAS Key Laboratory of Basic Plasma Physics, Department of Geophysics and Planetary Science, University of Science and Technology of China, Hefei 230026 (China)
2012-09-15
By separating the incident ions into directly transmitted, downstream thermalized, and diffuse ions, we perform one-dimensional (1D) hybrid simulations to investigate ion dynamics at a supercritical quasi-parallel shock. In the simulations, the angle between the upstream magnetic field and shock nominal direction is {theta}{sub Bn}=30 Degree-Sign , and the Alfven Mach number is M{sub A}{approx}5.5. The shock exhibits a periodic reformation process. The ion reflection occurs at the beginning of the reformation cycle. Part of the reflected ions is trapped between the old and new shock fronts for an extended time period. These particles eventually form superthermal diffuse ions after they escape to the upstream of the new shock front at the end of the reformation cycle. The other reflected ions may return to the shock immediately or be trapped between the old and new shock fronts for a short time period. When the amplitude of the new shock front exceeds that of the old shock front and the reformation cycle is finished, these ions become thermalized ions in the downstream. No noticeable heating can be found in the directly transmitted ions. The relevance of our simulations to the satellite observations is also discussed in the paper.
A Parallel Lattice-Boltzmann Method For Large Scale Simulations Of Complex Fluids
Maziar Nekovee; Jonathan Chin; Nélido González-segredo; Peter V. Coveney
2001-01-01
this paper we give a brief overview of our LB scheme for amphiphilicuids, describe a parallel implementation of the algorithm and examine itsperformance on massively parallel platforms. We present preliminary resultsof parallel LB simulations of phase separation and self-assembly in binary andternary systems and conclude with an outlook on future applications of ourmethod to complex uids.
Moldy: a portable molecular dynamics simulation program for serial and parallel computers
Keith Refson
2000-01-01
Moldy is a highly portable C program for performing molecular-dynamics simulations of solids and liquids using periodic boundary conditions. It runs in serial mode on a conventional workstation or on a parallel system using an interface to a parallel communications library such as MPI or BSP. The “replicated data” parallelization strategy is used to achieve reasonable performance with a minimal
NASA Technical Reports Server (NTRS)
Hsieh, Shang-Hsien
1993-01-01
The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.
Ultrasonic and parallel gap welding of GaAs solar cells, experimental and simulation results
S. Reul; W. Snakker
1989-01-01
Different ultrasonic and parallel gap welding techniques are discussed. Their application to the welding of GaAs solar cells is outlined. The results of finite element simulation of thermal, electrical and mechanical parameters affecting parallel gap welding are presented. Thermal cycling test for both types of welding are performed. Welding tests demonstrate problematic breakage behavior for parallel gap, but not for
A Partitioning Approach for Parallel Simulation of AC-Radial Shipboard Power Systems
Uriarte, Fabian Marcel
2011-08-08
An approach to parallelize the simulation of AC-Radial Shipboard Power Systems (SPSs) using multicore computers is presented. Time domain simulations of SPSs are notoriously slow, due principally to the number of components, and the time...
Variance reduction algorithms for parallel replicated simulation of uniformized Markov chains
Simon Streltsov; Pirooz Vakili
1996-01-01
We discuss the simulation ofM replications of a uniformizable Markov chain simultaneously and in parallel (the so-called parallel replicated approach). Distributed implementation on a number of processors and parallel SIMD implementation on massively parallel computers are described. We investigate various ways of inducing correlation across replications in order to reduce the variance of estimators obtained from theM replications. In particular,
Discrete Event Dyn Syst (2006) 16: 451492 DOI 10.1007/s10626-006-0021-9
Antsaklis, Panos
2006-01-01
of inequalities that the Petri net marking must satisfy at any time. However, as we show, problems involving more, and synchronic-distance based designs. Keywords Petri nets Â· Supervisory control Â· Mutual exclusion M. V) 16: 451Â492 1 Introduction Petri nets (PNs) are an important class of discrete event systems
Jose Mireles Jr.; Frank L. Lewis; Roberto C. Ambrosio; Edgar Martínez
2009-01-01
Using a patented matrix formulation, a Discrete Event (DE) controller is designed for a manufacturing cell. The DE controller can directly be implemented from standard manufacturing tools such as the Bill of Materials or the assembly tree. The matrices also make it straightforward to actually implement the DE controller for sequencing the jobs and assigning the resources. We use virtual
Teneketzis, Demosthenis
2000-01-01
from fault-tree (Lapp and Powers, 1977) and analytical redundancy (Will- sky, 1976; Frank, 1990Discrete Event Dynamic Systems: Theory and Applications, 10, 3386 (2000) c 2000 Kluwer Academic for systems where the information used for fault diagnosis is centralized. A notable exception is Holloway
Steel product transportation and storage simulation: a combined simulation\\/optimization approach
Nobuyuki Ueno; Yoshiyuki Nakagawa; Yoshiro Okuno; Susumu Morito
1988-01-01
Product transportation and storage in steel works are modeled in order to evaluate both facility specification and transportation rules. Discrete-event simulation together with mathematical optimization technique are carried out based on probability distribution which is determined by analysis of actual operating data. The model utilizes a combined network and discrete-event approach written under the SLAM II simulation language. Simulation results
A massively parallel solution strategy for efficient thermal radiation simulation
NASA Astrophysics Data System (ADS)
Nguyen, P. D.; Moureau, V.; Vervisch, L.; Perret, N.
2012-06-01
A novel and efficient methodology to solve the Radiative Transfer Equations (RTE) in thermal radiation is discussed. The BiCGStab(2) iterative solution method, as designed for the non-symmetric linear equation systems, is used to solve the discretized RTE. The numerical upwind and central schemes are blended to provide a stable numerical scheme (MUCS) for interpolation of the cell facial radiation intensities in finite volume formulation. The combination of the BiCGStab(2) and MUCS methods proved to be very efficient when coupling with the DOM approach to solve the RTE. A cost-effective tabulation technique for the gaseous radiative property model SNB-FSCK using 7-point Gauss-Labatto quadrature scheme is also introduced. The whole methodology is implemented into a massively parallel unstructured CFD code where the radiative and fluid flow solutions share the same domain decomposition, which is the bottleneck in current radiative solvers. The dual mesh decomposition at the cell groups level and processors level is adopted to optimize the CFD code for massively parallel computing. The whole method is applied to simulate the radiation heat-transfer in a 3D rectangular enclosure containing non-isothermal CO2 and H2O mixtures. Two test cases are studied for homogeneous and inhomogeneous distributions of CO2 and H2O in the enclosure. The result is reported for the heat flux and radiation energy source and the comparison is also made between the present methodology BiCGStab(2)/MUCS/tabulated SNB-FSCK, the benchmark method SNB-CK (implemented at 25cm-1 narrow-band) and some other methods available in the literature. The present method (BiCGStab(2)/MUCS/tabulated SNB-FSCK) yields more accurate predictions particularly for the radiation source term. When comparing with the benchmark solution, the relative error of the radiation source term is remarkably reduced to less than 4% and the CPU time is drastically diminished.
Application of discrete event simulation to the activity based costing of manufacturing systems
T. A. Spedding; G. Q. Sun
1999-01-01
In the last two decades traditional cost accounting practices have been unable to respond to the changing information needs of manufacturing management. Activity Based Costing (ABC) is a method which can solve many of the limitations of traditional cost systems. This method of accounting involves the breaking down of the individual activities and costing of the amount of time spent
A discrete event simulation model for unstructured supervisory control of unmanned vehicles
McDonald, Anthony D. (Anthony Douglas)
2010-01-01
Most current Unmanned Vehicle (UV) systems consist of teams of operators controlling a single UV. Technological advances will likely lead to the inversion of this ratio, and automation of low level tasking. These advances ...
Systems analysis and optimization through discrete event simulation at Amazon.com
Price, Cameron S. (Cameron Stalker), 1972-
2004-01-01
The basis for this thesis involved a six and a half month LFM internship at the Amazon.com fulfillment center in the United Kingdom. The fulfillment center management sought insight into the substantial variation across ...
Integrating System Performance Engineering into MASCOT Methodology through Discrete-Event Simulation
Pere P. Sancho; Carlos Juiz; Ramón Puigjaner
2004-01-01
\\u000a Software design methodologies are incorporating non functional features on design system descriptions. MASCOT which has been\\u000a a traditional design methodology for European defence companies has no performance extension. In this paper we present a set\\u000a of performance annotations to the MASCOT methodology, called MASCOTime. These annotations are extending the MDL (MASCOT Description\\u000a Language) design components transparently. Thus, in order to evaluate the
Quantifying supply chain disruption risk using Monte Carlo and discrete-event simulation
Schmitt, Amanda J.
We present a model constructed for a large consumer products company to assess their vulnerability to disruption risk and quantify its impact on customer service. Risk profiles for the locations and connections in the ...
Modeling protein-DNA binding time in Stochastic Discrete Event Simulation of Biological Processes
Preetam Ghosh; Samik Ghosh; Kalyan Basu; Sajal K. Das
2007-01-01
This paper presents a parametric model to esti- mate the DNA-protein binding time using the DNA and protein structures and details of the binding site. To understand the stochastic behavior of biological systems, we propose an \\
Analysis of a hospital network transportation system with discrete event simulation
Kwon, Annie Y. (Annie Yean)
2011-01-01
VA New England Healthcare System (VISN1) provides transportation to veterans between eight medical centers and over 35 Community Based Outpatient Clinics across New England. Due to high variation in its geographic area, ...
GPU-based Parallel Computing for the Simulation of Complex Multibody Systems with
Anitescu, Mihai
GPU-based Parallel Computing for the Simulation of Complex Multibody Systems with Unilateral to illustrate its potential parallel scientific computing. The equations of motion associated with the dynamics the simulation of systems with more than one million bodies on commodity desktops. Efforts are under way
Solving the WDM network operation problem using dynamic synchronous parallel simulated annealing
Asheq Khan; Dale R. Thompson
2005-01-01
Several variations of synchronous parallel simulated annealing (PSA) were applied to solve the static lightpath establishment wavelength selective cross-connect network operation problem for a WDM network. The goal was to find high-quality solutions to determine the efficiency of the dynamic routing and wavelength assignment algorithms. Multiple parallel processes ran the simulated annealing algorithm and exchanged solutions among them. A proposed
Particle/Continuum Hybrid Simulation in a Parallel Computing Environment
NASA Technical Reports Server (NTRS)
Baganoff, Donald
1996-01-01
The objective of this study was to modify an existing parallel particle code based on the direct simulation Monte Carlo (DSMC) method to include a Navier-Stokes (NS) calculation so that a hybrid solution could be developed. In carrying out this work, it was determined that the following five issues had to be addressed before extensive program development of a three dimensional capability was pursued: (1) find a set of one-sided kinetic fluxes that are fully compatible with the DSMC method, (2) develop a finite volume scheme to make use of these one-sided kinetic fluxes, (3) make use of the one-sided kinetic fluxes together with DSMC type boundary conditions at a material surface so that velocity slip and temperature slip arise naturally for near-continuum conditions, (4) find a suitable sampling scheme so that the values of the one-sided fluxes predicted by the NS solution at an interface between the two domains can be converted into the correct distribution of particles to be introduced into the DSMC domain, (5) carry out a suitable number of tests to confirm that the developed concepts are valid, individually and in concert for a hybrid scheme.
Sensor Configuration Selection for Discrete-Event Systems under Unreliable Observations
Wen-Chiao Lin; Tae-Sic Yoo; Humberto E. Garcia
2010-08-01
Algorithms for counting the occurrences of special events in the framework of partially-observed discrete event dynamical systems (DEDS) were developed in previous work. Their performances typically become better as the sensors providing the observations become more costly or increase in number. This paper addresses the problem of finding a sensor configuration that achieves an optimal balance between cost and the performance of the special event counting algorithm, while satisfying given observability requirements and constraints. Since this problem is generally computational hard in the framework considered, a sensor optimization algorithm is developed using two greedy heuristics, one myopic and the other based on projected performances of candidate sensors. The two heuristics are sequentially executed in order to find best sensor configurations. The developed algorithm is then applied to a sensor optimization problem for a multiunit- operation system. Results show that improved sensor configurations can be found that may significantly reduce the sensor configuration cost but still yield acceptable performance for counting the occurrences of special events.
WARPP -A Toolkit for Simulating High-Performance Parallel Scientific Codes
Jarvis, Stephen
WARPP - A Toolkit for Simulating High-Performance Parallel Scientific Codes S.D. Hammond, G system jitter to the simulator can improve the quality of the application performance model results, Simulation Keywords Application Performance Modelling, Simulation, High Performance Computing 1. INTRODUCTION
Parallel I/O, Analysis, and Visualization of a Trillion Particle Simulation
Southern California, University of
of terabytes (TB) of data per simulation. For instance, the Intergovernmental Panel on Climate Change multiParallel I/O, Analysis, and Visualization of a Trillion Particle Simulation Surendra Byna, Jerry physics simulations have recently entered the regime of simulating trillions of particles
Contact-impact simulations on massively parallel SIMD supercomputers
Plaskacz, E.J. (Argonne National Lab., IL (United States)); Belytscko, T.; Chiang, H.Y. (Northwestern Univ., Evanston, IL (United States))
1992-01-01
The implementation of explicit finite element methods with contact-impact on massively parallel SIMD computers is described. The basic parallel finite element algorithm employs an exchange process which minimizes interprocessor communication at the expense of redundant computations and storage. The contact-impact algorithm is based on the pinball method in which compatibility is enforced by preventing interpenetration on spheres embedded in elements adjacent to surfaces. The enhancements to the pinball algorithm include a parallel assembled surface normal algorithm and a parallel detection of interpenetrating pairs. Some timings with and without contact-impact are given.
xSim: The Extreme-Scale Simulator
Boehm, Swen [ORNL; Engelmann, Christian [ORNL
2011-01-01
Investigating parallel application performance properties at scale is becoming an important part of high-performance computing (HPC) application development and deployment. The Extreme-scale Simulator (xSim) is a performance investigation toolkit that permits running an application in a controlled environment at extreme scale without the need for a respective extreme-scale HPC system. Using a lightweight parallel discrete event simulation, xSim executes a parallel application with a virtual wall clock time, such that performance data can be extracted based on a processor model and a network model. This paper presents significant enhancements to the xSim toolkit prototype that provide a more complete Message Passing Interface (MPI) support and improve its versatility. These enhancements include full virtual MPI group, communicator and collective communication support, and global variables support. The new capabilities are demonstrated by executing the entire NAS Parallel Benchmark suite in a simulated HPC environment.
Parallel computing in enterprise modeling.
Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.
2008-08-01
This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.
Ultrasonic and parallel gap welding of GaAs solar cells, experimental and simulation results
NASA Astrophysics Data System (ADS)
Reul, S.; Snakker, W.
1989-08-01
Different ultrasonic and parallel gap welding techniques are discussed. Their application to the welding of GaAs solar cells is outlined. The results of finite element simulation of thermal, electrical and mechanical parameters affecting parallel gap welding are presented. Thermal cycling test for both types of welding are performed. Welding tests demonstrate problematic breakage behavior for parallel gap, but not for ultrasonic welding. In both cases deep thermal cycling is successful.
Design and implementation of parallel and distributed wargame simulation system
A. Ozaki; M. Furuichi; K. Takahashi; H. Matsukawa
2000-01-01
Simulation based education and training, especially wargame simulations are being widely used in the field of defence modeling and simulation communities. In order to efficiently train students and trainees, the wargame simulations must have both high performance and high fidelity. In this paper, we discuss design and implementation issues for a tentative wargame simulation on a distribution system. This wargame
Simulation of Sheared Suspensions With a Parallel Implementation of QDPD
Bentz, Dale P.
must be coupled. By mapping of the DPD equa- tions of motion to the Fokker-Planck equation [4.martys@nist.gov A parallel quaternion-based dissipative particle dynamics (QDPD) program has been developed in Fortran
Parallel climate model (PCM) control and transient simulations
W. M. Washington; J. W. Weatherly; G. A. Meehl; A. J. Semtner Jr.; T. W. Bettge; A. P. Craig; W. G. Strand Jr.; J. M. Arblaster; V. B. Wayland; R. James; Y. Zhang
2000-01-01
The Department of Energy (DOE) supported Parallel Climate Model (PCM) makes use of the NCAR Community Climate Model (CCM3)\\u000a and Land Surface Model (LSM) for the atmospheric and land surface components, respectively, the DOE Los Alamos National Laboratory\\u000a Parallel Ocean Program (POP) for the ocean component, and the Naval Postgraduate School sea-ice model. The PCM executes on\\u000a several distributed and
High Performance System Framework for Parallel in-Silico Biological Simulations
Plamenka Borovska; Ognian Nakov; Veska Gancheva; Ivailo Georgiev
2011-01-01
The parallel implementation of methods and algorithms for analysis of biological data using high-performance computing is essential for accelerating the research and reduce the investment. The paper presents a high-performance framework for carrying out scientific experiments in the area of bioinformatics, on the basis of parallel computer simulations on a heterogeneous compact computer cluster. Several of the most popular and
Jiun-Ming Hsu; Prithviraj Banerjee
1990-01-01
This paper presents the performance evaluation, workload characterization and trace driven simulation of a hypercube multi-computer running realistic workloads. Six representative parallel applications were selected as benchmarks. Software monitoring techniques were then used to collect execution traces. Based on the measurement results, we investigated both the computation and communication behavior of these parallel programs, including CPU utilization, computation task granularity,
Parallel Simulation of Peer-to-Peer Systems Martin Quinson, Cristian Rosa, Christophe Thiery
Paris-Sud XI, Université de
Parallel Simulation of Peer-to-Peer Systems Martin Quinson, Cristian Rosa, Christophe Thi´ery LORIA. In the context of Peer-to-Peer (P2P) protocols, most studies rely on simulation. Surprisingly enough, none to this context. The constraints posed on the simulator internals are presented, and an OS-inspired architecture
ACCELERATION OF RADIANCE FOR LIGHTING SIMULATION BY USING PARALLEL COMPUTING WITH OPENCL
ACCELERATION OF RADIANCE FOR LIGHTING SIMULATION BY USING PARALLEL COMPUTING WITH OPENCL Wangda Zuo on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications
Scalable Time-Parallelization of Molecular Dynamics Simulations in Nano Mechanics
Srinivasan, Ashok
Scalable Time-Parallelization of Molecular Dynamics Simulations in Nano Mechanics Yanan Yu Dept: chandra@eng.fsu.edu. Abstract-- Molecular Dynamics (MD) is an important atomistic simulation technique. Molecular dynamics is widely used to simulate the behavior of physical systems in such applications
G. V. FRAZIER
1998-01-01
This research presents a Simulated Annealing based technique to address the assembly line balancing problem for multiple objective problems when paralleling of workstations is permitted. The Simulated Annealing methodology is used for 23 line balancing strategies across seven problems. The resulting performance of each solution was studied through a simulation experiment. Many of the prob- lems consisted of multiple products,
Abstractions and Techniques for Parallel NÂbody Simulation Michael S. Warren \\Lambda John K. Salmon generated as needed. In our implementation, we assign a Key to each particle, which is based on Morton
Three-dimensional shock wave physics simulations with MIMD PAGOSA on massively parallel computers
NASA Astrophysics Data System (ADS)
Gardner, D. R.; Vaughan, C. T.; Cline, D. D.
The numerical modeling of penetrator-armor interactions for design studies requires rapid, detailed, three-dimensional simulation of complex interactions of exotic materials at high speeds and high rates of strain. To perform such simulations, we have developed a multiple-instruction, multiple-data (MIMD) version of the PAGOSA hydrocode. The code includes a variety of models for material strength, fracture, and the detonation of high explosives. We present a typical armor/antiarmor penetration simulation conducted with this code, and measurements of its performance. The scaled speedups for MIMD PAGOSA on the 1024-processor nCUBE 2 parallel computer, measured as the simulation size is increased with the number of processors, reveal that small grind times (computational time per cell per cycle) and parallel scaled efficiencies of 90% can be achieved for realistic problems. This simulation demonstrates that massively parallel hydrocodes can provide rapid, highly-detailed armor/antiarmor simulations.
Three-dimensional shock wave physics simulations with MIMD PAGOSA on massively parallel computers
Gardner, D.R.; Vaughan, C.T. [Sandia National Labs., Albuquerque, NM (United States); Cline, D.D. [Texas Univ., Austin, TX (United States). Center for High Performance Computing
1992-12-31
The numerical modeling of penetrator-armor interactions for design studies requires rapid, detailed, three-dimensional simulation of complex interactions of exotic materials at high speeds and high rates of strain. To perform such simulations, we have developed a multiple-instruction, multiple-data (MIMD) version of the PAGOSA hydrocode. The code includes a variety of models for material strength, fracture, and the detonation of high explosives. We present a typical armor/antiarmor penetration simulation conducted with this code, and measurements of its performance. The scaled speedups for MIMD PAGOSA on the 1024-processor nCUBE 2 parallel computer, measured as the simulation size is increased with the number of processors, reveal that small grind times (computational time per cell per cycle) and parallel scaled efficiencies of 90% can be achieved for realistic problems. This simulation demonstrates that massively parallel hydrocodes can provide rapid, highly-detailed armor/ antiarmor simulations.
Three-dimensional shock wave physics simulations with MIMD PAGOSA on massively parallel computers
Gardner, D.R.; Vaughan, C.T. (Sandia National Labs., Albuquerque, NM (United States)); Cline, D.D. (Texas Univ., Austin, TX (United States). Center for High Performance Computing)
1992-01-01
The numerical modeling of penetrator-armor interactions for design studies requires rapid, detailed, three-dimensional simulation of complex interactions of exotic materials at high speeds and high rates of strain. To perform such simulations, we have developed a multiple-instruction, multiple-data (MIMD) version of the PAGOSA hydrocode. The code includes a variety of models for material strength, fracture, and the detonation of high explosives. We present a typical armor/antiarmor penetration simulation conducted with this code, and measurements of its performance. The scaled speedups for MIMD PAGOSA on the 1024-processor nCUBE 2 parallel computer, measured as the simulation size is increased with the number of processors, reveal that small grind times (computational time per cell per cycle) and parallel scaled efficiencies of 90% can be achieved for realistic problems. This simulation demonstrates that massively parallel hydrocodes can provide rapid, highly-detailed armor/ antiarmor simulations.
Biophysically Accurate Brain Modeling and Simulation using Hybrid MPI/OpenMP Parallel Processing
Hu, Jingzhen
2012-07-16
i BIOPHYSICALLY ACCURATE BRAIN MODELING AND SIMULATION USING HYBRID MPI/OPENMP PARALLEL PROCESSING A Thesis by JINGZHEN HU Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment... Copyright 2012 Jingzhen Hu iii BIOPHYSICALLY ACCURATE BRAIN MODELING AND SIMULATION USING HYBRID MPI/OPENMP PARALLEL PROCESSING A Thesis by JINGZHEN HU Submitted to the Office of Graduate Studies of Texas A&M University in partial...
BRADLEY J. BRUMMEL; DEBORAH E. RUPP; SETH M. SPAIN
2009-01-01
Assessment centers rely on multiple, carefully constructed behavioral simulation exercises to measure individuals on multiple performance dimensions. Although methods for establishing parallelism among al- ternate forms of paper-and-pencil tests have been well researched (i.e., to equate tests on difficulty such that the scores can be compared), little re- search has considered the why and how of parallel simulation exercises. This
iPRIDE: a parallel integrated circuit simulator using direct method
Mi-Chang Chang; I. N. Hajj
1988-01-01
A parallel circuit simulator, iPRIDE, which uses a direct solution method and runs on a shared-memory multiprocessor is described. The simulator is based on a multilevel node tearing approach which produces a nested bordered-block-diagonal (BBD) form of the circuit equation matrix. The parallel solution of the nested BBD matrix is described. Its efficiency is shown to depend on how the
Dynamics of a Parallel Platform for Helicopter Flight Simulation Considering Friction
D. L. Pisla; T. P. Itul; A. Pisla; B. Gherman
\\u000a In the paper the inverse dynamical model with friction of a 6-DOF parallel structure destined to helicopter flight simulation\\u000a is presented using the Newton-Euler equations. The obtained dynamical algorithms offer the possibility of a complex study\\u000a of the parallel structure in order to evaluate the dynamic capabilities and to generate the control algorithms. Using a numerical\\u000a and graphical simulation the
Hisashi Ishida; Mariko Higuchi; Yoshiteru Yonetani; Takuma Kano; Yasumasa Joti; Akio Kitao; Nobuhiro Go
The Earth Simulator has the highest power ever achieved to perform molecular dynamics simulation of large-scale supra- molecular systems. Now, we are developing a molecular dynamics simulation system, called PABIOS, which is designed to run a system composed of more than a million particles efficiently on parallel computers. To perform large-scale simulations rapidly and accurately, state-of-the-art algorithms, such as Particle-Particle
NASA Astrophysics Data System (ADS)
Rudd, Kevin Edward
In this dissertation, we present two parallelized 3D simulation techniques for three-dimensional acoustic and elastic wave propagation based on the finite integration technique. We demonstrate their usefulness in solving real-world problems with examples in the three very different areas of nondestructive evaluation, medical imaging, and security screening. More precisely, these include concealed weapons detection, periodontal ultrasography, and guided wave inspection of complex piping systems. We have employed these simulation methods to study complex wave phenomena and to develop and test a variety of signal processing and hardware configurations. Simulation results are compared to experimental measurements to confirm the accuracy of the parallel simulation methods.
Kothe, D.B.; Turner, J.A.; Mosso, S.J. [Los Alamos National Lab., NM (United States); Ferrell, R.C. [Cambridge Power Computer Assoc. (United States)
1997-03-01
We discuss selected aspects of a new parallel three-dimensional (3-D) computational tool for the unstructured mesh simulation of Los Alamos National Laboratory (LANL) casting processes. This tool, known as {bold Telluride}, draws upon on robust, high resolution finite volume solutions of metal alloy mass, momentum, and enthalpy conservation equations to model the filling, cooling, and solidification of LANL castings. We briefly describe the current {bold Telluride} physical models and solution methods, then detail our parallelization strategy as implemented with Fortran 90 (F90). This strategy has yielded straightforward and efficient parallelization on distributed and shared memory architectures, aided in large part by new parallel libraries {bold JTpack9O} for Krylov-subspace iterative solution methods and {bold PGSLib} for efficient gather/scatter operations. We illustrate our methodology and current capabilities with source code examples and parallel efficiency results for a LANL casting simulation.
Parallel molecular dynamics simulation on thin-film formation process
Huawei Chen; Ichiro Hagiwara; Dawei Zhang; Tian Huang
2005-01-01
Chemical vapor deposition is regarded as one of the most promising methods of epitaxial growth for materials such as thin films, nanotubes, etc. The properties of such thin films depend on the states of cluster such as initial velocity, size, etc. We developed parallel molecule dynamics using the potential of the embedded atom method (EAM), which can make the scale
Towards scalable parallel-in-time turbulent flow simulations
Wang, Qiqi
We present a reformulation of unsteady turbulent flow simulations. The initial condition is relaxed and information is allowed to propagate both forward and backward in time. Simulations of chaotic dynamical systems with ...
Wang, Zhiteng; Zhang, Hongjun; Zhang, Rui; Li, Yong; Zhang, Xuliang
2014-01-01
Service oriented modeling and simulation are hot issues in the field of modeling and simulation, and there is need to call service resources when simulation task workflow is running. How to optimize the service resource allocation to ensure that the task is complete effectively is an important issue in this area. In military modeling and simulation field, it is important to improve the probability of success and timeliness in simulation task workflow. Therefore, this paper proposes an optimization algorithm for multipath service resource parallel allocation, in which multipath service resource parallel allocation model is built and multiple chains coding scheme quantum optimization algorithm is used for optimization and solution. The multiple chains coding scheme quantum optimization algorithm is to extend parallel search space to improve search efficiency. Through the simulation experiment, this paper investigates the effect for the probability of success in simulation task workflow from different optimization algorithm, service allocation strategy, and path number, and the simulation result shows that the optimization algorithm for multipath service resource parallel allocation is an effective method to improve the probability of success and timeliness in simulation task workflow. PMID:24963506
Parallel kinetic Monte Carlo simulations of Ag(111) island coarsening using a large database
NASA Astrophysics Data System (ADS)
Nandipati, Giridhar; Shim, Yunsic; Amar, Jacques G.; Karim, Altaf; Kara, Abdelkader; Rahman, Talat S.; Trushin, Oleg
2009-02-01
The results of parallel kinetic Monte Carlo (KMC) simulations of the room-temperature coarsening of Ag(111) islands carried out using a very large database obtained via self-learning KMC simulations are presented. Our results indicate that, while cluster diffusion and coalescence play an important role for small clusters and at very early times, at late time the coarsening proceeds via Ostwald ripening, i.e. large clusters grow while small clusters evaporate. In addition, an asymptotic analysis of our results for the average island size S(t) as a function of time t leads to a coarsening exponent n = 1/3 (where S(t)~t2n), in good agreement with theoretical predictions. However, by comparing with simulations without concerted (multi-atom) moves, we also find that the inclusion of such moves significantly increases the average island size. Somewhat surprisingly we also find that, while the average island size increases during coarsening, the scaled island-size distribution does not change significantly. Our simulations were carried out both as a test of, and as an application of, a variety of different algorithms for parallel kinetic Monte Carlo including the recently developed optimistic synchronous relaxation (OSR) algorithm as well as the semi-rigorous synchronous sublattice (SL) algorithm. A variation of the OSR algorithm corresponding to optimistic synchronous relaxation with pseudo-rollback (OSRPR) is also proposed along with a method for improving the parallel efficiency and reducing the number of boundary events via dynamic boundary allocation (DBA). A variety of other methods for enhancing the efficiency of our simulations are also discussed. We note that, because of the relatively high temperature of our simulations, as well as the large range of energy barriers (ranging from 0.05 to 0.8 eV), developing an efficient algorithm for parallel KMC and/or SLKMC simulations is particularly challenging. However, by using DBA to minimize the number of boundary events, we have achieved significantly improved parallel efficiencies for the OSRPR and SL algorithms. Finally, we note that, among the three parallel algorithms which we have tested here, the semi-rigorous SL algorithm with DBA led to the highest parallel efficiencies. As a result, we have obtained reasonable parallel efficiencies in our simulations of room-temperature Ag(111) island coarsening for a small number of processors (e.g. Np = 2 and 4). Since the SL algorithm scales with system size for fixed processor size, we expect that comparable and/or even larger parallel efficiencies should be possible for parallel KMC and/or SLKMC simulations of larger systems with larger numbers of processors.
Parallel, AMR MHD for Global Space Weather Simulations
Kenneth G. Powell; Darren L. De Zeeuw; Igor V. Sokolov; Gábor Tóth; Tamas Gombosi; Quentin Stout
This paper presents the methodology behind and results of adaptive mesh refinement in global magnetohydrodynamic models of the space environment. Techniques used in solving the governing equations of semi-relativistic magnetohydrodynamics (MHD) are presented. These techniques include high-resolution upwind schemes, block-based solution-adaptive grids, explicit, implicit and partial-implicit time-stepping, and domain decomposition for parallelization. Recent work done in coupling the MHD model
Parallel Simulation of High Reynolds Number Vascular Flows Paul Fischer,a
Fischer, Paul F.
Parallel Simulation of High Reynolds Number Vascular Flows Paul Fischer,a Francis Loth, Sang, Argonne National Laboratory Argonne, IL 60439, U.S.A. 1. Introduction The simulation of turbulent vascular flow, such as occurs in post-stenotic regions or subse- quent to graft implantation, exhibits a much
Light Scattering Simulations with a Massively Parallel Computer at the IC3A
Amsterdam, Universiteit van
1 Light Scattering Simulations with a Massively Parallel Computer at the IC3A Peter M.A. Sloot powerfull method to simulate the Elastic Light Scattering from arbitrary particles. This method however has Definition of the physical problem and the computational challenge Elastic Light Scattering (ELS
Development of a parallel optimization method based on genetic simulated annealing algorithm
Zhi-gang Wang; Yoke-san Wong; Mustafizur Rahman
2005-01-01
This paper presents a parallel genetic simulated annealing (PGSA) algorithm that has been developed and applied to optimize continuous problems. In PGSA, the entire population is divided into subpopulations, and in each subpopulation the algorithm uses the local search ability of simulated annealing after crossover and mutation. The best individuals of each subpopulation are migrated to neighboring ones after certain
A Fast Algorithm for Massively Parallel, LongTerm, Simulation of Complex
Çagin, Tahir
A Fast Algorithm for Massively Parallel, LongTerm, Simulation of Complex Molecular Dynamics Institute of Technology Abstract In this paper a novel algorithm for solution of constrained equations of motion with application to simulation of molecular dynamics systems is presented. The algorithm enables
A Scalable Architecture for Crowd Simulation: Implementing a Parallel Action Server
Lozano, Miguel
A Scalable Architecture for Crowd Simulation: Implementing a Parallel Action Server G. Vigueras, M- lability of crowd simulation is still an open issue. In this paper, we propose a scalable architecture agents. Although several proposals have focused on the software architectures for these systems, the sca
A Multidimensional Study on the Feasibility of Parallel Switch-Level Circuit Simulation
Yu-an Chen; Vikas Jha; Rajive Bagrodia
1997-01-01
This paper presents the results of an experimental study to evaluate the effectiveness of multiple synchronization protocols and partitioning algorithms in reducing the execution time of switch-level models of VLSI circuits. Specific contributions of this paper include: (i) parallelizing an existing switch-level simulator such that the model can be executed using conservative and optimistic simulation protocols with minor changes, (ii)
Kevin L. Kapp; Thomas C. Hartrum; Tom S. Wailes
1995-01-01
Distributing computation among multiple processors is one approach to reducing simulation time for large VLSI circuit designs. However, parallel simulation introduces the problem of how to partition the logic gates and system behaviors of the circuit among the available processors in order to obtain maximum speedup. A complicating factor that is often ignored is the effect of the time-synchronization protocol
TimeNET-Sim-a parallel simulator for stochastic Petri nets
Christian Kelling; E R. Germany
1995-01-01
TimeNET is a software package for modeling and performance evaluation with non-Markovian Petri nets. Concepts and implementation of the simulation component of this tool are introduced. The paper focuses on a reliable statistical analysis and the application of variance reduction techniques in a parallel, distributed simulation framework. It examines the application of variance reduction with control variates and shows an
An Improved Parallel Simulated Annealing Algorithm Used for Protein Structure Prediction
Yun-Ling Liu; Lan Tao
2006-01-01
This paper introduces an improved simulated annealing - parallel simulated annealing with genetic crossover: PSAGC, and uses this method in energy minimization problem of protein structure prediction. Through experiments on three real proteins, PSAGC is proved more effective than SA and PSA, it can achieve conformations which have lower energy values. Then the paper investigates crossover interval and crossover method
Johannes Roth; Jorg Stadler; Marco Brunelli; Dietmar Bunz; Franz Gahler; Jutta Hahn; Martin Hohl; Christof Horn; Jutta Kaiser; Ralf Mikulla; Gunther Schaaf; Joachim Stelzer; Hans-Rainer Trebin
1999-01-01
We describe the current development status of IMD (ITAPMolecular Dynamics), a software package for classical molecular dynamics simulations on massively-parallel computers. IMD is a general purpose program which can be used for all kinds of two -and three-dimensional studies in condensed matter physics, in addition to the usual MD features it contains a number of special routines for simulation of
Haibo Dong
2003-01-01
Due to the progress in computer technology in recent years, distributed memory parallel computer systems are rapidly gaining importance in direct numerical simulation (DNS) of the stability and transition of compressible boundary layers. In most works, explicit methods have mainly been used in such simulations to advance the compressible Navier-Stokes equations in time. However, the small wall-normal grid sizes for
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL
Zuo, Wangda; McNeil, Andrew; Wetter, Michael; Lee, Eleanor
2011-09-06
We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
: A Scalable and Transparent System for Simulating MPI Programs
Perumalla, Kalyan S [ORNL
2010-01-01
is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, and MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.
Partitioning and packing mathematical simulation models for calculation on parallel computers
NASA Technical Reports Server (NTRS)
Arpasi, D. J.; Milner, E. J.
1986-01-01
The development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system is described. Degrees of parallelism (i.e., coupling between the equations) and their impact on parallel processing are discussed. The problem of identifying computational parallelism within sets of closely coupled equations that require the exchange of current values of variables is described. A technique is presented for identifying this parallelism and for partitioning the equations for parallel solution on a multiprocessor. An algorithm which packs the equations into a minimum number of processors is also described. The results of the packing algorithm when applied to a turbojet engine model are presented in terms of processor utilization.
DL_POLY_2.0: A general-purpose parallel molecular dynamics simulation package
W. Smith; T. R. Forester
1996-01-01
DL_POLY_2.0 is a general-purpose parallel molecular dynamics simulation package developed at Daresbury Laboratory under the auspices of the Council for the Central Laboratory of the Research Councils. Written to support academic research, it has a wide range of applications and is designed to run on a wide range of computers: from single processor workstations to parallel supercomputers. Its structure, functionality,
Jianming Yang; Elias Balaras
A parallel embedded boundary approach for large-eddy simulations of turbulent flows with complex geometries and dynamically moving boundaries on fixed orthogonal grids is presented. The underlying solver is based on a second-order fractional step method on a staggered grid. The boundary conditions on an arbitrary immersed interface are satisfied via second-order local reconstructions. The parallelization is implemented via a slab
Parallel Many--Body Simulations Without All--to--All Communication
Bruce Hendrickson; Steve Plimpton Sandia
1993-01-01
Simulations of interacting particles are common in science and engineering, appearing insuch diverse disciplines as astrophysics, fluid dynamics, molecular physics, and materials science.These simulations are often computationally intensive and so natural candidates for massivelyparallel computing. Many--body simulations that directly compute interactions between pairsof particles, be they short--range or long--range interactions, have been parallelized in severalstandard ways. The...
Parallel Many-Body Simulations without All-to-All Communication
Bruce Hendrickson; Steve Plimpton
1995-01-01
Simulations of interacting particles are common in science and engineering, appearing insuch diverse disciplines as astrophysics, fluid dynamics, molecular physics, and materials science.These simulations are often computationally intensive and so natural candidates for massivelyparallel computing. Many--body simulations that directly compute interactions between pairsof particles, be they short--range or long--range interactions, have been parallelized in severalstandard ways. The...
Shared Memory Implementation of a Parallel Switch-Level Circuit Simulator
Yu-an Chen; Rajive Bagrodia
1998-01-01
Circuit simulation is a critical bottleneck in VLSIdesign. This paper describes the implementation ofan existing parallel switch-level simulator called MIRSIMon a shared-memory multiprocessor architecture.The simulator uses a set of three different conservativeprotocols: the null message protocol, the conditionalevent protocol and the accelerated null message protocol,a combinations of the preceding two algorithms.The paper describes the implementation of these protocolsto exploit...
Modeling and dynamic simulation of high voltage generator parallel in the grid
Ge Baojun; Zhao Jinshi; Tao Dajun; Zhang Zhiqiang; Lin Peng
2009-01-01
The simulation model of high voltage generator is proposed in MATLAB\\/SIMULINK based on the equivalent mathematical model of the new type generator. Simulation module is established in a single-machine infinite-bus. A high voltage generator prototype rated at 162.5 kVA\\/5 kV is taken as the research object in this paper. The quasi-synchronizing parallel in operation characteristic is studied with digital simulation,
Parallelized modelling and solution scheme for hierarchically scaled simulations
NASA Technical Reports Server (NTRS)
Padovan, Joe
1995-01-01
This two-part paper presents the results of a benchmarked analytical-numerical investigation into the operational characteristics of a unified parallel processing strategy for implicit fluid mechanics formulations. This hierarchical poly tree (HPT) strategy is based on multilevel substructural decomposition. The Tree morphology is chosen to minimize memory, communications and computational effort. The methodology is general enough to apply to existing finite difference (FD), finite element (FEM), finite volume (FV) or spectral element (SE) based computer programs without an extensive rewrite of code. In addition to finding large reductions in memory, communications, and computational effort associated with a parallel computing environment, substantial reductions are generated in the sequential mode of application. Such improvements grow with increasing problem size. Along with a theoretical development of general 2-D and 3-D HPT, several techniques for expanding the problem size that the current generation of computers are capable of solving, are presented and discussed. Among these techniques are several interpolative reduction methods. It was found that by combining several of these techniques that a relatively small interpolative reduction resulted in substantial performance gains. Several other unique features/benefits are discussed in this paper. Along with Part 1's theoretical development, Part 2 presents a numerical approach to the HPT along with four prototype CFD applications. These demonstrate the potential of the HPT strategy.
Large Eddy simulation of parallel blade-vortex interaction
NASA Astrophysics Data System (ADS)
Felten, Frederic; Lund, Thomas
2002-11-01
Helicopter Blade-Vortex Interaction (BVI) generally occurs under certain conditions of powered descent or during extreme maneuvering. The vibration and acoustic problems associated with the interaction of rotor tip vortices and the following blades is a major aerodynamic concern for the helicopter community. Numerous experimental and computational studies have been done over the last two decades in order to gain a better understanding of the physical mechanisms involved in BVI. The most severe interaction, in terms of generated noise, happens when the vortex filament is parallel to the blade, thus affecting a great portion of it. The majority of the previous numerical studies of parallel BVI fall within a potential flow framework. Some Navier-Stokes approaches using dissipative numerical methods and RANS-type turbulence models have also been attempted, but with limited success. The current investigation makes use of an incompressible, non-dissipative, kinetic energy conserving collocated mesh scheme in conjunction with a dynamic subgrid-scale model. The concentrated tip vortex is not attenuated as it is convected downstream and over a NACA-0012 airfoil. The lift, drag, moment and pressure coefficients induced by the passage of the vortex are monitored in time and compared with experimental data.
Parallel Algorithms for Time and Frequency Domain Circuit Simulation
Dong, Wei
2010-10-12
be computationally expensive, especially for ever-larger transistor circuits with more complex device models. Therefore, it is becoming increasingly desirable to accelerate circuit simulation. On the other hand, the emergence of multi-core machines offers a promising...
NASA Astrophysics Data System (ADS)
Seough, J.; Yoon, P. H.; Hwang, J.; Kim, K. H.
2014-12-01
In situ observations have shown that the measured electron temperature anisotropy in the expanding solar wind is regulated by the electron fire-hose instabilities (EFI), which could be excited by excessive parallel temperature anisotropy. It is known that for parallel propagation mode the enhanced transverse fluctuations driven by the parallel EFI are resonant with the ions. In the present study, nonlinear properties of the parallel EFI are investigated using one-dimensional particle-in-cell simulations with various initial proton plasma betas. It is found that the protons in resonance with the left-hand polarized EFI modes are anisotropically heated and subsequently their resonant interactions give rise to the excitation of the ion-acoustic waves (IAW). It is shown that the intensity of the excited IAW is proportional to the values of the electron to proton temperature ratio. In addition, the presence of the unusual electrostatic modes driven by nonlinear behavior of the protons, especially for the lower proton beta simulations, leads to the formation of the suprathermal component in the proton parallel velocity distribution, although the parallel proton temperature does not practically change throughout the simulation period.
A parallel algorithm for transient solid dynamics simulations with contact detection
Attaway, S.; Hendrickson, B.; Plimpton, S.; Gardner, D.; Vaughan, C.; Heinstein, M.; Peery, J.
1996-06-01
Solid dynamics simulations with Lagrangian finite elements are used to model a wide variety of problems, such as the calculation of impact damage to shipping containers for nuclear waste and the analysis of vehicular crashes. Using parallel computers for these simulations has been hindered by the difficulty of searching efficiently for material surface contacts in parallel. A new parallel algorithm for calculation of arbitrary material contacts in finite element simulations has been developed and implemented in the PRONTO3D transient solid dynamics code. This paper will explore some of the issues involved in developing efficient, portable, parallel finite element models for nonlinear transient solid dynamics simulations. The contact-detection problem poses interesting challenges for efficient implementation of a solid dynamics simulation on a parallel computer. The finite element mesh is typically partitioned so that each processor owns a localized region of the finite element mesh. This mesh partitioning is optimal for the finite element portion of the calculation since each processor must communicate only with the few connected neighboring processors that share boundaries with the decomposed mesh. However, contacts can occur between surfaces that may be owned by any two arbitrary processors. Hence, a global search across all processors is required at every time step to search for these contacts. Load-imbalance can become a problem since the finite element decomposition divides the volumetric mesh evenly across processors but typically leaves the surface elements unevenly distributed. In practice, these complications have been limiting factors in the performance and scalability of transient solid dynamics on massively parallel computers. In this paper the authors present a new parallel algorithm for contact detection that overcomes many of these limitations.
Z. G. Wang; M. Rahman; Y. S. Wong; J. Sun
2005-01-01
This paper presents an approach to select the optimal machining parameters for multi-pass milling. It is based on two recent approaches, genetic algorithm (GA) and simulated annealing (SA), which have been applied to many difficult combinatorial optimization problems with certain strengths and weaknesses. In this paper, a hybrid of GA and SA (GSA) is presented to use the strengths of
LISP based simulation generators for modeling complex space processes
NASA Technical Reports Server (NTRS)
Tseng, Fan T.; Schroer, Bernard J.; Dwan, Wen-Shing
1987-01-01
The development of a simulation assistant for modeling discrete event processes is presented. Included are an overview of the system, a description of the simulation generators, and a sample process generated using the simulation assistant.
NASA Technical Reports Server (NTRS)
Sohn, Andrew; Biswas, Rupak
1996-01-01
Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.
Virtual reality visualization of parallel molecular dynamics simulation
Disz, T.; Papka, M.; Stevens, R.; Pellegrino, M. [Argonne National Lab., IL (United States); Taylor, V. [Northwestern Univ., Evanston, IL (United States). Electrical Engineering and Computer Science
1995-12-31
When performing communications mapping experiments for massively parallel processors, it is important to be able to visualize the mappings and resulting communications. In a molecular dynamics model, visualization of the atom to atom interaction and the processor mappings provides insight into the effectiveness of the communications algorithms. The basic quantities available for visualization in a model of this type are the number of molecules per unit volume, the mass, and velocity of each molecule. The computational information available for visualization is the atom to atom interaction within each time step, the atom to processor mapping, and the energy resealing events. We use the CAVE (CAVE Automatic Virtual Environment) to provide interactive, immersive visualization experiences.
Parallel computation for reservoir thermal simulation: An overlapping domain decomposition approach
NASA Astrophysics Data System (ADS)
Wang, Zhongxiao
2005-11-01
In this dissertation, we are involved in parallel computing for the thermal simulation of multicomponent, multiphase fluid flow in petroleum reservoirs. We report the development and applications of such a simulator. Unlike many efforts made to parallelize locally the solver of a linear equations system which affects the performance the most, this research takes a global parallelization strategy by decomposing the computational domain into smaller subdomains. This dissertation addresses the domain decomposition techniques and, based on the comparison, adopts an overlapping domain decomposition method. This global parallelization method hands over each subdomain to a single processor of the parallel computer to process. Communication is required when handling overlapping regions between subdomains. For this purpose, MPI (message passing interface) is used for data communication and communication control. A physical and mathematical model is introduced for the reservoir thermal simulation. Numerical tests on two sets of industrial data of practical oilfields indicate that this model and the parallel implementation match the history data accurately. Therefore, we expect to use both the model and the parallel code to predict oil production and guide the design, implementation and real-time fine tuning of new well operating schemes. A new adaptive mechanism to synchronize processes on different processors has been introduced, which not only ensures the computational accuracy but also improves the time performance. To accelerate the convergence rate of iterative solution of the large linear equations systems derived from the discretization of governing equations of our physical and mathematical model in space and time, we adopt the ORTHOMIN method in conjunction with an incomplete LU factorization preconditioning technique. Important improvements have been made in both ORTHOMIN method and incomplete LU factorization in order to enhance time performance without affecting the convergence rate of iterative solution. More importantly, the parallel implementation may serve as a working platform for any further research, for example, building and testing new physical and mathematical models, developing and testing new solver of pertinent linear equations system, etc.
A conflict-free, path-level parallelization approach for sequential simulation algorithms
NASA Astrophysics Data System (ADS)
Rasera, Luiz Gustavo; Machado, Péricles Lopes; Costa, João Felipe C. L.
2015-07-01
Pixel-based simulation algorithms are the most widely used geostatistical technique for characterizing the spatial distribution of natural resources. However, sequential simulation does not scale well for stochastic simulation on very large grids, which are now commonly found in many petroleum, mining, and environmental studies. With the availability of multiple-processor computers, there is an opportunity to develop parallelization schemes for these algorithms to increase their performance and efficiency. Here we present a conflict-free, path-level parallelization strategy for sequential simulation. The method consists of partitioning the simulation grid into a set of groups of nodes and delegating all available processors for simulation of multiple groups of nodes concurrently. An automated classification procedure determines which groups are simulated in parallel according to their spatial arrangement in the simulation grid. The major advantage of this approach is that it does not require conflict resolution operations, and thus allows exact reproduction of results. Besides offering a large performance gain when compared to the traditional serial implementation, the method provides efficient use of computational resources and is generic enough to be adapted to several sequential algorithms.
Constructing School Timetables Using Simulated Annealing: Sequential and Parallel Algorithms
D. Abramson
1991-01-01
This paper considers a solution to the school timetabling problem. The timetabling problem involves scheduling a number of tuples, each consisting of class of students, a teacher, a subject and a room, to a fixed number of time slots. A Monte Carlo scheme called simulated annealing is used as an optimisation technique. The paper introduces the timetabling problem, and then
Error-Free Parallel Implementation of Simulated Annealing
Pierre Roussel-ragot; Nadia Kouicem; Gérard Dreyfus
1990-01-01
Simulated Annealing is a powerful optimization method which has been applied to a number of industrial problems; however, it often requires large computation times. One way to overcome this drawback is the design of finely tuned cooling schedules, which may ensure a fast convergence of the sequential algorithm towards near-optimal solutions. Another way of decreasing the computation time is the
Gedney, S.D.
1990-12-01
The Parallel-Plate Bounded-Wave EMP Simulator is typically used to test the vulnerability of electronic systems to the electromagnetic pulse (EMP) produced by a high altitude nuclear burst by subjecting the systems to a simulated EMP environment. However, when large test objects are placed within the simulator for investigation, the desired EMP environment may be affected by the interaction between the simulator and the test object. This simulator/obstacle interaction can be attributed to the following phenomena: (1) mutual coupling between the test object and the simulator, (2) fringing effects due to the finite width of the conducting plates of the simulator, and (3) multiple reflections between the object and the simulator's tapered end-sections. When the interaction is significant, the measurement of currents coupled into the system may not accurately represent those induced by an actual EMP. To better understand the problem of simulator/obstacle interaction, a dynamic analysis of the fields within the parallel-plate simulator is presented. The fields are computed using a moment method solution based on a wire mesh approximation of the conducting surfaces of the simulator. The fields within an empty simulator are found to be predominately transversse electromagnetic (TEM) for frequencies within the simulator's bandwidth, properly simulating the properties of the EMP propagating in free space. However, when a large test object is placed within the simulator, it is found that the currents induced on the object can be quite different from those on an object situated in free space. A comprehensive study of the mechanisms contributing to this deviation is presented.
NASA Technical Reports Server (NTRS)
Fijany, Amir (inventor); Bejczy, Antal K. (inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
Strong-strong beam-beam simulation on parallel computer
Qiang, Ji
2004-08-02
The beam-beam interaction puts a strong limit on the luminosity of the high energy storage ring colliders. At the interaction points, the electromagnetic fields generated by one beam focus or defocus the opposite beam. This can cause beam blowup and a reduction of luminosity. An accurate simulation of the beam-beam interaction is needed to help optimize the luminosity in high energy colliders.
A parallel simulated annealing algorithm for standard cell placement on a hypercube computer
NASA Technical Reports Server (NTRS)
Jones, Mark Howard
1987-01-01
A parallel version of a simulated annealing algorithm is presented which is targeted to run on a hypercube computer. A strategy for mapping the cells in a two dimensional area of a chip onto processors in an n-dimensional hypercube is proposed such that both small and large distance moves can be applied. Two types of moves are allowed: cell exchanges and cell displacements. The computation of the cost function in parallel among all the processors in the hypercube is described along with a distributed data structure that needs to be stored in the hypercube to support parallel cost evaluation. A novel tree broadcasting strategy is used extensively in the algorithm for updating cell locations in the parallel environment. Studies on the performance of the algorithm on example industrial circuits show that it is faster and gives better final placement results than the uniprocessor simulated annealing algorithms. An improved uniprocessor algorithm is proposed which is based on the improved results obtained from parallelization of the simulated annealing algorithm.
Characterization of parallel-hole collimator using Monte Carlo Simulation
Pandey, Anil Kumar; Sharma, Sanjay Kumar; Karunanithi, Sellam; Kumar, Praveen; Bal, Chandrasekhar; Kumar, Rakesh
2015-01-01
Objective: Accuracy of in vivo activity quantification improves after the correction of penetrated and scattered photons. However, accurate assessment is not possible with physical experiment. We have used Monte Carlo Simulation to accurately assess the contribution of penetrated and scattered photons in the photopeak window. Materials and Methods: Simulations were performed with Simulation of Imaging Nuclear Detectors Monte Carlo Code. The simulations were set up in such a way that it provides geometric, penetration, and scatter components after each simulation and writes binary images to a data file. These components were analyzed graphically using Microsoft Excel (Microsoft Corporation, USA). Each binary image was imported in software (ImageJ) and logarithmic transformation was applied for visual assessment of image quality, plotting profile across the center of the images and calculating full width at half maximum (FWHM) in horizontal and vertical directions. Results: The geometric, penetration, and scatter at 140 keV for low-energy general-purpose were 93.20%, 4.13%, 2.67% respectively. Similarly, geometric, penetration, and scatter at 140 keV for low-energy high-resolution (LEHR), medium-energy general-purpose (MEGP), and high-energy general-purpose (HEGP) collimator were (94.06%, 3.39%, 2.55%), (96.42%, 1.52%, 2.06%), and (96.70%, 1.45%, 1.85%), respectively. For MEGP collimator at 245 keV photon and for HEGP collimator at 364 keV were 89.10%, 7.08%, 3.82% and 67.78%, 18.63%, 13.59%, respectively. Conclusion: Low-energy general-purpose and LEHR collimator is best to image 140 keV photon. HEGP can be used for 245 keV and 364 keV; however, correction for penetration and scatter must be applied if one is interested to quantify the in vivo activity of energy 364 keV. Due to heavy penetration and scattering, 511 keV photons should not be imaged with HEGP collimator. PMID:25829730
A parallel computational framework for integrated surface-subsurface flow and transport simulations
NASA Astrophysics Data System (ADS)
Park, Y.; Hwang, H.; Sudicky, E. A.
2010-12-01
HydroGeoSphere is a 3D control-volume finite element hydrologic model describing fully-integrated surface and subsurface water flow and solute and thermal energy transport. Because the model solves tighly-coupled highly-nonlinear partial differential equations, often applied at regional and continental scales (for example, to analyze the impact of climate change on water resources), high performance computing (HPC) is essential. The target parallelization includes the composition of the Jacobian matrix for the iterative linearization method and the sparse-matrix solver, a preconditioned Bi-CGSTAB. The matrix assembly is parallelized by using a coarse-grained scheme in that the local matrix compositions can be performed independently. The preconditioned Bi-CGSTAB algorithm performs a number of LU substitutions, matrix-vector multiplications, and inner products, where the parallelization of the LU substitution is not trivial. The parallelization of the solver is achieved by partitioning the domain into equal-size subdomains, with an efficient reordering scheme. The computational flow of the Bi-CGSTAB solver is also modified to reduce the parallelization overhead and to be suitable for parallel architectures. The parallelized model is tested on several benchmark simulations which include linear and nonlinear flow problems involving various domain sizes and degrees of hydrologic complexities. The performance is evaluated in terms of computational robustness and efficiency, using standard scaling performance measures. The results of simulation profiling indicate that the efficiency becomes higher with an increasing number of nodes/elements in the mesh, for increasingly nonlinear transient simulations, and with domains of irregular geometry. These characteristics are promising for the large-scale analysis water resources problems involved integrated surface/subsurface flow regimes.
Parallel FEM Simulation of Electromechanics in the Heart
NASA Astrophysics Data System (ADS)
Xia, Henian; Wong, Kwai; Zhao, Xiaopeng
2011-11-01
Cardiovascular disease is the leading cause of death in America. Computer simulation of complicated dynamics of the heart could provide valuable quantitative guidance for diagnosis and treatment of heart problems. In this paper, we present an integrated numerical model which encompasses the interaction of cardiac electrophysiology, electromechanics, and mechanoelectrical feedback. The model is solved by finite element method on a Linux cluster and the Cray XT5 supercomputer, kraken. Dynamical influences between the effects of electromechanics coupling and mechanic-electric feedback are shown.
GalaxSee HPC Module 1: The N-Body Problem, Serial and Parallel Simulation
NSDL National Science Digital Library
David Joiner
This module introduces the N-body problem, which seeks to account for the dynamics of systems of multiple interacting objects. Galaxy dynamics serves as the motivating example to introduce a variety of computational methods for simulating change and criteria that can be used to check for model accuracy. Finally, the basic issues and ideas that must be considered when developing a parallel implementation of the simulation are introduced.
ParaSol: a multithreaded system for parallel simulation based on mobile threads
Edward Mascarenhas; Felipe Knop; Vernon Rego
1995-01-01
ParaSol is a novel multithreaded system for shared-and distributed-memory parallel simulation, designed to support a variety of domain-specific simulation object libraries. We report on the design of the ParaSol kernel, which drives executions based on optimistic and adaptive synchronization protocols. The active-transaction flow methodology we advocate is enabled by an underlying, efficient lightweight process system. Though this process- and object-interaction
NASA Astrophysics Data System (ADS)
Neumann, Rebecca; Bastian, Peter; Ippisch, Olaf
2013-04-01
Carbon Capture and storage is simulated with a two-phase two-component flow model employing a special set of primary variables (capillary pressure / phase pressure) to deal with the (dis-)appearance of the non-wetting phase. The implementation is based on DUNE PDElab. Numerical results of massive parallel simulations for test cases with millions of unknowns on the super computer HERMIT (Cray X6) are presented and discussed.
Xyce parallel electronic simulator reference guide, Version 6.0.1.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David Gregory. [Raytheon, Albuquerque, NM
2014-01-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1] .
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Owen, Jeffrey E.
1988-01-01
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
To appear in Journal of Computational Physics Parallel Discrete Molecular Dynamics Simulation
Herbordt, Martin
To appear in Journal of Computational Physics Parallel Discrete Molecular Dynamics Simulation with Speculation Â to appear in J. Computational Physics 2 1 Introduction Discrete, or Discontinuous, Molecular 02215 www.bu.edu/caadlab; email: azkhan@bu.edu, herbordt@bu.edu Abstract: Discrete molecular dynamics
A study on parallel processing of electro-magnetic transient simulation
Yasuyuki Kowada; Isao Iyoda; Nobuyuki Sato; Akira Yamazaki; Seiichi Matoba
1995-01-01
This paper presents a parallel processing method for power system electromagnetic transient simulation. It describes an outline of the algorithm, improvement in combined calculation between generators and networks, the relationship of granularity of divided subsystem and execution time and evaluation of precision by comparison with the results of EMTP
A study on parallel processing of electro-magnetic transient simulation
Kowada, Yasuyuki; Iyoda, Isao [Mitsubishi Electric Corp., Kobe, Hyogo (Japan)] [Mitsubishi Electric Corp., Kobe, Hyogo (Japan); Sato, Nobuyuki; Yamazaki, Akira [Tokyo Electric Power Co., Chofu, Tokyo (Japan)] [Tokyo Electric Power Co., Chofu, Tokyo (Japan); Matoba, Seiichi [Kaihatsu Computing Service Center Ltd., Tokyo (Japan)] [Kaihatsu Computing Service Center Ltd., Tokyo (Japan)
1995-08-01
This paper presents a parallel processing method for electro-magnetic transient simulation. It describes outline of algorithm, improvement in combined calculation between generator and network, relationship of granularity of divided subsystem and execution time, evaluation of precision by comparing with the results of EMTP.
Patrick Odent; Luc J. M. Claesen; Hugo De Man
1990-01-01
This paper presents two new techniques for accelerating circuit simulation. The first technique is an improvement of the parallel Waveform Relaxation Newton (WRN) method. The computations of all the timepoints are executed concurrently. Static task partitioning is shown to be an efficient method to limit the scheduling overhead. The second technique combines in a dynamic way the efficiency of the
An Architecture for Large ModSAF Simulations Using Scalable Parallel Processors
Sharon Brunett; Thomas Gottschalk
1997-01-01
An implementation of ModSAF for Scalable Parallel Processors (SPPs) is presented. This model ex- ploits the large number of processing elements and fast interprocessor communications of SPPs to simulate many thousands of vehicles on a single SPP. The implementation uses a heterogeneous assignment of tasks to processors, with most processors running independent copies of the standard SAFSim code and additional
Wisconsin Wind Tunnel II: A Fast and Portable Parallel Architecture Simulator
Wood, David A.
Wisconsin Wind Tunnel II: A Fast and Portable Parallel Architecture Simulator Shubhendu S 1, 1997 * Computer Sciences Department University of Wisconsin-Madison 1210 West Dayton Street Madison, Wisconsin 53706-1685 USA URL: http://www.cs.wisc.edu/~wwt Email: wwt@cs.wisc.edu ¦EECS Department
Parallel Object Oriented Implementation of a 2D Bounded Electrostatic Plasma PIC Simulation \\Lambda
Bystroff, Chris
of Aeronautics, Mission to Planet Earth and Space Science. y Department of Computer Science, Rensselaer issues involved in designing parallel programs using object oriented techniques. Simulations involving 1D of converting the procedural Fortran codes into object oriented C++ versions. A comparative analysis
LIBOR MARKET MODEL SIMULATION ON AN FPGA PARALLEL MACHINE Xiang Tian and Khaled Benkrid
Arslan, Tughrul
LIBOR MARKET MODEL SIMULATION ON AN FPGA PARALLEL MACHINE Xiang Tian and Khaled Benkrid, UK (X.Tian, k.benkrid)@ed.ac.uk ABSTRACT In this paper, we present a high performance scalable FPGA derivative based on the LIBOR market model. We implemented this design on the Maxwell FPGA supercomputer
Comparison of gyrokinetic simulations of parallel plasma conductivity with analytical models
NASA Astrophysics Data System (ADS)
Kiviniemi, T. P.; Leerink, S.; Niskala, P.; Heikkinen, J. A.; Korpilo, T.; Janhunen, S.
2014-07-01
A full f gyrokinetic particle-in-cell simulation including Coulomb collisions is shown to reproduce the results from analytic estimates for parallel plasma conductivity in the collisional parameter regime, with reasonably good agreement, when varying the temperature and impurity content of the plasma. Differences between the models are discussed.
SWiMNet: A Scalable Parallel Simulation Testbed for Wireless and Mobile Networks
Azzedine Boukerche; Sajal K. Das; Alessandro Fabbri
2001-01-01
We present a framework, called SWiMNet, for parallel simulation of wireless and mobile PCS networks, which allows realistic and detailed modeling of mobility, call traffic, and PCS network deployment. SWiMNet is based upon event precomputation and a combination of optimistic and conservative synchronization mechanisms. Event precomputation is the result of model independence within the global PCS network. Low percentage of
Practical Parallel SimulationApplied to Aviation Modeling Dr. Frederick Wieland
Tropper, Carl
Practical Parallel SimulationApplied to Aviation Modeling Dr. Frederick Wieland Centerfor Advanced Aviation Systems Development The MITRE Corporation 1820 Dolley Madison Dr., McLean, VA22102 fwieland@mitre.org Abstract. This paper analyzes the Detailed Policy Assessment Tool (DPAT) as an exaniple of a practical real
Synthetic Simulation of Mesh-Based Parallel Applications Driven by Fine-Grained Profiling
Qingyuan Liu; Amol S. Deshmukh; Karen A. Tomko
2005-01-01
We are interested in discovering the intrinsic dynamics of parallel applications, which are independent of runtime environment, to aid in the development of appropriate tuning policies, especially dynamic load balancing policies. Based on the novel idea of profiling mesh-based applications at a fine granularity of each mesh element, this paper proposes a synthetic application simulator which is driven by a
Parallel Algorithm for Detonation Wave Simulation P. Ravindran and F. K. Lu
Texas at Arlington, University of
environment. 2 Governing Equations The time-dependent conservation equations are those for an inviscid, non-heatParallel Algorithm for Detonation Wave Simulation P. Ravindran and F. K. Lu Aerodynamics Research- tational time without compromising accuracy. The flow was assumed to be unsteady, inviscid and non-heat
Improving the Performance of the Extreme-scale Simulator
Engelmann, Christian [ORNL; Naughton, III, Thomas J [ORNL
2014-01-01
Investigating the performance of parallel applications at scale on future high-performance computing (HPC) architectures and the performance impact of different architecture choices is an important component of HPC hardware/software co-design. The Extreme-scale Simulator (xSim) is a simulation-based toolkit for investigating the performance of parallel applications at scale. xSim scales to millions of simulated Message Passing Interface (MPI) processes. The overhead introduced by a simulation tool is an important performance and productivity aspect. This paper documents two improvements to xSim: (1) a new deadlock resolution protocol to reduce the parallel discrete event simulation management overhead and (2) a new simulated MPI message matching algorithm to reduce the oversubscription management overhead. The results clearly show a significant performance improvement, such as by reducing the simulation overhead for running the NAS Parallel Benchmark suite inside the simulator from 1,020\\% to 238% for the conjugate gradient (CG) benchmark and from 102% to 0% for the embarrassingly parallel (EP) and benchmark, as well as, from 37,511% to 13,808% for CG and from 3,332% to 204% for EP with accurate process failure simulation.
Stepp, Justin Wayne
2011-10-21
-fuel requires the implementation of efficient culturing processes to maximize production and reduce costs. Therefore, three discrete rate event simulation models were developed to analyze different scaling scenarios and determine total costs associated with each...
Stepp, Justin Wayne
2011-10-21
-fuel requires the implementation of efficient culturing processes to maximize production and reduce costs. Therefore, three discrete rate event simulation models were developed to analyze different scaling scenarios and determine total costs associated with each...
Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems
BAER,THOMAS A.; SACKINGER,PHILIP A.; SUBIA,SAMUEL R.
1999-10-14
Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-static solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance.
Midpoint cell method for hybrid (MPI+OpenMP) parallelization of molecular dynamics simulations.
Jung, Jaewoon; Mori, Takaharu; Sugita, Yuji
2014-05-30
We have developed a new hybrid (MPI+OpenMP) parallelization scheme for molecular dynamics (MD) simulations by combining a cell-wise version of the midpoint method with pair-wise Verlet lists. In this scheme, which we call the midpoint cell method, simulation space is divided into subdomains, each of which is assigned to a MPI processor. Each subdomain is further divided into small cells. The interaction between two particles existing in different cells is computed in the subdomain containing the midpoint cell of the two cells where the particles reside. In each MPI processor, cell pairs are distributed over OpenMP threads for shared memory parallelization. The midpoint cell method keeps the advantages of the original midpoint method, while filtering out unnecessary calculations of midpoint checking for all the particle pairs by single midpoint cell determination prior to MD simulations. Distributing cell pairs over OpenMP threads allows for more efficient shared memory parallelization compared with distributing atom indices over threads. Furthermore, cell grouping of particle data makes better memory access, reducing the number of cache misses. The parallel performance of the midpoint cell method on the K computer showed scalability up to 512 and 32,768 cores for systems of 20,000 and 1 million atoms, respectively. One MD time step for long-range interactions could be calculated within 4.5 ms even for a 1 million atoms system with particle-mesh Ewald electrostatics. PMID:24659253
Parallel Monte Carlo simulations on an ARC-enabled computing grid
NASA Astrophysics Data System (ADS)
Nilsen, Jon K.; Samset, Bjørn H.
2011-12-01
Grid computing opens new possibilities for running heavy Monte Carlo simulations of physical systems in parallel. The presentation gives an overview of GaMPI, a system for running an MPI-based random walker simulation on grid resources. Integrating the ARC middleware and the new storage system Chelonia with the Ganga grid job submission and control system, we show that MPI jobs can be run on a world-wide computing grid with good performance and promising scaling properties. Results for relatively communication-heavy Monte Carlo simulations run on multiple heterogeneous, ARC-enabled computing clusters in several countries are presented.
Design of a real-time wind turbine simulator using a custom parallel architecture
NASA Technical Reports Server (NTRS)
Hoffman, John A.; Gluck, R.; Sridhar, S.
1995-01-01
The design of a new parallel-processing digital simulator is described. The new simulator has been developed specifically for analysis of wind energy systems in real time. The new processor has been named: the Wind Energy System Time-domain simulator, version 3 (WEST-3). Like previous WEST versions, WEST-3 performs many computations in parallel. The modules in WEST-3 are pure digital processors, however. These digital processors can be programmed individually and operated in concert to achieve real-time simulation of wind turbine systems. Because of this programmability, WEST-3 is very much more flexible and general than its two predecessors. The design features of WEST-3 are described to show how the system produces high-speed solutions of nonlinear time-domain equations. WEST-3 has two very fast Computational Units (CU's) that use minicomputer technology plus special architectural features that make them many times faster than a microcomputer. These CU's are needed to perform the complex computations associated with the wind turbine rotor system in real time. The parallel architecture of the CU causes several tasks to be done in each cycle, including an IO operation and the combination of a multiply, add, and store. The WEST-3 simulator can be expanded at any time for additional computational power. This is possible because the CU's interfaced to each other and to other portions of the simulation using special serial buses. These buses can be 'patched' together in essentially any configuration (in a manner very similar to the programming methods used in analog computation) to balance the input/ output requirements. CU's can be added in any number to share a given computational load. This flexible bus feature is very different from many other parallel processors which usually have a throughput limit because of rigid bus architecture.
AUVNetSim: a Simulator for Underwater Acoustics Networks
Montana, Josep Miquel Jornet
AUVNetSim is a simulation library for testing acoustic networking algorithms. It is written in Python and makes extensive use of the SimPy discrete event simulation package. AUVNetSim is interesting for both end users and ...
Using parallel computing for the display and simulation of the space debris environment
NASA Astrophysics Data System (ADS)
Moeckel, Marek; Wiedemann, Carsten; Flegel, Sven Kevin; Gelhaus, Johannes; Klinkrad, Heiner; Krag, Holger; Voersmann, Peter
Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction of OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Using parallel computing for the display and simulation of the space debris environment
NASA Astrophysics Data System (ADS)
Möckel, M.; Wiedemann, C.; Flegel, S.; Gelhaus, J.; Vörsmann, P.; Klinkrad, H.; Krag, H.
2011-07-01
Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction to OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Knowledge-based environment for hierarchical modeling and simulation
Kim, Taggon.
1988-01-01
This dissertation develops a knowledge-based environment for hierarchical modeling and simulation of discrete-event systems as the major part of a longer, ongoing research project in artificial intelligence and distributed simulation. In developing the environment, a knowledge representation framework for modeling and simulation, which unifies structural and behavioral knowledge of simulation models, is proposed by incorporating knowledge-representation schemes in artificial intelligence within simulation models. The knowledge base created using the framework is composed of a structural knowledge base called entity structure base and a behavioral knowledge base called model base. The DEVS-Scheme, a realization of DEVS (Discrete Event System Specifiation) formalism in a LISP-based, object-oriented environment, is extended to facilitate the specification of behavioral knowledge of models, especially for kernel models that are suited to model massively parallel computer architectures. The ESP Scheme, a realization of entity structure formalism in a frame-theoretic representation, is extended to represent structural knowledge of models and to manage it in the structural knowledge base.
Srini Ramaswamy; K. P. Valavanis
Hierarchical Time-Extended Pelri Nets (H-EPNs) are proposed as a modeling, analysis and simulation tool to study and coordinate real-fime system operations including Failure Detection and Identification (FDI). The structure any hierarchically decomposable system is considered to be a three interactive level hierarchical stn~cture of organization, coordination and execution of tasks (7, 9). The proposed hybrid model may be used at
Lee, Anthony; Yau, Christopher; Giles, Michael B; Doucet, Arnaud; Holmes, Christopher C
2010-12-01
We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design. PMID:22003276
Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P
Candel, A.; Kabel, A.; Lee, L.; Li, Z.; Ng, C.; Schussman, G.; Ko, K.; /SLAC; Syratchev, I.; /CERN
2009-06-19
In recent years, SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic time-domain code T3P. Higher-order Finite Element methods on conformal unstructured meshes and massively parallel processing allow unprecedented simulation accuracy for wakefield computations and simulations of transient effects in realistic accelerator structures. Applications include simulation of wakefield damping in the Compact Linear Collider (CLIC) power extraction and transfer structure (PETS).
Kumar, Ratnesh
Synthesis of Optimal Fault-Tolerant Supervisor for Discrete Event Systems Q. Wen, Member IEEE, R University, Ames, Iowa 50011 Abstract-- In an earlier work [1], [2], we introduced a framework for fault existence. Here we propose an approach to synthesize an optimal fault-tolerant supervisory controller. Given
Mahinthakumar, G. [Oak Ridge National Lab., TN (United States). Center for Computational Sciences; Saied, F.; Valocchi, A.J. [Univ. of Illinois, Urbana, IL (United States)
1997-03-01
Some popular iterative solvers for non-symmetric systems arising from the finite-element discretization of three-dimensional groundwater contaminant transport problem are implemented and compared on distributed memory parallel platforms. This paper attempts to determine which solvers are most suitable for the contaminant transport problem under varied conditions for large scale simulations on distributed parallel platforms. The original parallel implementation was targeted for the 1024 node Intel paragon platform using explicit message passing with the NX library. This code was then ported to SGI Power Challenge Array, Convex Exemplar, and Origin 2000 machines using an MPI implementation. The performance of these solvers is studied for increasing problem size, roughness of the coefficients, and selected problem scenarios. These conditions affect the properties of the matrix and hence the difficulty level of the solution process. Performance is analyzed in terms of convergence behavior, overall time, parallel efficiency, and scalability. The solvers that are presented are BiCGSTAB, GMRES, ORTHOMIN, and CGS. A simple diagonal preconditioner is used in this parallel implementation for all the methods. The results indicate that all methods are comparable in performance with BiCGSTAB slightly outperforming the other methods for most problems. The authors achieved very good scalability in all the methods up to 1024 processors of the Intel Paragon XPS/150. They demonstrate scalability by solving 100 time steps of a 40 million element problem in about 5 minutes using either BiCGSTAB or GMRES.
Modeling of Weakly Collisional Parallel Electron Transport for Edge Plasma Simulations
NASA Astrophysics Data System (ADS)
Umansky, M. V.; Dimits, A. M.; Joseph, I.; Omotani, J. T.; Rognlien, T. D.
2014-10-01
The parallel electron heat transport in a weakly collisional regime can be represented in the framework of the Landau-fluid (LF) model. Practical implementation of LF-based transport models has become possible due to the recent invention of an efficient non- spectral method for the non-local closure operators. Here the implementation of a LF based model for the parallel plasma transport is described, and the model is tested for different collisionality regimes against a Fokker-Plank code. The new method appears to represent weakly collisional parallel electron transport more accurately than the conventional flux-limiter based models; on the other hand it is computationally efficient enough to be used in tokamak edge plasma simulations. Implementation of an LF-based model for the parallel plasma transport in the UEDGE code is described, and applications to realistic divertor simulations are discussed. Work performed for U.S. DoE by LLNL under Contract DE-AC52-07NA27344.
A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows
NASA Technical Reports Server (NTRS)
Bui, Trong T.
1999-01-01
A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.
Parallel Solutions for Voxel-Based Simulations of Reaction-Diffusion Systems
D'Agostino, Daniele; Pasquale, Giulia; Clematis, Andrea; Maj, Carlo; Mosca, Ettore; Milanesi, Luciano; Merelli, Ivan
2014-01-01
There is an increasing awareness of the pivotal role of noise in biochemical processes and of the effect of molecular crowding on the dynamics of biochemical systems. This necessity has given rise to a strong need for suitable and sophisticated algorithms for the simulation of biological phenomena taking into account both spatial effects and noise. However, the high computational effort characterizing simulation approaches, coupled with the necessity to simulate the models several times to achieve statistically relevant information on the model behaviours, makes such kind of algorithms very time-consuming for studying real systems. So far, different parallelization approaches have been deployed to reduce the computational time required to simulate the temporal dynamics of biochemical systems using stochastic algorithms. In this work we discuss these aspects for the spatial TAU-leaping in crowded compartments (STAUCC) simulator, a voxel-based method for the stochastic simulation of reaction-diffusion processes which relies on the S?-DPP algorithm. In particular we present how the characteristics of the algorithm can be exploited for an effective parallelization on the present heterogeneous HPC architectures. PMID:25045716
Schuchardt, Karen L.; Agarwal, Khushbu; Chase, Jared M.; Rockhold, Mark L.; Freedman, Vicky L.; Elsethagen, Todd O.; Scheibe, Timothy D.; Chin, George; Sivaramakrishnan, Chandrika
2010-07-15
The Support Architecture for Large-Scale Subsurface Analysis (SALSSA) provides an extensible framework, sophisticated graphical user interface, and underlying data management system that simplifies the process of running subsurface models, tracking provenance information, and analyzing the model results. Initially, SALSSA supported two styles of job control: user directed execution and monitoring of individual jobs, and load balancing of jobs across multiple machines taking advantage of many available workstations. Recent efforts in subsurface modelling have been directed at advancing simulators to take advantage of leadership class supercomputers. We describe two approaches, current progress, and plans toward enabling efficient application of the subsurface simulator codes via the SALSSA framework: automating sensitivity analysis problems through task parallelism, and task parallel parameter estimation using the PEST framework.
Multi-objective optimization of high-speed milling with parallel genetic simulated annealing
Z. G. Wang; Y. S. Wong; M. Rahman; J. Sun
2006-01-01
In this paper, the optimization of multi-pass milling has been investigated in terms of two objectives: machining time and production cost. An advanced search algorithm—parallel genetic simulated annealing (PGSA)—was used to obtain the optimal cutting parameters. In the implementation of PGSA, the fitness assignment is based on the concept of a non-dominated sorting genetic algorithm (NSGA). An application example is
Construction of a parallel processor for simulating manipulators and other mechanical systems
NASA Technical Reports Server (NTRS)
Hannauer, George
1991-01-01
This report summarizes the results of NASA Contract NAS5-30905, awarded under phase 2 of the SBIR Program, for a demonstration of the feasibility of a new high-speed parallel simulation processor, called the Real-Time Accelerator (RTA). The principal goals were met, and EAI is now proceeding with phase 3: development of a commercial product. This product is scheduled for commercial introduction in the second quarter of 1992.
NASA Technical Reports Server (NTRS)
Morgan, Philip E.
2004-01-01
This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Spontaneous Hot Flow Anomalies at Quasi-Parallel Shocks: 2. Hybrid Simulations
NASA Technical Reports Server (NTRS)
Omidi, N.; Zhang, H.; Sibeck, D.; Turner, D.
2013-01-01
Motivated by recent THEMIS observations, this paper uses 2.5-D electromagnetic hybrid simulations to investigate the formation of Spontaneous Hot Flow Anomalies (SHFA) upstream of quasi-parallel bow shocks during steady solar wind conditions and in the absence of discontinuities. The results show the formation of a large number of structures along and upstream of the quasi-parallel bow shock. Their outer edges exhibit density and magnetic field enhancements, while their cores exhibit drops in density, magnetic field, solar wind velocity and enhancements in ion temperature. Using virtual spacecraft in the simulation, we show that the signatures of these structures in the time series data are very similar to those of SHFAs seen in THEMIS data and conclude that they correspond to SHFAs. Examination of the simulation data shows that SHFAs form as the result of foreshock cavitons interacting with the bow shock. Foreshock cavitons in turn form due to the nonlinear evolution of ULF waves generated by the interaction of the solar wind with the backstreaming ions. Because foreshock cavitons are an inherent part of the shock dissipation process, the formation of SHFAs is also an inherent part of the dissipation process leading to a highly non-uniform plasma in the quasi-parallel magnetosheath including large scale density and magnetic field cavities.
NASA Astrophysics Data System (ADS)
Honkonen, I.
2015-03-01
I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring modification of existing code. This is an advantage for the development and testing of, e.g., geoscientific software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. An implementation of the generic simulation cell method presented here, generic simulation cell class (gensimcell), also includes support for parallel programming by allowing model developers to select which simulation variables of, e.g., a domain-decomposed model to transfer between processes via a Message Passing Interface (MPI) library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class requires a C++ compiler that supports a version of the language standardized in 2011 (C++11). The code is available at https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those who do are kindly requested to acknowledge and cite this work.
Scalable parallel Monte Carlo algorithm for atomistic simulations of precipitation in alloys
NASA Astrophysics Data System (ADS)
Sadigh, Babak; Erhart, Paul; Stukowski, Alexander; Caro, Alfredo; Martinez, Enrique; Zepeda-Ruiz, Luis
2012-05-01
We present an extension of the semi-grand-canonical (SGC) ensemble that we refer to as the variance-constrained semi-grand-canonical (VC-SGC) ensemble. It allows for transmutation Monte Carlo simulations of multicomponent systems in multiphase regions of the phase diagram and lends itself to scalable simulations on massively parallel platforms. By combining transmutation moves with molecular dynamics steps, structural relaxations and thermal vibrations in realistic alloys can be taken into account. In this way, we construct a robust and efficient simulation technique that is ideally suited for large-scale simulations of precipitation in multicomponent systems in the presence of structural disorder. To illustrate the algorithm introduced in this work, we study the precipitation of Cu in nanocrystalline Fe.
Supporting the Development of Resilient Message Passing Applications using Simulation
Naughton, III, Thomas J [ORNL; Engelmann, Christian [ORNL] [ORNL; Vallee, Geoffroy R [ORNL] [ORNL; Boehm, Swen [ORNL] [ORNL
2014-01-01
An emerging aspect of high-performance computing (HPC) hardware/software co-design is investigating performance under failure. The work in this paper extends the Extreme-scale Simulator (xSim), which was designed for evaluating the performance of message passing interface (MPI) applications on future HPC architectures, with fault-tolerant MPI extensions proposed by the MPI Fault Tolerance Working Group. xSim permits running MPI applications with millions of concurrent MPI ranks, while observing application performance in a simulated extreme-scale system using a lightweight parallel discrete event simulation. The newly added features offer user-level failure mitigation (ULFM) extensions at the simulated MPI layer to support algorithm-based fault tolerance (ABFT). The presented solution permits investigating performance under failure and failure handling of ABFT solutions. The newly enhanced xSim is the very first performance tool that supports ULFM and ABFT.
Gobbert, Matthias K.
a Parallelized Genetic Algorithm Joseph Cornish*, Robert Forder**, Ivan Erill*, Matthias K. Gobbert** *Department a genetic algorithm in parallel using a server-client organization to simulate the evolution are not able to recognize correlation information in binding sites. We implement a genetic algorithm
H. Uehara; T. Fujimoto; N. Nanba; K. Kudo
2009-01-01
We carried out a three-dimensional tree simulation of a composite system with an interface perpendicular or parallel to the electric force line based on a dielectric breakdown model (DBM) considering the growth probability. It was found that an interface perpendicular to the electric force line exhibits a positive barrier effect and that an interface parallel to the electric force line
Simões, Marcelo Godoy
Terms--Induction generators, parallel machines, state-space methods, transient analysis. NOMENCLATURE Simulation and Analysis of Parallel Self-Excited Induction Generators for Islanded Wind Farm Systems Bhaskara mathematical model to de- scribe the transient behavior of a system of self-excited induction generators (SEIGs
T. Hoshino; T. Shirakawa
1982-01-01
The three-dimensional boiling water reactor (BWR) core following the daily load was simulated by the use of the processor array for continuum simulation (PACS-32), a newly developed parallel microprocessor system. The PACS system consists of 32 processing units (PUs) (microprocessors) and has a multiinstruction, multidata type architecture, being optimum to the numerical simulation of the partial differential equations. The BWR
Massively parallel Monte Carlo for many-particle simulations on GPUs
Anderson, Joshua A.; Jankowski, Eric [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)] [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Grubb, Thomas L. [Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)] [Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Engel, Michael [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)] [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Glotzer, Sharon C., E-mail: sglotzer@umich.edu [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)
2013-12-01
Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.
M.Dyna Mix - a scalable portable parallel MD simulation package for arbitrary molecular mixtures
NASA Astrophysics Data System (ADS)
Lyubartsev, Alexander P.; Laaksonen, Aatto
2000-06-01
A general purpose, scalable parallel molecular dynamics package for simulations of arbitrary mixtures of flexible or rigid molecules is presented. It allows use of most types of conventional molecular-mechanical force fields and contains a variety of auxiliary terms for inter- and intramolecular interactions, including an harmonic bond-stretchings. It can handle both isotropic or ordered systems. Besides an NVE MD ensemble, the simulations can also be carried out in either NVT or NPT ensembles, by employing the Nosé-Hoover thermostats and barostats, respectively. If required, the NPT ensemble can be generated by maintaining anisotropic pressures. The simulation cell can be either cubic, rectangular, hexagonal or a truncated octahedron, with corresponding periodic boundary conditions and minimum images. In all cases, the optimized Ewald method can be used to treat the Coulombic interactions. Double time-step or constrained dynamics schemes are included. An external electric field can be applied across the simulation cell. The whole program is highly modular and is written in standard Fortran 77. It can be compiled to run efficiently both on parallel and sequential computers. The inherent complexity of the studied system does not affect the scalability of the program. The scaling is good with the size of the system and with the number of processors. The portability of the program is good, it runs regularly on several common single- and multiprocessor platforms, both scalar and vector architectures included.
Switching to High Gear: Opportunities for Grand-scale Real-time Parallel Simulations
Perumalla, Kalyan S [ORNL
2009-01-01
The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 10^5 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher gear to exploit the new, immense levels of computational power. The potential for grand-scale real-time solutions is illustrated using preliminary results from prototypes in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) very large-scale epidemic modeling, (c) modeling the propagation of wireless network signals in very large, cluttered terrains, and, (d) country- or world-scale social behavioral modeling. We believe the stage is perfectly poised for the parallel/distributed simulation community to envision and formulate similar grand-scale, real-time simulation-based solutions in many application areas.
Use of Parallel Micro-Platform for the Simulation the Space Exploration
NASA Astrophysics Data System (ADS)
Velasco Herrera, Victor Manuel; Velasco Herrera, Graciela; Rosano, Felipe Lara; Rodriguez Lozano, Salvador; Lucero Roldan Serrato, Karen
The purpose of this work is to create a parallel micro-platform, that simulates the virtual movements of a space exploration in 3D. One of the innovations presented in this design consists of the application of a lever mechanism for the transmission of the movement. The development of such a robot is a challenging task very different of the industrial manipulators due to a totally different target system of requirements. This work presents the study and simulation, aided by computer, of the movement of this parallel manipulator. The development of this model has been developed using the platform of computer aided design Unigraphics, in which it was done the geometric modeled of each one of the components and end assembly (CAD), the generation of files for the computer aided manufacture (CAM) of each one of the pieces and the kinematics simulation of the system evaluating different driving schemes. We used the toolbox (MATLAB) of aerospace and create an adaptive control module to simulate the system.
Parallel, adaptive, multi-object trajectory integrator for space simulation applications
NASA Astrophysics Data System (ADS)
Atanassov, Atanas Marinov
2014-10-01
Computer simulation is a very helpful approach for improving results from space born experiments. Initial-value problems (IVPs) can be applied for modeling dynamics of different objects - artificial Earth satellites, charged particles in magnetic and electric fields, charged or non-charged dust particles, space debris. An ordinary differential equations systems (ODESs) integrator based on applying different order embedded Runge-Kutta-Fehlberg methods is developed. These methods enable evaluation of the local error. Instead of step-size control based on local error evaluation, an optimal integration method is selected. Integration while meeting the required local error proceeds with constant-sized steps. This optimal scheme selection reduces the amount of calculation needed for solving the IVPs. In addition, for an implementation on a multi core processor and parallelization based on threads application, we describe how to solve multiple systems of IVPs efficiently in parallel. The proposed integrator allows the application of a different force model for every object in multi-satellite simulation models. Simultaneous application of the integrator toward different kinds of problems in the frames of one combined simulation model is possible too. The basic application of the integrator is solving mechanical IVPs in the context of simulation models and their application in complex multi-satellite space missions and as a design tool for experiments.
Parallelization of Particle-Particle, Particle-Mesh Method within N-Body Simulation
NSDL National Science Digital Library
Nicholas Nocito
The N-Body problem has become an intricate part of the computational sciences, and there has been rise to many methods to solve and approximate the problem. The solution potentially requires on the order of calculations each time step, therefore efficient performance of these N-Body algorithms is very significant [5]. This work describes the parallelization and optimization of the Particle-Particle, Particle-Mesh (P3M) algorithm within GalaxSeeHPC, an open-source N-Body Simulation code. Upon successful profiling, MPI (Message Passing Interface) routines were implemented into the population of the density grid in the P3M method in GalaxSeeHPC. Each problem size recorded different results, and for a problem set dealing with 10,000 celestial bodies, speedups up to 10x were achieved. However, in accordance to Amdahl's Law, maximum speedups for the code should have been closer to 16x. In order to achieve maximum optimization, additional research is needed and parallelization of the Fourier Transform routines could prove to be rewarding. In conclusion, the GalaxSeeHPC Simulation was successfully parallelized and obtained very respectable results, while further optimization remains possible.
NASA Astrophysics Data System (ADS)
Mosaddeghi, Hamid; Alavi, Saman; Kowsari, M. H.; Najafi, Bijan
2012-11-01
We use molecular dynamics simulations to study the structure, dynamics, and transport properties of nano-confined water between parallel graphite plates with separation distances (H) from 7 to 20 Å at different water densities with an emphasis on anisotropies generated by confinement. The behavior of the confined water phase is compared to non-confined bulk water under similar pressure and temperature conditions. Our simulations show anisotropic structure and dynamics of the confined water phase in directions parallel and perpendicular to the graphite plate. The magnitude of these anisotropies depends on the slit width H. Confined water shows "solid-like" structure and slow dynamics for the water layers near the plates. The mean square displacements (MSDs) and velocity autocorrelation functions (VACFs) for directions parallel and perpendicular to the graphite plates are calculated. By increasing the confinement distance from H = 7 Å to H = 20 Å, the MSD increases and the behavior of the VACF indicates that the confined water changes from solid-like to liquid-like dynamics. If the initial density of the water phase is set up using geometric criteria (i.e., distance between the graphite plates), large pressures (in the order of ˜10 katm), and large pressure anisotropies are established within the water. By decreasing the density of the water between the confined plates to about 0.9 g cm-3, bubble formation and restructuring of the water layers are observed.
Parallel Agent-Based Simulations on Clusters of GPUs and Multi-Core Processors
Aaby, Brandon G [ORNL; Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL
2010-01-01
An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Message Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.
Jeong Hoon Kim; Chang Beom Choi; Tag Gon Kim
2011-01-01
The modern naval air defense of a fleet is a critical task dictating the equipment, the operation, and the management of the fleet. Military modelers consider that an improved weapon system in naval air defense (i.e. the AEGIS system) is the most critical enabler of defense at the engagement level. However, at the mission execution level, naval air defense is
JULIE EATOCK; RAY J. PAUL; ALAN SERRANO
Advocates of Business Process (BP) approaches argue that the real value of IT is that it provokes innovative changes in business processes. Despite the fact that many BP and IT academics and practitioners agree on this idea, BP and IT design are still performed separately. Moreover, there is very little research that is concerned with studying the ways in which
Fancher, Robert H.
1997-01-01
. Finally, if the potential recruit agrees to accept a specific job, he or she is enrolled in the Delayed Entry Program (DEP) and awaits final admittance into the Army. Data on contacts made while soliciting recruits and appointments scheduled..., but once shipped, they are no longer considered a part of the recruiting system. Solicit Recruits Appointments Scheduled Appointmcnts Made ASVAB Test Physical Examination Negottate Contracts DEP ARMY Waiver Required Figure 1. Army...
Bradley, Randolph L. (Randolph Lewis)
2012-01-01
Heavy industries operate equipment having a long life to generate revenue or perform a mission. These industries must invest in the specialized service parts needed to maintain their equipment, because unlike in other ...
A New Fighter Simulator Based on a Full Spinning Six Degrees-of-freedom Parallel Mechanism Platform
Jongwon Kim; Sun Ho Kim
2005-01-01
This paper presents an innovative motion base for flight simulator, which is based on the new six degrees-of- freedom parallel mechanism, called 'Eclipse-II'. Most conventional simulators adopt the Stewart platform as its motion base. The Stewart platform is a six degree-of-freedom parallel mechanism that enables both translational and rotational motions. However, the motions such as continuous 360-degree overturn of the
De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers
Nakano, A; Kalia, R K; Nomura, K; Sharma, A; Vashishta, P; Shimojo, F; van Duin, A; Goddard, III, W A; Biswas, R; Srivastava, D; Yang, L H
2006-09-04
We present a de novo hierarchical simulation framework for first-principles based predictive simulations of materials and their validation on high-end parallel supercomputers and geographically distributed clusters. In this framework, high-end chemically reactive and non-reactive molecular dynamics (MD) simulations explore a wide solution space to discover microscopic mechanisms that govern macroscopic material properties, into which highly accurate quantum mechanical (QM) simulations are embedded to validate the discovered mechanisms and quantify the uncertainty of the solution. The framework includes an embedded divide-and-conquer (EDC) algorithmic framework for the design of linear-scaling simulation algorithms with minimal bandwidth complexity and tight error control. The EDC framework also enables adaptive hierarchical simulation with automated model transitioning assisted by graph-based event tracking. A tunable hierarchical cellular decomposition parallelization framework then maps the O(N) EDC algorithms onto Petaflops computers, while achieving performance tunability through a hierarchy of parameterized cell data/computation structures, as well as its implementation using hybrid Grid remote procedure call + message passing + threads programming. High-end computing platforms such as IBM BlueGene/L, SGI Altix 3000 and the NSF TeraGrid provide an excellent test grounds for the framework. On these platforms, we have achieved unprecedented scales of quantum-mechanically accurate and well validated, chemically reactive atomistic simulations--1.06 billion-atom fast reactive force-field MD and 11.8 million-atom (1.04 trillion grid points) quantum-mechanical MD in the framework of the EDC density functional theory on adaptive multigrids--in addition to 134 billion-atom non-reactive space-time multiresolution MD, with the parallel efficiency as high as 0.998 on 65,536 dual-processor BlueGene/L nodes. We have also achieved an automated execution of hierarchical QM/MD simulation on a Grid consisting of 6 supercomputer centers in the US and Japan (in total of 150 thousand processor-hours), in which the number of processors change dynamically on demand and resources are allocated and migrated dynamically in response to faults. Furthermore, performance portability has been demonstrated on a wide range of platforms such as BlueGene/L, Altix 3000, and AMD Opteron-based Linux clusters.
NASA Astrophysics Data System (ADS)
Laundy, D.; Sutter, J. P.; Wagner, U. H.; Rau, C.; Thomas, C. A.; Sawhney, K. J. S.; Chubar, O.
2013-03-01
Hard X-ray undulator radiation at 3rd generation storage rings falls between the geometrical and the fully coherent limit. This is a result of the small but finite emittance of the electron beam source and means that the radiation cannot be completely modelled by incoherent ray tracing or by fully coherent wave propagation. We have developed using the wavefront propagation code Synchrotron Radiation Workshop (SRW) running in a Python environment, a parallel computer program using the Monte Carlo method for modelling the partially coherent emission from electron beam sources taking into account the finite emittance of the source. Using a parallel computing cluster with in excess of 500 cores and each core calculating the wavefront from in excess of a 1000 electrons, a source containing millions of electrons could be simulated. We have applied this method to the Diamond X-ray Imaging and Coherence beamline (113).
Superposition-Enhanced Estimation of Optimal Temperature Spacings for Parallel Tempering Simulations
2014-01-01
Effective parallel tempering simulations rely crucially on a properly chosen sequence of temperatures. While it is desirable to achieve a uniform exchange acceptance rate across neighboring replicas, finding a set of temperatures that achieves this end is often a difficult task, in particular for systems undergoing phase transitions. Here we present a method for determination of optimal replica spacings, which is based upon knowledge of local minima in the potential energy landscape. Working within the harmonic superposition approximation, we derive an analytic expression for the parallel tempering acceptance rate as a function of the replica temperatures. For a particular system and a given database of minima, we show how this expression can be used to determine optimal temperatures that achieve a desired uniform acceptance rate. We test our strategy for two atomic clusters that exhibit broken ergodicity, demonstrating that our method achieves uniform acceptance as well as significant efficiency gains. PMID:25512744
Univ. of California, San Diego; Li, Xiaoye Sherry; Cicotti, Pietro; Li, Xiaoye Sherry; Baden, Scott B.
2008-04-15
Sparse parallel factorization is among the most complicated and irregular algorithms to analyze and optimize. Performance depends both on system characteristics such as the floating point rate, the memory hierarchy, and the interconnect performance, as well as input matrix characteristics such as such as the number and location of nonzeros. We present LUsim, a simulation framework for modeling the performance of sparse LU factorization. Our framework uses micro-benchmarks to calibrate the parameters of machine characteristics and additional tools to facilitate real-time performance modeling. We are using LUsim to analyze an existing parallel sparse LU factorization code, and to explore a latency tolerant variant. We developed and validated a model of the factorization in SuperLU_DIST, then we modeled and implemented a new variant of slud, replacing a blocking collective communication phase with a non-blocking asynchronous point-to-point one. Our strategy realized a mean improvement of 11percent over a suite of test matrices.
Massively Parallel Phase-Field Simulations for Ternary Eutectic Directional Solidification
Bauer, Martin; Steinmetz, Philipp; Jainta, Marcus; Berghoff, Marco; Schornbaum, Florian; Godenschwager, Christian; Köstler, Harald; Nestler, Britta; Rüde, Ulrich
2015-01-01
Microstructures forming during ternary eutectic directional solidification processes have significant influence on the macroscopic mechanical properties of metal alloys. For a realistic simulation, we use the well established thermodynamically consistent phase-field method and improve it with a new grand potential formulation to couple the concentration evolution. This extension is very compute intensive due to a temperature dependent diffusive concentration. We significantly extend previous simulations that have used simpler phase-field models or were performed on smaller domain sizes. The new method has been implemented within the massively parallel HPC framework waLBerla that is designed to exploit current supercomputers efficiently. We apply various optimization techniques, including buffering techniques, explicit SIMD kernel vectorization, and communication hiding. Simulations utilizing up to 262,144 cores have been run on three different supercomputing architectures and weak scalability results are show...
CLUSTEREASY: A Program for Simulating Scalar Field Evolution on Parallel Computers
Gary N Felder
2007-12-05
We describe a new, parallel programming version of the scalar field simulation program LATTICEEASY. The new C++ program, CLUSTEREASY, can simulate arbitrary scalar field models on distributed-memory clusters. The speed and memory requirements scale well with the number of processors. As with the serial version of LATTICEEASY, CLUSTEREASY can run simulations in one, two, or three dimensions, with or without expansion of the universe, with customizable parameters and output. The program and its full documentation are available on the LATTICEEASY website at http://www.science.smith.edu/departments/Physics/fstaff/gfelder/latticeeasy/. In this paper we provide a brief overview of what CLUSTEREASY does and the ways in which it does and doesn't differ from the serial version of LATTICEEASY.
NASA Astrophysics Data System (ADS)
Reynolds-Barredo, J. M.; Newman, D. E.; Sanchez, R.
2013-12-01
Parareal is a recent time parallelization algorithm based on a predictor-corrector mechanism. Recently, it has been applied for the first time to a fully-developed plasma turbulent simulation, and a qualitative understanding of how parareal converges exists for this case. In this paper, we construct an analytical framework of the process of convergence that should be applicable to parareal simulations of general turbulent systems. This framework allows one to gain a quantitative understanding of the dependence of the convergence on the physics of the problem and the choices that must be made to implement parareal. The analytical knowledge provided by this new framework can be used to optimize the implementation of parareal. We illustrate the inner workings of the framework and demonstrate its predictive capabilities by applying it to the modeling of the parareal convergence of drift-wave plasma turbulent simulations.
Xyce parallel electronic simulator design : mathematical formulation, version 2.0.
Hoekstra, Robert John; Waters, Lon J.; Hutchinson, Scott Alan; Keiter, Eric Richard; Russo, Thomas V.
2004-06-01
This document is intended to contain a detailed description of the mathematical formulation of Xyce, a massively parallel SPICE-style circuit simulator developed at Sandia National Laboratories. The target audience of this document are people in the role of 'service provider'. An example of such a person would be a linear solver expert who is spending a small fraction of his time developing solver algorithms for Xyce. Such a person probably is not an expert in circuit simulation, and would benefit from an description of the equations solved by Xyce. In this document, modified nodal analysis (MNA) is described in detail, with a number of examples. Issues that are unique to circuit simulation, such as voltage limiting, are also described in detail.
Large-scale nonlinear structural analysis simulation on CRAY parallel/vector supercomputers
NASA Astrophysics Data System (ADS)
Poole, Eugene; Bauer, John; Stratton, Troy; Weidner, Tom
1993-12-01
Two large-scale nonlinear structural analysis simulations of a nine degree model of the Space Shuttle Redesigned Solid Rocket Motor (RSRM) factory joint are described. The first analysis simulates a burst pressure test where a pressure load was incrementally applied from 0 to 1500 p.s.i. This simulation was used to assist in evaluating and defining an error criterion based on correlation of the finite element analysis with experimentally determined failure points from an actual burst pressure test of a metal case containing the tang and clevis joint. The second part of the analysis is used to show the cyclic stress-strain behavior of the joint assembly and is used to evaluate low cycle fatigue in the local yield zones. The finite element analysis of the RSRM factory joint was conducted using the ANSYS structural analysis code and the computer runs were made using CRAY Y-MP 8I and CRAY C90 supercomputers at Cray Research Inc. Performance apsects of this large-scale analysis are described, including both parallel/vector algorithms used within the ANSYS code and a new I/O software layer described, including both parallel/vector algorithms used within the ANSYS code and a new I/O software layer under development at Cray Research designed to reduce I/O wait time for engineering applications. Key features of the finite element model are discussed, including features to improve accuracy and efficiency.
The Poisson Simulation Approach to Combined Simulation Leif Gustafsson
The Poisson Simulation Approach to Combined Simulation Leif Gustafsson Signals and Systems, Dept the foundations of combined Discrete Event Simulation (DES) and Continuous Systems Simu- lation (CSS) by extending. The traditional combined DES/CSS simulation, where the modeller selects between a DES and a CSS description
Guo Fan [Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545 (United States); Giacalone, Joe, E-mail: guofan.ustc@gmail.com [Department of Planetary Sciences and Lunar and Planetary Laboratory, University of Arizona, 1629 E. University Blvd., Tucson, AZ 85721 (United States)
2013-08-20
We present three-dimensional hybrid simulations of collisionless shocks that propagate parallel to the background magnetic field to study the acceleration of protons that forms a high-energy tail on the distribution. We focus on the initial acceleration of thermal protons and compare it with results from one-dimensional simulations. We find that for both one- and three-dimensional simulations, particles that end up in the high-energy tail of the distribution later in the simulation gained their initial energy right at the shock. This confirms previous results but is the first to demonstrate this using fully three-dimensional fields. The result is not consistent with the ''thermal leakage'' model. We also show that the gyrocenters of protons in the three-dimensional simulation can drift away from the magnetic field lines on which they started due to the removal of ignorable coordinates that exist in one- and two-dimensional simulations. Our study clarifies the injection problem for diffusive shock acceleration.
pWeb: A High-Performance, Parallel-Computing Framework for Web-Browser-Based Medical Simulation.
Halic, Tansel; Ahn, Woojin; De, Suvranu
2014-01-01
This work presents a pWeb - a new language and compiler for parallelization of client-side compute intensive web applications such as surgical simulations. The recently introduced HTML5 standard has enabled creating unprecedented applications on the web. Low performance of the web browser, however, remains the bottleneck of computationally intensive applications including visualization of complex scenes, real time physical simulations and image processing compared to native ones. The new proposed language is built upon web workers for multithreaded programming in HTML5. The language provides fundamental functionalities of parallel programming languages as well as the fork/join parallel model which is not supported by web workers. The language compiler automatically generates an equivalent parallel script that complies with the HTML5 standard. A case study on realistic rendering for surgical simulations demonstrates enhanced performance with a compact set of instructions. PMID:24732497
Stochastic simulation of charged particle transport on the massively parallel processor
NASA Technical Reports Server (NTRS)
Earl, James A.
1988-01-01
Computations of cosmic-ray transport based upon finite-difference methods are afflicted by instabilities, inaccuracies, and artifacts. To avoid these problems, researchers developed a Monte Carlo formulation which is closely related not only to the finite-difference formulation, but also to the underlying physics of transport phenomena. Implementations of this approach are currently running on the Massively Parallel Processor at Goddard Space Flight Center, whose enormous computing power overcomes the poor statistical accuracy that usually limits the use of stochastic methods. These simulations have progressed to a stage where they provide a useful and realistic picture of solar energetic particle propagation in interplanetary space.
Understanding Performance of Parallel Scientific Simulation Codes using Open|SpeedShop
Ghosh, K K
2011-11-07
Conclusions of this presentation are: (1) Open SpeedShop's (OSS) is convenient to use for large, parallel, scientific simulation codes; (2) Large codes benefit from uninstrumented execution; (3) Many experiments can be run in a short time - might need multiple shots e.g. usertime for caller-callee, hwcsamp for HW counters; (4) Decent idea of code's performance is easily obtained; (5) Statistical sampling calls for decent number of samples; and (6) HWC data is very useful for micro-analysis but can be tricky to analyze.
Simulation of optical devices using parallel finite-difference time-domain method
NASA Astrophysics Data System (ADS)
Li, Kang; Kong, Fanmin; Mei, Liangmo; Liu, Xin
2005-11-01
This paper presents a new parallel finite-difference time-domain (FDTD) numerical method in a low-cost network environment to stimulate optical waveguide characteristics. The PC motherboard based cluster is used, as it is relatively low-cost, reliable and has high computing performance. Four clusters are networked by fast Ethernet technology. Due to the simplicity nature of FDTD algorithm, a native Ethernet packet communication mechanism is used to reduce the overhead of the communication between the adjacent clusters. To validate the method, a microcavity ring resonator based on semiconductor waveguides is chosen as an instance of FDTD parallel computation. Speed-up rate under different division density is calculated. From the result we can conclude that when the decomposing size reaches a certain point, a good parallel computing speed up will be maintained. This simulation shows that through the overlapping of computation and communication method and controlling the decomposing size, the overhead of the communication of the shared data will be conquered. The result indicates that the implementation can achieve significant speed up for the FDTD algorithm. This will enable us to tackle the larger real electromagnetic problem by the low-cost PC clusters.
Parallel Simulation of Three-Dimensional Free-Surface Fluid Flow Problems
BAER,THOMAS A.; SUBIA,SAMUEL R.; SACKINGER,PHILIP A.
2000-01-18
We describe parallel simulations of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact lines. The Galerlin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-static solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of problem unknowns. Issues concerning the proper constraints along the solid-fluid dynamic contact line in three dimensions are discussed. Parallel computations are carried out for an example taken from the coating flow industry, flow in the vicinity of a slot coater edge. This is a three-dimensional free-surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another part of the flow domain. Discussion focuses on parallel speedups for fixed problem size, a class of problems of immediate practical importance.
L-PICOLA: A parallel code for fast dark matter simulation
Howlett, Cullan; Percival, Will J
2015-01-01
Robust measurements based on current large-scale structure surveys require precise knowledge of statistical and systematic errors. This can be obtained from large numbers of realistic mock galaxy catalogues that mimic the observed distribution of galaxies within the survey volume. To this end we present a fast, distributed-memory, planar-parallel code, L-PICOLA, which can be used to generate and evolve a set of initial conditions into a dark matter field much faster than a full non-linear N-Body simulation. Additionally, L-PICOLA has the ability to include primordial non-Gaussianity in the simulation and simulate the past lightcone at run-time, with optional replication of the simulation volume. Through comparisons to fully non-linear N-Body simulations we find that our code can reproduce the $z=0$ power spectrum and reduced bispectrum of dark matter to within 2% on all scales of interest to measurements of Baryon Acoustic Oscillations and Redshift Space Distortions, but 3 orders of magnitude faster. The accu...
Pesce, Lorenzo L.; Lee, Hyong C.; Stevens, Rick L.
2013-01-01
Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determined the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons) and processor pool sizes (1 to 256 processors). Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers. PMID:24416069
Gait simulation via a 6-DOF parallel robot with iterative learning control.
Aubin, Patrick M; Cowley, Matthew S; Ledoux, William R
2008-03-01
We have developed a robotic gait simulator (RGS) by leveraging a 6-degree of freedom parallel robot, with the goal of overcoming three significant challenges of gait simulation, including: 1) operating at near physiologically correct velocities; 2) inputting full scale ground reaction forces; and 3) simulating motion in all three planes (sagittal, coronal and transverse). The robot will eventually be employed with cadaveric specimens, but as a means of exploring the capability of the system, we have first used it with a prosthetic foot. Gait data were recorded from one transtibial amputee using a motion analysis system and force plate. Using the same prosthetic foot as the subject, the RGS accurately reproduced the recorded kinematics and kinetics and the appropriate vertical ground reaction force was realized with a proportional iterative learning controller. After six gait iterations the controller reduced the root mean square (RMS) error between the simulated and in situ; vertical ground reaction force to 35 N during a 1.5 s simulation of the stance phase of gait with a prosthetic foot. This paper addresses the design, methodology and validation of the novel RGS. PMID:18334421
Simulating massively parallel electron beam inspection for sub-20 nm defects
NASA Astrophysics Data System (ADS)
Bunday, Benjamin D.; Mukhtar, Maseeh; Quoi, Kathy; Thiel, Brad; Malloy, Matt
2015-03-01
SEMATECH has initiated a program to develop massively-parallel electron beam defect inspection (MPEBI). Here we use JMONSEL simulations to generate expected imaging responses of chosen test cases of patterns and defects with ability to vary parameters for beam energy, spot size, pixel size, and/or defect material and form factor. The patterns are representative of the design rules for an aggressively-scaled FinFET-type design. With these simulated images and resulting shot noise, a signal-to-noise framework is developed, which relates to defect detection probabilities. Additionally, with this infrastructure the effect of detection chain noise and frequency dependent system response can be made, allowing for targeting of best recipe parameters for MPEBI validation experiments, ultimately leading to insights into how such parameters will impact MPEBI tool design, including necessary doses for defect detection and estimations of scanning speeds for achieving high throughput for HVM.
Infrastructure for distributed enterprise simulation
Johnson, M.M.; Yoshimura, A.S.; Goldsby, M.E. [and others
1998-01-01
Traditional discrete-event simulations employ an inherently sequential algorithm and are run on a single computer. However, the demands of many real-world problems exceed the capabilities of sequential simulation systems. Often the capacity of a computer`s primary memory limits the size of the models that can be handled, and in some cases parallel execution on multiple processors could significantly reduce the simulation time. This paper describes the development of an Infrastructure for Distributed Enterprise Simulation (IDES) - a large-scale portable parallel simulation framework developed to support Sandia National Laboratories` mission in stockpile stewardship. IDES is based on the Breathing-Time-Buckets synchronization protocol, and maps a message-based model of distributed computing onto an object-oriented programming model. IDES is portable across heterogeneous computing architectures, including single-processor systems, networks of workstations and multi-processor computers with shared or distributed memory. The system provides a simple and sufficient application programming interface that can be used by scientists to quickly model large-scale, complex enterprise systems. In the background and without involving the user, IDES is capable of making dynamic use of idle processing power available throughout the enterprise network. 16 refs., 14 figs.
A PARALLEL MONTE CARLO CODE FOR SIMULATING COLLISIONAL N-BODY SYSTEMS
Pattabiraman, Bharath; Umbreit, Stefan; Liao, Wei-keng; Choudhary, Alok; Kalogera, Vassiliki; Memik, Gokhan; Rasio, Frederic A., E-mail: bharath@u.northwestern.edu [Center for Interdisciplinary Exploration and Research in Astrophysics, Northwestern University, Evanston, IL (United States)
2013-02-15
We present a new parallel code for computing the dynamical evolution of collisional N-body systems with up to N {approx} 10{sup 7} particles. Our code is based on the Henon Monte Carlo method for solving the Fokker-Planck equation, and makes assumptions of spherical symmetry and dynamical equilibrium. The principal algorithmic developments involve optimizing data structures and the introduction of a parallel random number generation scheme as well as a parallel sorting algorithm required to find nearest neighbors for interactions and to compute the gravitational potential. The new algorithms we introduce along with our choice of decomposition scheme minimize communication costs and ensure optimal distribution of data and workload among the processing units. Our implementation uses the Message Passing Interface library for communication, which makes it portable to many different supercomputing architectures. We validate the code by calculating the evolution of clusters with initial Plummer distribution functions up to core collapse with the number of stars, N, spanning three orders of magnitude from 10{sup 5} to 10{sup 7}. We find that our results are in good agreement with self-similar core-collapse solutions, and the core-collapse times generally agree with expectations from the literature. Also, we observe good total energy conservation, within {approx}< 0.04% throughout all simulations. We analyze the performance of the code, and demonstrate near-linear scaling of the runtime with the number of processors up to 64 processors for N = 10{sup 5}, 128 for N = 10{sup 6} and 256 for N = 10{sup 7}. The runtime reaches saturation with the addition of processors beyond these limits, which is a characteristic of the parallel sorting algorithm. The resulting maximum speedups we achieve are approximately 60 Multiplication-Sign , 100 Multiplication-Sign , and 220 Multiplication-Sign , respectively.
MDSLB: A new static load balancing method for parallel molecular dynamics simulations
NASA Astrophysics Data System (ADS)
Wu, Yun-Long; Xu, Xin-Hai; Yang, Xue-Jun; Zou, Shun; Ren, Xiao-Guang
2014-02-01
Large-scale parallelization of molecular dynamics simulations is facing challenges which seriously affect the simulation efficiency, among which the load imbalance problem is the most critical. In this paper, we propose, a new molecular dynamics static load balancing method (MDSLB). By analyzing the characteristics of the short-range force of molecular dynamics programs running in parallel, we divide the short-range force into three kinds of force models, and then package the computations of each force model into many tiny computational units called “cell loads”, which provide the basic data structures for our load balancing method. In MDSLB, the spatial region is separated into sub-regions called “local domains”, and the cell loads of each local domain are allocated to every processor in turn. Compared with the dynamic load balancing method, MDSLB can guarantee load balance by executing the algorithm only once at program startup without migrating the loads dynamically. We implement MDSLB in OpenFOAM software and test it on TianHe-1A supercomputer with 16 to 512 processors. Experimental results show that MDSLB can save 34%-64% time for the load imbalanced cases.
SDA 7: A modular and parallel implementation of the simulation of diffusional association software.
Martinez, Michael; Bruce, Neil J; Romanowska, Julia; Kokh, Daria B; Ozboyaci, Musa; Yu, Xiaofeng; Öztürk, Mehmet Ali; Richter, Stefan; Wade, Rebecca C
2015-08-01
The simulation of diffusional association (SDA) Brownian dynamics software package has been widely used in the study of biomacromolecular association. Initially developed to calculate bimolecular protein-protein association rate constants, it has since been extended to study electron transfer rates, to predict the structures of biomacromolecular complexes, to investigate the adsorption of proteins to inorganic surfaces, and to simulate the dynamics of large systems containing many biomacromolecular solutes, allowing the study of concentration-dependent effects. These extensions have led to a number of divergent versions of the software. In this article, we report the development of the latest version of the software (SDA 7). This release was developed to consolidate the existing codes into a single framework, while improving the parallelization of the code to better exploit modern multicore shared memory computer architectures. It is built using a modular object-oriented programming scheme, to allow for easy maintenance and extension of the software, and includes new features, such as adding flexible solute representations. We discuss a number of application examples, which describe some of the methods available in the release, and provide benchmarking data to demonstrate the parallel performance. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. PMID:26123630
Matsuda, K.; Terada, N.; Katoh, Y. [Space and Terrestrial Plasma Physics Laboratory, Department of Geophysics, Graduate School of Science, Tohoku University, Sendai, Miyagi 980-8578 (Japan); Misawa, H. [Planetary Plasma and Atmospheric Research Center, Graduate School of Science, Tohoku University, Sendai, Miyagi 980-8578 (Japan)
2011-08-15
There has been a great concern about the origin of the parallel electric field in the frame of fluid equations in the auroral acceleration region. This paper proposes a new method to simulate magnetohydrodynamic (MHD) equations that include the electron convection term and shows its efficiency with simulation results in one dimension. We apply a third-order semi-discrete central scheme to investigate the characteristics of the electron convection term including its nonlinearity. At a steady state discontinuity, the sum of the ion and electron convection terms balances with the ion pressure gradient. We find that the electron convection term works like the gradient of the negative pressure and reduces the ion sound speed or amplifies the sound mode when parallel current flows. The electron convection term enables us to describe a situation in which a parallel electric field and parallel electron acceleration coexist, which is impossible for ideal or resistive MHD.
University of Miami; Zuo, Wangda; McNeil, Andrew; Wetter, Michael; Lee, Eleanor S.
2013-04-30
Building designers are increasingly relying on complex fenestration systems to reduce energy consumed for lighting and HVAC in low energy buildings. Radiance, a lighting simulation program, has been used to conduct daylighting simulations for complex fenestration systems. Depending on the configurations, the simulation can take hours or even days using a personal computer. This paper describes how to accelerate the matrix multiplication portion of a Radiance three-phase daylight simulation by conducting parallel computing on heterogeneous hardware of a personal computer. The algorithm was optimized and the computational part was implemented in parallel using OpenCL. The speed of new approach was evaluated using various daylighting simulation cases on a multicore central processing unit and a graphics processing unit. Based on the measurements and analysis of the time usage for the Radiance daylighting simulation, further speedups can be achieved by using fast I/O devices and storing the data in a binary format.
NASA Astrophysics Data System (ADS)
Saito, Kenichiro; Koizumi, Eiko; Koizumi, Hideya
2012-09-01
In our previous study, we introduced a new hybrid approach to effectively approximate the total force on each ion during a trajectory calculation in mass spectrometry device simulations, and the algorithm worked successfully with SIMION. We took one step further and applied the method in massively parallel general-purpose computing with GPU (GPGPU) to test its performance in simulations with thousands to over a million ions. We took extra care to minimize the barrier synchronization and data transfer between the host (CPU) and the device (GPU) memory, and took full advantage of the latency hiding. Parallel codes were written in CUDA C++ and implemented to SIMION via the user-defined Lua program. In this study, we tested the parallel hybrid algorithm with a couple of basic models and analyzed the performance by comparing it to that of the original, fully-explicit method written in serial code. The Coulomb explosion simulation with 128,000 ions was completed in 309 s, over 700 times faster than the 63 h taken by the original explicit method in which we evaluated two-body Coulomb interactions explicitly on one ion with each of all the other ions. The simulation of 1,024,000 ions was completed in 2650 s. In another example, we applied the hybrid method on a simulation of ions in a simple quadrupole ion storage model with 100,000 ions, and it only took less than 10 d. Based on our estimate, the same simulation is expected to take 5-7 y by the explicit method in serial code.
Parallel Adaptive Simulation of Weak and Strong Transverse-Wave Structures in H2-O2 Detonations
Deiterding, Ralf [ORNL] [ORNL
2010-01-01
Two- and three-dimensional simulation results are presented that investigate at great detail the temporal evolution of Mach reflection sub-structure patterns intrinsic to gaseous detonation waves. High local resolution is achieved by utilizing a distributed memory parallel shock-capturing finite volume code that employs block-structured dynamic mesh adaptation. The computational approach, the implemented parallelization strategy, and the software design are discussed.
NASA Astrophysics Data System (ADS)
Weiss, C. J.; Schultz, A.
2011-12-01
The high computational cost of the forward solution for modeling low-frequency electromagnetic induction phenomena is one of the primary impediments against broad-scale adoption by the geoscience community of exploration techniques, such as magnetotellurics and geomagnetic depth sounding, that rely on fast and cheap forward solutions to make tractable the inverse problem. As geophysical observables, electromagnetic fields are direct indicators of Earth's electrical conductivity - a physical property independent of (but in some cases correlative with) seismic wavespeed. Electrical conductivity is known to be a function of Earth's physiochemical state and temperature, and to be especially sensitive to the presence of fluids, melts and volatiles. Hence, electromagnetic methods offer a critical and independent constraint on our understanding of Earth's interior processes. Existing methods for parallelization of time-harmonic electromagnetic simulators, as applied to geophysics, have relied heavily on a combination of strategies: coarse-grained decompositions of the model domain; and/or, a high-order functional decomposition across spectral components, which in turn can be domain-decomposed themselves. Hence, in terms of scaling, both approaches are ultimately limited by the growing communication cost as the granularity of the forward problem increases. In this presentation we examine alternate parallelization strategies based on OpenMP shared-memory parallelization and CUDA-based GPU parallelization. As a test case, we use two different numerical simulation packages, each based on a staggered Cartesian grid: FDM3D (Weiss, 2006) which solves the curl-curl equation directly in terms of the scattered electric field (available under the LGPL at www.openem.org); and APHID, the A-Phi Decomposition based on mixed vector and scalar potentials, in which the curl-curl operator is replaced operationally by the vector Laplacian. We describe progress made in modifying the code to use direct solvers in GPU cores dedicated to each small subdomain, iteratively improving the solution by matching adjacent subdomain boundary solutions, rather than iterative Krylov space sparse solvers as currently applied to the whole domain.
Levin, A.E. (Georgia Inst. of Tech., Atlanta, GA (USA)); Montgomery, B.H. (Oak Ridge National Lab., TN (USA))
1990-01-01
The Thermal-Hydraulic Out of Reactor Safety (THORS) Program at Oak Ridge National Laboratory (ORNL) had as its objective the testing of simulated, electrically heated liquid metal reactor (LMR) fuel assemblies in an engineering-scale, sodium loop. Between 1971 and 1985, the THORS Program operated 11 simulated fuel bundles in conditions covering a wide range of normal and off-normal conditions. The last test series in the Program, THORS-SHRS Assembly 1, employed two parallel, 19-pin, full-length, simulated fuel assemblies of a design consistent with the large LMR (Large Scale Prototype Breeder -- LSPB) under development at that time. These bundles were installed in the THORS Facility, allowing single- and parallel-bundle testing in thermal-hydraulic conditions up to and including sodium boiling and dryout. As the name SHRS (Shutdown Heat Removal System) implies, a major objective of the program was testing under conditions expected during low-power reactor operation, including low-flow forced convection, natural convection, and forced-to-natural convection transition at various powers. The THORS-SHRS Assembly 1 experimental program was divided up into four phases. Phase 1 included preliminary and shakedown tests, including the collection of baseline steady-state thermal-hydraulic data. Phase 2 comprised natural convection testing. Forced convection testing was conducted in Phase 3. The final phase of testing included forced-to-natural convection transition tests. Phases 1, 2, and 3 have been discussed in previous papers. The fourth phase is described in this paper. 3 refs., 2 figs.
SIMULATING COMPLEX SYSTEMS IN LABVIEW
György Lipovszki
Modeling and simulation of systems, especially in science and engineering can help to reduce risk and cost of design and testing processes. According to Cellier, the established mathematical models can be classified as follows: continuous time, discrete time, quantitative models and discrete event models. A huge number of simulation software has been developed to support modeling and simulation efforts. All
Accelerating Dust Storm Simulation by Balancing Task Allocation in Parallel Computing Environment
NASA Astrophysics Data System (ADS)
Gui, Z.; Yang, C.; XIA, J.; Huang, Q.; YU, M.
2013-12-01
Dust storm has serious negative impacts on environment, human health, and assets. The continuing global climate change has increased the frequency and intensity of dust storm in the past decades. To better understand and predict the distribution, intensity and structure of dust storm, a series of dust storm models have been developed, such as Dust Regional Atmospheric Model (DREAM), the NMM meteorological module (NMM-dust) and Chinese Unified Atmospheric Chemistry Environment for Dust (CUACE/Dust). The developments and applications of these models have contributed significantly to both scientific research and our daily life. However, dust storm simulation is a data and computing intensive process. Normally, a simulation for a single dust storm event may take several days or hours to run. It seriously impacts the timeliness of prediction and potential applications. To speed up the process, high performance computing is widely adopted. By partitioning a large study area into small subdomains according to their geographic location and executing them on different computing nodes in a parallel fashion, the computing performance can be significantly improved. Since spatiotemporal correlations exist in the geophysical process of dust storm simulation, each subdomain allocated to a node need to communicate with other geographically adjacent subdomains to exchange data. Inappropriate allocations may introduce imbalance task loads and unnecessary communications among computing nodes. Therefore, task allocation method is the key factor, which may impact the feasibility of the paralleling. The allocation algorithm needs to carefully leverage the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire system. This presentation introduces two algorithms for such allocation and compares them with evenly distributed allocation method. Specifically, 1) In order to get optimized solutions, a quadratic programming based modeling method is proposed. This algorithm performs well with small amount of computing tasks. However, its efficiency decreases significantly as the subdomain number and computing node number increase. 2) To compensate performance decreasing for large scale tasks, a K-Means clustering based algorithm is introduced. Instead of dedicating to get optimized solutions, this method can get relatively good feasible solutions within acceptable time. However, it may introduce imbalance communication for nodes or node-isolated subdomains. This research shows both two algorithms have their own strength and weakness for task allocation. A combination of the two algorithms is under study to obtain a better performance. Keywords: Scheduling; Parallel Computing; Load Balance; Optimization; Cost Model
A parallelized particle tracing code for massive 3D mantle flow simulations
NASA Astrophysics Data System (ADS)
Manea, V.; Manea, M.; Pomeran, M.; Besutiu, L.; Zlagnean, L.
2013-05-01
The problem of convective flows in a highly viscous fluid represents a common research direction in Earth Sciences. For tracing the convective motion of the fluid material, a source passive particles (or tracers) that flow at a local convection velocity and do not affect the pattern of flow it is commonly used. Here we present a parallelized tracer code that uses passive and weightless particles with their position computed from their displacement during a small time interval at the velocity of flow previously calculated for a given point in space and time. The tracer code is integrated in the open source package CitcomS, which is widely used in the solid earth community (www.geodynamics.org). We benchmarked the tracer code on the state-of-the-art CyberDyn parallel machine, a High Performance Computing (HPC) Cluster with 1344 computing cores available at the Institute of Geodynamics of the Romanian Academy. The benchmark tests are performed using a series of 3D geodynamic settings where we introduced various clusters of tracers at different places in the models. Using several millions of particles, the benchmark results show that the parallelized tracer code performs well with an optimum number of computing cores between 32 and 64. Because of the large amount of communications among the computing cores, high-resolution CFD simulations for geodynamic predictions that require tens of millions, or even billions of tracers to accurately track mantle flow, will greatly benefit from HPC systems based on low-latency high-speed interconnects. In this paper we will present several study cases regarding the 3D mantle flow as revealed by tracers in active subduction zones, as the subduction of Rivera and Cocos plates beneath North America plate as well as the subduction of Nazca plate beneath the South America plate.
Weaver, R. P. (Robert P.); Gittings, M. L. (Michael L.)
2004-01-01
The Los Alamos Crestone Project is part of the Department of Energy's (DOE) Accelerated Strategic Computing Initiative, or ASCI Program. The main goal of this software development project is to investigate the use of continuous adaptive mesh refinement (CAMR) techniques for application to problems of interest to the Laboratory. There are many code development efforts in the Crestone Project, both unclassified and classified codes. In this overview I will discuss the unclassified SAGE and the RAGE codes. The SAGE (SAIC adaptive grid Eulerian) code is a one-, two-, and three-dimensional multimaterial Eulerian massively parallel hydrodynamics code for use in solving a variety of high-deformation flow problems. The RAGE CAMR code is built from the SAGE code by adding various radiation packages, improved setup utilities and graphics packages and is used for problems in which radiation transport of energy is important. The goal of these massively-parallel versions of the codes is to run extremely large problems in a reasonable amount of calendar time. Our target is scalable performance to {approx}10,000 processors on a 1 billion CAMR computational cell problem that requires hundreds of variables per cell, multiple physics packages (e.g. radiation and hydrodynamics), and implicit matrix solves for each cycle. A general description of the RAGE code has been published in [l],[ 2], [3] and [4]. Currently, the largest simulations we do are three-dimensional, using around 500 million computation cells and running for literally months of calendar time using {approx}2000 processors. Current ASCI platforms range from several 3-teraOPS supercomputers to one 12-teraOPS machine at Lawrence Livermore National Laboratory, the White machine, and one 20-teraOPS machine installed at Los Alamos, the Q machine. Each machine is a system comprised of many component parts that must perform in unity for the successful run of these simulations. Key features of any massively parallel system include the processors, the disks, the interconnection between processors, the operating system, libraries for message passing and parallel 1/0 and other fundamental units of the system. We will give an overview of the current status of the Crestone Project codes SAGE and RAGE. These codes are intended for general applications without tuning of algorithms or parameters. We have run a wide variety of physical applications from millimeter-scale laboratory laser experiments to the multikilometer-scale asteroid impacts into the Pacific Ocean to parsec-scale galaxy formation. Examples of these simulations will be shown. The goal of our effort is to avoid ad hoc models and attempt to rely on first-principles physics. In addition to the large effort on developing parallel code physics packages, a substantial effort in the project is devoted to improving the computer science and software quality engineering (SQE) of the Project codes as well as a sizable effort on the verification and validation (V&V) of the resulting codes. Examples of these efforts for our project will be discussed.
NASA Astrophysics Data System (ADS)
Hobson, T.; Clarkson, V.
2012-09-01
As a result of continual space activity since the 1950s, there are now a large number of man-made Resident Space Objects (RSOs) orbiting the Earth. Because of the large number of items and their relative speeds, the possibility of destructive collisions involving important space assets is now of significant concern to users and operators of space-borne technologies. As a result, a growing number of international agencies are researching methods for improving techniques to maintain Space Situational Awareness (SSA). Computer simulation is a method commonly used by many countries to validate competing methodologies prior to full scale adoption. The use of supercomputing and/or reduced scale testing is often necessary to effectively simulate such a complex problem on todays computers. Recently the authors presented a simulation aimed at reducing the computational burden by selecting the minimum level of fidelity necessary for contrasting methodologies and by utilising multi-core CPU parallelism for increased computational efficiency. The resulting simulation runs on a single PC while maintaining the ability to effectively evaluate competing methodologies. Nonetheless, the ability to control the scale and expand upon the computational demands of the sensor management system is limited. In this paper, we examine the advantages of increasing the parallelism of the simulation by means of General Purpose computing on Graphics Processing Units (GPGPU). As many sub-processes pertaining to SSA management are independent, we demonstrate how parallelisation via GPGPU has the potential to significantly enhance not only research into techniques for maintaining SSA, but also to enhance the level of sophistication of existing space surveillance sensors and sensor management systems. Nonetheless, the use of GPGPU imposes certain limitations and adds to the implementation complexity, both of which require consideration to achieve an effective system. We discuss these challenges and how they can be overcome. We further describe an application of the parallelised system where visibility prediction is used to enhance sensor management. This facilitates significant improvement in maximum catalogue error when RSOs become temporarily unobservable. The objective is to demonstrate the enhanced scalability and increased computational capability of the system.
Yi-Sheng Huang; Yi-Shun Weng; MengChu Zhou
2010-01-01
Deterministic and stochastic Petri nets (DSPNs) are well utilized as a visual and mathematical formalism to model discrete event systems. This paper proposes to use them to model parallel railroad level crossing (LC) control systems. Their applications to both single- and double-track railroad lines are illustrated. The resulting models allow one to identify and thus avoid critical scenarios in such
Experiences with serial and parallel algorithms for channel routing using simulated annealing
NASA Technical Reports Server (NTRS)
Brouwer, Randall Jay
1988-01-01
Two algorithms for channel routing using simulated annealing are presented. Simulated annealing is an optimization methodology which allows the solution process to back up out of local minima that may be encountered by inappropriate selections. By properly controlling the annealing process, it is very likely that the optimal solution to an NP-complete problem such as channel routing may be found. The algorithm presented proposes very relaxed restrictions on the types of allowable transformations, including overlapping nets. By freeing that restriction and controlling overlap situations with an appropriate cost function, the algorithm becomes very flexible and can be applied to many extensions of channel routing. The selection of the transformation utilizes a number of heuristics, still retaining the pseudorandom nature of simulated annealing. The algorithm was implemented as a serial program for a workstation, and a parallel program designed for a hypercube computer. The details of the serial implementation are presented, including many of the heuristics used and some of the resulting solutions.
Hybrid parallel strategy for the simulation of fast transient accidental situations at reactor scale
NASA Astrophysics Data System (ADS)
Faucher, V.; Galon, P.; Beccantini, A.; Crouzet, F.; Debaud, F.; Gautier, T.
2014-06-01
This contribution is dedicated to the latest methodological developments implemented in the fast transient dynamics software EUROPLEXUS (EPX) to simulate the mechanical response of fully coupled fluid-structure systems to accidental situations to be considered at reactor scale, among which the Loss of Coolant Accident, the Core Disruptive Accident and the Hydrogen Explosion. Time integration is explicit and the search for reference solutions within the safety framework prevents any simplification and approximations in the coupled algorithm: for instance, all kinematic constraints are dealt with using Lagrange Multipliers, yielding a complex flow chart when non-permanent constraints such as unilateral contact or immersed fluid-structure boundaries are considered. The parallel acceleration of the solution process is then achieved through a hybrid approach, based on a weighted domain decomposition for distributed memory computing and the use of the KAAPI library for self-balanced shared memory processing inside subdomains.
A package of Linux scripts for the parallelization of Monte Carlo simulations
NASA Astrophysics Data System (ADS)
Badal, Andreu; Sempau, Josep
2006-09-01
Despite the fact that fast computers are nowadays available at low cost, there are many situations where obtaining a reasonably low statistical uncertainty in a Monte Carlo (MC) simulation involves a prohibitively large amount of time. This limitation can be overcome by having recourse to parallel computing. Most tools designed to facilitate this approach require modification of the source code and the installation of additional software, which may be inconvenient for some users. We present a set of tools, named clonEasy, that implement a parallelization scheme of a MC simulation that is free from these drawbacks. In clonEasy, which is designed to run under Linux, a set of "clone" CPUs is governed by a "master" computer by taking advantage of the capabilities of the Secure Shell (ssh) protocol. Any Linux computer on the Internet that can be ssh-accessed by the user can be used as a clone. A key ingredient for the parallel calculation to be reliable is the availability of an independent string of random numbers for each CPU. Many generators—such as RANLUX, RANECU or the Mersenne Twister—can readily produce these strings by initializing them appropriately and, hence, they are suitable to be used with clonEasy. This work was primarily motivated by the need to find a straightforward way to parallelize PENELOPE, a code for MC simulation of radiation transport that (in its current 2005 version) employs the generator RANECU, which uses a combination of two multiplicative linear congruential generators (MLCGs). Thus, this paper is focused on this class of generators and, in particular, we briefly present an extension of RANECU that increases its period up to ˜5×10 and we introduce seedsMLCG, a tool that provides the information necessary to initialize disjoint sequences of an MLCG to feed different CPUs. This program, in combination with clonEasy, allows to run PENELOPE in parallel easily, without requiring specific libraries or significant alterations of the sequential code. Program summary 1Title of program:clonEasy Catalogue identifier:ADYD_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADYD_v1_0 Program obtainable from:CPC Program Library, Queen's University of Belfast, Northern Ireland Computer for which the program is designed and others in which it is operable:Any computer with a Unix style shell (bash), support for the Secure Shell protocol and a FORTRAN compiler Operating systems under which the program has been tested:Linux (RedHat 8.0, SuSe 8.1, Debian Woody 3.1) Compilers:GNU FORTRAN g77 (Linux); g95 (Linux); Intel Fortran Compiler 7.1 (Linux) Programming language used:Linux shell (bash) script, FORTRAN 77 No. of bits in a word:32 No. of lines in distributed program, including test data, etc.:1916 No. of bytes in distributed program, including test data, etc.:18 202 Distribution format:tar.gz Nature of the physical problem:There are many situations where a Monte Carlo simulation involves a huge amount of CPU time. The parallelization of such calculations is a simple way of obtaining a relatively low statistical uncertainty using a reasonable amount of time. Method of solution:The presented collection of Linux scripts and auxiliary FORTRAN programs implement Secure Shell-based communication between a "master" computer and a set of "clones". The aim of this communication is to execute a code that performs a Monte Carlo simulation on all the clones simultaneously. The code is unique, but each clone is fed with a different set of random seeds. Hence, clonEasy effectively permits the parallelization of the calculation. Restrictions on the complexity of the program:clonEasy can only be used with programs that produce statistically independent results using the same code, but with a different sequence of random numbers. Users must choose the initialization values for the random number generator on each computer and combine the output from the different executions. A FORTRAN program to combine the final results is also provided. Typical running time:The execution time
Staged Simulation: A General Technique for Improving Simulation Scale and Performance
Hybinette, Maria
Staged Simulation: A General Technique for Improving Simulation Scale and Performance KEVIN WALSH and EMIN G Â¨UN SIRER Cornell University This article describes staged simulation, a technique for improving the run time performance and scale of discrete event simulators. Typical network simulations are limited
Implementation of a parallel algorithm for thermo-chemical nonequilibrium flow simulations
Wong, C.C.; Blottner, F.G.; Payne, J.L. [Sandia National Labs., Albuquerque, NM (United States); Soetrisno, M. [Amtec Engineering, Inc., Bellevue, WA (United States)
1995-01-01
Massively parallel (MP) computing is considered to be the future direction of high performance computing. When engineers apply this new MP computing technology to solve large-scale problems, one major interest is what is the maximum problem size that a MP computer can handle. To determine the maximum size, it is important to address the code scalability issue. Scalability implies whether the code can provide an increase in performance proportional to an increase in problem size. If the size of the problem increases, by utilizing more computer nodes, the ideal elapsed time to simulate a problem should not increase much. Hence one important task in the development of the MP computing technology is to ensure scalability. A scalable code is an efficient code. In order to obtain good scaled performance, it is necessary to first have the code optimized for a single node performance before proceeding to a large-scale simulation with a large number of computer nodes. This paper will discuss the implementation of a massively parallel computing strategy and the process of optimization to improve the scaled performance. Specifically, we will look at domain decomposition, resource management in the code, communication overhead, and problem mapping. By incorporating these improvements and adopting an efficient MP computing strategy, an efficiency of about 85% and 96%, respectively, has been achieved using 64 nodes on MP computers for both perfect gas and chemically reactive gas problems. A comparison of the performance between MP computers and a vectorized computer, such as Cray-YMP, will also be presented.
Nielsen, Jens; D’Avezac, Mayeul; Hetherington, James [Research Software Development Team, Research IT Services, University College London, Torrington Place, London WC1E 6BT (United Kingdom)] [Research Software Development Team, Research IT Services, University College London, Torrington Place, London WC1E 6BT (United Kingdom); Stamatakis, Michail, E-mail: m.stamatakis@ucl.ac.uk [Department of Chemical Engineering, University College London, Torrington Place, London WC1E 7JE (United Kingdom)] [Department of Chemical Engineering, University College London, Torrington Place, London WC1E 7JE (United Kingdom)
2013-12-14
Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.
Nielsen, Jens; d'Avezac, Mayeul; Hetherington, James; Stamatakis, Michail
2013-12-14
Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion. PMID:24329081
Perfect simulation, monotonicity and finite queueing networks
Paris-Sud XI, Université de
Perfect simulation, monotonicity and finite queueing networks Jean-Marc Vincent LIG Laboratory. Simulation approaches are alternative methods to estimate quality of service of such networks. Based on discrete event simulation [4] or on Markov properties (MCMC methods) [9], simulations estimate the steady
Simulated Wake Characteristics Data for Closely Spaced Parallel Runway Operations Analysis
NASA Technical Reports Server (NTRS)
Guerreiro, Nelson M.; Neitzke, Kurt W.
2012-01-01
A simulation experiment was performed to generate and compile wake characteristics data relevant to the evaluation and feasibility analysis of closely spaced parallel runway (CSPR) operational concepts. While the experiment in this work is not tailored to any particular operational concept, the generated data applies to the broader class of CSPR concepts, where a trailing aircraft on a CSPR approach is required to stay ahead of the wake vortices generated by a lead aircraft on an adjacent CSPR. Data for wake age, circulation strength, and wake altitude change, at various lateral offset distances from the wake-generating lead aircraft approach path were compiled for a set of nine aircraft spanning the full range of FAA and ICAO wake classifications. A total of 54 scenarios were simulated to generate data related to key parameters that determine wake behavior. Of particular interest are wake age characteristics that can be used to evaluate both time- and distance- based in-trail separation concepts for all aircraft wake-class combinations. A simple first-order difference model was developed to enable the computation of wake parameter estimates for aircraft models having weight, wingspan and speed characteristics similar to those of the nine aircraft modeled in this work.
Adelmann, Andreas; Gsell, Achim; Oswald, Benedikt; Schietinger,Thomas; Bethel, Wes; Shalf, John; Siegerist, Cristina; Stockinger, Kurt
2007-06-22
Significant problems facing all experimental andcomputationalsciences arise from growing data size and complexity. Commonto allthese problems is the need to perform efficient data I/O ondiversecomputer architectures. In our scientific application, thelargestparallel particle simulations generate vast quantitiesofsix-dimensional data. Such a simulation run produces data foranaggregate data size up to several TB per run. Motived by the needtoaddress data I/O and access challenges, we have implemented H5Part,anopen source data I/O API that simplifies the use of the HierarchicalDataFormat v5 library (HDF5). HDF5 is an industry standard forhighperformance, cross-platform data storage and retrieval that runsonall contemporary architectures from large parallel supercomputerstolaptops. H5Part, which is oriented to the needs of the particlephysicsand cosmology communities, provides support for parallelstorage andretrieval of particles, structured and in the future unstructuredmeshes.In this paper, we describe recent work focusing on I/O supportforparticles and structured meshes and provide data showing performance onmodernsupercomputer architectures like the IBM POWER 5.