Discrete Event Simulation Parallel Discrete-Event Simulation
Discrete Event Simulation Parallel Discrete-Event Simulation Using TM for PDES Conclusions & Future Works Using TM for high-performance Discrete-Event Simulation on multi-core architectures EuroTM 2013 14, 2013 Olivier Dalle Using TM for high-performance Discrete-Event Simulation on mu #12;Discrete
Synchronization Of Parallel Discrete Event Simulations
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S.
1992-01-01
Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
Parallel Discrete Event Simulation of Lyme Disease
Szymanski, Boleslaw K.
Parallel Discrete Event Simulation of Lyme Disease Ewa Deelman , Thomas Caraco ¡ and Boleslaw K distribution of Lyme disease, currently the most frequently re- ported vector-borne disease of humans). Our goal is to understand patterns in the Lyme disease epidemic at the regional scale through studying
Program For Parallel Discrete-Event Simulation
NASA Technical Reports Server (NTRS)
Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.
1991-01-01
User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.
Continuously Monitored Global Virtual Time in Parallel Discrete Event Simulation
Bystroff, Chris
Continuously Monitored Global Virtual Time in Parallel Discrete Event Simulation Ewa Deelman for Parallel Discrete Event Simulation (PDES) rely heavily on the Global Virtual Time (GVT) calculation. Since the simulation uses large amounts of memory, the GVT is used to synchronize processes and discard obsolete system
Parallel discrete-event simulation of FCFS stochastic queueing networks
David M. Nicol
1988-01-01
Physical systems are inherently parallel; intuition suggests that simulations of these systems may be amenable to parallel execution. The parallel execution of a discrete-event simulation requires careful synchronization of processes in order to ensure the execution's correctness; this synchronization can degrade performance. Largely negative results were recently reported in a study which used a well-known synchronization method on queueing network
The cost of conservative synchronization in parallel discrete event simulations
David M. Nicol
1993-01-01
This paper analytically studies the performance of a synchronous conservative parallel discrete-event simulation protocol. The class of models considered simulates activity in a physical domain, and possesses a limited ability to predict future behavior. Using a stochastic model, it is shown that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of
Presented by Parallel Discrete Event Simulation
Command and control, business processes · Logistics simulations Supply chain processes, contingency results · Parameterized analysis of PDES dynamics Unique experimental analysis approach and empirical requirements: Implemented in sik · Global time synchronization Total time-stamped ordering of events
An adaptive synchronization protocol for parallel discrete event simulation
Bisset, K.R.
1998-12-01
Simulation, especially discrete event simulation (DES), is used in a variety of disciplines where numerical methods are difficult or impossible to apply. One problem with this method is that a sufficiently detailed simulation may take hours or days to execute, and multiple runs may be needed in order to generate the desired results. Parallel discrete event simulation (PDES) has been explored for many years as a method to decrease the time taken to execute a simulation. Many protocols have been developed which work well for particular types of simulations, but perform poorly when used for other types of simulations. Often it is difficult to know a priori whether a particular protocol is appropriate for a given problem. In this work, an adaptive synchronization method (ASM) is developed which works well on an entire spectrum of problems. The ASM determines, using an artificial neural network (ANN), the likelihood that a particular event is safe to process.
Parallel discrete-event simulation of FCFS stochastic queueing networks
NASA Technical Reports Server (NTRS)
Nicol, David M.
1988-01-01
Physical systems are inherently parallel. Intuition suggests that simulations of these systems may be amenable to parallel execution. The parallel execution of a discrete-event simulation requires careful synchronization of processes in order to ensure the execution's correctness; this synchronization can degrade performance. Largely negative results were recently reported in a study which used a well-known synchronization method on queueing network simulations. Discussed here is a synchronization method (appointments), which has proven itself to be effective on simulations of FCFS queueing networks. The key concept behind appointments is the provision of lookahead. Lookahead is a prediction on a processor's future behavior, based on an analysis of the processor's simulation state. It is shown how lookahead can be computed for FCFS queueing network simulations, give performance data that demonstrates the method's effectiveness under moderate to heavy loads, and discuss performance tradeoffs between the quality of lookahead, and the cost of computing lookahead.
Parallel discrete event simulation: A shared memory approach
NASA Technical Reports Server (NTRS)
Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.
1987-01-01
With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.
Optimal topology for parallel discrete-event simulations
NASA Astrophysics Data System (ADS)
Kim, Yup; Kim, Jung-Hwa; Yook, Soon-Hyung
2011-05-01
The effect of shortcuts on the task completion landscape in parallel discrete-event simulation (PDES) is investigated. The morphology of the task completion landscape in PDES is known to be described well by the Langevin-type equation for nonequillibrium interface growth phenomena, such as the Kardar-Parisi-Zhang equation. From the numerical simulations, we find that the root-mean-squared fluctuation of task completion landscape, W(t,N), scales as W(t??,N)~N when the number of shortcuts, ?, is finite. Here N is the number of nodes. This behavior can be understood from the mean-field type argument with effective defects when ? is finite. We also study the behavior of W(t,N) when ? increases as N increases and provide a criterion to design an optimal topology to achieve a better synchronizability in PDES.
The cost of conservative synchronization in parallel discrete event simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
The performance of a synchronous conservative parallel discrete-event simulation protocol is analyzed. The class of simulation models considered is oriented around a physical domain and possesses a limited ability to predict future behavior. A stochastic model is used to show that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of the average per-event overhead due to synchronization, event list manipulation, lookahead calculations, and processor idle time approach the complexity of the average per-event overhead of a serial simulation. The method is therefore within a constant factor of optimal. The analysis demonstrates that on large problems--those for which parallel processing is ideally suited--there is often enough parallel workload so that processors are not usually idle. The viability of the method is also demonstrated empirically, showing how good performance is achieved on large problems using a thirty-two node Intel iPSC/2 distributed memory multiprocessor.
Synchronous parallel system for emulation and discrete event simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (inventor)
1992-01-01
A synchronous parallel system for emulation and discrete event simulation having parallel nodes responds to received messages at each node by generating event objects having individual time stamps, stores only the changes to state variables of the simulation object attributable to the event object, and produces corresponding messages. The system refrains from transmitting the messages and changing the state variables while it determines whether the changes are superseded, and then stores the unchanged state variables in the event object for later restoral to the simulation object if called for. This determination preferably includes sensing the time stamp of each new event object and determining which new event object has the earliest time stamp as the local event horizon, determining the earliest local event horizon of the nodes as the global event horizon, and ignoring the events whose time stamps are less than the global event horizon. Host processing between the system and external terminals enables such a terminal to query, monitor, command or participate with a simulation object during the simulation process.
Synchronous Parallel System for Emulation and Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor)
2001-01-01
A synchronous parallel system for emulation and discrete event simulation having parallel nodes responds to received messages at each node by generating event objects having individual time stamps, stores only the changes to the state variables of the simulation object attributable to the event object and produces corresponding messages. The system refrains from transmitting the messages and changing the state variables while it determines whether the changes are superseded, and then stores the unchanged state variables in the event object for later restoral to the simulation object if called for. This determination preferably includes sensing the time stamp of each new event object and determining which new event object has the earliest time stamp as the local event horizon, determining the earliest local event horizon of the nodes as the global event horizon, and ignoring events whose time stamps are less than the global event horizon. Host processing between the system and external terminals enables such a terminal to query, monitor, command or participate with a simulation object during the simulation process.
Optimistic Parallel Discrete Event Simulation on a Beowulf Cluster of Multi-core Machines
Wilsey, Philip A.
Optimistic Parallel Discrete Event Simulation on a Beowulf Cluster of Multi-core Machines The trend towards multi-core and many-core CPUs is forever changing the composition of the Beowulf cluster. The modern Beowulf cluster is now a heterogeneous cluster of single core, multi-core, and even many
Parallel Discrete Event Simulation-Applications Carl Tropper
Tropper, Carl
and complex models have grown. Now we want to simulate models of the Internet and models of the formation environment (e.g. a CAVE automatic virtual environment). In order to accomodate the growing need
SPEEDES - A multiple-synchronization environment for parallel discrete-event simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeff S.
1992-01-01
Synchronous Parallel Environment for Emulation and Discrete-Event Simulation (SPEEDES) is a unified parallel simulation environment. It supports multiple-synchronization protocols without requiring users to recompile their code. When a SPEEDES simulation runs on one node, all the extra parallel overhead is removed automatically at run time. When the same executable runs in parallel, the user preselects the synchronization algorithm from a list of options. SPEEDES currently runs on UNIX networks and on the California Institute of Technology/Jet Propulsion Laboratory Mark III Hypercube. SPEEDES also supports interactive simulations. Featured in the SPEEDES environment is a new parallel synchronization approach called Breathing Time Buckets. This algorithm uses some of the conservative techniques found in Time Bucket synchronization, along with the optimism that characterizes the Time Warp approach. A mathematical model derived from first principles predicts the performance of Breathing Time Buckets. Along with the Breathing Time Buckets algorithm, this paper discusses the rules for processing events in SPEEDES, describes the implementation of various other synchronization protocols supported by SPEEDES, describes some new ones for the future, discusses interactive simulations, and then gives some performance results.
Thulasidasan, Sunil; Kasiviswanathan, Shiva; Eidenbenz, Stephan; Romero, Philip
2010-01-01
We re-examine the problem of load balancing in conservatively synchronized parallel, discrete-event simulations executed on high-performance computing clusters, focusing on simulations where computational and messaging load tend to be spatially clustered. Such domains are frequently characterized by the presence of geographic 'hot-spots' - regions that generate significantly more simulation events than others. Examples of such domains include simulation of urban regions, transportation networks and networks where interaction between entities is often constrained by physical proximity. Noting that in conservatively synchronized parallel simulations, the speed of execution of the simulation is determined by the slowest (i.e most heavily loaded) simulation process, we study different partitioning strategies in achieving equitable processor-load distribution in domains with spatially clustered load. In particular, we study the effectiveness of partitioning via spatial scattering to achieve optimal load balance. In this partitioning technique, nearby entities are explicitly assigned to different processors, thereby scattering the load across the cluster. This is motivated by two observations, namely, (i) since load is spatially clustered, spatial scattering should, intuitively, spread the load across the compute cluster, and (ii) in parallel simulations, equitable distribution of CPU load is a greater determinant of execution speed than message passing overhead. Through large-scale simulation experiments - both of abstracted and real simulation models - we observe that scatter partitioning, even with its greatly increased messaging overhead, significantly outperforms more conventional spatial partitioning techniques that seek to reduce messaging overhead. Further, even if hot-spots change over the course of the simulation, if the underlying feature of spatial clustering is retained, load continues to be balanced with spatial scattering leading us to the observation that spatial scattering can often obviate the need for dynamic load balancing.
Korniss, Gyorgy
it to understand optimizing the size of a moving "time window" to enforce memory constraints. Keywords conservative to exchange their local simulated (or "virtual") times only with "neighboring" processing elements in the virtual topol- ogy. Systems which can be modeled on a lattice (regular grid) with short
Distributed discrete event simulation. Final report
De Vries, R.C. [Univ. of New Mexico, Albuquerque, NM (United States). EECE Dept.
1988-02-01
The presentation given here is restricted to discrete event simulation. The complexity of and time required for many present and potential discrete simulations exceeds the reasonable capacity of most present serial computers. The desire, then, is to implement the simulations on a parallel machine. However, certain problems arise in an effort to program the simulation on a parallel machine. In one category of methods deadlock care arise and some method is required to either detect deadlock and recover from it or to avoid deadlock through information passing. In the second category of methods, potentially incorrect simulations are allowed to proceed. If the situation is later determined to be incorrect, recovery from the error must be initiated. In either case, computation and information passing are required which would not be required in a serial implementation. The net effect is that the parallel simulation may not be much better than a serial simulation. In an effort to determine alternate approaches, important papers in the area were reviewed. As a part of that review process, each of the papers was summarized. The summary of each paper is presented in this report in the hopes that those doing future work in the area will be able to gain insight that might not otherwise be available, and to aid in deciding which papers would be most beneficial to pursue in more detail. The papers are broken down into categories and then by author. Conclusions reached after examining the papers and other material, such as direct talks with an author, are presented in the last section. Also presented there are some ideas that surfaced late in the research effort. These promise to be of some benefit in limiting information which must be passed between processes and in better understanding the structure of a distributed simulation. Pursuit of these ideas seems appropriate.
Performance bounds on parallel self-initiating discrete-event
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
The use is considered of massively parallel architectures to execute discrete-event simulations of what is termed self-initiating models. A logical process in a self-initiating model schedules its own state re-evaluation times, independently of any other logical process, and sends its new state to other logical processes following the re-evaluation. The interest is in the effects of that communication on synchronization. The performance is considered of various synchronization protocols by deriving upper and lower bounds on optimal performance, upper bounds on Time Warp's performance, and lower bounds on the performance of a new conservative protocol. The analysis of Time Warp includes the overhead costs of state-saving and rollback. The analysis points out sufficient conditions for the conservative protocol to outperform Time Warp. The analysis also quantifies the sensitivity of performance to message fan-out, lookahead ability, and the probability distributions underlying the simulation.
Unsynchronized Parallel Discrete Event Simulation
Wilsey, Philip A.
of Master of Science in the Department of Electrical and Computer Engineering and Computer Science of The College of Engineering November 24, 1998 by Narayanan Thondugulam B.Tech., Indian Institute of Technology such as Time Warp, process events without blocking, they recover from causal violations by rolling back in time
Regenerative Steady-State Simulation of Discrete-Event Systems
Henderson, Shane
Regenerative Steady-State Simulation of Discrete-Event Systems Shane G. Henderson University that holds great appeal, for Name: Shane G. Henderson Address: shane.henderson@umich.edu Name: Peter W. Glynn
Discrete event simulation in the artificial intelligence environment
Egdorf, H.W.; Roberts, D.J.
1987-01-01
Discrete Event Simulations performed in an Artificial Intelligence (AI) environment provide benefits in two major areas. The productivity provided by Object Oriented Programming, Rule Based Programming, and AI development environments allows simulations to be developed and maintained more efficiently than conventional environments allow. Secondly, the use of AI techniques allows direct simulation of human decision making processes and Command and Control aspects of a system under study. An introduction to AI techniques is presented. Two discrete event simulations produced in these environments are described. Finally, a software engineering methodology is discussed that allows simulations to be designed for use in these environments. 3 figs.
Data coupling and downcasting in discrete event simulation software
Nutaro, James J [ORNL; Ward, Richard C [ORNL; Allgood, Glenn O [ORNL; Parfenov, Alexander [Physical Optics Corporation; Jason, Holmstedt [Physical Optics Corporation
2006-01-01
Discrete Event System Specification (DEVS) simulation libraries commonly make use of indirection and, essentially, typeless events as part of their interface specification. This forces library users to employ downcasting and/or strong data coupling in the design of their simulation applications. These techniques are anathema to good object oriented design principles, but seem to be inescapable when using pre-built discrete event simulation libraries. This paper describes how downcasting and data coupling emerge in the design of a computer architecture model. It is hoped that, by exposing the problem and its underlying causes, future research can be directed at improving software engineering techniques for DEVS simulation software.
Discrete-event simulation of queues with spreadsheets: a teaching case
Marco Aurélio De Mesquita; Alvaro Euzebio Hernandez
2006-01-01
This paper describes the use of spreadsheets combined with simple VBA code as a tool for teaching queuing theory and discrete-event simulation. Four different cases are considered: single server, parallel servers, tandem queuing, and closed queuing system. The data obtained in the simulation run are conveniently stored in spreadsheets for subsequent statistical analysis. This approach was successfully deployed in a
Discrete-Event Simulation in Chemical Engineering.
ERIC Educational Resources Information Center
Schultheisz, Daniel; Sommerfeld, Jude T.
1988-01-01
Gives examples, descriptions, and uses for various types of simulation systems, including the Flowtran, Process, Aspen Plus, Design II, GPSS, Simula, and Simscript. Explains similarities in simulators, terminology, and a batch chemical process. Tables and diagrams are included. (RT)
Optimization of Operations Resources via Discrete Event Simulation Modeling
NASA Technical Reports Server (NTRS)
Joshi, B.; Morris, D.; White, N.; Unal, R.
1996-01-01
The resource levels required for operation and support of reusable launch vehicles are typically defined through discrete event simulation modeling. Minimizing these resources constitutes an optimization problem involving discrete variables and simulation. Conventional approaches to solve such optimization problems involving integer valued decision variables are the pattern search and statistical methods. However, in a simulation environment that is characterized by search spaces of unknown topology and stochastic measures, these optimization approaches often prove inadequate. In this paper, we have explored the applicability of genetic algorithms to the simulation domain. Genetic algorithms provide a robust search strategy that does not require continuity and differentiability of the problem domain. The genetic algorithm successfully minimized the operation and support activities for a space vehicle, through a discrete event simulation model. The practical issues associated with simulation optimization, such as stochastic variables and constraints, were also taken into consideration.
Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models
Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL
2010-01-01
The spatial scale, runtime speed and behavioral detail of epidemic outbreak simulations together require the use of large-scale parallel processing. In this paper, an optimistic parallel discrete event execution of a reaction-diffusion simulation model of epidemic outbreaks is presented, with an implementation over the $\\mu$sik simulator. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the simulation are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundred million individuals in the largest case) are exercised.
Discrete Event Modeling and Massively Parallel Execution of Epidemic Outbreak Phenomena
Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL
2011-01-01
In complex phenomena such as epidemiological outbreaks, the intensity of inherent feedback effects and the significant role of transients in the dynamics make simulation the only effective method for proactive, reactive or post-facto analysis. The spatial scale, runtime speed, and behavioral detail needed in detailed simulations of epidemic outbreaks make it necessary to use large-scale parallel processing. Here, an optimistic parallel execution of a new discrete event formulation of a reaction-diffusion simulation model of epidemic propagation is presented to facilitate in dramatically increasing the fidelity and speed by which epidemiological simulations can be performed. Rollback support needed during optimistic parallel execution is achieved by combining reverse computation with a small amount of incremental state saving. Parallel speedup of over 5,500 and other runtime performance metrics of the system are observed with weak-scaling execution on a small (8,192-core) Blue Gene / P system, while scalability with a weak-scaling speedup of over 10,000 is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes exceeding several hundreds of millions of individuals in the largest cases are successfully exercised to verify model scalability.
Multi-threaded, discrete event simulation of distributed computing systems
NASA Astrophysics Data System (ADS)
Legrand, Iosif; MONARC Collaboration
2001-10-01
The LHC experiments have envisaged computing systems of unprecedented complexity, for which is necessary to provide a realistic description and modeling of data access patterns, and of many jobs running concurrently on large scale distributed systems and exchanging very large amounts of data. A process oriented approach for discrete event simulation is well suited to describe various activities running concurrently, as well the stochastic arrival patterns specific for such type of simulation. Threaded objects or "Active Objects" can provide a natural way to map the specific behaviour of distributed data processing into the simulation program. The simulation tool developed within MONARC is based on Java (TM) technology which provides adequate tools for developing a flexible and distributed process oriented simulation. Proper graphics tools, and ways to analyze data interactively, are essential in any simulation project. The design elements, status and features of the MONARC simulation tool are presented. The program allows realistic modeling of complex data access patterns by multiple concurrent users in large scale computing systems in a wide range of possible architectures, from centralized to highly distributed. Comparison between queuing theory and realistic client-server measurements is also presented.
Bruno R. Preiss; V. C. Hamacher; W. M. Loucks
1988-01-01
The main problem associated with comparing distributed discrete event simulation mechanisms is the need to base the comparisons on some common problem specification. This paper presents a specification strategy and language which allows the same simulation problem specification to be used for both distributed discrete event simulation mechanisms as well as the traditional single event list mechanism. This paper includes:
Metrics for Availability Analysis Using a Discrete Event Simulation Method
Schryver, Jack C; Nutaro, James J; Haire, Marvin Jonathan
2012-01-01
The system performance metric 'availability' is a central concept with respect to the concerns of a plant's operators and owners, yet it can be abstract enough to resist explanation at system levels. Hence, there is a need for a system-level metric more closely aligned with a plant's (or, more generally, a system's) raison d'etre. Historically, availability of repairable systems - intrinsic, operational, or otherwise - has been defined as a ratio of times. This paper introduces a new concept of availability, called endogenous availability, defined in terms of a ratio of quantities of product yield. Endogenous availability can be evaluated using a discrete event simulation analysis methodology. A simulation example shows that endogenous availability reduces to conventional availability in a simple series system with different processing rates and without intermediate storage capacity, but diverges from conventional availability when storage capacity is progressively increased. It is shown that conventional availability tends to be conservative when a design includes features, such as in - process storage, that partially decouple the components of a larger system.
Enhancing Complex System Performance Using Discrete-Event Simulation
Allgood, Glenn O [ORNL; Olama, Mohammed M [ORNL; Lake, Joe E [ORNL
2010-01-01
In this paper, we utilize discrete-event simulation (DES) merged with human factors analysis to provide the venue within which the separation and deconfliction of the system/human operating principles can occur. A concrete example is presented to illustrate the performance enhancement gains for an aviation cargo flow and security inspection system achieved through the development and use of a process DES. The overall performance of the system is computed, analyzed, and optimized for the different system dynamics. Various performance measures are considered such as system capacity, residual capacity, and total number of pallets waiting for inspection in the queue. These metrics are performance indicators of the system's ability to service current needs and respond to additional requests. We studied and analyzed different scenarios by changing various model parameters such as the number of pieces per pallet ratio, number of inspectors and cargo handling personnel, number of forklifts, number and types of detection systems, inspection modality distribution, alarm rate, and cargo closeout time. The increased physical understanding resulting from execution of the queuing model utilizing these vetted performance measures identified effective ways to meet inspection requirements while maintaining or reducing overall operational cost and eliminating any shipping delays associated with any proposed changes in inspection requirements. With this understanding effective operational strategies can be developed to optimally use personnel while still maintaining plant efficiency, reducing process interruptions, and holding or reducing costs.
Parallel discrete event simulation with predictors
Gummadi, Vidya
1995-01-01
The motivation for this research has been its applicability in sequence checking in a spacecraft's control commands. Spacecrafts are controlled by sequences of time-tagged control commands which are essentially onboard computer programs. 'The...
Continuum Representation for Simulating Discrete Events of Battery Operation
Subramanian, Venkat
that are currently fol- lowed for the modeling of charge/discharge cycles of lithium-ion batteries involve different the discrete events in the cycling studies of lithium-ion batteries as a continuum event has been proposed-order pseudo-two-dimensional lithium-ion battery model that has several coupled and nonlinear partial
Simulation of AC Electrical Machines Behaviour Using Discrete Event System Simulator
Paris-Sud XI, Université de
Simulation of AC Electrical Machines Behaviour Using Discrete Event System Simulator L. Capocchi simulation model that could require much more execution time. So far, electrical machine digital simulation library inside. However, the approach used is always the same since the electrical machine has
DEVS Framework for Component-based Modeling\\/Simulation of Discrete Event Systems
Young Ik Cho; Tag Gon Kim
This paper applies a component-based framework to discrete event systems simulation and then develops a component- based simulation environment. The environment is based on combination of the sound modeling formalism of DEVS (Discrete Event Systems Specification) and the powerful component standard of COM (Component Object Model). The combination results in the DEVS\\/COM run-time infrastructure which supports binary reusability of simulation
Stochastic discrete event simulation of germinal center reactions Marc Thilo Figge*
Stochastic discrete event simulation of germinal center reactions Marc Thilo Figge* Centre October 2004; published 24 May 2005 We introduce a generic reaction-diffusion model for germinal center in order to simulate the correct time evolution of this complex biological system. Germinal centers play
Ulrich von Beck; John W. Nowak
2000-01-01
Activity based costing (ABC) has revolutionized product costing, planning, and forecasting in the last decade. It is based on a philosophy of estimation that: “it is better to be approximately right, than precisely wrong.” The philosophy of discrete-event simulation modeling follows a similar tack, where statistical inference and the stochastic nature of processes are used to replicate the behavior of
Using Discrete-Event Simulation to Model Situational Awareness of Unmanned-Vehicle Operators
Cummings, Mary "Missy"
1 Using Discrete-Event Simulation to Model Situational Awareness of Unmanned-Vehicle Operators Carl Institute of Technology Cambridge, MA 02139 As the paradigm of operators supervising multiple unmanned on situational awareness as the size of the unmanned vehicle team being supervised is varied. INTRODUCTION N
Sally C. Brailsford; Bernd Schmidt
2003-01-01
Operational Research models are well established as an effective tool for tackling a vast range of health care problems. Many of these models involve parameters which depend on human behaviour, and thus individuals’ characteristics or personality traits should be included. In this paper we describe a discrete event simulation model of attendance for screening for diabetic retinopathy, a sight-threatening complication
A graphical, intelligent interface for discrete-event simulations
Michelsen, C.; Dreicer, J.; Morgeson, D.
1988-01-01
This paper will present a prototype of anengagement analysis simulation tool. This simulation environment is to assist a user (analyst) in performing sensitivity analysis via the repeated execution of user-specified engagement scenarios. This analysis tool provides an intelligent front-end which is easy to use and modify. The intelligent front-end provides the capabilities to assist the use in the selection of appropriate scenario value. The incorporated graphics capabilities also provide additional insight into the simulation events as they are )openreverse arrowquotes)unfolding.)closingreverse arrowquotes) 4 refs., 4 figs.
DISCRETE EVENT SIMULATION OF OPTICAL SWITCH MATRIX PERFORMANCE IN COMPUTER NETWORKS
Imam, Neena [ORNL; Poole, Stephen W [ORNL
2013-01-01
In this paper, we present application of a Discrete Event Simulator (DES) for performance modeling of optical switching devices in computer networks. Network simulators are valuable tools in situations where one cannot investigate the system directly. This situation may arise if the system under study does not exist yet or the cost of studying the system directly is prohibitive. Most available network simulators are based on the paradigm of discrete-event-based simulation. As computer networks become increasingly larger and more complex, sophisticated DES tool chains have become available for both commercial and academic research. Some well-known simulators are NS2, NS3, OPNET, and OMNEST. For this research, we have applied OMNEST for the purpose of simulating multi-wavelength performance of optical switch matrices in computer interconnection networks. Our results suggest that the application of DES to computer interconnection networks provides valuable insight in device performance and aids in topology and system optimization.
Ghany, Ahmad; Vassanji, Karim; Kuziemsky, Craig; Keshavjee, Karim
2013-01-01
Electronic prescribing (e-prescribing) is expected to bring many benefits to Canadian healthcare, such as a reduction in errors and adverse drug reactions. As there currently is no functioning e-prescribing system in Canada that is completely electronic, we are unable to evaluate the performance of a live system. An alternative approach is to use simulation modeling for evaluation. We developed two discrete-event simulation models, one of the current handwritten prescribing system and one of a proposed e-prescribing system, to compare the performance of these two systems. We were able to compare the number of processes in each model, workflow efficiency, and the distribution of patients or prescriptions. Although we were able to compare these models to each other, using discrete-event simulation software was challenging. We were limited in the number of variables we could measure. We discovered non-linear processes and feedback loops in both models that could not be adequately represented using discrete-event simulation software. Finally, interactions between entities in both models could not be modeled using this type of software. We have come to the conclusion that a more appropriate approach to modeling both the handwritten and electronic prescribing systems would be to use a complex adaptive systems approach using agent-based modeling or systems-based modeling. PMID:23388319
Krukenberg, Harry J.
1996-01-01
This study, through the use of discrete event simulation and modeling, explores various prioritization disciplines for U.S. Air Force Military Family Housing maintenance, repair, and renovation projects. Actual data from the Military Family Housing...
Using discrete event simulation to design a more efficient hospital pharmacy for outpatients.
Reynolds, Matthew; Vasilakis, Christos; McLeod, Monsey; Barber, Nicholas; Mounsey, Ann; Newton, Sue; Jacklin, Ann; Franklin, Bryony Dean
2011-09-01
We present the findings of a discrete event simulation study of the hospital pharmacy outpatient dispensing systems at two London hospitals. Having created a model and established its face validity, we tested scenarios to estimate the likely impact of changes in prescription workload, staffing levels and skill-mix, and utilisation of the dispensaries' automatic dispensing robots. The scenarios were compared in terms of mean prescription turnaround times and percentage of prescriptions completed within 45 min. The findings are being used to support business cases for changes in staffing levels and skill-mix in response to changes in workload. PMID:21344201
d li d l iModeling and Solution Issues in Discrete Event Simulationin Discrete Event Simulation
Grossmann, Ignacio E.
;Manufacturing Systems. Design/Operation (planning, h d li )scheduling) Supply Chain Management. Logistics/Design/Operation as an aid in the Design Emulation andSimulations are used as an aid in the Design, Emulation, and Operation, Global warming) Operator training, model validation (computational pilot plant)Operator training, model
Discrete-event simulation for the design and evaluation of physical protection systems
Jordan, S.E.; Snell, M.K.; Madsen, M.M.; Smith, J.S.; Peters, B.A.
1998-08-01
This paper explores the use of discrete-event simulation for the design and control of physical protection systems for fixed-site facilities housing items of significant value. It begins by discussing several modeling and simulation activities currently performed in designing and analyzing these protection systems and then discusses capabilities that design/analysis tools should have. The remainder of the article then discusses in detail how some of these new capabilities have been implemented in software to achieve a prototype design and analysis tool. The simulation software technology provides a communications mechanism between a running simulation and one or more external programs. In the prototype security analysis tool, these capabilities are used to facilitate human-in-the-loop interaction and to support a real-time connection to a virtual reality (VR) model of the facility being analyzed. This simulation tool can be used for both training (in real-time mode) and facility analysis and design (in fast mode).
A Framework for the Optimization of Discrete-Event Simulation Models
NASA Technical Reports Server (NTRS)
Joshi, B. D.; Unal, R.; White, N. H.; Morris, W. D.
1996-01-01
With the growing use of computer modeling and simulation, in all aspects of engineering, the scope of traditional optimization has to be extended to include simulation models. Some unique aspects have to be addressed while optimizing via stochastic simulation models. The optimization procedure has to explicitly account for the randomness inherent in the stochastic measures predicted by the model. This paper outlines a general purpose framework for optimization of terminating discrete-event simulation models. The methodology combines a chance constraint approach for problem formulation, together with standard statistical estimation and analyses techniques. The applicability of the optimization framework is illustrated by minimizing the operation and support resources of a launch vehicle, through a simulation model.
DeMO: An Ontology for Discrete-event Modeling and Simulation
Silver, Gregory A; Miller, John A; Hybinette, Maria; Baramidze, Gregory; York, William S
2011-01-01
Several fields have created ontologies for their subdomains. For example, the biological sciences have developed extensive ontologies such as the Gene Ontology, which is considered a great success. Ontologies could provide similar advantages to the Modeling and Simulation community. They provide a way to establish common vocabularies and capture knowledge about a particular domain with community-wide agreement. Ontologies can support significantly improved (semantic) search and browsing, integration of heterogeneous information sources, and improved knowledge discovery capabilities. This paper discusses the design and development of an ontology for Modeling and Simulation called the Discrete-event Modeling Ontology (DeMO), and it presents prototype applications that demonstrate various uses and benefits that such an ontology may provide to the Modeling and Simulation community. PMID:22919114
Statistical and Probabilistic Extensions to Ground Operations' Discrete Event Simulation Modeling
NASA Technical Reports Server (NTRS)
Trocine, Linda; Cummings, Nicholas H.; Bazzana, Ashley M.; Rychlik, Nathan; LeCroy, Kenneth L.; Cates, Grant R.
2010-01-01
NASA's human exploration initiatives will invest in technologies, public/private partnerships, and infrastructure, paving the way for the expansion of human civilization into the solar system and beyond. As it is has been for the past half century, the Kennedy Space Center will be the embarkation point for humankind's journey into the cosmos. Functioning as a next generation space launch complex, Kennedy's launch pads, integration facilities, processing areas, launch and recovery ranges will bustle with the activities of the world's space transportation providers. In developing this complex, KSC teams work through the potential operational scenarios: conducting trade studies, planning and budgeting for expensive and limited resources, and simulating alternative operational schemes. Numerous tools, among them discrete event simulation (DES), were matured during the Constellation Program to conduct such analyses with the purpose of optimizing the launch complex for maximum efficiency, safety, and flexibility while minimizing life cycle costs. Discrete event simulation is a computer-based modeling technique for complex and dynamic systems where the state of the system changes at discrete points in time and whose inputs may include random variables. DES is used to assess timelines and throughput, and to support operability studies and contingency analyses. It is applicable to any space launch campaign and informs decision-makers of the effects of varying numbers of expensive resources and the impact of off nominal scenarios on measures of performance. In order to develop representative DES models, methods were adopted, exploited, or created to extend traditional uses of DES. The Delphi method was adopted and utilized for task duration estimation. DES software was exploited for probabilistic event variation. A roll-up process was used, which was developed to reuse models and model elements in other less - detailed models. The DES team continues to innovate and expand DES capabilities to address KSC's planning needs.
Developing Flexible Discrete Event Simulation Models in an Uncertain Policy Environment
NASA Technical Reports Server (NTRS)
Miranda, David J.; Fayez, Sam; Steele, Martin J.
2011-01-01
On February 1st, 2010 U.S. President Barack Obama submitted to Congress his proposed budget request for Fiscal Year 2011. This budget included significant changes to the National Aeronautics and Space Administration (NASA), including the proposed cancellation of the Constellation Program. This change proved to be controversial and Congressional approval of the program's official cancellation would take many months to complete. During this same period an end-to-end discrete event simulation (DES) model of Constellation operations was being built through the joint efforts of Productivity Apex Inc. (PAl) and Science Applications International Corporation (SAIC) teams under the guidance of NASA. The uncertainty in regards to the Constellation program presented a major challenge to the DES team, as to: continue the development of this program-of-record simulation, while at the same time remain prepared for possible changes to the program. This required the team to rethink how it would develop it's model and make it flexible enough to support possible future vehicles while at the same time be specific enough to support the program-of-record. This challenge was compounded by the fact that this model was being developed through the traditional DES process-orientation which lacked the flexibility of object-oriented approaches. The team met this challenge through significant pre-planning that led to the "modularization" of the model's structure by identifying what was generic, finding natural logic break points, and the standardization of interlogic numbering system. The outcome of this work resulted in a model that not only was ready to be easily modified to support any future rocket programs, but also a model that was extremely structured and organized in a way that facilitated rapid verification. This paper discusses in detail the process the team followed to build this model and the many advantages this method provides builders of traditional process-oriented discrete event simulations.
A Formal Framework for Stochastic Discrete Event System Specification Modeling and Simulation
Rodrigo Castro; Ernesto Kofman; Gabriel A. Wainer
2010-01-01
We introduce an extension of the classic Discrete Event System Specification (DEVS) formalism that includes stochastic features. Based on the use of the probability spaces theory we define the stochastic DEVS (STDEVS) specification, which provides a formal framework for modeling and sim- ulation of general non-deterministic discrete event systems. The main theoretical properties of the STDEVS framework are treated, including
The effects of indoor environmental exposures on pediatric asthma: a discrete event simulation model
2012-01-01
Background In the United States, asthma is the most common chronic disease of childhood across all socioeconomic classes and is the most frequent cause of hospitalization among children. Asthma exacerbations have been associated with exposure to residential indoor environmental stressors such as allergens and air pollutants as well as numerous additional factors. Simulation modeling is a valuable tool that can be used to evaluate interventions for complex multifactorial diseases such as asthma but in spite of its flexibility and applicability, modeling applications in either environmental exposures or asthma have been limited to date. Methods We designed a discrete event simulation model to study the effect of environmental factors on asthma exacerbations in school-age children living in low-income multi-family housing. Model outcomes include asthma symptoms, medication use, hospitalizations, and emergency room visits. Environmental factors were linked to percent predicted forced expiratory volume in 1 second (FEV1%), which in turn was linked to risk equations for each outcome. Exposures affecting FEV1% included indoor and outdoor sources of NO2 and PM2.5, cockroach allergen, and dampness as a proxy for mold. Results Model design parameters and equations are described in detail. We evaluated the model by simulating 50,000 children over 10 years and showed that pollutant concentrations and health outcome rates are comparable to values reported in the literature. In an application example, we simulated what would happen if the kitchen and bathroom exhaust fans were improved for the entire cohort, and showed reductions in pollutant concentrations and healthcare utilization rates. Conclusions We describe the design and evaluation of a discrete event simulation model of pediatric asthma for children living in low-income multi-family housing. Our model simulates the effect of environmental factors (combustion pollutants and allergens), medication compliance, seasonality, and medical history on asthma outcomes (symptom-days, medication use, hospitalizations, and emergency room visits). The model can be used to evaluate building interventions and green building construction practices on pollutant concentrations, energy savings, and asthma healthcare utilization costs, and demonstrates the value of a simulation approach for studying complex diseases such as asthma. PMID:22989068
cloud elements into abstract representations that may lose key interactions on fine spatiotemporal scales. Koala is based loosely on the Amazon Elastic Compute Cloud (EC2) and on Eucalyptus openKoala: A DiscreteEvent Simulation Model of Infrastructure Clouds Koala is a discrete
Towards High Performance Discrete-Event Simulations of Smart Electric Grids
Perumalla, Kalyan S [ORNL; Nutaro, James J [ORNL; Yoginath, Srikanth B [ORNL
2011-01-01
Future electric grid technology is envisioned on the notion of a smart grid in which responsive end-user devices play an integral part of the transmission and distribution control systems. Detailed simulation is often the primary choice in analyzing small network designs, and the only choice in analyzing large-scale electric network designs. Here, we identify and articulate the high-performance computing needs underlying high-resolution discrete event simulation of smart electric grid operation large network scenarios such as the entire Eastern Interconnect. We focus on the simulator's most computationally intensive operation, namely, the dynamic numerical solution for the electric grid state, for both time-integration as well as event-detection. We explore solution approaches using general-purpose dense and sparse solvers, and propose a scalable solver specialized for the sparse structures of actual electric networks. Based on experiments with an implementation in the THYME simulator, we identify performance issues and possible solution approaches for smart grid experimentation in the large.
Discrete event simulation tool for analysis of qualitative models of continuous processing systems
NASA Technical Reports Server (NTRS)
Malin, Jane T. (inventor); Basham, Bryan D. (inventor); Harris, Richard A. (inventor)
1990-01-01
An artificial intelligence design and qualitative modeling tool is disclosed for creating computer models and simulating continuous activities, functions, and/or behavior using developed discrete event techniques. Conveniently, the tool is organized in four modules: library design module, model construction module, simulation module, and experimentation and analysis. The library design module supports the building of library knowledge including component classes and elements pertinent to a particular domain of continuous activities, functions, and behavior being modeled. The continuous behavior is defined discretely with respect to invocation statements, effect statements, and time delays. The functionality of the components is defined in terms of variable cluster instances, independent processes, and modes, further defined in terms of mode transition processes and mode dependent processes. Model construction utilizes the hierarchy of libraries and connects them with appropriate relations. The simulation executes a specialized initialization routine and executes events in a manner that includes selective inherency of characteristics through a time and event schema until the event queue in the simulator is emptied. The experimentation and analysis module supports analysis through the generation of appropriate log files and graphics developments and includes the ability of log file comparisons.
Jonathan Karnon
2003-01-01
Markov models have traditionally been used to evaluate the cost-effectiveness of competing health care technologies that require the description of patient pathways over extended time horizons. Discrete event simulation (DES) is a more flexible, but more complicated decision modelling technique, that can also be used to model extended time horizons. Through the application of a Markov process and a DES
Aickelin, Uwe
Technology University of Nottingham, Nottingham, UK {mva,uxa,pos}@cs.nott.ac.uk Abstract Discrete Event study at a department store. Both DES and ABS models will be compared using the same problem domain
StratBAM: A Discrete-Event Simulation Model to Support Strategic Hospital Bed Capacity Decisions.
Devapriya, Priyantha; Strömblad, Christopher T B; Bailey, Matthew D; Frazier, Seth; Bulger, John; Kemberling, Sharon T; Wood, Kenneth E
2015-10-01
The ability to accurately measure and assess current and potential health care system capacities is an issue of local and national significance. Recent joint statements by the Institute of Medicine and the Agency for Healthcare Research and Quality have emphasized the need to apply industrial and systems engineering principles to improving health care quality and patient safety outcomes. To address this need, a decision support tool was developed for planning and budgeting of current and future bed capacity, and evaluating potential process improvement efforts. The Strategic Bed Analysis Model (StratBAM) is a discrete-event simulation model created after a thorough analysis of patient flow and data from Geisinger Health System's (GHS) electronic health records. Key inputs include: timing, quantity and category of patient arrivals and discharges; unit-level length of care; patient paths; and projected patient volume and length of stay. Key outputs include: admission wait time by arrival source and receiving unit, and occupancy rates. Electronic health records were used to estimate parameters for probability distributions and to build empirical distributions for unit-level length of care and for patient paths. Validation of the simulation model against GHS operational data confirmed its ability to model real-world data consistently and accurately. StratBAM was successfully used to evaluate the system impact of forecasted patient volumes and length of stay in terms of patient wait times, occupancy rates, and cost. The model is generalizable and can be appropriately scaled for larger and smaller health care settings. PMID:26310949
Tutorial: Parallel Simulation on Supercomputers
Perumalla, Kalyan S
2012-01-01
This tutorial introduces typical hardware and software characteristics of extant and emerging supercomputing platforms, and presents issues and solutions in executing large-scale parallel discrete event simulation scenarios on such high performance computing systems. Covered topics include synchronization, model organization, example applications, and observed performance from illustrative large-scale runs.
Wilsey, Philip A.
. Several simulation models are also included in the WARPED v2.0 distribution for use in analyzing the system. In this initial version of WARPED v2.x, the system includes sequential and parallel simulation. With the available simulations and extensible design, WARPED v2.0 can be used to explore new optimizations
Using Discrete Event Computer Simulation to Improve Patient Flow in a Ghanaian Acute Care Hospital
Best, Allyson M.; Dixon, Cinnamon A.; Kelton, W. David; Lindsell, Christopher J.
2014-01-01
Objectives Crowding and limited resources have increased the strain on acute care facilities and emergency departments (EDs) worldwide. These problems are particularly prevalent in developing countries. Discrete event simulation (DES) is a computer-based tool that can be used to estimate how changes to complex healthcare delivery systems, such as EDs, will affect operational performance. Using this modality, our objective was to identify operational interventions that could potentially improve patient throughput of one acute care setting in a developing country. Methods We developed a simulation model of acute care at a district level hospital in Ghana to test the effects of resource-neutral (e.g. modified staff start times and roles) and resource-additional (e.g. increased staff) operational interventions on patient throughput. Previously captured, de-identified time-and-motion data from 487 acute care patients were used to develop and test the model. The primary outcome was the modeled effect of interventions on patient length of stay (LOS). Results The base-case (no change) scenario had a mean LOS of 292 minutes (95% CI 291, 293). In isolation, neither adding staffing, changing staff roles, nor varying shift times affected overall patient LOS. Specifically, adding two registration workers, history takers, and physicians resulted in a 23.8 (95% CI 22.3, 25.3) minute LOS decrease. However, when shift start-times were coordinated with patient arrival patterns, potential mean LOS was decreased by 96 minutes (95% CI 94, 98); and with the simultaneous combination of staff roles (Registration and History-taking) there was an overall mean LOS reduction of 152 minutes (95% CI 150, 154). Conclusions Resource-neutral interventions identified through DES modeling have the potential to improve acute care throughput in this Ghanaian municipal hospital. DES offers another approach to identifying potentially effective interventions to improve patient flow in emergency and acute care in resource-limited settings. PMID:24953788
A discrete event-based simulation model for real-time traffic management in railways
Jose L. Espinosa; Ricardo García-Ródenas
2012-01-01
Rail systems are highly complex and their control requires mathematical-computational tools. The main drawback of the models used to represent railway traffic, and to resolve any conflicts that occur, is the large computational time needed to obtain satisfactory results. Therefore the purpose of this paper is to study and design a discrete event-based model characterized by the positioning of trains
Towards Adaptive Caching for Parallel and Discrete Event Abhishek Chugh and Maria Hybinette
Hybinette, Maria
this problem. In earlier work, we proposed simulation cloning as a means of reducing the number of redundant simulation is a means of eval- uating the impact of different conditions or policies on the outcome of a real system (e.g. air traffic control systems). Cloning, however, does not address the problem of repeated
Conery, John
for spatially explicit ecological modeling. Our simulation engine, which we call EcoKit, operates on top of Warp simulation environment called EcoKit, which operates on top of the WarpKit implementation of Time Warp. OurKit, a shared memory implementation of the Time Warp Oper- ating System [2]. The next section describes
Optimistic simulation of parallel message-passing applications
Thomas Phan; Rajive Bagrodia
2001-01-01
Optimistic techniques can improve the performance of discrete-event simulations, but one area where optimistic simulators have been unable to show performance improvement is in the simulation of parallel programs. Unfortunately parallel program simulation using direct execution is difficult: the use of direct execution implies that the memory and computation requirements of the simulator are at least as large as that
Optimistic simulation of parallel message-passing applications
Thomas Phan; Rajive Bagrodia
2001-01-01
Optimistic techniques can improve the performance of discrete-event simulations, but one area where optimistic simulators have been unable to show performance improvement is in the simulation of parallel programs. Unfortunately, parallel program simulation using direct execution is difficult: the use of direct execution implies that the memory and computation requirements of the simulator are at least as large as that
NASA Technical Reports Server (NTRS)
Dubos, Gregory F.; Cornford, Steven
2012-01-01
While the ability to model the state of a space system over time is essential during spacecraft operations, the use of time-based simulations remains rare in preliminary design. The absence of the time dimension in most traditional early design tools can however become a hurdle when designing complex systems whose development and operations can be disrupted by various events, such as delays or failures. As the value delivered by a space system is highly affected by such events, exploring the trade space for designs that yield the maximum value calls for the explicit modeling of time.This paper discusses the use of discrete-event models to simulate spacecraft development schedule as well as operational scenarios and on-orbit resources in the presence of uncertainty. It illustrates how such simulations can be utilized to support trade studies, through the example of a tool developed for DARPA's F6 program to assist the design of "fractionated spacecraft".
Aggarwal, S.; Ryland, S.; Peck, R.
1980-06-19
This report outlines a methodology to study the effects of disruptive events on nuclear waste material in stable geologic sites. The methodology is based upon developing a discrete events model that can be simulated on the computer. This methodology allows a natural development of simulation models that use computer resources in an efficient manner. Accurate modeling in this area depends in large part upon accurate modeling of ion transport behavior in the storage media. Unfortunately, developments in this area are not at a stage where there is any consensus on proper models for such transport. Consequently, our work is directed primarily towards showing how disruptive events can be properly incorporated in such a model, rather than as a predictive tool at this stage. When and if proper geologic parameters can be determined, then it would be possible to use this as a predictive model. Assumptions and their bases are discussed, and the mathematical and computer model are described.
Pan, Chong; Zhang, Dali; Kon, Audrey Wan Mei; Wai, Charity Sue Lea; Ang, Woo Boon
2015-06-01
Continuous improvement in process efficiency for specialist outpatient clinic (SOC) systems is increasingly being demanded due to the growth of the patient population in Singapore. In this paper, we propose a discrete event simulation (DES) model to represent the patient and information flow in an ophthalmic SOC system in the Singapore National Eye Centre (SNEC). Different improvement strategies to reduce the turnaround time for patients in the SOC were proposed and evaluated with the aid of the DES model and the Design of Experiment (DOE). Two strategies for better patient appointment scheduling and one strategy for dilation-free examination are estimated to have a significant impact on turnaround time for patients. One of the improvement strategies has been implemented in the actual SOC system in the SNEC with promising improvement reported. PMID:25012400
Using machine learning techniques to interpret results from discrete event
Mladenic, Dunja
Using machine learning techniques to interpret results from discrete event simulation Dunja Mladeni machine learning techniques. The results of two simulators were processed as machine learning problems discovered. Key words: discrete event simulation, machine learning, artificial intelligence 1 Introduction
Scaling time warp-based discrete event execution to 104 processors on a Blue Gene supercomputer
Kalyan S. Perumalla; Kalyan S
2007-01-01
Lately, important large-scale simulation applications, such as emergency\\/event planning and response, are emerging that are based on discrete event models. The applications are characterized by their scale (several millions of simulated entities), their fine-grained nature of computation (microseconds per event), and their highly dynamic inter-entity event interactions. The desired scale and speed together call for highly scalable parallel discrete event
Parallelized direct execution simulation of message-passing parallel programs
NASA Technical Reports Server (NTRS)
Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.
1994-01-01
As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
2013-01-01
Background Computer simulation studies of the emergency department (ED) are often patient driven and consider the physician as a human resource whose primary activity is interacting directly with the patient. In many EDs, physicians supervise delegates such as residents, physician assistants and nurse practitioners each with different skill sets and levels of independence. The purpose of this study is to present an alternative approach where physicians and their delegates in the ED are modeled as interacting pseudo-agents in a discrete event simulation (DES) and to compare it with the traditional approach ignoring such interactions. Methods The new approach models a hierarchy of heterogeneous interacting pseudo-agents in a DES, where pseudo-agents are entities with embedded decision logic. The pseudo-agents represent a physician and delegate, where the physician plays a senior role to the delegate (i.e. treats high acuity patients and acts as a consult for the delegate). A simple model without the complexity of the ED is first created in order to validate the building blocks (programming) used to create the pseudo-agents and their interaction (i.e. consultation). Following validation, the new approach is implemented in an ED model using data from an Ontario hospital. Outputs from this model are compared with outputs from the ED model without the interacting pseudo-agents. They are compared based on physician and delegate utilization, patient waiting time for treatment, and average length of stay. Additionally, we conduct sensitivity analyses on key parameters in the model. Results In the hospital ED model, comparisons between the approach with interaction and without showed physician utilization increase from 23% to 41% and delegate utilization increase from 56% to 71%. Results show statistically significant mean time differences for low acuity patients between models. Interaction time between physician and delegate results in increased ED length of stay and longer waits for beds. Conclusion This example shows the importance of accurately modeling physician relationships and the roles in which they treat patients. Neglecting these relationships could lead to inefficient resource allocation due to inaccurate estimates of physician and delegate time spent on patient related activities and length of stay. PMID:23692710
Hessam S. Sarjoughian; Dongping Huang; Gary W. Godding; Karl G. Kempf; Wenlin Wang; Daniel E. Rivera; Hans D. Mittelmann
2005-01-01
Simulation modeling combined with decision control can offer important benefits for analysis, design, and operation of semiconductor supply-chain network systems. Detailed simulation of physical processes provides information for its controller to account for (expected) stochasticity present in the manufacturing processes. In turn, the controller can provide (near) optimal decisions for the operation of the processes and thus handle uncertainty in
A methodology for fabrication of intelligent discrete-event simulation models
Morgeson, J.D.; Burns, J.R.
1987-01-01
In this article a meta-specification for the software requirements and design of intelligent discrete next-event simulation models has been presented. The specification is consistent with established practices for software development as presented in the software engineering literature. The specification has been adapted to take into consideration the specialized needs of object-oriented programming resulting in the actor-centered taxonomy. The heart of the meta-specification is the methodology for requirements specification and design specification of the model. The software products developed by use of the methodology proposed herein are at the leading edge of technology in two very synergistic disciplines - expert systems and simulation. By incorporating simulation concepts into expert systems a deeper reasoning capability is obtained - one that is able to emulate the dynamics or behavior of the object system or process over time. By including expert systems concepts into simulation, the capability to emulate the reasoning functions of decision-makers involved with (and subsumed by) the object system is attained. In either case the robustness of the technology is greatly enhanced.
Using Distributed Analytics to Enable Real-Time Exploration of Discrete Event Simulations
Pallickara, Sangmi
of internal or external stimuli, and direct experimentation is often prohibitively expensive, time-consuming, or simply not feasible. In these situations, computer simulation is a compelling solution. Specifically, dis adjustments to quarantine procedures or the number of vaccines available in order to analyze economic
SimPackJ/S: a web-oriented toolkit for discrete event simulation
NASA Astrophysics Data System (ADS)
Park, Minho; Fishwick, Paul A.
2002-07-01
SimPackJ/S is the JavaScript and Java version of SimPack, which means SimPackJ/S is a collection of JavaScript and Java libraries and executable programs for computer simulations. The main purpose of creating SimPackJ/S is that we allow existing SimPack users to expand simulation areas and provide future users with a freeware simulation toolkit to simulate and model a system in web environments. One of the goals for this paper is to introduce SimPackJ/S. The other goal is to propose translation rules for converting C to JavaScript and Java. Most parts demonstrate the translation rules with examples. In addition, we discuss a 3D dynamic system model and overview an approach to 3D dynamic systems using SimPackJ/S. We explain an interface between SimPackJ/S and the 3D language--Virtual Reality Modeling Language (VRML). This paper documents how to translate C to JavaScript and Java and how to utilize SimPackJ/S within a 3D web environment.
Forest biomass supply logistics for a power plant using the discrete-event simulation approach
Mobini, Mahdi; Sowlati, T.; Sokhansanj, Shahabaddine
2011-04-01
This study investigates the logistics of supplying forest biomass to a potential power plant. Due to the complexities in such a supply logistics system, a simulation model based on the framework of Integrated Biomass Supply Analysis and Logistics (IBSAL) is developed in this study to evaluate the cost of delivered forest biomass, the equilibrium moisture content, and carbon emissions from the logistics operations. The model is applied to a proposed case of 300 MW power plant in Quesnel, BC, Canada. The results show that the biomass demand of the power plant would not be met every year. The weighted average cost of delivered biomass to the gate of the power plant is about C$ 90 per dry tonne. Estimates of equilibrium moisture content of delivered biomass and CO2 emissions resulted from the processes are also provided.
Discrete event simulation of a proton therapy facility: a case study.
Corazza, Uliana; Filippini, Roberto; Setola, Roberto
2011-06-01
Proton therapy is a type of particle therapy which utilizes a beam of protons to irradiate diseased tissue. The main difference with respect to conventional radiotherapy (X-rays, ?-rays) is the capability to target tumors with extreme precision, which makes it possible to treat deep-seated tumors and tumors affecting noble tissues as brain, eyes, etc. However, proton therapy needs high-energy cyclotrons and this requires sophisticated control-supervision schema to guarantee, further than the prescribed performance, the safety of the patients and of the operators. In this paper we present the modeling and simulation of the irradiation process of the PROSCAN facility at the Paul Scherrer Institut. This is a challenging task because of the complexity of the operation scenario, which consists of deterministic and stochastic processes resulting from the coordination-interaction among diverse entities such as distributed automatic control systems, safety protection systems, and human operators. PMID:20675013
Knowledge acquisition for discrete event systems using machine learning
Mladenic, Dunja
Knowledge acquisition for discrete event systems using machine learning Dunja Mladeni'c, 1 and Ivan of discrete event simulation systems is a difficult task. Machine Learning has been investigated to help of discrete event simulation mod els and machine learning as tools for the intelligent analy sis
Higashi, Hideki; Barendregt, Jan J.
2011-01-01
Background Osteoarthritis constitutes a major musculoskeletal burden for the aged Australians. Hip and knee replacement surgeries are effective interventions once all conservative therapies to manage the symptoms have been exhausted. This study aims to evaluate the cost-effectiveness of hip and knee replacements in Australia. To our best knowledge, the study is the first attempt to account for the dual nature of hip and knee osteoarthritis in modelling the severities of right and left joints separately. Methodology/Principal Findings We developed a discrete-event simulation model that follows up the individuals with osteoarthritis over their lifetimes. The model defines separate attributes for right and left joints and accounts for several repeat replacements. The Australian population with osteoarthritis who were 40 years of age or older in 2003 were followed up until extinct. Intervention effects were modelled by means of disability-adjusted life-years (DALYs) averted. Both hip and knee replacements are highly cost effective (AUD 5,000 per DALY and AUD 12,000 per DALY respectively) under an AUD 50,000/DALY threshold level. The exclusion of cost offsets, and inclusion of future unrelated health care costs in extended years of life, did not change the findings that the interventions are cost-effective (AUD 17,000 per DALY and AUD 26,000 per DALY respectively). However, there was a substantial difference between hip and knee replacements where surgeries administered for hips were more cost-effective than for knees. Conclusions/Significance Both hip and knee replacements are cost-effective interventions to improve the quality of life of people with osteoarthritis. It was also shown that the dual nature of hip and knee OA should be taken into account to provide more accurate estimation on the cost-effectiveness of hip and knee replacements. PMID:21966520
Simulating Billion-Task Parallel Programs
Perumalla, Kalyan S [ORNL] [ORNL; Park, Alfred J [ORNL] [ORNL
2014-01-01
In simulating large parallel systems, bottom-up approaches exercise detailed hardware models with effects from simplified software models or traces, whereas top-down approaches evaluate the timing and functionality of detailed software models over coarse hardware models. Here, we focus on the top-down approach and significantly advance the scale of the simulated parallel programs. Via the direct execution technique combined with parallel discrete event simulation, we stretch the limits of the top-down approach by simulating message passing interface (MPI) programs with millions of tasks. Using a timing-validated benchmark application, a proof-of-concept scaling level is achieved to over 0.22 billion virtual MPI processes on 216,000 cores of a Cray XT5 supercomputer, representing one of the largest direct execution simulations to date, combined with a multiplexing ratio of 1024 simulated tasks per real task.
Symbolic discrete event system specification
NASA Technical Reports Server (NTRS)
Zeigler, Bernard P.; Chi, Sungdo
1992-01-01
Extending discrete event modeling formalisms to facilitate greater symbol manipulation capabilities is important to further their use in intelligent control and design of high autonomy systems. An extension to the DEVS formalism that facilitates symbolic expression of event times by extending the time base from the real numbers to the field of linear polynomials over the reals is defined. A simulation algorithm is developed to generate the branching trajectories resulting from the underlying nondeterminism. To efficiently manage symbolic constraints, a consistency checking algorithm for linear polynomial constraints based on feasibility checking algorithms borrowed from linear programming has been developed. The extended formalism offers a convenient means to conduct multiple, simultaneous explorations of model behaviors. Examples of application are given with concentration on fault model analysis.
Threaded WARPED: An Optimistic Parallel Discrete Event Simulator for Cluster of Multi-Core Machines
Wilsey, Philip A.
. However, the emergence of low-cost multi-core and many-core processors suitable for use in Beowulf called WARPED to a Beowulf Cluster of many-core processors. More precisely, WARPED is an optimistically for efficient execution on single-core Beowulf Clusters. The work of this thesis extends the WARPED kernel
Parallel Discrete Event Simulation of Molecular Dynamics Through Event-Based Decomposition
Herbordt, Martin
uses simplified models: atoms as hard spheres, covalent bonds as infinite barriers, and van der Waals, it inherently falls short--by several orders of magnitude--of being able to model many important biological
On extending parallelism to serial simulators
NASA Technical Reports Server (NTRS)
Nicol, David; Heidelberger, Philip
1994-01-01
This paper describes an approach to discrete event simulation modeling that appears to be effective for developing portable and efficient parallel execution of models of large distributed systems and communication networks. In this approach, the modeler develops submodels using an existing sequential simulation modeling tool, using the full expressive power of the tool. A set of modeling language extensions permit automatically synchronized communication between submodels; however, the automation requires that any such communication must take a nonzero amount off simulation time. Within this modeling paradigm, a variety of conservative synchronization protocols can transparently support conservative execution of submodels on potentially different processors. A specific implementation of this approach, U.P.S. (Utilitarian Parallel Simulator), is described, along with performance results on the Intel Paragon.
Concurrency and discrete event control
NASA Technical Reports Server (NTRS)
Heymann, Michael
1990-01-01
Much of discrete event control theory has been developed within the framework of automata and formal languages. An alternative approach inspired by the theories of process-algebra as developed in the computer science literature is presented. The framework, which rests on a new formalism of concurrency, can adequately handle nondeterminism and can be used for analysis of a wide range of discrete event phenomena.
Fancher, Robert H.
1997-01-01
simulating changes in the recruiting process brought about by policy changes initiated by the U.S. Army Recruiting Command. The standard model is based strictly on the data collected, while the alternative models simulate increases in the number...
Inflated speedups in parallel simulations via malloc()
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
Discrete-event simulation programs make heavy use of dynamic memory allocation in order to support simulation's very dynamic space requirements. When programming in C one is likely to use the malloc() routine. However, a parallel simulation which uses the standard Unix System V malloc() implementation may achieve an overly optimistic speedup, possibly superlinear. An alternate implementation provided on some (but not all systems) can avoid the speedup anomaly, but at the price of significantly reduced available free space. This is especially severe on most parallel architectures, which tend not to support virtual memory. It is shown how a simply implemented user-constructed interface to malloc() can both avoid artificially inflated speedups, and make efficient use of the dynamic memory space. The interface simply catches blocks on the basis of their size. The problem is demonstrated empirically, and the effectiveness of the solution is shown both empirically and analytically.
Guo, Shien; Getsios, Denis; Hernandez, Luis; Cho, Kelly; Lawler, Elizabeth; Altincatal, Arman; Lanes, Stephan; Blankenburg, Michael
2012-01-01
The growing understanding of the use of biomarkers in Alzheimer's disease (AD) may enable physicians to make more accurate and timely diagnoses. Florbetaben, a beta-amyloid tracer used with positron emission tomography (PET), is one of these diagnostic biomarkers. This analysis was undertaken to explore the potential value of florbetaben PET in the diagnosis of AD among patients with suspected dementia and to identify key data that are needed to further substantiate its value. A discrete event simulation was developed to conduct exploratory analyses from both US payer and societal perspectives. The model simulates the lifetime course of disease progression for individuals, evaluating the impact of their patient management from initial diagnostic work-up to final diagnosis. Model inputs were obtained from specific analyses of a large longitudinal dataset from the New England Veterans Healthcare System and supplemented with data from public data sources and assumptions. The analyses indicate that florbetaben PET has the potential to improve patient outcomes and reduce costs under certain scenarios. Key data on the use of florbetaben PET, such as its influence on time to confirmation of final diagnosis, treatment uptake, and treatment persistency, are unavailable and would be required to confirm its value. PMID:23326754
Massachusetts at Amherst, University of
, wise}@cs.umass.edu Keywords: Executable process models, Resource Man- agement, Complex and Dynamic systems, Human-centric processes, Health care simulation. Abstract A process modeling language to include humans [9] have lacked detailed definition of resources [11]. Similarly, al- though there have
Khalid, Ruzelan; M. Nawawi, Mohd Kamal; Kawsar, Luthful A.; Ghani, Noraida A.; Kamil, Anton A.; Mustafa, Adli
2013-01-01
M/G/C/C state dependent queuing networks consider service rates as a function of the number of residing entities (e.g., pedestrians, vehicles, and products). However, modeling such dynamic rates is not supported in modern Discrete Simulation System (DES) software. We designed an approach to cater this limitation and used it to construct the M/G/C/C state-dependent queuing model in Arena software. Using the model, we have evaluated and analyzed the impacts of various arrival rates to the throughput, the blocking probability, the expected service time and the expected number of entities in a complex network topology. Results indicated that there is a range of arrival rates for each network where the simulation results fluctuate drastically across replications and this causes the simulation results and analytical results exhibit discrepancies. Detail results that show how tally the simulation results and the analytical results in both abstract and graphical forms and some scientific justifications for these have been documented and discussed. PMID:23560037
Khalid, Ruzelan; Nawawi, Mohd Kamal M; Kawsar, Luthful A; Ghani, Noraida A; Kamil, Anton A; Mustafa, Adli
2013-01-01
M/G/C/C state dependent queuing networks consider service rates as a function of the number of residing entities (e.g., pedestrians, vehicles, and products). However, modeling such dynamic rates is not supported in modern Discrete Simulation System (DES) software. We designed an approach to cater this limitation and used it to construct the M/G/C/C state-dependent queuing model in Arena software. Using the model, we have evaluated and analyzed the impacts of various arrival rates to the throughput, the blocking probability, the expected service time and the expected number of entities in a complex network topology. Results indicated that there is a range of arrival rates for each network where the simulation results fluctuate drastically across replications and this causes the simulation results and analytical results exhibit discrepancies. Detail results that show how tally the simulation results and the analytical results in both abstract and graphical forms and some scientific justifications for these have been documented and discussed. PMID:23560037
Oakley, Jeremy
simulation tool for analysing emergency medical services. To reflect upon this aim the following objectives to be investigated in the context of model re-use and the emergency medical services field. 2. To develop an Agent Medical Service and perform quantitative analysis of the findings. 7. To test emergency medical service
Comas, Mercè; Arrospide, Arantzazu; Mar, Javier; Sala, Maria; Vilaprinyó, Ester; Hernández, Cristina; Cots, Francesc; Martínez, Juan; Castells, Xavier
2014-01-01
Objective To assess the budgetary impact of switching from screen-film mammography to full-field digital mammography in a population-based breast cancer screening program. Methods A discrete-event simulation model was built to reproduce the breast cancer screening process (biennial mammographic screening of women aged 50 to 69 years) combined with the natural history of breast cancer. The simulation started with 100,000 women and, during a 20-year simulation horizon, new women were dynamically entered according to the aging of the Spanish population. Data on screening were obtained from Spanish breast cancer screening programs. Data on the natural history of breast cancer were based on US data adapted to our population. A budget impact analysis comparing digital with screen-film screening mammography was performed in a sample of 2,000 simulation runs. A sensitivity analysis was performed for crucial screening-related parameters. Distinct scenarios for recall and detection rates were compared. Results Statistically significant savings were found for overall costs, treatment costs and the costs of additional tests in the long term. The overall cost saving was 1,115,857€ (95%CI from 932,147 to 1,299,567) in the 10th year and 2,866,124€ (95%CI from 2,492,610 to 3,239,638) in the 20th year, representing 4.5% and 8.1% of the overall cost associated with screen-film mammography. The sensitivity analysis showed net savings in the long term. Conclusions Switching to digital mammography in a population-based breast cancer screening program saves long-term budget expense, in addition to providing technical advantages. Our results were consistent across distinct scenarios representing the different results obtained in European breast cancer screening programs. PMID:24832200
NASA Technical Reports Server (NTRS)
Nicol, David; Fujimoto, Richard
1992-01-01
This paper surveys topics that presently define the state of the art in parallel simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, time parallelism, hardware support for parallel simulation, load balancing algorithms, and dynamic memory management for optimistic synchronization.
Parallel Atomistic Simulations
HEFFELFINGER,GRANT S.
2000-01-18
Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
Karnon, Jonathan; Haji Ali Afzali, Hossein
2014-06-01
Modelling in economic evaluation is an unavoidable fact of life. Cohort-based state transition models are most common, though discrete event simulation (DES) is increasingly being used to implement more complex model structures. The benefits of DES relate to the greater flexibility around the implementation and population of complex models, which may provide more accurate or valid estimates of the incremental costs and benefits of alternative health technologies. The costs of DES relate to the time and expertise required to implement and review complex models, when perhaps a simpler model would suffice. The costs are not borne solely by the analyst, but also by reviewers. In particular, modelled economic evaluations are often submitted to support reimbursement decisions for new technologies, for which detailed model reviews are generally undertaken on behalf of the funding body. This paper reports the results from a review of published DES-based economic evaluations. Factors underlying the use of DES were defined, and the characteristics of applied models were considered, to inform options for assessing the potential benefits of DES in relation to each factor. Four broad factors underlying the use of DES were identified: baseline heterogeneity, continuous disease markers, time varying event rates, and the influence of prior events on subsequent event rates. If relevant, individual-level data are available, representation of the four factors is likely to improve model validity, and it is possible to assess the importance of their representation in individual cases. A thorough model performance evaluation is required to overcome the costs of DES from the users' perspective, but few of the reviewed DES models reported such a process. More generally, further direct, empirical comparisons of complex models with simpler models would better inform the benefits of DES to implement more complex models, and the circumstances in which such benefits are most likely. PMID:24627341
2014-01-01
Background Osteoporotic fractures cause a large health burden and substantial costs. This study estimated the expected fracture numbers and costs for the remaining lifetime of postmenopausal women in Germany. Methods A discrete event simulation (DES) model which tracks changes in fracture risk due to osteoporosis, a previous fracture or institutionalization in a nursing home was developed. Expected lifetime fracture numbers and costs per capita were estimated for postmenopausal women (aged 50 and older) at average osteoporosis risk (AOR) and for those never suffering from osteoporosis. Direct and indirect costs were modeled. Deterministic univariate and probabilistic sensitivity analyses were conducted. Results The expected fracture numbers over the remaining lifetime of a 50 year old woman with AOR for each fracture type (% attributable to osteoporosis) were: hip 0.282 (57.9%), wrist 0.229 (18.2%), clinical vertebral 0.206 (39.2%), humerus 0.147 (43.5%), pelvis 0.105 (47.5%), and other femur 0.033 (52.1%). Expected discounted fracture lifetime costs (excess cost attributable to osteoporosis) per 50 year old woman with AOR amounted to €4,479 (€1,995). Most costs were accrued in the hospital €1,743 (€751) and long-term care sectors €1,210 (€620). Univariate sensitivity analysis resulted in percentage changes between -48.4% (if fracture rates decreased by 2% per year) and +83.5% (if fracture rates increased by 2% per year) compared to base case excess costs. Costs for women with osteoporosis were about 3.3 times of those never getting osteoporosis (€7,463 vs. €2,247), and were markedly increased for women with a previous fracture. Conclusion The results of this study indicate that osteoporosis causes a substantial share of fracture costs in postmenopausal women, which strongly increase with age and previous fractures. PMID:24981316
Compiling Esterel into Static Discrete-Event Code
Stephen A. Edwards; Vimal Kapadia; Michael Halas
2004-01-01
Executing concurrent specications on sequential hardware is important for both simulation of systems that are eventually implemented on concurrent hardware and for those most conveniently described as a set of concurrent processes. As with most forms of simulation, this is easy to do correctly but dicult to do ecien tly. So- lutions such as preemptive operating systems and discrete-event simulators
Xyce parallel electronic simulator.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.
2010-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.
Adelinde Uhrmacher; Corrado Priami
2005-01-01
The goal of Systems Biology is to analyze the behavior and interrelationships between entities of entire functional bio- logical systems. Discrete event approaches are of particular interest if small numbers of entities, like DNA molecules, shall be modeled. Two general approaches toward discrete event modeling and simulation are presented. They provide rather different perspectives on the system to be modeled,
Optimal Discrete Event Supervisory Control of Aircraft Gas Turbine Engines
NASA Technical Reports Server (NTRS)
Litt, Jonathan (Technical Monitor); Ray, Asok
2004-01-01
This report presents an application of the recently developed theory of optimal Discrete Event Supervisory (DES) control that is based on a signed real measure of regular languages. The DES control techniques are validated on an aircraft gas turbine engine simulation test bed. The test bed is implemented on a networked computer system in which two computers operate in the client-server mode. Several DES controllers have been tested for engine performance and reliability.
An algebra of discrete event processes
NASA Technical Reports Server (NTRS)
Heymann, Michael; Meyer, George
1991-01-01
This report deals with an algebraic framework for modeling and control of discrete event processes. The report consists of two parts. The first part is introductory, and consists of a tutorial survey of the theory of concurrency in the spirit of Hoare's CSP, and an examination of the suitability of such an algebraic framework for dealing with various aspects of discrete event control. To this end a new concurrency operator is introduced and it is shown how the resulting framework can be applied. It is further shown that a suitable theory that deals with the new concurrency operator must be developed. In the second part of the report the formal algebra of discrete event control is developed. At the present time the second part of the report is still an incomplete and occasionally tentative working paper.
Discrete Events as Units of Perceived Time
ERIC Educational Resources Information Center
Liverence, Brandon M.; Scholl, Brian J.
2012-01-01
In visual images, we perceive both space (as a continuous visual medium) and objects (that inhabit space). Similarly, in dynamic visual experience, we perceive both continuous time and discrete events. What is the relationship between these units of experience? The most intuitive answer may be similar to the spatial case: time is perceived as an…
Discrete event control of an unmanned aircraft
Mehdi Fatemi; James Millan; Jonathan Stevenson; Tina Yu; Siu O'Young
2008-01-01
This paper describes the application of a limited-lookahead discrete event supervisory controller that handles the control aspects of the unmanned aerial vehicle (UAV) ldquosense and avoidrdquo (SAA) problem. The controlled UAV and the approaching uncontrolled intruding aircraft that must be avoided are treated as a hybrid system. The UAV control decision making process is discrete, while the embedded flight model
Discrete Event Execution with One-Sided and Two-Sided GVT Algorithms on 216,000 Processor Cores
Perumalla, Kalyan S; Park, Alfred J; Tipparaju, Vinod
2014-01-01
Global virtual time (GVT) computation is a key determinant of the efficiency and runtime dynamics of parallel discrete event simulations (PDES), especially on large-scale parallel platforms. Here, three execution modes of a generalized GVT computation algorithm are studied on high-performance parallel computing systems: (1) a synchronous GVT algorithm that affords ease of implementation, (2) an asynchronous GVT algorithm that is more complex to implement but can relieve blocking latencies, and (3) a variant of the asynchronous GVT algorithm to exploit one-sided communication in extant supercomputing platforms. Performance results are presented of implementations of these algorithms on up to 216,000 cores of a Cray XT5 system, exercised on a range of parameters: optimistic and conservative synchronization, fine- to medium-grained event computation, synthetic and non-synthetic applications, and different lookahead values. Performance of up to 54 billion events executed per second is registered. Detailed PDES-specific runtime metrics are presented to further the understanding of tightly-coupled discrete event dynamics on massively parallel platforms.
Tomeczkowski, Jörg; Stern, Sean; Müller, Alfred; von Heymann, Christian
2013-01-01
Objectives Transfusion of allogeneic blood is still common in orthopedic surgery. This analysis evaluates from the perspective of a German hospital the potential cost savings of Epoetin alfa (EPO) compared to predonated autologous blood transfusions or to a nobloodconservationstrategy (allogeneic blood transfusion strategy)during elective hip and knee replacement surgery. Methods Individual patients (N?=?50,000) were simulated based on data from controlled trials, the German DRG institute (InEK) and various publications and entered into a stochastic model (Monte-Carlo) of three treatment arms: EPO, preoperative autologous donation and nobloodconservationstrategy. All three strategies lead to a different risk for an allogeneic blood transfusion. The model focused on the costs and events of the three different procedures. The costs were obtained from clinical trial databases, the German DRG system, patient records and medical publications: transfusion (allogeneic red blood cells: €320/unit and autologous red blood cells: €250/unit), pneumonia treatment (€5,000), and length of stay (€300/day). Probabilistic sensitivity analyses were performed to determine which factors had an influence on the model's clinical and cost outcomes. Results At acquisition costs of €200/40,000 IU EPO is cost saving compared to autologous blood donation, and cost-effective compared to a nobloodconservationstrategy. The results were most sensitive to the cost of EPO, blood units and hospital days. Conclusions EPO might become an attractive blood conservation strategy for anemic patients at reasonable costs due to the reduction in allogeneic blood transfusions, in the modeled incidence of transfusion-associated pneumonia andthe prolongedlength of stay. PMID:24039829
Multiple Autonomous Discrete Event Controllers for Constellations
NASA Technical Reports Server (NTRS)
Esposito, Timothy C.
2003-01-01
The Multiple Autonomous Discrete Event Controllers for Constellations (MADECC) project is an effort within the National Aeronautics and Space Administration Goddard Space Flight Center's (NASA/GSFC) Information Systems Division to develop autonomous positioning and attitude control for constellation satellites. It will be accomplished using traditional control theory and advanced coordination algorithms developed by the Johns Hopkins University Applied Physics Laboratory (JHU/APL). This capability will be demonstrated in the discrete event control test-bed located at JHU/APL. This project will be modeled for the Leonardo constellation mission, but is intended to be adaptable to any constellation mission. To develop a common software architecture. the controllers will only model very high-level responses. For instance, after determining that a maneuver must be made. the MADECC system will output B (Delta)V (velocity change) value. Lower level systems must then decide which thrusters to fire and for how long to achieve that (Delta)V.
Nonlinear Control and Discrete Event Systems
NASA Technical Reports Server (NTRS)
Meyer, George; Null, Cynthia H. (Technical Monitor)
1995-01-01
As the operation of large systems becomes ever more dependent on extensive automation, the need for an effective solution to the problem of design and validation of the underlying software becomes more critical. Large systems possesses much detailed structure, typically hierarchical, and they are hybrid. Information processing at the top of the hierarchy is by means of formal logic and sentences; on the bottom it is by means of simple scalar differential equations and functions of time; and in the middle it is by an interacting mix of nonlinear multi-axis differential equations and automata, and functions of time and discrete events. The lecture will address the overall problem as it relates to flight vehicle management, describe the middle level, and offer a design approach that is based on Differential Geometry and Discrete Event Dynamic Systems Theory.
GVT Algorithms and Discrete Event Dynamics on 128K+ Processor Cores
Perumalla, Kalyan S [ORNL] [ORNL; Park, Alfred J [ORNL] [ORNL; Tipparaju, Vinod [ORNL] [ORNL
2011-01-01
Parallel discrete event simulation (PDES) represents a class of codes that are challenging to scale to large number of processors due to tight global timestamp-ordering and fine-grained event execution. One of the critical factors in scaling PDES is the efficiency of the underlying global virtual time (GVT) algorithm needed for correctness of parallel execution and speed of progress. Although many GVT algorithms have been proposed previously, few have been proposed for scalable asynchronous execution and none customized to exploit one-sided communication. Moreover, the detailed performance effects of actual GVT algorithm implementations on large platforms are unknown. Here, three major GVT algorithms intended for scalable execution on high-performance systems are studied: (1) a synchronous GVT algorithm that affords ease of implementation, (2) an asynchronous GVT algorithm that is more complex to implement but can relieve blocking latencies, and (3) a variant of the asynchronous GVT algorithm, proposed and studied for the first time here, to exploit one-sided communication in extant supercomputing platforms. Performance results are presented of implementations of these algorithms on over 64,000 cores of a Cray XT5 system, exercised on a range of parameters: optimistic and conservative synchronization, fine- to medium-grained event computation, synthetic and non-synthetic applications, and different lookahead values. Performance of tens of billions of events executed per second are registered, exceeding the speeds of any known PDES engine, and showing asynchronous GVT algorithms to outperform state-of-the-art synchronous GVT algorithms. Detailed PDES-specific runtime metrics are presented to further the understanding of tightly-coupled discrete event execution dynamics on massively parallel platforms.
Discrete events as units of perceived time.
Liverence, Brandon M; Scholl, Brian J
2012-06-01
In visual images, we perceive both space (as a continuous visual medium) and objects (that inhabit space). Similarly, in dynamic visual experience, we perceive both continuous time and discrete events. What is the relationship between these units of experience? The most intuitive answer may be similar to the spatial case: time is perceived as an underlying medium, which is later segmented into discrete event representations. Here we explore the opposite possibility--that our subjective experience of time itself can be influenced by how durations are temporally segmented, beyond more general effects of change and complexity. We show that the way in which a continuous dynamic display is segmented into discrete units (via a path shuffling manipulation) greatly influences duration judgments, independent of psychophysical factors previously implicated in time perception, such as overall stimulus energy, attention and predictability. It seems that we may use the passage of discrete events--and the boundaries between them--in our subjective experience as part of the raw material for inferring the strength of the underlying "current" of time. PMID:22369229
Planning and supervision of reactor defueling using discrete event techniques
Garcia, H.E.; Imel, G.R. [Argonne National Lab., IL (United States); Houshyar, A. [Western Michigan Univ., Kalamazoo, MI (United States). Dept. of Physics
1995-12-31
New fuel handling and conditioning activities for the defueling of the Experimental Breeder Reactor II are being performed at Argonne National Laboratory. Research is being conducted to investigate the use of discrete event simulation, analysis, and optimization techniques to plan, supervise, and perform these activities in such a way that productivity can be improved. The central idea is to characterize this defueling operation as a collection of interconnected serving cells, and then apply operational research techniques to identify appropriate planning schedules for given scenarios. In addition, a supervisory system is being developed to provide personnel with on-line information on the progress of fueling tasks and to suggest courses of action to accommodate changing operational conditions. This paper provides an introduction to the research in progress at ANL. In particular, it briefly describes the fuel handling configuration for reactor defueling at ANL, presenting the flow of material from the reactor grid to the interim storage location, and the expected contributions of this work. As an example of the studies being conducted for planning and supervision of fuel handling activities at ANL, an application of discrete event simulation techniques to evaluate different fuel cask transfer strategies is given at the end of the paper.
A new parallel simulation technique
NASA Astrophysics Data System (ADS)
Blanco-Pillado, Jose J.; Olum, Ken D.; Shlaer, Benjamin
2012-01-01
We develop a "semi-parallel" simulation technique suggested by Pretorius and Lehner, in which the simulation spacetime volume is divided into a large number of small 4-volumes that have only initial and final surfaces. Thus there is no two-way communication between processors, and the 4-volumes can be simulated independently and potentially at different times. This technique allows us to simulate much larger volumes than we otherwise could, because we are not limited by total memory size. No processor time is lost waiting for other processors. We compare a cosmic string simulation we developed using the semi-parallel technique with our previous MPI-based code for several test cases and find a factor of 2.6 improvement in the total amount of processor time required to accomplish the same job for strings evolving in the matter-dominated era.
CAISSON: Interconnect Network Simulator
NASA Technical Reports Server (NTRS)
Springer, Paul L.
2006-01-01
Cray response to HPCS initiative. Model future petaflop computer interconnect. Parallel discrete event simulation techniques for large scale network simulation. Built on WarpIV engine. Run on laptop and Altix 3000. Can be sized up to 1000 simulated nodes per host node. Good parallel scaling characteristics. Flexible: multiple injectors, arbitration strategies, queue iterators, network topologies.
Discrete-event Execution Alternatives on General Purpose Graphical Processing Units (GPGPUs)
Kalyan S. Perumalla; Kalyan S
2006-01-01
Graphics cards, traditionally designed as accelerators for computer graphics, have evolved to support more general-purpose computation. General Purpose Graphical Processing Units (GPGPUs) are now being used as highly efficient, cost-effective platforms for executing certain simulation applications. While most of these applications belong to the category of time- stepped simulations, little is known about the applicability of GPGPUs to discrete event
Controllers for Discrete Event Systems via P. Madhusudan 1 and P. S. Thiagarajan 2 ?
Parthasarathy, Madhusudan
but not least, we deploy a type of homomorphisms called csimulations (controlsimulations) to model tems -- one modelling the plant and the other the specification -- admits a controller is decidable therein. From the controltheoretic perspective, the modelling of discreteevent systems (DES
Optimal stabilization of discrete event systems
NASA Technical Reports Server (NTRS)
Passino, Kevin M.; Antsaklis, Panos J.
1990-01-01
An optimal stabilization problem for discrete event systems (DES) is addressed. A class of not necessarily finite state 'logical' DES models is utilized which can also model the costs for events to occur. Let P and A denote two such models. Suppose that P characterizes the valid behavior of a dynamical system and A represents certain design objectives which specify the allowable DES behavior which is 'contained in' the valid behavior. An optimal control problem for P and A is how to choose the sequence of inputs to P so that the DES behavior lies in A (i.e., it is allowable) and so that a performance index defined in terms of the costs of the events is minimized. Two solutions are provided to an optimal stabilization problem, i.e. how to find a sequence of inputs that results in an optimal state trajectory which cycles in a pre-specified set.
Parallel Simulation Today David Nicol
messages, with the same timestamp, and these may be processed. As a result, each queue sends two new null messages, now simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, analytic performance analysis, time parallelism, hardware support, load balancing, and dynamic memory
An assessment of the ModSim/TWOS parallel simulation environment
Rich, D.O.; Michelsen, R.E.
1991-01-01
The Time Warp Operating System (TWOS) has been the focus of significant research in parallel, discrete-event simulation (PDES). A new language, ModSim, has been developed for use in conjunction with TWOS. The coupling of ModSim and TWOS is an attempt to address the development of large-scale, complex, discrete-event simulation models for parallel execution. The approach, simply stated, is to provide a high-level simulation-language that embodies well-known software engineering principles combined with a high-performance parallel execution environment. The inherent difficulty with this approach is the mapping of the simulation application to the parallel run-time environment. To use TWOS, Time Warp applications are currently developed in C and must be tailored according to a set of constraints and conventions. C/TWOS applications are carefully developed using explicit calls to the Time Warp primitives; thus, the mapping of application to parallel run-time environment is done by the application developer. The disadvantage to this approach is the questionable scalability to larger software efforts; the obvious advantage is the degree of control over managing the efficient execution of the application. The ModSim/TWOS system provides an automatic mapping from a ModSim application to an equivalent C/TWOS application. The major flaw with the ModSim/TWOS system is it currently exists is that there is no compiler support for mapping a ModSim application into an efficient C/TWOS application. Moreover, the ModSim language as currently defined does not provide explicit hooks into the Time Warp Operating System and hence the developer is unable to tailor a ModSim application in the same fashion that a C application can be tailored. Without sufficient compiler support, there is a mismatch between ModSim's object-oriented, process-based execution model and the Time Warp execution model.
Parallel Markov chain Monte Carlo simulations
NASA Astrophysics Data System (ADS)
Ren, Ruichao; Orkoulas, G.
2007-06-01
With strict detailed balance, parallel Monte Carlo simulation through domain decomposition cannot be validated with conventional Markov chain theory, which describes an intrinsically serial stochastic process. In this work, the parallel version of Markov chain theory and its role in accelerating Monte Carlo simulations via cluster computing is explored. It is shown that sequential updating is the key to improving efficiency in parallel simulations through domain decomposition. A parallel scheme is proposed to reduce interprocessor communication or synchronization, which slows down parallel simulation with increasing number of processors. Parallel simulation results for the two-dimensional lattice gas model show substantial reduction of simulation time for systems of moderate and large size.
Fault diagnosis of continuous systems using discrete-event methods
Matthew Daigle; Xenofon Koutsoukos; Gautam Biswas
2007-01-01
Fault diagnosis is crucial for ensuring the safe operation of complex engineering systems. Although discrete- event diagnosis methods are used extensively, they do not easily apply to parametric fault isolation in systems with complex continuous dynamics. This paper presents a novel discrete- event system diagnosis approach for abrupt parametric faults in continuous systems that is based on a qualitative abstraction
Monitoring and Active Diagnosis for Discrete-Event Systems
Pencolé, Yannick
by processes integrated in the on-board architecture of the system. On-line diagnosis is usually consideredMonitoring and Active Diagnosis for Discrete-Event Systems Elodie Chanthery , Yannick Pencol the monitoring of discrete-event systems named active diagnosis. The objective of on-line active diagnosis
Model Transformation with Hierarchical Discrete-Event Control
Model Transformation with Hierarchical Discrete- Event Control Thomas Huining Feng Electrical. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. #12;Model Transformation with Hierarchical Discrete-Event Control by Huining Feng B.S. (Nanjing
Decentralized Modular Control of Concurrent Discrete Event Systems
Kumar, Ratnesh
Decentralized Modular Control of Concurrent Discrete Event Systems Changyan Zhou, Ratnesh Kumar, and Ramavarapu S. Sreenivas Abstract-- The paper studies decentralized modular control of concurrent discrete event systems that are composed of multiple interacting modules. A modular supervisor consists of a set
Parallel circuit simulation on supercomputers
Saleh, R.A.; Gallivan, K.A. . Center for Supercomputing Research and Development); Chang, M.C. ); Hajj, I.N.; Trick, T.N. . Coordinated Science Lab.); Smart, D. )
1989-12-01
Circuit simulation is a very time-consuming and numerically intensive application, especially when the problem size is large as in the case of VLSI circuits. To improve the performance of circuit simulators without sacrificing accuracy, a variety of parallel processing algorithms have been investigated due to the recent availability of a number of commercial multiprocessor machines. In this paper, research in the field of parallel circuit simulation is surveyed and the ongoing research in this area at the University of Illinois is described. Both standard and relaxation-based approaches are considered. In particular, the forms of parallelism available within the direct method approach, used in programs such as SPICE2 and SLATE, and within the relaxation-based approaches, such as waveform relaxation, iterated timing analysis, and waveform-relaxation-Newton, are described. The specific implementation issues addressed here are primarily related to general-purpose multiprocessors with a shared-memory architecture having a limited number of processors, although many of the comments also apply to a number of other architectures.
Automated Control Synthesis for an Assembly Line using Discrete Event System Control Theory
Kumar, Ratnesh
1 Automated Control Synthesis for an Assembly Line using Discrete Event System Control Theory- tems, an educational test-bed that simulates an automated car assembly line has been built using LEGO r blocks. Finite state machines (FSMs) are used for modeling operations of the assembly line
Computational Issues in Intelligent Control: Discrete-Event and Hybrid Systems
Koutsoukos, Xenofon D.
that are cen- tral in intelligent control. In particular, we discuss how the design, simulationComputational Issues in Intelligent Control: Discrete-Event and Hybrid Systems Xenofon D, IN 46556 e-mail: xkoutsou,antsaklis.1@nd.edu Abstract Intelligent control methodologies are being developed
Xyce parallel electronic simulator design.
Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.
2010-09-01
This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.
Parallel network simulations with NEURON.
Migliore, M; Cannia, C; Lytton, W W; Markram, Henry; Hines, M L
2006-10-01
The NEURON simulation environment has been extended to support parallel network simulations. Each processor integrates the equations for its subnet over an interval equal to the minimum (interprocessor) presynaptic spike generation to postsynaptic spike delivery connection delay. The performance of three published network models with very different spike patterns exhibits superlinear speedup on Beowulf clusters and demonstrates that spike communication overhead is often less than the benefit of an increased fraction of the entire problem fitting into high speed cache. On the EPFL IBM Blue Gene, almost linear speedup was obtained up to 100 processors. Increasing one model from 500 to 40,000 realistic cells exhibited almost linear speedup on 2,000 processors, with an integration time of 9.8 seconds and communication time of 1.3 seconds. The potential for speed-ups of several orders of magnitude makes practical the running of large network simulations that could otherwise not be explored. PMID:16732488
Graphite : a parallel distributed simulator for multicores
Kasture, Harshad
2010-01-01
This thesis describes Graphite, a parallel, distributed simulator for simulating large-scale multicore architectures, and focuses particularly on the functional aspects of simulating a single, unmodified multi-threaded ...
Discrete Event Supervisory Control Applied to Propulsion Systems
NASA Technical Reports Server (NTRS)
Litt, Jonathan S.; Shah, Neerav
2005-01-01
The theory of discrete event supervisory (DES) control was applied to the optimal control of a twin-engine aircraft propulsion system and demonstrated in a simulation. The supervisory control, which is implemented as a finite-state automaton, oversees the behavior of a system and manages it in such a way that it maximizes a performance criterion, similar to a traditional optimal control problem. DES controllers can be nested such that a high-level controller supervises multiple lower level controllers. This structure can be expanded to control huge, complex systems, providing optimal performance and increasing autonomy with each additional level. The DES control strategy for propulsion systems was validated using a distributed testbed consisting of multiple computers--each representing a module of the overall propulsion system--to simulate real-time hardware-in-the-loop testing. In the first experiment, DES control was applied to the operation of a nonlinear simulation of a turbofan engine (running in closed loop using its own feedback controller) to minimize engine structural damage caused by a combination of thermal and structural loads. This enables increased on-wing time for the engine through better management of the engine-component life usage. Thus, the engine-level DES acts as a life-extending controller through its interaction with and manipulation of the engine s operation.
Randomized Priority Queues for Fast Parallel Access Peter Sanders
Sanders, Peter
are timestamps in optimistic discrete event simulation [18, 8]. A sequential simulator processing events units may be more difficult to process or superfluous work may be necessary. One example for priorities in time stamp order never has to perform a roll back. For parallel simulation this is not possible
Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL
2013-01-01
Virtual machine (VM) technologies, especially those offered via Cloud platforms, present new dimensions with respect to performance and cost in executing parallel discrete event simulation (PDES) applications. Due to the introduction of overall cost as a metric, the choice of the highest-end computing configuration is no longer the most economical one. Moreover, runtime dynamics unique to VM platforms introduce new performance characteristics, and the variety of possible VM configurations give rise to a range of choices for hosting a PDES run. Here, an empirical study of these issues is undertaken to guide an understanding of the dynamics, trends and trade-offs in executing PDES on VM/Cloud platforms. Performance results and cost measures are obtained from actual execution of a range of scenarios in two PDES benchmark applications on the Amazon Cloud offerings and on a high-end VM host machine. The data reveals interesting insights into the new VM-PDES dynamics that come into play and also leads to counter-intuitive guidelines with respect to choosing the best and second-best configurations when overall cost of execution is considered. In particular, it is found that choosing the highest-end VM configuration guarantees neither the best runtime nor the least cost. Interestingly, choosing a (suitably scaled) low-end VM configuration provides the least overall cost without adversely affecting the total runtime.
A Framework for Performance Evaluation of Parallel Discrete Event
Wilsey, Philip A.
of the requirements for the degree of MASTER OF SCIENCE in the Department of Electrical and Computer Engineering and Computer Science of the College of Engineering 27th March, 1997 by Vijay Balakrishnan Bachelor of Engineering, Birla Institute of Technology, Mesra, Ranchi, India 1993 Thesis Advisor and Committee Chair: Dr
State Estimation and Detectability of Probabilistic Discrete Event Systems1
Shu, Shaolong; Ying, Hao; Chen, Xinguang
2009-01-01
A probabilistic discrete event system (PDES) is a nondeterministic discrete event system where the probabilities of nondeterministic transitions are specified. State estimation problems of PDES are more difficult than those of non-probabilistic discrete event systems. In our previous papers, we investigated state estimation problems for non-probabilistic discrete event systems. We defined four types of detectabilities and derived necessary and sufficient conditions for checking these detectabilities. In this paper, we extend our study to state estimation problems for PDES by considering the probabilities. The first step in our approach is to convert a given PDES into a nondeterministic discrete event system and find sufficient conditions for checking probabilistic detectabilities. Next, to find necessary and sufficient conditions for checking probabilistic detectabilities, we investigate the “convergence” of event sequences in PDES. An event sequence is convergent if along this sequence, it is more and more certain that the system is in a particular state. We derive conditions for convergence and hence for detectabilities. We focus on systems with complete event observation and no state observation. For better presentation, the theoretical development is illustrated by a simplified example of nephritis diagnosis. PMID:19956775
Hierarchical Discrete Event Supervisory Control of Aircraft Propulsion Systems
NASA Technical Reports Server (NTRS)
Yasar, Murat; Tolani, Devendra; Ray, Asok; Shah, Neerav; Litt, Jonathan S.
2004-01-01
This paper presents a hierarchical application of Discrete Event Supervisory (DES) control theory for intelligent decision and control of a twin-engine aircraft propulsion system. A dual layer hierarchical DES controller is designed to supervise and coordinate the operation of two engines of the propulsion system. The two engines are individually controlled to achieve enhanced performance and reliability, necessary for fulfilling the mission objectives. Each engine is operated under a continuously varying control system that maintains the specified performance and a local discrete-event supervisor for condition monitoring and life extending control. A global upper level DES controller is designed for load balancing and overall health management of the propulsion system.
Parallel methods for the flight simulation model
Xiong, Wei Zhong; Swietlik, C.
1994-06-01
The Advanced Computer Applications Center (ACAC) has been involved in evaluating advanced parallel architecture computers and the applicability of these machines to computer simulation models. The advanced systems investigated include parallel machines with shared. memory and distributed architectures consisting of an eight processor Alliant FX/8, a twenty four processor sor Sequent Symmetry, Cray XMP, IBM RISC 6000 model 550, and the Intel Touchstone eight processor Gamma and 512 processor Delta machines. Since parallelizing a truly efficient application program for the parallel machine is a difficult task, the implementation for these machines in a realistic setting has been largely overlooked. The ACAC has developed considerable expertise in optimizing and parallelizing application models on a collection of advanced multiprocessor systems. One of aspect of such an application model is the Flight Simulation Model, which used a set of differential equations to describe the flight characteristics of a launched missile by means of a trajectory. The Flight Simulation Model was written in the FORTRAN language with approximately 29,000 lines of source code. Depending on the number of trajectories, the computation can require several hours to full day of CPU time on DEC/VAX 8650 system. There is an impetus to reduce the execution time and utilize the advanced parallel architecture computing environment available. ACAC researchers developed a parallel method that allows the Flight Simulation Model to be able to run in parallel on the multiprocessor system. For the benchmark data tested, the parallel Flight Simulation Model implemented on the Alliant FX/8 has achieved nearly linear speedup. In this paper, we describe a parallel method for the Flight Simulation Model. We believe the method presented in this paper provides a general concept for the design of parallel applications. This concept, in most cases, can be adapted to many other sequential application programs.
Extending Decentralized Discrete-Event Modelling to Diagnose Reconfigurable Systems
Grastien, Alban
or power transportation networks. The reason of a reconfiguration can be the update of the system (substituExtending Decentralized Discrete-Event Modelling to Diagnose Reconfigurable Systems Alban Grastien1 of systems, is particularly well suited to the diagnosis of reconfigurable systems. The contribution
Discrete Event Modelling and Performance Evaluation of Versions of TCP
Kumar, Anurag
Discrete Event Modelling and Performance Evaluation of Versions of TCP: Lossy Links and Rate Introduction The Internet protocol suite's Transmission Control Protocol (TCP) adapts IP's unreliable, TCP serves two other functions: (i) senderÂreceiver flow control, and (ii) congestion control
The Computational Complexity of Decentralized Discrete-Event Control Problems
], [LW88], [CM89], [Kro87], [Laf88] and an application within semiconductor manufacturing [Bal91], [HSF91 to model control problems that arise from discrete-event processes. Problems associated with centralized exible manufacturing systems [LW90] and communication systems [CDFV88], [RW90], [RW92a]. At this On one
Extracting Discrete Event System Models from Hybrid Control Systems
Antsaklis, Panos
and output. In addition to the plant and controller, there is an interface which pro- vides communication. The index n indicates the order of the symbols. 2.3 Interface The controller and plant cannot communicate halfspaces are used to define a set of plant events and a discrete event system model is generated which
Notions of security and opacity in discrete event systems
Anooshiravan Saboori; Christoforos N. Hadjicostis
2007-01-01
In this paper, we follow a state-based approach to extend the notion of opacity in computer security to discrete event systems. A system is (S, P)-opaque if the evolution of its true state through a set of secret states S remains opaque to an observer who is observing activity in the system through the projection map P. In other words,
Graphite: A Distributed Parallel Simulator for Multicores
Beckmann, Nathan
2009-11-09
This paper introduces the open-source Graphite distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, ...
Data parallel sorting for particle simulation
NASA Technical Reports Server (NTRS)
Dagum, Leonardo
1992-01-01
Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Xyce parallel electronic simulator : users' guide.
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick
2011-05-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.
Parallel Implementation of Power System Dynamic Simulation
Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng; Wu, Di; Chen, Yousu
2013-07-21
Dynamic simulation of power system transient stability is important for planning, monitoring, operation, and control of electrical power systems. However, modeling the system dynamics and network involves the computationally intensive time-domain solution of numerous differential and algebraic equations (DAE). This results in a transient stability implementation that may not maintain the real-time constraints of an online security assessment. This paper presents a parallel implementation of the dynamic simulation on a high-performance computing (HPC) platform using parallel simulation algorithms and computation architectures. It enables the simulation to run even faster than real time, enabling the “look-ahead” capability of upcoming stability problems in the power grid.
Simulating the scheduling of parallel supercomputer applications
Seager, M.K.; Stichnoth, J.M.
1989-09-19
An Event Driven Simulator for Evaluating Multiprocessing Scheduling (EDSEMS) disciplines is presented. The simulator is made up of three components: machine model; parallel workload characterization ; and scheduling disciplines for mapping parallel applications (many processes cooperating on the same computation) onto processors. A detailed description of how the simulator is constructed, how to use it and how to interpret the output is also given. Initial results are presented from the simulation of parallel supercomputer workloads using Dog-Eat-Dog,'' Family'' and Gang'' scheduling disciplines. These results indicate that Gang scheduling is far better at giving the number of processors that a job requests than Dog-Eat-Dog or Family scheduling. In addition, the system throughput and turnaround time are not adversely affected by this strategy. 10 refs., 8 figs., 1 tab.
Time parallel gravitational collapse simulation
Kreienbuehl, Andreas; Ruprecht, Daniel; Krause, Rolf
2015-01-01
This article demonstrates the applicability of the parallel-in-time method Parareal to the numerical solution of the Einstein gravity equations for the spherical collapse of a massless scalar field. To account for the shrinking of the spatial domain in time, a tailored load balancing scheme is proposed and compared to load balancing based on number of time steps alone. The performance of Parareal is studied for both the sub-critical and black hole case; our experiments show that Parareal generates substantial speedup and, in the super-critical regime, can also reproduce the black hole mass scaling law.
Simulation of Advanced Large-Scale HPC Architectures Simulation of Advanced Large-Scale HPC
Engelmann, Christian
Architectures Design Project based on parallel discrete event simulation (PDES) Logical process Virtual compute simulation (PDES) Logical process Virtual compute node Event MPI message Virtual time CPU utilisation time + network delay Global virtual time Frank Lauer 10 / 1 #12;Simulation of Advanced Large-Scale HPC
The Xyce Parallel Electronic Simulator - An Overview
HUTCHINSON,SCOTT A.; KEITER,ERIC R.; HOEKSTRA,ROBERT J.; WATTS,HERMAN A.; WATERS,ARLON J.; SCHELLS,REGINA L.; WIX,STEVEN D.
2000-12-08
The Xyce{trademark} Parallel Electronic Simulator has been written to support the simulation needs of the Sandia National Laboratories electrical designers. As such, the development has focused on providing the capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). In addition, they are providing improved performance for numerical kernels using state-of-the-art algorithms, support for modeling circuit phenomena at a variety of abstraction levels and using object-oriented and modern coding-practices that ensure the code will be maintainable and extensible far into the future. The code is a parallel code in the most general sense of the phrase--a message passing parallel implementation--which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Furthermore, careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved even as the number of processors grows.
Xyce parallel electronic simulator release notes.
Keiter, Eric Richard; Hoekstra, Robert John; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Rankin, Eric Lamont; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.
2010-05-01
The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.
Department of Electrical Engineering and Computer Science Discrete Event Systems Group
Tilbury, Dawn
Department of Electrical Engineering and Computer Science 1 Discrete Event Systems Group A Discrete 2000 #12;Department of Electrical Engineering and Computer Science 2 Discrete Event Systems Group of Electrical Engineering and Computer Science 3 Discrete Event Systems Group Requirements for Industrial
A Method for Deadlock Prevention in Discrete Event Systems Using Petri Nets
Antsaklis, Panos
A Method for Deadlock Prevention in Discrete Event Systems Using Petri Nets Technical Report.J. Antsaklis, "A Method for Deadlock Prevention in Discrete Event Systems Using Petri Nets," Technical Report PREVENTION IN DISCRETE EVENT SYSTEMS USING PETRI NETS Marian V. Iordache , John O. Moody , Panos J. Antsaklis
NASA Technical Reports Server (NTRS)
Greenberg, Albert G.; Lubachevsky, Boris D.; Nicol, David M.; Wright, Paul E.
1994-01-01
Fast, efficient parallel algorithms are presented for discrete event simulations of dynamic channel assignment schemes for wireless cellular communication networks. The driving events are call arrivals and departures, in continuous time, to cells geographically distributed across the service area. A dynamic channel assignment scheme decides which call arrivals to accept, and which channels to allocate to the accepted calls, attempting to minimize call blocking while ensuring co-channel interference is tolerably low. Specifically, the scheme ensures that the same channel is used concurrently at different cells only if the pairwise distances between those cells are sufficiently large. Much of the complexity of the system comes from ensuring this separation. The network is modeled as a system of interacting continuous time automata, each corresponding to a cell. To simulate the model, conservative methods are used; i.e., methods in which no errors occur in the course of the simulation and so no rollback or relaxation is needed. Implemented on a 16K processor MasPar MP-1, an elegant and simple technique provides speedups of about 15 times over an optimized serial simulation running on a high speed workstation. A drawback of this technique, typical of conservative methods, is that processor utilization is rather low. To overcome this, new methods were developed that exploit slackness in event dependencies over short intervals of time, thereby raising the utilization to above 50 percent and the speedup over the optimized serial code to about 120 times.
Plasma simulation using the massively parallel processor
NASA Technical Reports Server (NTRS)
Lin, C. S.; Thring, A. L.; Koga, J.; Janetzke, R. W.
1987-01-01
Two dimensional electrostatic simulation codes using the particle-in-cell model are developed on the Massively Parallel Processor (MPP). The conventional plasma simulation procedure that computes electric fields at particle positions by means of a gridded system is found inefficient on the MPP. The MPP simulation code is thus based on the gridless system in which particles are assigned to processing elements and electric fields are computed directly via Discrete Fourier Transform. Currently, the gridless model on the MPP in two dimensions is about nine times slower that the gridded system on the CRAY X-MP without considering I/O time. However, the gridless system on the MPP can be improved by incorporating a faster I/O between the staging memory and Array Unit and a more efficient procedure for taking floating point sums over processing elements. The initial results suggest that the parallel processors have the potential for performing large scale plasma simulations.
Parallel software simulation using PS-nets
Markov, N.G.; Miroshnichenko, E.A.; Saraikin, A.V. [Tomsk Polytechnical Univ. (Russian Federation)
1995-09-01
The requirements of techniques for parallel software simulation are discussed. According to these requirements, techniques on the basis of PS-nets are suggested. The fundamentals of program system modeling by PS-nets are given. The tools developed for modeling are described.
Parallel Performance of a Combustion Chemistry Simulation
Skinner, Gregg; Eigenmann, Rudolf
1995-01-01
We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.
Performance limitations in parallel processor simulations
O'Grady, E.P.; Wang, C.H.
1987-10-01
A jet-engine model is partitioned and simulated on a parallel processor system consisting of five 8086/8087 floating-point computers. The simulation uses Heun's integration method. A near-optimal parallel simulation (in the sense of minimum execution time) achieves speedup of only 2.13 and efficiency of 42.6 percent, in effect wasting 57.4 percent of the available processing power. A detailed analysis identifies and graphically demonstrates why the system fails to achieve ideal performance (viz., speedup of 5 and efficiency of 100 percent). Inherent characteristics of the problem equations and solution algorithm account for the loss of nearly half of the available processing power. Overheads associated with interprocessor communication and processor synchronization account for only a small fraction of the lost processing power. The effects of these and other factors which limit parallel processor performance are illustrated through real-time timing-analyzer tracers describing the run/idle status of the parallel processors during the simulation. 12 references.
Distributed, Parallel Simulation of Multiple, Deliberative Agents A.M.Uhrmacher K.Gugler
Hybinette, Maria
DEVS (Discrete Event System Specification) [19], enriched by means to support variable structures. Time some time to decide which action to take, this "reaction" and "deliberation" time is translated of agents in virtual, dynamic environments necessitates re- lating the simulation time to the actual
Parallel Simulation of Unsteady Turbulent Flames
NASA Technical Reports Server (NTRS)
Menon, Suresh
1996-01-01
Time-accurate simulation of turbulent flames in high Reynolds number flows is a challenging task since both fluid dynamics and combustion must be modeled accurately. To numerically simulate this phenomenon, very large computer resources (both time and memory) are required. Although current vector supercomputers are capable of providing adequate resources for simulations of this nature, the high cost and their limited availability, makes practical use of such machines less than satisfactory. At the same time, the explicit time integration algorithms used in unsteady flow simulations often possess a very high degree of parallelism, making them very amenable to efficient implementation on large-scale parallel computers. Under these circumstances, distributed memory parallel computers offer an excellent near-term solution for greatly increased computational speed and memory, at a cost that may render the unsteady simulations of the type discussed above more feasible and affordable.This paper discusses the study of unsteady turbulent flames using a simulation algorithm that is capable of retaining high parallel efficiency on distributed memory parallel architectures. Numerical studies are carried out using large-eddy simulation (LES). In LES, the scales larger than the grid are computed using a time- and space-accurate scheme, while the unresolved small scales are modeled using eddy viscosity based subgrid models. This is acceptable for the moment/energy closure since the small scales primarily provide a dissipative mechanism for the energy transferred from the large scales. However, for combustion to occur, the species must first undergo mixing at the small scales and then come into molecular contact. Therefore, global models cannot be used. Recently, a new model for turbulent combustion was developed, in which the combustion is modeled, within the subgrid (small-scales) using a methodology that simulates the mixing and the molecular transport and the chemical kinetics within each LES grid cell. Finite-rate kinetics can be included without any closure and this approach actually provides a means to predict the turbulent rates and the turbulent flame speed. The subgrid combustion model requires resolution of the local time scales associated with small-scale mixing, molecular diffusion and chemical kinetics and, therefore, within each grid cell, a significant amount of computations must be carried out before the large-scale (LES resolved) effects are incorporated. Therefore, this approach is uniquely suited for parallel processing and has been implemented on various systems such as: Intel Paragon, IBM SP-2, Cray T3D and SGI Power Challenge (PC) using the system independent Message Passing Interface (MPI) compiler. In this paper, timing data on these machines is reported along with some characteristic results.
Parallel and Distributed System Simulation
NASA Technical Reports Server (NTRS)
Dongarra, Jack
1998-01-01
This exploratory study initiated our research into the software infrastructure necessary to support the modeling and simulation techniques that are most appropriate for the Information Power Grid. Such computational power grids will use high-performance networking to connect hardware, software, instruments, databases, and people into a seamless web that supports a new generation of computation-rich problem solving environments for scientists and engineers. In this context we looked at evaluating the NetSolve software environment for network computing that leverages the potential of such systems while addressing their complexities. NetSolve's main purpose is to enable the creation of complex applications that harness the immense power of the grid, yet are simple to use and easy to deploy. NetSolve uses a modular, client-agent-server architecture to create a system that is very easy to use. Moreover, it is designed to be highly composable in that it readily permits new resources to be added by anyone willing to do so. In these respects NetSolve is to the Grid what the World Wide Web is to the Internet. But like the Web, the design that makes these wonderful features possible can also impose significant limitations on the performance and robustness of a NetSolve system. This project explored the design innovations that push the performance and robustness of the NetSolve paradigm as far as possible without sacrificing the Web-like ease of use and composability that make it so powerful.
Control of discrete event systems modeled as hierarchical state machines
NASA Technical Reports Server (NTRS)
Brave, Y.; Heymann, M.
1991-01-01
The authors examine a class of discrete event systems (DESs) modeled as asynchronous hierarchical state machines (AHSMs). For this class of DESs, they provide an efficient method for testing reachability, which is an essential step in many control synthesis procedures. This method utilizes the asynchronous nature and hierarchical structure of AHSMs, thereby illustrating the advantage of the AHSM representation as compared with its equivalent (flat) state machine representation. An application of the method is presented where an online minimally restrictive solution is proposed for the problem of maintaining a controlled AHSM within prescribed legal bounds.
Parallel algorithm strategies for circuit simulation.
Thornquist, Heidi K.; Schiek, Richard Louis; Keiter, Eric Richard
2010-01-01
Circuit simulation tools (e.g., SPICE) have become invaluable in the development and design of electronic circuits. However, they have been pushed to their performance limits in addressing circuit design challenges that come from the technology drivers of smaller feature scales and higher integration. Improving the performance of circuit simulation tools through exploiting new opportunities in widely-available multi-processor architectures is a logical next step. Unfortunately, not all traditional simulation applications are inherently parallel, and quickly adapting mature application codes (even codes designed to parallel applications) to new parallel paradigms can be prohibitively difficult. In general, performance is influenced by many choices: hardware platform, runtime environment, languages and compilers used, algorithm choice and implementation, and more. In this complicated environment, the use of mini-applications small self-contained proxies for real applications is an excellent approach for rapidly exploring the parameter space of all these choices. In this report we present a multi-core performance study of Xyce, a transistor-level circuit simulation tool, and describe the future development of a mini-application for circuit simulation.
Xyce parallel electronic simulator : reference guide.
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick
2011-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to run on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.
Parallel node placement method by bubble simulation
NASA Astrophysics Data System (ADS)
Nie, Yufeng; Zhang, Weiwei; Qi, Nan; Li, Yiqiang
2014-03-01
An efficient Parallel Node Placement method by Bubble Simulation (PNPBS), employing METIS-based domain decomposition (DD) for an arbitrary number of processors is introduced. In accordance with the desired nodal density and Newton’s Second Law of Motion, automatic generation of node sets by bubble simulation has been demonstrated in previous work. Since the interaction force between nodes is short-range, for two distant nodes, their positions and velocities can be updated simultaneously and independently during dynamic simulation, which indicates the inherent property of parallelism, it is quite suitable for parallel computing. In this PNPBS method, the METIS-based DD scheme has been investigated for uniform and non-uniform node sets, and dynamic load balancing is obtained by evenly distributing work among the processors. For the nodes near the common interface of two neighboring subdomains, there is no need for special treatment after dynamic simulation. These nodes have good geometrical properties and a smooth density distribution which is desirable in the numerical solution of partial differential equations (PDEs). The results of numerical examples show that quasi linear speedup in the number of processors and high efficiency are achieved.
State-space supervision of reconfigurable discrete event systems
Garcia, H.E. [Argonne National Lab., IL (United States); Ray, A. [Pennsylvania State Univ., University Park, PA (United States)
1995-12-31
The Discrete Event Systems (DES) theory of supervisory and state feedback control offers many advantages for implementing supervisory systems. Algorithmic concepts have been introduced to assure that the supervising algorithms are correct and meet the specifications. It is often assumed that the supervisory specifications are invariant or, at least, until a given supervisory task is completed. However, there are many practical applications where the supervising specifications update at real time. For example, in a Reconfigurable Discrete Event System (RDES) architecture, a bank of supervisors is defined to accommodate each identified operational condition or different supervisory specifications. This adaptive supervisory control system changes the supervisory configuration to accept coordinating commands or to adjust for changes in the controlled process. This paper addresses reconfiguration at the supervisory level of hybrid systems along with a RDES underlying architecture. It reviews the state-based supervisory control theory and extends it to the paradigm of RDES and in view of process control applications. The paper addresses theoretical issues with a limited number of practical examples. This control approach is particularly suitable for hierarchical reconfigurable hybrid implementations.
Discrete Event Based Simulation and Control of Continuous
in partial fullfilment of the requirements for the degree of Doctor en Ingenier´ia Director: Sergio Junco Facultad de Ciencias Exactas, Ingenier´ia y Agrimensura Universidad Nacional de Rosario #12;ii #12;Abstract would not been involved with teaching and research (and now I would be probably working at the industry
Reactive-Process Programming Distributed Discrete-Event Simulation
enlightenments. To my buddies from UC Davis, Glenn Saito and John Bakos, for their help in my college years peers, Bill Athas, Bill Dally, John Ngai, and Craig Steele, for their help and advice. To my junior co-workers, Nanette Boden, Charles Flaig, Glenn Lewis, Mike Pertel, and Jakov Seizovic, for their feedback and support
Improving ICU patient flow through discrete-event simulation
Christensen, Benjamin A. (Benjamin Arthur)
2012-01-01
Massachusetts General Hospital (MGH), the largest hospital in New England and a national leader in care delivery, teaching, and research, operates ten Intensive Care Units (ICUs), including the 20-bed Ellison 4 Surgical ...
Discrete Event Simulation of Molecular Dynamics with Configurable Logic
Herbordt, Martin
is problematic; 9-12 orders of mag- nitude more time is needed to model many important biological phenomena, e, covalent This work was supported in part by the NIH through award #RR020209-01 and facilitated by donations from Xilinx Corpo- ration. Web: http://www.bu.edu/caadlab. bonds as infinite barriers, and van der
Incremental Checkpointing with Application to Distributed Discrete Event Simulation
and the following companies: Agilent, DGIST, General Motors, Hewlett Packard, Infineon, Microsoft, and Toyota. #12 Motors, Hewlett Packard, Infineon, Microsoft, and Toyota. #12;editing applications may attempt
Parallel Simulated Annealing for Materialized View Selection in Data Warehousing
Dehene, Frank
Parallel Simulated Annealing for Materialized View Selection in Data Warehousing Environments. Keywords: Parallel Simulated Annealing, Data Warehousing, Materi- alized view selection. 1 Introduction for selecting an appropriate set of views to materialize which increases the query performance, commonly
Darsim: A Parallel Cycle-Level NoC Simulator
Devadas, Srinivas
2010-01-01
We present DARSIM, a parallel, highly configurable, cycle-level network-on-chip simulator based on an ingress-queued wormhole router architecture. The parallel simulation engine offers cycle-accurate as well as periodic ...
Parallel Strategies for Crash and Impact Simulations
Attaway, S.; Brown, K.; Hendrickson, B.; Plimpton, S.
1998-12-07
We describe a general strategy we have found effective for parallelizing solid mechanics simula- tions. Such simulations often have several computationally intensive parts, including finite element integration, detection of material contacts, and particle interaction if smoothed particle hydrody- namics is used to model highly deforming materials. The need to balance all of these computations simultaneously is a difficult challenge that has kept many commercial and government codes from being used effectively on parallel supercomputers with hundreds or thousands of processors. Our strategy is to load-balance each of the significant computations independently with whatever bal- ancing technique is most appropriate. The chief benefit is that each computation can be scalably paraIlelized. The drawback is the data exchange between processors and extra coding that must be written to maintain multiple decompositions in a single code. We discuss these trade-offs and give performance results showing this strategy has led to a parallel implementation of a widely-used solid mechanics code that can now be run efficiently on thousands of processors of the Pentium-based Sandia/Intel TFLOPS machine. We illustrate with several examples the kinds of high-resolution, million-element models that can now be simulated routinely. We also look to the future and dis- cuss what possibilities this new capabUity promises, as well as the new set of challenges it poses in material models, computational techniques, and computing infrastructure.
BigSim: A Parallel Simulator for Performance Prediction of Extremely Large Parallel Machines
Tropper, Carl
, application developers must spend some time visualizing and analyzing the performance data before the next setBigSim: A Parallel Simulator for Performance Prediction of Extremely Large Parallel Machines at Urbana-Champaign {gzheng, kakulapa, kale}@cs.uiuc.edu Abstract We present a parallel simulator -- Big
Accelerating conservative parallel simulation of VHDL circuits
NASA Astrophysics Data System (ADS)
Hurford, Joel F.
1994-12-01
This research effort considers heuristic and cost model based techniques for the optimal partitioning of VHDL circuits for parallel simulation. Correlation statistics are gathered on a wide variety of graph-based a priori parameters. Linear regression is used to identify significant parameters for inclusion in a representative cost model. Driving a greedy search, this cost model is used to improve upon initial heuristic partitions. The influence of feedback dominated previous research so a no-feedback algorithm is used to create the initial partition. The circuits studied range from 1050 to 4243 gates.
Implementation of Real-Time Distributed Discrete-Event Execution with Fault Tolerance
Implementation of Real-Time Distributed Discrete- Event Execution with Fault Tolerance Thomas this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers Instruments, and Toyota. #12;Implementation of Real-Time Distributed Discrete-Event Execution with Fault
Conflict-Based Diagnosis of Discrete Event Systems: Theory and Practice Alban Grastien1,2
ThiÃ©baux, Sylvie
Conflict-Based Diagnosis of Discrete Event Systems: Theory and Practice Alban Grastien1,2 and P, Australian National University Abstract We present a conflict-based approach to diagnosing Discrete Event of diagnosis hypotheses, testing hypotheses for consistency, and generating conflicts which rule out successors
Control and Stabilization of Discrete Event Systems with Limited Look-Ahead Policies
Control and Stabilization of Discrete Event Systems with Limited Look-Ahead Policies A Thesis of limited look-ahead policies (LLP) for control of DES is employed and a result for the special case and Stabilization of Discrete Event Systems with Limited Look-Ahead Policies By Veysel Gazi, M.S. The Ohio State
Benefits from semi-asynchronous checkpointing for time warp simulations of a large state PCS model
Andrea Santoro; Francesco Quaglia
2001-01-01
Checkpointing overhead is a major obstacle for the effectiveness of time warp parallel discrete event simulators. Semi-asynchronous checkpointing is a solution to tackle this obstacle for time warp simulations on distributed memory systems based on Myrinet. In this solution, checkpoint operations are offloaded from the host CPU and are charged to a DMA engine on board Myrinet network cards. In
Parallel Proximity Detection for Computer Simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor); Wieland, Frederick P. (Inventor)
1997-01-01
The present invention discloses a system for performing proximity detection in computer simulations on parallel processing architectures utilizing a distribution list which includes movers and sensor coverages which check in and out of grids. Each mover maintains a list of sensors that detect the mover's motion as the mover and sensor coverages check in and out of the grids. Fuzzy grids are includes by fuzzy resolution parameters to allow movers and sensor coverages to check in and out of grids without computing exact grid crossings. The movers check in and out of grids while moving sensors periodically inform the grids of their coverage. In addition, a lookahead function is also included for providing a generalized capability without making any limiting assumptions about the particular application to which it is applied. The lookahead function is initiated so that risk-free synchronization strategies never roll back grid events. The lookahead function adds fixed delays as events are scheduled for objects on other nodes.
Parallel Proximity Detection for Computer Simulations
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor); Wieland, Frederick P. (Inventor)
1998-01-01
The present invention discloses a system for performing proximity detection in computer simulations on parallel processing architectures utilizing a distribution list which includes movers and sensor coverages which check in and out of grids. Each mover maintains a list of sensors that detect the mover's motion as the mover and sensor coverages check in and out of the grids. Fuzzy grids are included by fuzzy resolution parameters to allow movers and sensor coverages to check in and out of grids without computing exact grid crossings. The movers check in and out of grids while moving sensors periodically inform the grids of their coverage. In addition, a lookahead function is also included for providing a generalized capability without making any limiting assumptions about the particular application to which it is applied. The lookahead function is initiated so that risk-free synchronization strategies never roll back grid events. The lookahead function adds fixed delays as events are scheduled for objects on other nodes.
Parallel multiscale simulations of a brain aneurysm
Grinberg, Leopold [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States)] [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States); Fedosov, Dmitry A. [Institute of Complex Systems and Institute for Advanced Simulation, Forschungszentrum Jülich, Jülich 52425 (Germany)] [Institute of Complex Systems and Institute for Advanced Simulation, Forschungszentrum Jülich, Jülich 52425 (Germany); Karniadakis, George Em, E-mail: george_karniadakis@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States)
2013-07-01
Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier–Stokes solver N??T?r. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (N??T?r and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work.
Parallel multiscale simulations of a brain aneurysm
NASA Astrophysics Data System (ADS)
Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em
2013-07-01
Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver N??T?r. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (N??T?r and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work.
Parallel multiscale simulations of a brain aneurysm
Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em
2012-01-01
Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multi-scale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver ?? ?r. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers ( ?? ?r and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work. PMID:23734066
Parallel multiscale simulations of a brain aneurysm.
Grinberg, Leopold; Fedosov, Dmitry A; Karniadakis, George Em
2013-07-01
Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multi-scale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver ?? ?r . The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers ( ?? ?r and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work. PMID:23734066
Andrea Santoro; Francesco Quaglia
2001-01-01
Checkpointing overhead is a major obstacle for the effectiveness of Time Warp parallel discrete event simulators. Semi-asynchronous checkpointing is a recent solution to tackle this obstacle for Time Warp simulations on distributed memory systems based on Myrinet. In this solution, checkpoint operations are offloaded from the host CPU and are charged to a DMA engine on board of Myrinet network
SPINET: A Parallel Computing Approach to Spine Simulations
Schneider, Jean-Guy
in this paper applies modern scientific computing tools to biomechanical simulations: parallel computingSPINET: A Parallel Computing Approach to Spine Simulations Peter G. Kropf 1 , Edgar F.A. Lederer 2 hand, application driven demands on computing methods and power are continuously growing. Therefore
Optimistic Parallel Simulation of TCP/IP Over ATM Networks
Kansas, University of
Optimistic Parallel Simulation of TCP/IP Over ATM Networks Ming H. Chong Victor S. Frost ITTC on TCP/IP over ATM networks, and compares the performance of a parallel simulator to ProTEuS (a to construct large-scale TCP/IP over ATM scenarios to evaluate the performance of GTW. Results indicate
Partitioning strategies for parallel KIVA-4 engine simulations
Torres, D J; Kong, S C
2008-01-01
Parallel KIVA-4 is described and simulated in four different engine geometries. The Message Passing-Interface (MPl) was used to parallelize KIVA-4. Par itioning strategies ar accesed in light of the fact that cells can become deactivated and activated during the course of an engine simulation which will affect the load balance between processors.
On Parallelizing On-Line Statistics for Stochastic Biological Simulations
Troina, Angelo
difficult when the complexity of the bio- logical systems increases. To address these issues, in the lastOn Parallelizing On-Line Statistics for Stochastic Biological Simulations Marco Aldinucci1 , Mario concerns a general technique to enrich parallel version of stochastic simulators for biological systems
Parallelization of Rocket Engine Simulator Software (PRESS)
NASA Technical Reports Server (NTRS)
Cezzar, Ruknet
1997-01-01
Parallelization of Rocket Engine System Software (PRESS) project is part of a collaborative effort with Southern University at Baton Rouge (SUBR), University of West Florida (UWF), and Jackson State University (JSU). The second-year funding, which supports two graduate students enrolled in our new Master's program in Computer Science at Hampton University and the principal investigator, have been obtained for the period from October 19, 1996 through October 18, 1997. The key part of the interim report was new directions for the second year funding. This came about from discussions during Rocket Engine Numeric Simulator (RENS) project meeting in Pensacola on January 17-18, 1997. At that time, a software agreement between Hampton University and NASA Lewis Research Center had already been concluded. That agreement concerns off-NASA-site experimentation with PUMPDES/TURBDES software. Before this agreement, during the first year of the project, another large-scale FORTRAN-based software, Two-Dimensional Kinetics (TDK), was being used for translation to an object-oriented language and parallelization experiments. However, that package proved to be too complex and lacking sufficient documentation for effective translation effort to the object-oriented C + + source code. The focus, this time with better documented and more manageable PUMPDES/TURBDES package, was still on translation to C + + with design improvements. At the RENS Meeting, however, the new impetus for the RENS projects in general, and PRESS in particular, has shifted in two important ways. One was closer alignment with the work on Numerical Propulsion System Simulator (NPSS) through cooperation and collaboration with LERC ACLU organization. The other was to see whether and how NASA's various rocket design software can be run over local and intra nets without any radical efforts for redesign and translation into object-oriented source code. There were also suggestions that the Fortran based code be encapsulated in C + + code thereby facilitating reuse without undue development effort. The details are covered in the aforementioned section of the interim report filed on April 28, 1997.
Gregory Provan; Yi-Liang Chen
2000-01-01
Shows the relationship between two discrete event system representations, finite state machines and causal networks. Finite state machine models have been used extensively for the supervisory control of logical (and timed, with some extension) discrete event systems. On the other hand, causal networks have been applied mainly to the diagnosis of discrete event systems. Advances in finite-state-machine based diagnosis and
Parallel methods for dynamic simulation of multiple manipulator systems
NASA Technical Reports Server (NTRS)
Mcmillan, Scott; Sadayappan, P.; Orin, David E.
1993-01-01
In this paper, efficient dynamic simulation algorithms for a system of m manipulators, cooperating to manipulate a large load, are developed; their performance, using two possible forms of parallelism on a general-purpose parallel computer, is investigated. One form, temporal parallelism, is obtained with the use of parallel numerical integration methods. A speedup of 3.78 on four processors of CRAY Y-MP8 was achieved with a parallel four-point block predictor-corrector method for the simulation of a four manipulator system. These multi-point methods suffer from reduced accuracy, and when comparing these runs with a serial integration method, the speedup can be as low as 1.83 for simulations with the same accuracy. To regain the performance lost due to accuracy problems, a second form of parallelism is employed. Spatial parallelism allows most of the dynamics of each manipulator chain to be computed simultaneously. Used exclusively in the four processor case, this form of parallelism in conjunction with a serial integration method results in a speedup of 3.1 on four processors over the best serial method. In cases where there are either more processors available or fewer chains in the system, the multi-point parallel integration methods are still advantageous despite the reduced accuracy because both forms of parallelism can then combine to generate more parallel tasks and achieve greater effective speedups. This paper also includes results for these cases.
Applications Parallel PIC plasma simulation through particle
Vlad, Gregorio
2000 Abstract Parallelization of a particle-in-cell (PIC) code has been accomplished through for the con®nement degradation of the plasma. www.elsevier.com/locate/parco Parallel Computing 27 (2001) 295, each of them representing a cloud of non-mutually interacting physical particles. The mutual
Makino, Jun
NextGeneration Massively Parallel Computers --- Massively Parallel Computer for Particlebased Simulations Junichiro Makino (School of Science, University of Tokyo) 1 Project Organization Subleader Professor, College of Arts and Sci ences, University of Tokyo Makoto Taiji Associate Professor, Institute
Parallel-Processing Test Bed For Simulation Software
NASA Technical Reports Server (NTRS)
Blech, Richard; Cole, Gary; Townsend, Scott
1996-01-01
Second-generation Hypercluster computing system is multiprocessor test bed for research on parallel algorithms for simulation in fluid dynamics, electromagnetics, chemistry, and other fields with large computational requirements but relatively low input/output requirements. Built from standard, off-shelf hardware readily upgraded as improved technology becomes available. System used for experiments with such parallel-processing concepts as message-passing algorithms, debugging software tools, and computational steering. First-generation Hypercluster system described in "Hypercluster Parallel Processor" (LEW-15283).
Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors
NASA Technical Reports Server (NTRS)
Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.
1990-01-01
Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.
Parallel architecture for real-time simulation. Master's thesis
Cockrell, C.D.
1989-01-01
This thesis is concerned with the development of a very fast and highly efficient parallel computer architecture for real-time simulation of continuous systems. Currently, several parallel processing systems exist that may be capable of executing a complex simulation in real-time. These systems are examined and the pros and cons of each system discussed. The thesis then introduced a custom-designed parallel architecture based upon The University of Alabama's OPERA architecture. Each component of this system is discussed and rationale presented for its selection. The problem selected, real-time simulation of the Space Shuttle Main Engine for the test and evaluation of the proposed architecture, is explored, identifying the areas where parallelism can be exploited and parallel processing applied. Results from the test and evaluation phase are presented and compared with the results of the same problem that has been processed on a uniprocessor system.
Parallel Simulation of Ion Recombination in Nonpolar Liquids
Seinstra, Frank J.
Parallel Simulation of Ion Recombination in Nonpolar Liquids Frank J. Seinstra a;1 , Henri E. Bal a Boelelaan 1081, 1081 HV Amsterdam, The Netherlands Abstract Ion recombination in nonpolar liquids is an important problem in radiation chem- istry. We have designed and implemented a parallel Monte Carlo
PARALLEL COMPUTER SIMULATION TECHNIQUES FOR THE STUDY OF MACROMOLECULES
Wilson, Mark R.
PARALLEL COMPUTER SIMULATION TECHNIQUES FOR THE STUDY OF MACROMOLECULES Mark R. Wilson and Jaroslav years two important developments in computing have occurred. At the high-cost end of the scale, supercomputers have become parallel comput- ers. The ultra-fast (specialist) processors and the expensive vector-computers
Simulation of electric power systems by parallel computation
K. Schmidt; W. Leonhard
1982-01-01
The aim of dynamic contingency calculations in power systems is to estimate the effects of assumed disturbances, such as loss of generation, faults, etc. Due to the large dimensions of the problem these simulations require considerable computing time and cost, which may be reduced by parallel computation. Some results of a study employing a prototype parallel computer (SMS) are presented.
Parallel Simulation of ElectronSolid Interactions Electron Microscopy Modeling
Plimpton, Steve
Page 1 Parallel Simulation of ElectronSolid Interactions for Electron Microscopy Modeling S. J, Monte Carlo, electron, microscopy, random number generation Abstract A parallel implementation Introduction Analytical electron microscopy (AEM) is a tool for characterizing the spatial distribution of ele
Parallel FEM Simulation of Crack Propagation --Challenges, Status, and Perspectives
Stodghill, Paul
Parallel FEM Simulation of Crack Propagation -- Challenges, Status, and Perspectives List and accurate computer simulation of crack propagation in realistic 3D structures would be a valuable tool generation crack propagation simulation software that aims to make this potential a reality. Within the scope
Merging Parallel Simulation Programs Abhishek Agarwal and Maria Hybinette
Hybinette, Maria
simulation cloning as a means of gaining efficiency in the execution of parallel sim- ulations. Simulations Department University of Georgia Athens, GA 30602-7404, USA maria@cs.uga.edu Abstract In earlier work cloning is proposed as a means for ef- ficiently splitting a running simulation midway through its execution
SUPERB: Simulator Utilizing Parallel Evaluation of Resistive Bridges Piet Engelke1
Polian, Ilia
SUPERB: Simulator Utilizing Parallel Evaluation of Resistive Bridges Piet Engelke1 Bettina bridging fault simulator SUPERB (Simulator Utilizing Parallel Evaluation of Resis- tive Bridges- stuck-at simulation. It outperforms a conventional interval- based resistive bridging fault simulator
Parallel Algorithms for Time and Frequency Domain Circuit Simulation
Dong, Wei
2010-10-12
device model evaluation and matrix solutions. This dissertation also exploits the recently developed explicit telescopic projective integration method for efficient parallel transient circuit simulation by addressing the stability limitation of explicit...
Discrete event command and control for networked teams with multiple missions
NASA Astrophysics Data System (ADS)
Lewis, Frank L.; Hudas, Greg R.; Pang, Chee Khiang; Middleton, Matthew B.; McMurrough, Christopher
2009-05-01
During mission execution in military applications, the TRADOC Pamphlet 525-66 Battle Command and Battle Space Awareness capabilities prescribe expectations that networked teams will perform in a reliable manner under changing mission requirements, varying resource availability and reliability, and resource faults. In this paper, a Command and Control (C2) structure is presented that allows for computer-aided execution of the networked team decision-making process, control of force resources, shared resource dispatching, and adaptability to change based on battlefield conditions. A mathematically justified networked computing environment is provided called the Discrete Event Control (DEC) Framework. DEC has the ability to provide the logical connectivity among all team participants including mission planners, field commanders, war-fighters, and robotic platforms. The proposed data management tools are developed and demonstrated on a simulation study and an implementation on a distributed wireless sensor network. The results show that the tasks of multiple missions are correctly sequenced in real-time, and that shared resources are suitably assigned to competing tasks under dynamically changing conditions without conflicts and bottlenecks.
HPC Infrastructure for Solid Earth Simulation on Parallel Computers
NASA Astrophysics Data System (ADS)
Nakajima, K.; Chen, L.; Okuda, H.
2004-12-01
Recently, various types of parallel computers with various types of architectures and processing elements (PE) have emerged, which include PC clusters and the Earth Simulator. Moreover, users can easily access to these computer resources through network on Grid environment. It is well-known that thorough tuning is required for programmers to achieve excellent performance on each computer. The method for tuning strongly depends on the type of PE and architecture. Optimization by tuning is a very tough work, especially for developers of applications. Moreover, parallel programming using message passing library such as MPI is another big task for application programmers. In GeoFEM project (http://gefeom.tokyo.rist.or.jp), authors have developed a parallel FEM platform for solid earth simulation on the Earth Simulator, which supports parallel I/O, parallel linear solvers and parallel visualization. This platform can efficiently hide complicated procedures for parallel programming and optimization on vector processors from application programmers. This type of infrastructure is very useful. Source codes developed on PC with single processor is easily optimized on massively parallel computer by linking the source code to the parallel platform installed on the target computer. This parallel platform, called HPC Infrastructure will provide dramatic efficiency, portability and reliability in development of scientific simulation codes. For example, line number of the source codes is expected to be less than 10,000 and porting legacy codes to parallel computer takes 2 or 3 weeks. Original GeoFEM platform supports only I/O, linear solvers and visualization. In the present work, further development for adaptive mesh refinement (AMR) and dynamic load-balancing (DLB) have been carried out. In this presentation, examples of large-scale solid earth simulation using the Earth Simulator will be demonstrated. Moreover, recent results of a parallel computational steering tool using an MxN communication model will be shown. In an MxN communication model, the large-scale computation modules run on M PE's and high performance parallel visualization modules run on N PE's, concurrently. This can allow computation and visualization to select suitable parallel hardware environments respectively. Meanwhile, real-time steering can be achieved during computation so that the users can check and adjust the computation process in real time. Furthermore, different numbers of PE's can achieve better configuration between computation and visualization under Grid environment.
Xyce parallel electronic simulator : users' guide. Version 5.1.
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick
2009-11-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.
Xyce Parallel Electronic Simulator : users' guide, version 4.1.
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick
2009-02-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.
Traffic simulations on parallel computers using domain decomposition techniques
Hanebutte, U.R.; Tentner, A.M.
1995-12-31
Large scale simulations of Intelligent Transportation Systems (ITS) can only be achieved by using the computing resources offered by parallel computing architectures. Domain decomposition techniques are proposed which allow the performance of traffic simulations with the standard simulation package TRAF-NETSIM on a 128 nodes IBM SPx parallel supercomputer as well as on a cluster of SUN workstations. Whilst this particular parallel implementation is based on NETSIM, a microscopic traffic simulation model, the presented strategy is applicable to a broad class of traffic simulations. An outer iteration loop must be introduced in order to converge to a global solution. A performance study that utilizes a scalable test network that consist of square-grids is presented, which addresses the performance penalty introduced by the additional iteration loop.
Beam dynamics simulations using a parallel version of PARMILA
Ryne, R.D.
1996-12-01
The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code.
An approach to real-time simulation using parallel processing
NASA Technical Reports Server (NTRS)
Blech, R. A.; Arpasi, D. J.
1981-01-01
Current applications of real-time simulations to the development of complex aircraft propulsion system controls have demonstrated the need for accurate, portable, and low-cost simulators. This paper presents a preliminary simulator design that uses a parallel computer organization to provide these features. The hardware and software for this prototype simulator are discussed. A detailed discussion of the inter-computer data transfer mechanism is also presented.
High-performance retargetable simulator for parallel architectures. Technical report
Dellarocas, C.N.
1991-06-01
In this thesis, the authors describe Proteus, a high-performance simulation-based system for the evaluation of parallel algorithms and system software. Proteus is built around a retargetable parallel architecture simulator and a flexible data collection and display component. The simulator uses a combination of simulation and direct execution to achieve high performance, while retaining simulation accuracy. Proteus can be configured to simulate a wide range of shared memory and message passing MIMD architectures and the level of simulation detail can be chosen by the user. Detailed memory, cache and network simulation is supported. Parallel programs can be written using a programming model based on C and a set of runtime system calls for thread and memory management. The system allows nonintrusive monitoring of arbitrary information about an execution, and provides flexible graphical utilities for displaying recorded data. To validate the accuracy of the system, a number of published experiments were reproduced on Proteus. In all cases the results obtained by simulation are very close to those published, a fact that provides support for the reliability of the system. Performance measurements demonstrate that the simulator is one to two orders of magnitude faster than other similar multiprocessor simulators.
Parallel Simulation of Ion Recombination in Nonpolar Liquids
Seinstra, Frank J.
Parallel Simulation of Ion Recombination in Nonpolar Liquids Frank J. Seinstra1 , Henri E. Bal1, 1081 HV Amsterdam The Netherlands Abstract. Ion recombination in nonpolar liquids is an important prob- lem in radiation chemistry. We have designed and implemented a paral- lel Monte Carlo simulation
Parallel Simulation of Ion Recombination in Nonpolar Liquids #
Seinstra, Frank J.
Parallel Simulation of Ion Recombination in Nonpolar Liquids # Frank J. Seinstra 1 , Henri E. Bal 1, 1081 HV Amsterdam The Netherlands Abstract. Ion recombination in nonpolar liquids is an important prob lem in radiation chemistry. We have designed and implemented a paral lel Monte Carlo simulation
An approach to real-time simulation using parallel processing
NASA Technical Reports Server (NTRS)
Blech, R. A.; Arpasi, D. J.
1981-01-01
A preliminary simulator design that uses a parallel computer organization to provide accuracy, portability, and low cost is presented. The hardware and software for this prototype simulator are discussed. A detailed discussion of the inter-computer data transfer mechanism is also presented.
Clustering Algorithms for Parallel Car-Crash Simulation Analysis
Liquan Meil; Clemens Thole
Buckling and certain contact situations cause scattering results of numerical crash simulation: For a BMW model differences between the position of a node in two simulation runs of up to 10 cm were observed, just as a result of round-off differences in the case of parallel computing. An engineer has to measure this scatter, to check whether important parts of
3-D massively parallel impact simulations using PCTH
Fang, H.E.; Robinson, A.C.
1992-01-01
Simulations of hypervelocity impact problems are performed frequently by government laboratories and contractors for armor/anti-armor applications. These simulations need to deal with shock wave physics phenomena, large material deformation, motion of debris particles and complex geometries. As a result, memory and processing time requirements are large for detailed, three-dimensional calculations. The large massively parallel supercomputing systems of the future will provide the power necessary to greatly reduce simulation times currently required by shared-memory, vector supercomputers. This paper gives an introduction to PCTH, a next-generation shock wave physics code which is being built at Sandia National Laboratories for massively parallel supercomputers, and demonstrates that massively parallel hydrocodes, such as PCTH, can provide highly-detailed, three-dimensional simulations of armor/anti-armor systems.
3-D massively parallel impact simulations using PCTH
Fang, H.E.; Robinson, A.C.
1992-12-31
Simulations of hypervelocity impact problems are performed frequently by government laboratories and contractors for armor/anti-armor applications. These simulations need to deal with shock wave physics phenomena, large material deformation, motion of debris particles and complex geometries. As a result, memory and processing time requirements are large for detailed, three-dimensional calculations. The large massively parallel supercomputing systems of the future will provide the power necessary to greatly reduce simulation times currently required by shared-memory, vector supercomputers. This paper gives an introduction to PCTH, a next-generation shock wave physics code which is being built at Sandia National Laboratories for massively parallel supercomputers, and demonstrates that massively parallel hydrocodes, such as PCTH, can provide highly-detailed, three-dimensional simulations of armor/anti-armor systems.
NASA Technical Reports Server (NTRS)
Zeigler, Bernard P.
1989-01-01
It is shown how systems can be advantageously represented as discrete-event models by using DEVS (discrete-event system specification), a set-theoretic formalism. Such DEVS models provide a basis for the design of event-based logic control. In this control paradigm, the controller expects to receive confirming sensor responses to its control commands within definite time windows determined by its DEVS model of the system under control. The event-based contral paradigm is applied in advanced robotic and intelligent automation, showing how classical process control can be readily interfaced with rule-based symbolic reasoning systems.
Parallel Implicit Kinetic Simulation with PARSEK
Markidis Stefano; Lapenta Giovanni
2004-01-01
Kinetic plasma simulation is the ultimate tool for plasma analysis. One of the prime tools for kinetic simulation is the particle in cell (PIC) method. The explicit or semi-implicit (i.e. implicit only on the fields) PIC method requires exceedingly small time steps and grid spacing, limited by the necessity to resolve the electron plasma frequency, the Debye length and the
Parallelization of sequential Gaussian, indicator and direct simulation algorithms
NASA Astrophysics Data System (ADS)
Nunes, Ruben; Almeida, José A.
2010-08-01
Improving the performance and robustness of algorithms on new high-performance parallel computing architectures is a key issue in efficiently performing 2D and 3D studies with large amount of data. In geostatistics, sequential simulation algorithms are good candidates for parallelization. When compared with other computational applications in geosciences (such as fluid flow simulators), sequential simulation software is not extremely computationally intensive, but parallelization can make it more efficient and creates alternatives for its integration in inverse modelling approaches. This paper describes the implementation and benchmarking of a parallel version of the three classic sequential simulation algorithms: direct sequential simulation (DSS), sequential indicator simulation (SIS) and sequential Gaussian simulation (SGS). For this purpose, the source used was GSLIB, but the entire code was extensively modified to take into account the parallelization approach and was also rewritten in the C programming language. The paper also explains in detail the parallelization strategy and the main modifications. Regarding the integration of secondary information, the DSS algorithm is able to perform simple kriging with local means, kriging with an external drift and collocated cokriging with both local and global correlations. SIS includes a local correction of probabilities. Finally, a brief comparison is presented of simulation results using one, two and four processors. All performance tests were carried out on 2D soil data samples. The source code is completely open source and easy to read. It should be noted that the code is only fully compatible with Microsoft Visual C and should be adapted for other systems/compilers.
Direct simulation Monte Carlo analysis on parallel processors
NASA Technical Reports Server (NTRS)
Wilmoth, Richard G.
1989-01-01
A method is presented for executing a direct simulation Monte Carlo (DSMC) analysis using parallel processing. The method is based on using domain decomposition to distribute the work load among multiple processors, and the DSMC analysis is performed completely in parallel. Message passing is used to transfer molecules between processors and to provide the synchronization necessary for the correct physical simulation. Benchmark problems are described for testing the method and results are presented which demonstrate the performance on two commercially available multicomputers. The results show that reasonable parallel speedup and efficiency can be obtained if the problem is properly sized to the number of processors. It is projected that with a massively parallel system, performance exceeding that of current supercomputers is possible.
Improved task scheduling for parallel simulations. Master's thesis
McNear, A.E.
1991-12-01
The objective of this investigation is to design, analyze, and validate the generation of optimal schedules for simulation systems. Improved performance in simulation execution times can greatly improve the return rate of information provided by such simulations resulting in reduced development costs of future computer/electronic systems. Optimal schedule generation of precedence-constrained task systems including iterative feedback systems such as VHDL or war gaming simulations for execution on a parallel computer is known to be N P-hard. Efficiently parallelizing such problems takes full advantage of present computer technology to achieve a significant reduction in the search times required. Unfortunately, the extreme combinatoric 'explosion' of possible task assignments to processors creates an exponential search space prohibitive on any computer for search algorithms which maintain more than one branch of the search graph at any one time. This work develops various parallel modified backtracking (MBT) search algorithms for execution on an iPSC/2 hypercube that bound the space requirements and produce an optimally minimum schedule with linear speed-up. The parallel MBT search algorithm is validated using various feedback task simulation systems which are scheduled for execution on an iPSC/2 hypercube. The search time, size of the enumerated search space, and communications overhead required to ensure efficient utilization during the parallel search process are analyzed. The various applications indicated appreciable improvement in performance using this method.
Deterministic simulation of idealized parallel computers on more realistic ones
Alt, H.; Hagerup, T.; Mehlhorn, K.; Preparata, F.P.
1987-10-01
The authors describe a nonuniform deterministic simulation of PRAMs on module parallel computers (MPCs) and on processor networks of bounded degree. The simulating machines have the same number n of processors as the simulated PRAM, and if the size of the PRAM's shared memory is polynomial in n, each PRAM step is simulated by O(log n) MPC steps or by O((log n)/sup 2/) steps of the bounded-degree network. This improves upon a previous result. The authors also prove an ..cap omega..((log n/sup 2//log log n) lower bound on the number of steps needed to simulate one PRAM step on a bounded-degree network under the assumption that the communication in the network is point to point. As an important part of the simulation of PRAMs on MPCs, a new technique for dynamically averaging out a given work load among a set of processors operating in parallel is used.
Parallel Performance in Multi-physics Simulation
Kevin Mcmanus; Mark Cross; Chris Walshaw; Nick Croft; Alison Williams
2002-01-01
A comprehensive simulation of solidification\\/melting pro- cesses requires the simultaneous representation of free surface fluid flow, heat transfer, phase change, non-linear solid mechanics and, possibly, electromagnetics together with their interactions in what is now referred to as 'multi-physics' simulation. A 3D computational procedure and soft- ware tool, PHYSICA, embedding the above multi-physics models using finite volume methods on unstructured meshes
Efficient parallel simulation of CO2 geologic sequestration insaline aquifers
Zhang, Keni; Doughty, Christine; Wu, Yu-Shu; Pruess, Karsten
2007-01-01
An efficient parallel simulator for large-scale, long-termCO2 geologic sequestration in saline aquifers has been developed. Theparallel simulator is a three-dimensional, fully implicit model thatsolves large, sparse linear systems arising from discretization of thepartial differential equations for mass and energy balance in porous andfractured media. The simulator is based on the ECO2N module of the TOUGH2code and inherits all the process capabilities of the single-CPU TOUGH2code, including a comprehensive description of the thermodynamics andthermophysical properties of H2O-NaCl- CO2 mixtures, modeling singleand/or two-phase isothermal or non-isothermal flow processes, two-phasemixtures, fluid phases appearing or disappearing, as well as saltprecipitation or dissolution. The new parallel simulator uses MPI forparallel implementation, the METIS software package for simulation domainpartitioning, and the iterative parallel linear solver package Aztec forsolving linear equations by multiple processors. In addition, theparallel simulator has been implemented with an efficient communicationscheme. Test examples show that a linear or super-linear speedup can beobtained on Linux clusters as well as on supercomputers. Because of thesignificant improvement in both simulation time and memory requirement,the new simulator provides a powerful tool for tackling larger scale andmore complex problems than can be solved by single-CPU codes. Ahigh-resolution simulation example is presented that models buoyantconvection, induced by a small increase in brine density caused bydissolution of CO2.
A hybrid parallel framework for the cellular Potts model simulations
Jiang, Yi; He, Kejing; Dong, Shoubin
2009-01-01
The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approach achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).
Catya Zúñiga; Miquel Ángel Piera; Mercedes Narciso
2010-01-01
This paper presents a new challenging modelling approach to support different heuristics to tackle the pallet loading problem (PLP). A discrete event system model to tackle the PLP is specified using the coloured Petri net formalism in order to integrate the model with the industrial context in which the PLP must be solved. New events can be formalised in the
A General Architecture for Decentralized Supervisory Control of Discrete-Event Systems
Tae-sic Yoo; Stéphane Lafortune
2002-01-01
We consider a generalized form of the conventional decentralized control architecture for discrete-event systems where the control actions of a set of supervisors can be “fused” using both union and intersection of enabled events. Namely, the supervisors agree a priori on choosing “fusion by union” for certain controllable events and “fusion by intersection” for certain other controllable events. We show
Protocol Veri cation Using Discrete-Event Systems Karen Rudie W. Murray Wonham
Protocol Veri#12;cation Using Discrete-Event Systems Karen Rudie W. Murray Wonham Institute are used to verify the correctness of a protocol for the data transmission problem. In particular, it is demonstrated that our method provides a systematic check on whether the protocol satis#12;es the required
Safety Control of Discrete Event Systems Using Finite State Machines with Parameters
Lin, Feng
in a relatively short time period is that we adapted a simple model of finite state machines. Because of this, weSafety Control of Discrete Event Systems Using Finite State Machines with Parameters Yi-Liang Chen modeled as finite state machines have been well developed over the years in addressing various fundamental
Signed real measure of regular languages for discrete-event automata ASOK RAYy* and SHASHI PHOHAz
Ray, Asok
Signed real measure of regular languages for discrete-event automata ASOK RAYy* and SHASHI PHOHAz This paper presents the concept and formulation of a signed real measure of regular languages for analysis a partial ordering on a set of controlled sublanguages {Lk} of a regular plant language L, the signed real
Discrete-event requirements model for sensor fusion to provide real-time diagnostic feedback
NASA Astrophysics Data System (ADS)
Rokonuzzaman, Mohd; Gosine, Raymond G.
1998-06-01
Minimally-invasive surgical techniques reduce the size of the access corridor and affected zones resulting in limited real-time perceptual information available to the practitioners. A real-time feedback system is required to offset deficiencies in perceptual information. This feedback system acquires data from multiple sensors and fuses these data to extract pertinent information within defined time windows. To perform this task, a set of computing components interact with each other resulting in a discrete event dynamic system. In this work, a new discrete event requirements model for sensor fusion has been proposed to ensure logical and temporal correctness of the operation of the real-time diagnostic feedback system. This proposed scheme models system requirements as a Petri net based discrete event dynamic machine. The graphical representation and quantitative analysis of this model has been developed. Having a natural graphical property, this Petri net based model enables the requirements engineer to communicate intuitively with the client to avoid faults in the early phase of the development process. The quantitative analysis helps justify the logical and temporal correctness of the operation of the system. It has been shown that this model can be analyzed to check the presence of deadlock, reachability, and repetitiveness of the operation of the sensor fusion system. This proposed novel technique to model the requirements of sensor fusion as a discrete event dynamic system has the potential to realize highly reliable real-time diagnostic feedback system for many applications, such as minimally invasive instrumentation.
Decision making in fuzzy discrete event systems F. Lin a,c,*, H. Ying a
Lin, Feng
with the optimal control problem efficiently. As an applica- tion, we apply the approach to HIV/AIDS treatment decision model for HIV/AIDS treatment based on expert's knowledge, treatment guidelines, clinic trials Elsevier Inc. All rights reserved. Keywords: Discrete event systems; Fuzzy logic; Decision making; HIV/AIDS
An Analytic Method for Predicting Simulation Parallelism
Teo, Yong-Meng
and dynamics of the processes are analyzed under the following assump- tions: exponential task times and time-stamp of implementation details. We assume that the system to be simulated is modelled as a network of log- ical processes, and each logical process models a queu- ing server center. Unlike many analytic models reported
Parallel runway requirement analysis study. Volume 2: Simulation manual
NASA Technical Reports Server (NTRS)
Ebrahimi, Yaghoob S.; Chun, Ken S.
1993-01-01
This document is a user manual for operating the PLAND_BLUNDER (PLB) simulation program. This simulation is based on two aircraft approaching parallel runways independently and using parallel Instrument Landing System (ILS) equipment during Instrument Meteorological Conditions (IMC). If an aircraft should deviate from its assigned localizer course toward the opposite runway, this constitutes a blunder which could endanger the aircraft on the adjacent path. The worst case scenario would be if the blundering aircraft were unable to recover and continue toward the adjacent runway. PLAND_BLUNDER is a Monte Carlo-type simulation which employs the events and aircraft positioning during such a blunder situation. The model simulates two aircraft performing parallel ILS approaches using Instrument Flight Rules (IFR) or visual procedures. PLB uses a simple movement model and control law in three dimensions (X, Y, Z). The parameters of the simulation inputs and outputs are defined in this document along with a sample of the statistical analysis. This document is the second volume of a two volume set. Volume 1 is a description of the application of the PLB to the analysis of close parallel runway operations.
Efficient parallel CFD-DEM simulations using OpenMP
NASA Astrophysics Data System (ADS)
Amritkar, Amit; Deb, Surya; Tafti, Danesh
2014-01-01
The paper describes parallelization strategies for the Discrete Element Method (DEM) used for simulating dense particulate systems coupled to Computational Fluid Dynamics (CFD). While the field equations of CFD are best parallelized by spatial domain decomposition techniques, the N-body particulate phase is best parallelized over the number of particles. When the two are coupled together, both modes are needed for efficient parallelization. It is shown that under these requirements, OpenMP thread based parallelization has advantages over MPI processes. Two representative examples, fairly typical of dense fluid-particulate systems are investigated, including the validation of the DEM-CFD and thermal-DEM implementation with experiments. Fluidized bed calculations are performed on beds with uniform particle loading, parallelized with MPI and OpenMP. It is shown that as the number of processing cores and the number of particles increase, the communication overhead of building ghost particle lists at processor boundaries dominates time to solution, and OpenMP which does not require this step is about twice as fast as MPI. In rotary kiln heat transfer calculations, which are characterized by spatially non-uniform particle distributions, the low overhead of switching the parallelization mode in OpenMP eliminates the load imbalances, but introduces increased overheads in fetching non-local data. In spite of this, it is shown that OpenMP is between 50-90% faster than MPI.
Parallel PDE-Based Simulations Using the Common Component Architecture
McInnes, Lois C.; Allan, Benjamin A.; Armstrong, Robert; Benson, Steven J.; Bernholdt, David E.; Dahlgren, Tamara L.; Diachin, Lori; Krishnan, Manoj Kumar; Kohl, James A.; Larson, J. Walter; Lefantzi, Sophia; Nieplocha, Jarek; Norris, Boyana; Parker, Steven G.; Ray, Jaideep; Zhou, Shujia
2006-03-05
Summary. The complexity of parallel PDE-based simulations continues to increase as multimodel, multiphysics, and multi-institutional projects become widespread. A goal of componentbased software engineering in such large-scale simulations is to help manage this complexity by enabling better interoperability among various codes that have been independently developed by different groups. The Common Component Architecture (CCA) Forum is defining a component architecture specification to address the challenges of high-performance scientific computing. In addition, several execution frameworks, supporting infrastructure, and generalpurpose components are being developed. Furthermore, this group is collaborating with others in the high-performance computing community to design suites of domain-specific component interface specifications and underlying implementations. This chapter discusses recent work on leveraging these CCA efforts in parallel PDE-based simulations involving accelerator design, climate modeling, combustion, and accidental fires and explosions. We explain how component technology helps to address the different challenges posed by each of these applications, and we highlight how component interfaces built on existing parallel toolkits facilitate the reuse of software for parallel mesh manipulation, discretization, linear algebra, integration, optimization, and parallel data redistribution. We also present performance data to demonstrate the suitability of this approach, and we discuss strategies for applying component technologies to both new and existing applications.
A Parallel Simulation Framework for Integrated Regional Ecosytem Modeling
Dali Wang; Michael W. Berry; Eric A. Carr; Jane Comiskey; Louis J. Gross
2007-01-01
Abstract This paper presents a general framework,to utilize high performance,computations,in regional ecosystem simulation. First, a comprehensive modeling package is introduced to demonstrate the chal- lenges encountered,due to the multiple spatial and temporal scales which arise in regional ecosystem modeling. Second, a parallel simulation framework is presented to support multi-component ecosystem modeling on high performance computational platforms. Third, two ecological models
Xyce parallel electronic simulator reference guide, version 6.0.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David G.
2013-08-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1].
Xyce Parallel Electronic Simulator : reference guide, version 4.1.
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick
2009-02-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.
Xyce parallel electronic simulator reference guide, version 6.1.
Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason C.; Baur, David Gregory
2014-03-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide [1] .
Parallel and Distributed Multi-Algorithm Circuit Simulation
Dai, Ruicheng
2012-10-19
)( kk xCx ?? ?? (3.5) Here C is constant. Hence, Newton's method has a quadratic convergence rate. When Newton’s method is applied in circuit simulation, its Jacobian matrix needs... . The obvious difference is that the tangent lines are parallel. Figure 7. Successive Chord method Compared to Newton Raphson, SC method’s advantage is that it uses constant Jacobian matrix scJ in simulation. The Jacobian matrix is constructed...
PARALLEL SIMULATION USING THE TIME WARP OPERATING SYSTEM
California at Los Angeles, University of
4800 Oak Grove Drive Pasadena, California 91109 ABSTRACT The Time Warp Operating System runs discretePARALLEL SIMULATION USING THE TIME WARP OPERATING SYSTEM Peter L. Reiher Jet Propulsion Laboratory for experimental use. The first half of this tutorial will discuss how to use the Time Warp Operating System
A Framework for Transparent Load Balancing in Parallel Numerical Simulation
Josef Weidendorfer; Peter Luksch
2001-01-01
Load imbalance is the most important factor that limits scalability of parallel applications in scientific computing. Dynamic load balancing at the application level usually is implemented in aproprietary manner. This paper presents a generic framework for application level dynamic load bal- ancing. Our framework can be applied to any grid-based iterative numerical simulation. It defines a programming model that is
Parallel Transient Dynamics Simulations: Algorithms for Contact Detection
Plimpton, Steve
crashes, underwater explosions, and the response of shipping containers to highspeed impacts. PhysicalParallel Transient Dynamics Simulations: Algorithms for Contact Detection and Smoothed Particle challenge is to efficiently detect the contacts that occur within the deforming mesh and between mesh
Leaky Modes in Parallel-Plate EMP Simulators
Ali Rushdi; Ronald Menendez; Raj Mittra; Shung-Wu Lee
1978-01-01
The finite-width parallel-plate waveguide is a useful tool as an EMP simulator, and its characteristics have recently been investigated by a number of workers. In this paper, we report the results of a study of the modal fields in such a waveguide. Once these modal fields and their corresponding wavenumbers are known, the problem of source excitation in such a
Parallel Monte Carlo Ion Recombination Simulation in Orca
Seinstra, Frank J.
Parallel Monte Carlo Ion Recombination Simulation in Orca Frank J. Seinstra Department of Mathematics and Computer Science Vrije Universiteit Amsterdam, The Netherlands August 1996 Abstract: Orca in most languages for dis tributed programming is based on message passing. In Orca, however, a shared
PARALLEL SIMULATION OF LARGE-SCALE WATER DISTRIBUTION SYSTEMS
Bargiela, Andrzej
PARALLEL SIMULATION OF LARGE-SCALE WATER DISTRIBUTION SYSTEMS J.K. Hartley, A. Bargiela, R.J. Cant in the specific context of water distribution networks. Today's increasing complexity of such systems, together recovery from measurement failure situations. INTRODUCTION Water distribution systems are large
Parallel Computing Environments and Methods for Power Distribution System Simulation
Lu, Ning; Taylor, Zachary T.; Chassin, David P.; Guttromson, Ross T.; Studham, Scott S.
2005-11-10
The development of cost-effective high-performance parallel computing on multi-processor super computers makes it attractive to port excessively time consuming simulation software from personal computers (PC) to super computes. The power distribution system simulator (PDSS) takes a bottom-up approach and simulates load at appliance level, where detailed thermal models for appliances are used. This approach works well for a small power distribution system consisting of a few thousand appliances. When the number of appliances increases, the simulation uses up the PC memory and its run time increases to a point where the approach is no longer feasible to model a practical large power distribution system. This paper presents an effort made to port a PC-based power distribution system simulator (PDSS) to a 128-processor shared-memory super computer. The paper offers an overview of the parallel computing environment and a description of the modification made to the PDSS model. The performances of the PDSS running on a standalone PC and on the super computer are compared. Future research direction of utilizing parallel computing in the power distribution system simulation is also addressed.
Parallelization Strategies for Large Particle Simulations in Astrophysics
NASA Astrophysics Data System (ADS)
Pattabiraman, Bharath
The modeling of collisional N-body stellar systems is a topic of great current interest in several branches of astrophysics and cosmology. These systems are dominated by the physics of relaxation, the collective effect of many weak, random gravitational encounters between stars. They connect directly to our understanding of star clusters, and to the formation of exotic objects such as X-ray binaries, pulsars, and massive black holes. As a prototypical multi-physics, multi-scale problem, the numerical simulation of such systems is computationally intensive, and can only be achieved through high-performance computing. The goal of this thesis is to present parallelization and optimization strategies that can be used to develop efficient computational tools for simulating collisional N-body systems. This leads to major advances: 1) From an astrophysics perspective, these tools enable the study of new physical regimes out of reach by previous simulations. They also lead to much more complete parameter space exploration, allowing direct comparison of numerical results to observational data. 2) On the high-performance computing front, efficient parallelization of a multi-component application requires the meticulous redesign of the various components, as well as innovative parallelization techniques. Many of the challenges faced in this process lie at the very heart of high-performance computing research, including achieving optimal load balancing, maximizing utilization of computational resources, and making effective use of different parallel platforms. For modeling collisional N-body systems, a Monte Carlo approach provides ideal balance between speed and accuracy, as opposed to the more accurate but less scalable direct N-body method. We describe the development of a new version of the Cluster Monte Carlo (CMC) code capable of simulating systems with a realistic number of stars, while accounting for all important physical processes. This efficient and scalable parallel version of CMC runs on both GPUs and distributed-memory architectures. We introduce various parallelization and optimization strategies that include the use of best-suited data structures, adaptive data partitioning schemes, parallel random number generation, parallel I/O, and optimized parallel algorithms, resulting in a very desirable scalability of the run-time with the processor number.
Molecular simulation of rheological properties using massively parallel supercomputers
Bhupathiraju, R.K.; Cui, S.T.; Gupta, S.A.; Cummings, P.T. [Univ. of Tennessee, Knoxville, TN (United States). Dept of Chemical Engineering; Cochran, H.D. [Oak Ridge National Lab., TN (United States)
1996-11-01
Advances in parallel supercomputing now make possible molecular-based engineering and science calculations that will soon revolutionize many technologies, such as those involving polymers and those involving aqueous electrolytes. We have developed a suite of message-passing codes for classical molecular simulation of such complex fluids and amorphous materials and have completed a number of demonstration calculations of problems of scientific and technological importance with each. In this paper, we will focus on the molecular simulation of rheological properties, particularly viscosity, of simple and complex fluids using parallel implementations of non-equilibrium molecular dynamics. Such calculations represent significant challenges computationally because, in order to reduce the thermal noise in the calculated properties within acceptable limits, large systems and/or long simulated times are required.
Random number generators for massively parallel simulations on GPU
Markus Manssen; Martin Weigel; Alexander K. Hartmann
2012-04-27
High-performance streams of (pseudo) random numbers are crucial for the efficient implementation for countless stochastic algorithms, most importantly, Monte Carlo simulations and molecular dynamics simulations with stochastic thermostats. A number of implementations of random number generators has been discussed for GPU platforms before and some generators are even included in the CUDA supporting libraries. Nevertheless, not all of these generators are well suited for highly parallel applications where each thread requires its own generator instance. For this specific situation encountered, for instance, in simulations of lattice models, most of the high-quality generators with large states such as Mersenne twister cannot be used efficiently without substantial changes. We provide a broad review of existing CUDA variants of random-number generators and present the CUDA implementation of a new massively parallel high-quality, high-performance generator with a small memory load overhead.
PRATHAM: Parallel Thermal Hydraulics Simulations using Advanced Mesoscopic Methods
Joshi, Abhijit S [ORNL] [ORNL; Jain, Prashant K [ORNL] [ORNL; Mudrich, Jaime A [ORNL] [ORNL; Popov, Emilian L [ORNL] [ORNL
2012-01-01
At the Oak Ridge National Laboratory, efforts are under way to develop a 3D, parallel LBM code called PRATHAM (PaRAllel Thermal Hydraulic simulations using Advanced Mesoscopic Methods) to demonstrate the accuracy and scalability of LBM for turbulent flow simulations in nuclear applications. The code has been developed using FORTRAN-90, and parallelized using the message passing interface MPI library. Silo library is used to compact and write the data files, and VisIt visualization software is used to post-process the simulation data in parallel. Both the single relaxation time (SRT) and multi relaxation time (MRT) LBM schemes have been implemented in PRATHAM. To capture turbulence without prohibitively increasing the grid resolution requirements, an LES approach [5] is adopted allowing large scale eddies to be numerically resolved while modeling the smaller (subgrid) eddies. In this work, a Smagorinsky model has been used, which modifies the fluid viscosity by an additional eddy viscosity depending on the magnitude of the rate-of-strain tensor. In LBM, this is achieved by locally varying the relaxation time of the fluid.
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows
NASA Technical Reports Server (NTRS)
Moitra, Stuti; Gatski, Thomas B.
1997-01-01
A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Potts-model grain growth simulations: Parallel algorithms and applications
Wright, S.A.; Plimpton, S.J.; Swiler, T.P. [and others
1997-08-01
Microstructural morphology and grain boundary properties often control the service properties of engineered materials. This report uses the Potts-model to simulate the development of microstructures in realistic materials. Three areas of microstructural morphology simulations were studied. They include the development of massively parallel algorithms for Potts-model grain grow simulations, modeling of mass transport via diffusion in these simulated microstructures, and the development of a gradient-dependent Hamiltonian to simulate columnar grain growth. Potts grain growth models for massively parallel supercomputers were developed for the conventional Potts-model in both two and three dimensions. Simulations using these parallel codes showed self similar grain growth and no finite size effects for previously unapproachable large scale problems. In addition, new enhancements to the conventional Metropolis algorithm used in the Potts-model were developed to accelerate the calculations. These techniques enable both the sequential and parallel algorithms to run faster and use essentially an infinite number of grain orientation values to avoid non-physical grain coalescence events. Mass transport phenomena in polycrystalline materials were studied in two dimensions using numerical diffusion techniques on microstructures generated using the Potts-model. The results of the mass transport modeling showed excellent quantitative agreement with one dimensional diffusion problems, however the results also suggest that transient multi-dimension diffusion effects cannot be parameterized as the product of the grain boundary diffusion coefficient and the grain boundary width. Instead, both properties are required. Gradient-dependent grain growth mechanisms were included in the Potts-model by adding an extra term to the Hamiltonian. Under normal grain growth, the primary driving term is the curvature of the grain boundary, which is included in the standard Potts-model Hamiltonian.
NASA Technical Reports Server (NTRS)
Goswami, Kumar K.; Iyer, Ravishankar K.
1990-01-01
Discrete event-driven simulation makes it possible to model a computer system in detail. However, such simulation models can require a significant time to execute. This is especially true when modeling large parallel or distributed systems containing many processors and a complex communication network. One solution is to distribute the simulation over several processors. If enough parallelism is achieved, large simulation models can be efficiently executed. This study proposes a distributed simulator called DSIM which can run on various architectures. A simulated test environment is used to verify and characterize the performance of DSIM. The results of the experiments indicate that speedup is application-dependent and, in DSIM's case, is also dependent on how the simulation model is distributed among the processors. Furthermore, the experiments reveal that the communication overhead of ethernet-based distributed systems makes it difficult to achieve reasonable speedup unless the simulation model is computation bound.
Xyce Parallel Electronic Simulator Users Guide Version 6.2.
Keiter, Eric R.; Mei, Ting; Russo, Thomas V.; Schiek, Richard; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason; Baur, David Gregory
2014-09-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2014 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce%40sandia.gov (outside Sandia) xyce-sandia%40sandia.gov (Sandia only)
Parallelizing a DNA simulation code for the Cray MTA-2.
Bokhari, Shahid H; Glaser, Matthew A; Jordan, Harry F; Lansac, Yves; Sauer, Jon R; Van Zeghbroeck, Bart
2002-01-01
The Cray MTA-2 (Multithreaded Architecture) is an unusual parallel supercomputer that promises ease of use and high performance. We describe our experience on the MTA-2 with a molecular dynamics code, SIMU-MD, that we are using to simulate the translocation of DNA through a nanopore in a silicon based ultrafast sequencer. Our sequencer is constructed using standard VLSI technology and consists of a nanopore surrounded by Field Effect Transistors (FETs). We propose to use the FETs to sense variations in charge as a DNA molecule translocates through the pore and thus differentiate between the four building block nucleotides of DNA. We were able to port SIMU-MD, a serial C code, to the MTA with only a modest effort and with good performance. Our porting process needed neither a parallelism support platform nor attention to the intimate details of parallel programming and interprocessor communication, as would have been the case with more conventional supercomputers. PMID:15838145
Numerical simulation of supersonic wake flow with parallel computers
Wong, C.C. [Sandia National Labs., Albuquerque, NM (United States); Soetrisno, M. [Amtec Engineering, Inc., Bellevue, WA (United States)
1995-07-01
Simulating a supersonic wake flow field behind a conical body is a computing intensive task. It requires a large number of computational cells to capture the dominant flow physics and a robust numerical algorithm to obtain a reliable solution. High performance parallel computers with unique distributed processing and data storage capability can provide this need. They have larger computational memory and faster computing time than conventional vector computers. We apply the PINCA Navier-Stokes code to simulate a wind-tunnel supersonic wake experiment on Intel Gamma, Intel Paragon, and IBM SP2 parallel computers. These simulations are performed to study the mean flow in the near wake region of a sharp, 7-degree half-angle, adiabatic cone at Mach number 4.3 and freestream Reynolds number of 40,600. Overall the numerical solutions capture the general features of the hypersonic laminar wake flow and compare favorably with the wind tunnel data. With a refined and clustering grid distribution in the recirculation zone, the calculated location of the rear stagnation point is consistent with the 2D axisymmetric and 3D experiments. In this study, we also demonstrate the importance of having a large local memory capacity within a computer node and the effective utilization of the number of computer nodes to achieve good parallel performance when simulating a complex, large-scale wake flow problem.
Adaptive domain decomposition for Monte Carlo simulations on parallel processors
NASA Technical Reports Server (NTRS)
Wilmoth, Richard G.
1990-01-01
A method is described for performing direct simulation Monte Carlo (DSMC) calculations on parallel processors using adaptive domain decomposition to distribute the computational work load. The method has been implemented on a commercially available hypercube and benchmark results are presented which show the performance of the method relative to current supercomputers. The problems studied were simulations of equilibrium conditions in a closed, stationary box, a two-dimensional vortex flow, and the hypersonic, rarefield flow in a two-dimensional channel. For these problems, the parallel DSMC method ran 5 to 13 times faster than on a single processor of a Cray-2. The adaptive decomposition method worked well in uniformly distributing the computational work over an arbitrary number of processors and reduced the average computational time by over a factor of two in certain cases.
Adaptive domain decomposition for Monte Carlo simulations on parallel processors
NASA Technical Reports Server (NTRS)
Wilmoth, Richard G.
1991-01-01
A method is described for performing direct simulation Monte Carlo (DSMC) calculations on parallel processors using adaptive domain decomposition to distribute the computational work load. The method has been implemented on a commercially available hypercube and benchmark results are presented which show the performance of the method relative to current supercomputers. The problems studied were simulations of equilibrium conditions in a closed, stationary box, a two-dimensional vortex flow, and the hypersonic, rarefied flow in a two-dimensional channel. For these problems, the parallel DSMC method ran 5 to 13 times faster than on a single processor of a Cray-2. The adaptive decomposition method worked well in uniformly distributing the computational work over an arbitrary number of processors and reduced the average computational time by over a factor of two in certain cases.
Parallel algorithms for simulating continuous time Markov chains
NASA Technical Reports Server (NTRS)
Nicol, David M.; Heidelberger, Philip
1992-01-01
We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallel simulation of continuous-time Markov chains. This paper reviews the basic method and compares five different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.
Kumar, Ratnesh
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. XX, NO. Y, MONTH 2001 1 A Discrete Event Systems was supported in part by the National Science Foun- dation under the grants NSF-ECS-9709796 and NSF-ECS-0099851
Parallel hyperbolic PDE simulation on clusters: Cell versus GPU
NASA Astrophysics Data System (ADS)
Rostrup, Scott; De Sterck, Hans
2010-12-01
Increasingly, high-performance computing is looking towards data-parallel computational devices to enhance computational performance. Two technologies that have received significant attention are IBM's Cell Processor and NVIDIA's CUDA programming model for graphics processing unit (GPU) computing. In this paper we investigate the acceleration of parallel hyperbolic partial differential equation simulation on structured grids with explicit time integration on clusters with Cell and GPU backends. The message passing interface (MPI) is used for communication between nodes at the coarsest level of parallelism. Optimizations of the simulation code at the several finer levels of parallelism that the data-parallel devices provide are described in terms of data layout, data flow and data-parallel instructions. Optimized Cell and GPU performance are compared with reference code performance on a single x86 central processing unit (CPU) core in single and double precision. We further compare the CPU, Cell and GPU platforms on a chip-to-chip basis, and compare performance on single cluster nodes with two CPUs, two Cell processors or two GPUs in a shared memory configuration (without MPI). We finally compare performance on clusters with 32 CPUs, 32 Cell processors, and 32 GPUs using MPI. Our GPU cluster results use NVIDIA Tesla GPUs with GT200 architecture, but some preliminary results on recently introduced NVIDIA GPUs with the next-generation Fermi architecture are also included. This paper provides computational scientists and engineers who are considering porting their codes to accelerator environments with insight into how structured grid based explicit algorithms can be optimized for clusters with Cell and GPU accelerators. It also provides insight into the speed-up that may be gained on current and future accelerator architectures for this class of applications. Program summaryProgram title: SWsolver Catalogue identifier: AEGY_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v3 No. of lines in distributed program, including test data, etc.: 59 168 No. of bytes in distributed program, including test data, etc.: 453 409 Distribution format: tar.gz Programming language: C, CUDA Computer: Parallel Computing Clusters. Individual compute nodes may consist of x86 CPU, Cell processor, or x86 CPU with attached NVIDIA GPU accelerator. Operating system: Linux Has the code been vectorised or parallelized?: Yes. Tested on 1-128 x86 CPU cores, 1-32 Cell Processors, and 1-32 NVIDIA GPUs. RAM: Tested on Problems requiring up to 4 GB per compute node. Classification: 12 External routines: MPI, CUDA, IBM Cell SDK Nature of problem: MPI-parallel simulation of Shallow Water equations using high-resolution 2D hyperbolic equation solver on regular Cartesian grids for x86 CPU, Cell Processor, and NVIDIA GPU using CUDA. Solution method: SWsolver provides 3 implementations of a high-resolution 2D Shallow Water equation solver on regular Cartesian grids, for CPU, Cell Processor, and NVIDIA GPU. Each implementation uses MPI to divide work across a parallel computing cluster. Additional comments: Sub-program numdiff is used for the test run.
Xyce Parallel Electronic Simulator - Users' Guide Version 2.1.
Hutchinson, Scott A; Hoekstra, Robert J.; Russo, Thomas V.; Rankin, Eric; Pawlowski, Roger P.; Fixel, Deborah A; Schiek, Richard; Bogdan, Carolyn W.; Shirley, David N.; Campbell, Phillip M.; Keiter, Eric R.
2005-06-01
This manual describes the use of theXyceParallel Electronic Simulator.Xycehasbeen designed as a SPICE-compatible, high-performance analog circuit simulator, andhas been written to support the simulation needs of the Sandia National Laboratorieselectrical designers. This development has focused on improving capability over thecurrent state-of-the-art in the following areas:%04Capability to solve extremely large circuit problems by supporting large-scale par-allel computing platforms (up to thousands of processors). Note that this includessupport for most popular parallel and serial computers.%04Improved performance for all numerical kernels (e.g., time integrator, nonlinearand linear solvers) through state-of-the-art algorithms and novel techniques.%04Device models which are specifically tailored to meet Sandia's needs, includingmany radiation-aware devices.3 XyceTMUsers' Guide%04Object-oriented code design and implementation using modern coding practicesthat ensure that theXyceParallel Electronic Simulator will be maintainable andextensible far into the future.Xyceis a parallel code in the most general sense of the phrase - a message passingparallel implementation - which allows it to run efficiently on the widest possible numberof computing platforms. These include serial, shared-memory and distributed-memoryparallel as well as heterogeneous platforms. Careful attention has been paid to thespecific nature of circuit-simulation problems to ensure that optimal parallel efficiencyis achieved as the number of processors grows.The development ofXyceprovides a platform for computational research and de-velopment aimed specifically at the needs of the Laboratory. WithXyce, Sandia hasan %22in-house%22 capability with which both new electrical (e.g., device model develop-ment) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms)research and development can be performed. As a result,Xyceis a unique electricalsimulation capability, designed to meet the unique needs of the laboratory.4 XyceTMUsers' GuideAcknowledgementsThe authors would like to acknowledge the entire Sandia National Laboratories HPEMS(High Performance Electrical Modeling and Simulation) team, including Steve Wix, CarolynBogdan, Regina Schells, Ken Marx, Steve Brandon and Bill Ballard, for their support onthis project. We also appreciate very much the work of Jim Emery, Becky Arnold and MikeWilliamson for the help in reviewing this document.Lastly, a very special thanks to Hue Lai for typesetting this document with LATEX.TrademarksThe information herein is subject to change without notice.Copyrightc 2002-2003 Sandia Corporation. All rights reserved.XyceTMElectronic Simulator andXyceTMtrademarks of Sandia Corporation.Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence DesignSystems, Inc.Silicon Graphics, the Silicon Graphics logo and IRIX are registered trademarks of SiliconGraphics, Inc.Microsoft, Windows and Windows 2000 are registered trademark of Microsoft Corporation.Solaris and UltraSPARC are registered trademarks of Sun Microsystems Corporation.Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation.HP and Alpha are registered trademarks of Hewlett-Packard company.Amtec and TecPlot are trademarks of Amtec Engineering, Inc.Xyce's expression library is based on that inside Spice 3F5 developed by the EECS De-partment at the University of California.All other trademarks are property of their respective owners.ContactsBug Reportshttp://tvrusso.sandia.gov/bugzillaEmailxyce-support%40sandia.govWorld Wide Webhttp://www.cs.sandia.gov/xyce5 XyceTMUsers' GuideThis page is left intentionally blank6
Particle simulation of plasmas on the massively parallel processor
NASA Technical Reports Server (NTRS)
Gledhill, I. M. A.; Storey, L. R. O.
1987-01-01
Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.
MapReduce Parallel Cuckoo Hashing and Oblivious RAM Simulations
Goodrich, Michael T
2010-01-01
We present an efficient algorithm for performing cuckoo hashing in the MapReduce parallel model of computation and we show how this result in turn leads to improved methods for performing data-oblivious RAM simulations. Our contributions involve a number of seemingly unrelated new results, including: a parallel MapReduce cuckoo hashing algorithm that runs in O(log n) time and uses O(n) total work, with very high probability a reduction of data-oblivious simulation of sparse-streaming MapReduce algorithms to oblivious sorting an external-memory data-oblivious sorting algorithm using O((N/B) log^2_(M/B) (N/B)) I/Os constant-memory data-oblivious RAM simulation with O(log^2 n) amortized time overhead, with very high probability, or with expected O(log2 n) amortized time overhead and better constant factors sublinear-memory data-oblivious RAM simulation with O(n^nu) private memory and O(log n) amortized time overhead, with very high probability, for constant nu > 0. This last result is, in fact, the main result o...
Massively Parallel Processing for Fast and Accurate Stamping Simulations
NASA Astrophysics Data System (ADS)
Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu
2005-08-01
The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.
Long-range interactions and parallel scalability in molecular simulations
NASA Astrophysics Data System (ADS)
Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko
2007-01-01
Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.
Gyrokinetic simulations of the parallel velocity shear instability
NASA Astrophysics Data System (ADS)
Findley, T.; McCarthy, D.; Dorland, W.
2004-11-01
A linear theory of the parallel velocity shear instability in a uniformly magnetized slab is developed using the Braginskii Equations. Included in this model are electromagetic effects, parallel viscosity, a radial density gradient, and a radial gradient of the current density. The growth rates from the analytical theory are compared to the results from the gyrokinetic code, gs2. We find that for Ti ˜ Te the growth rates derived from the fluid equations are described well by the results from gs2, thus indicating that kinetic effects do not have a significant effect on the stability of this mode. In an effort to further understand the dynamics of an edge plasma with radial parallel velocity gradients, we perform fully nonlinear fluid simulations of this mode with a nonlocal variation of the equilibrium quantities, as well as fully nonlinear gyrokinetic simulations in the local limit. The similarities and differences of these results are presented, and compared to experimental data, in particular the edge harmonic oscillations from DIII-D.
Plimpton, Steve; Thompson, Aidan; Crozier, Paul
LAMMPS (http://lammps.sandia.gov/index.html) stands for Large-scale Atomic/Molecular Massively Parallel Simulator and is a code that can be used to model atoms or, as the LAMMPS website says, as a parallel particle simulator at the atomic, meso, or continuum scale. This Sandia-based website provides a long list of animations from large simulations. These were created using different visualization packages to read LAMMPS output, and each one provides the name of the PI and a brief description of the work done or visualization package used. See also the static images produced from simulations at http://lammps.sandia.gov/pictures.html The foundation paper for LAMMPS is: S. Plimpton, Fast Parallel Algorithms for Short-Range Molecular Dynamics, J Comp Phys, 117, 1-19 (1995), but the website also lists other papers describing contributions to LAMMPS over the years.
Mapping a battlefield simulation onto message-passing parallel architectures
NASA Technical Reports Server (NTRS)
Nicol, David M.
1987-01-01
Perhaps the most critical problem in distributed simulation is that of mapping: without an effective mapping of workload to processors the speedup potential of parallel processing cannot be realized. Mapping a simulation onto a message-passing architecture is especially difficult when the computational workload dynamically changes as a function of time and space; this is exactly the situation faced by battlefield simulations. This paper studies an approach where the simulated battlefield domain is first partitioned into many regions of equal size; typically there are more regions than processors. The regions are then assigned to processors; a processor is responsible for performing all simulation activity associated with the regions. The assignment algorithm is quite simple and attempts to balance load by exploiting locality of workload intensity. The performance of this technique is studied on a simple battlefield simulation implemented on the Flex/32 multiprocessor. Measurements show that the proposed method achieves reasonable processor efficiencies. Furthermore, the method shows promise for use in dynamic remapping of the simulation.
Numerical techniques for parallel dynamics in electromagnetic gyrokinetic Vlasov simulations
NASA Astrophysics Data System (ADS)
Maeyama, S.; Ishizawa, A.; Watanabe, T.-H.; Nakajima, N.; Tsuji-Iio, S.; Tsutsui, H.
2013-11-01
Numerical techniques for parallel dynamics in electromagnetic gyrokinetic simulations are introduced to regulate unphysical grid-size oscillations in the field-aligned coordinate. It is found that a fixed boundary condition and the nonlinear mode coupling in the field-aligned coordinate, as well as numerical errors of non-dissipative finite difference methods, produce fluctuations with high parallel wave numbers. The theoretical and numerical analyses demonstrate that an outflow boundary condition and a low-pass filter efficiently remove the numerical oscillations, providing small but acceptable errors of the entropy variables. The new method is advantageous for quantitative evaluation of the entropy balance that is required for obtaining a steady state in gyrokinetic turbulence.
Conservative parallel simulation of priority class queueing networks
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
A conservative synchronization protocol is described for the parallel simulation of queueing networks having C job priority classes, where a job's class is fixed. This problem has long vexed designers of conservative synchronization protocols because of its seemingly poor ability to compute lookahead: the time of the next departure. For, a job in service having low priority can be preempted at any time by an arrival having higher priority and an arbitrarily small service time. The solution is to skew the event generation activity so that the events for higher priority jobs are generated farther ahead in simulated time than lower priority jobs. Thus, when a lower priority job enters service for the first time, all the higher priority jobs that may preempt it are already known and the job's departure time can be exactly predicted. Finally, the protocol was analyzed and it was demonstrated that good performance can be expected on the simulation of large queueing networks.
Development of magnetron sputtering simulator with GPU parallel computing
NASA Astrophysics Data System (ADS)
Sohn, Ilyoup; Kim, Jihun; Bae, Junkyeong; Lee, Jinpil
2014-12-01
Sputtering devices are widely used in the semiconductor and display panel manufacturing process. Currently, a number of surface treatment applications using magnetron sputtering techniques are being used to improve the efficiency of the sputtering process, through the installation of magnets outside the vacuum chamber. Within the internal space of the low pressure chamber, plasma generated from the combination of a rarefied gas and an electric field is influenced interactively. Since the quality of the sputtering and deposition rate on the substrate is strongly dependent on the multi-physical phenomena of the plasma regime, numerical simulations using PIC-MCC (Particle In Cell, Monte Carlo Collision) should be employed to develop an efficient sputtering device. In this paper, the development of a magnetron sputtering simulator based on the PIC-MCC method and the associated numerical techniques are discussed. To solve the electric field equations in the 2-D Cartesian domain, a Poisson equation solver based on the FDM (Finite Differencing Method) is developed and coupled with the Monte Carlo Collision method to simulate the motion of gas particles influenced by an electric field. The magnetic field created from the permanent magnet installed outside the vacuum chamber is also numerically calculated using Biot-Savart's Law. All numerical methods employed in the present PIC code are validated by comparison with analytical and well-known commercial engineering software results, with all of the results showing good agreement. Finally, the developed PIC-MCC code is parallelized to be suitable for general purpose computing on graphics processing unit (GPGPU) acceleration, so as to reduce the large computation time which is generally required for particle simulations. The efficiency and accuracy of the GPGPU parallelized magnetron sputtering simulator are examined by comparison with the calculated results and computation times from the original serial code. It is found that initially both simulations are in good agreement; however, differences develop over time due to statistical noise in the PIC-MCC GPGPU model.
Sequential Window Diagnoser for Discrete-Event Systems Under Unreliable Observations
Wen-Chiao Lin; Humberto E. Garcia; David Thorsley; Tae-Sic Yoo
2009-09-01
This paper addresses the issue of counting the occurrence of special events in the framework of partiallyobserved discrete-event dynamical systems (DEDS). Developed diagnosers referred to as sequential window diagnosers (SWDs) utilize the stochastic diagnoser probability transition matrices developed in [9] along with a resetting mechanism that allows on-line monitoring of special event occurrences. To illustrate their performance, the SWDs are applied to detect and count the occurrence of special events in a particular DEDS. Results show that SWDs are able to accurately track the number of times special events occur.
Supervisor Localization: A Top-Down Approach to Distributed Control of Discrete-Event Systems
Cai, K.; Wonham, W. M. [Systems Control Group, Department of Electrical and Computer Engineering, University of Toronto, 10 King's College Road, Toronto, ON, M5S 3G4 (Canada)
2009-03-05
A purely distributed control paradigm is proposed for discrete-event systems (DES). In contrast to control by one or more external supervisors, distributed control aims to design built-in strategies for individual agents. First a distributed optimal nonblocking control problem is formulated. To solve it, a top-down localization procedure is developed which systematically decomposes an external supervisor into local controllers while preserving optimality and nonblockingness. An efficient localization algorithm is provided to carry out the computation, and an automated guided vehicles (AGV) example presented for illustration. Finally, the 'easiest' and 'hardest' boundary cases of localization are discussed.
Niehof, Jonathan T.; Morley, Steven K.
2012-01-01
We review and develop techniques to determine associations between series of discrete events. The bootstrap, a nonparametric statistical method, allows the determination of the significance of associations with minimal assumptions about the underlying processes. We find the key requirement for this method: one of the series must be widely spaced in time to guarantee the theoretical applicability of the bootstrap. If this condition is met, the calculated significance passes a reasonableness test. We conclude with some potential future extensions and caveats on the applicability of these methods. The techniques presented have been implemented in a Python-based software toolkit.
High Performance Parallel Methods for Space Weather Simulations
NASA Technical Reports Server (NTRS)
Hunter, Paul (Technical Monitor); Gombosi, Tamas I.
2003-01-01
This is the final report of our NASA AISRP grant entitled 'High Performance Parallel Methods for Space Weather Simulations'. The main thrust of the proposal was to achieve significant progress towards new high-performance methods which would greatly accelerate global MHD simulations and eventually make it possible to develop first-principles based space weather simulations which run much faster than real time. We are pleased to report that with the help of this award we made major progress in this direction and developed the first parallel implicit global MHD code with adaptive mesh refinement. The main limitation of all earlier global space physics MHD codes was the explicit time stepping algorithm. Explicit time steps are limited by the Courant-Friedrichs-Lewy (CFL) condition, which essentially ensures that no information travels more than a cell size during a time step. This condition represents a non-linear penalty for highly resolved calculations, since finer grid resolution (and consequently smaller computational cells) not only results in more computational cells, but also in smaller time steps.
CHOLLA: A New Massively Parallel Hydrodynamics Code for Astrophysical Simulation
NASA Astrophysics Data System (ADS)
Schneider, Evan E.; Robertson, Brant E.
2015-04-01
We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (?2563) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.
Simulation of hypervelocity impact on massively parallel supercomputer
Fang, H.E.
1994-12-31
Hypervelocity impact studies are important for debris shield and armor/anti-armor research and development. Numerical simulations are frequently performed to complement experimental studies, and to evaluate code accuracy. Parametric computational studies involving material properties, geometry and impact velocity can be used to understand hypervelocity impact processes. These impact simulations normally need to address shock wave physics phenomena, material deformation and failure, and motion of debris particles. Detailed, three-dimensional calculations of such events have large memory and processing time requirements. At Sandia National Laboratories, many impact problems of interest require tens of millions of computational cells. Furthermore, even the inadequately resolved problems often require tens or hundred of Cray CPU hours to complete. Recent numerical studies done by Grady and Kipp at Sandia using the Eulerian shock wave physics code CTH demonstrated very good agreement with many features of a copper sphere-on-steel plate oblique impact experiment, fully utilizing the compute power and memory of Sandia`s Cray supercomputer. To satisfy requirements for more finely resolved simulations in order to obtain a better understanding of the crater formation process and impact ejecta motion, the numerical work has been moved from the shared-memory Cray to a large, distributed-memory, massively parallel supercomputing system using PCTH, a parallel version of CTH. The current work is a continuation of the studies, but done on Sandia`s Intel 1840-processor Paragon X/PS parallel computer. With the great compute power and large memory provided by the Paragon, a highly detailed PCTH calculation has been completed for the copper sphere impacting steel plate experiment. Although the PCTH calculation used a mesh which is 4.5 times bigger than the original Cray setup, it finished in much less CPU time.
Massively parallel algorithms for trace-driven cache simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.
1991-01-01
Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.
NASA Technical Reports Server (NTRS)
Mizell, Carolyn Barrett; Malone, Linda
2007-01-01
The development process for a large software development project is very complex and dependent on many variables that are dynamic and interrelated. Factors such as size, productivity and defect injection rates will have substantial impact on the project in terms of cost and schedule. These factors can be affected by the intricacies of the process itself as well as human behavior because the process is very labor intensive. The complex nature of the development process can be investigated with software development process models that utilize discrete event simulation to analyze the effects of process changes. The organizational environment and its effects on the workforce can be analyzed with system dynamics that utilizes continuous simulation. Each has unique strengths and the benefits of both types can be exploited by combining a system dynamics model and a discrete event process model. This paper will demonstrate how the two types of models can be combined to investigate the impacts of human resource interactions on productivity and ultimately on cost and schedule.
Data analysis for parallel car-crash simulation results and model optimization
Liquan Mei; Clemens-august Thole
2008-01-01
The paper discusses automotive crash simulation in a stochastic context, whereby the uncertainties in numerical simulation results generated by parallel computing. Since crash is a non-repeatable phenomenon, qualification for crashworthiness based on a single test is not meaningful, and should be replaced by stochastic simulation. But the stochastic simulations may generate different results on parallel machines, if the same application
A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor
NASA Technical Reports Server (NTRS)
Rao, Hariprasad Nannapaneni
1989-01-01
The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing.
sik A Micro-Kernel for Parallel/Distributed Simulation Systems Kalyan S. Perumalla
Tropper, Carl
Page 1 µsik A Micro-Kernel for Parallel/Distributed Simulation Systems Kalyan S. Perumalla kalyan micro-kernel approach to building parallel/distributed simulation systems. Using this approach, we of this interface in µsik, which is an efficient parallel/distributed realization of our micro-kernel architecture
Financial simulations on a massively parallel Connection Machine
Hutchinson, J.M.; Zenios, S.A. )
1991-01-01
This paper reports on the valuation of complex financial instruments that appear in the banking and insurance industries which requires simulations of their cashflow behavior in a volatile interest rate environment. These simulations are complex and computationally intensive. Their use, thus far, has been limited to intra-day analysis and planning. Researchers at the Wharton School and Thinking Machines Corporation have developed model formulations for massively parallel architectures, like the Connection Machine CM-2. A library of financial modeling primitives has been designed and used to implement a model for the valuation of mortgage-backed securities. Analyzing a portfolio of these securities-which would require 2 days on a large mainframe-is carried out in 1 hour on a CM-2a.
PARSEC: PARALLEL SELF-CONSISTENT 3D ELECTRON-CLOUD SIMULATION IN ARBITRARY EXTERNAL FIELDS
Furman, Miguel
PARSEC: PARALLEL SELF-CONSISTENT 3D ELECTRON-CLOUD SIMULATION IN ARBITRARY EXTERNAL FIELDS Andreas Adelmann and Miguel A. Furman, LBNL, Berkeley CA 94720, USA Abstract We present PARSEC, a 3D parallel self
Concurrent simulation of a parallel jaw end effector
NASA Technical Reports Server (NTRS)
Bynum, Bill
1985-01-01
A system of programs developed to aid in the design and development of the command/response protocol between a parallel jaw end effector and the strategic planner program controlling it are presented. The system executes concurrently with the LISP controlling program to generate a graphical image of the end effector that moves in approximately real time in response to commands sent from the controlling program. Concurrent execution of the simulation program is useful for revealing flaws in the communication command structure arising from the asynchronous nature of the message traffic between the end effector and the strategic planner. Software simulation helps to minimize the number of hardware changes necessary to the microprocessor driving the end effector because of changes in the communication protocol. The simulation of other actuator devices can be easily incorporated into the system of programs by using the underlying support that was developed for the concurrent execution of the simulation process and the communication between it and the controlling program.
Parallel programming in MIMD type parallel systems using transputer and i860 in physical simulations
S. Ido; S. Hikosaka
1992-01-01
Parallel programming and calculation performance were examined by using two types of MIMD parallel systems, that is, a transputer (T800) network and iPSC\\/860. Some interface subroutines were developed to apply the programs parallelized by using a transputer network to iPSC\\/860. Compatibility and performance of parallelized programs are discussed.
Simulation of Distributed Systems Fernando G. Gonzalez
Gonzalez, Fernando
Simulation of Distributed Systems Fernando G. Gonzalez School of Electrical Engineering@pegasus.cc.ucf.edu Keywords: discrete event simulation, single threaded simulation, discrete event control Abstract This paper in a single threaded simulation. It is assumed that the distributed system is described by a collection
Parallel continuous simulated tempering and its applications in large-scale molecular simulations
Zang, Tianwu; Yu, Linglin; Zhang, Chong; Ma, Jianpeng
2014-01-01
In this paper, we introduce a parallel continuous simulated tempering (PCST) method for enhanced sampling in studying large complex systems. It mainly inherits the continuous simulated tempering (CST) method in our previous studies [C. Zhang and J. Ma, J. Chem. Phys.141, 194112 (2009); C. Zhang and J. Ma, J. Chem. Phys.141, 244101 (2010)], while adopts the spirit of parallel tempering (PT), or replica exchange method, by employing multiple copies with different temperature distributions. Differing from conventional PT methods, despite the large stride of total temperature range, the PCST method requires very few copies of simulations, typically 2–3 copies, yet it is still capable of maintaining a high rate of exchange between neighboring copies. Furthermore, in PCST method, the size of the system does not dramatically affect the number of copy needed because the exchange rate is independent of total potential energy, thus providing an enormous advantage over conventional PT methods in studying very large systems. The sampling efficiency of PCST was tested in two-dimensional Ising model, Lennard-Jones liquid and all-atom folding simulation of a small globular protein trp-cage in explicit solvent. The results demonstrate that the PCST method significantly improves sampling efficiency compared with other methods and it is particularly effective in simulating systems with long relaxation time or correlation time. We expect the PCST method to be a good alternative to parallel tempering methods in simulating large systems such as phase transition and dynamics of macromolecules in explicit solvent. PMID:25084887
Parallel Numerical Simulation with MUFTEUG: Two Phase Flow Processes in the Subsurface
Cirpka, Olaf Arie
[ 1 ] Parallel Numerical Simulation with MUFTEUG: Two Phase Flow Processes in the Subsurface UUG, a parallel numerical simulator for multiphase flow, is introduced. The basic PDEs for twophase flow together using a 2D example. Keywords Parallelisation, Numerical Simulation, TwoPhaseFlow, Multigrid Methods 1
Parallel grid library for rapid and flexible simulation development
NASA Astrophysics Data System (ADS)
Honkonen, I.; von Alfthan, S.; Sandroos, A.; Janhunen, P.; Palmroth, M.
2013-04-01
We present an easy to use and flexible grid library for developing highly scalable parallel simulations. The distributed cartesian cell-refinable grid (dccrg) supports adaptive mesh refinement and allows an arbitrary C++ class to be used as cell data. The amount of data in grid cells can vary both in space and time allowing dccrg to be used in very different types of simulations, for example in fluid and particle codes. Dccrg transfers the data between neighboring cells on different processes transparently and asynchronously allowing one to overlap computation and communication. This enables excellent scalability at least up to 32 k cores in magnetohydrodynamic tests depending on the problem and hardware. In the version of dccrg presented here part of the mesh metadata is replicated between MPI processes reducing the scalability of adaptive mesh refinement (AMR) to between 200 and 600 processes. Dccrg is free software that anyone can use, study and modify and is available at https://gitorious.org/dccrg. Users are also kindly requested to cite this work when publishing results obtained with dccrg. Catalogue identifier: AEOM_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOM_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: GNU Lesser General Public License version 3 No. of lines in distributed program, including test data, etc.: 54975 No. of bytes in distributed program, including test data, etc.: 974015 Distribution format: tar.gz Programming language: C++. Computer: PC, cluster, supercomputer. Operating system: POSIX. The code has been parallelized using MPI and tested with 1-32768 processes RAM: 10 MB-10 GB per process Classification: 4.12, 4.14, 6.5, 19.3, 19.10, 20. External routines: MPI-2 [1], boost [2], Zoltan [3], sfc++ [4] Nature of problem: Grid library supporting arbitrary data in grid cells, parallel adaptive mesh refinement, transparent remote neighbor data updates and load balancing. Solution method: The simulation grid is represented by an adjacency list (graph) with vertices stored into a hash table and edges into contiguous arrays. Message Passing Interface standard is used for parallelization. Cell data is given as a template parameter when instantiating the grid. Restrictions: Logically cartesian grid. Running time: Running time depends on the hardware, problem and the solution method. Small problems can be solved in under a minute and very large problems can take weeks. The examples and tests provided with the package take less than about one minute using default options. In the version of dccrg presented here the speed of adaptive mesh refinement is at most of the order of 106 total created cells per second. http://www.mpi-forum.org/. http://www.boost.org/. K. Devine, E. Boman, R. Heaphy, B. Hendrickson, C. Vaughan, Zoltan data management services for parallel dynamic applications, Comput. Sci. Eng. 4 (2002) 90-97. http://dx.doi.org/10.1109/5992.988653. https://gitorious.org/sfc++.
Parallel algorithm for spin and spin-lattice dynamics simulations
NASA Astrophysics Data System (ADS)
Ma, Pui-Wai; Woo, C. H.
2009-04-01
To control numerical errors accumulated over tens of millions of time steps during the integration of a set of highly coupled equations of motion is not a trivial task. In this paper, we propose a parallel algorithm for spin dynamics and the newly developed spin-lattice dynamics simulation [P. W. Ma , Phys. Rev. B 78, 024434 (2008)]. The algorithm is successfully tested in both types of dynamic calculations involving a million spins. It shows good stability and numerical accuracy over millions of time steps (˜1ns) . The scheme is based on the second-order Suzuki-Trotter decomposition (STD). The usage can avoid numerical energy dissipation despite the trajectory and machine errors. The mathematical base of the symplecticity, for properly decomposed evolution operators, is presented. Due to the noncommutative nature of the spin in the present STD scheme, a unique parallel algorithm is needed. The efficiency and stability are tested. It can attain six to seven times speed up when eight threads are used. The run time per time step is linearly proportional to the system size.
Exception handling controllers: An application of pushdown systems to discrete event control
Griffin, Christopher H
2008-01-01
Recent work by the author has extended the Supervisory Control Theory to include the class of control languages defined by pushdown machines. A pushdown machine is a finite state machine extended by an infinite stack memory. In this paper, we define a specific type of deterministic pushdown machine that is particularly useful as a discrete event controller. Checking controllability of pushdown machines requires computing the complement of the controller machine. We show that Exception Handling Controllers have the property that algorithms for taking their complements and determining their prefix closures are nearly identical to the algorithms available for finite state machines. Further, they exhibit an important property that makes checking for controllability extremely simple. Hence, they maintain the simplicity of the finite state machine, while providing the extra power associated with a pushdown stack memory. We provide an example of a useful control specification that cannot be implemented using a finite state machine, but can be implemented using an Exception Handling Controller.
Petascale turbulence simulation using a highly parallel fast multipole method
Yokota, R; Barba, L A; Yasuoka, K
2011-01-01
We present a 0.5 Petaflop/s calculation of homogeneous isotropic turbulence in a cube of 2048^3 particles, using a highly parallel fast multipole method (FMM) using 2048 GPUs on the TSUBAME 2.0 system. We compare this particle-based code with a spectral DNS code under the same calculation condition and the same machine. The results of our particle-based turbulence simulation match quantitatively with that of the spectral method. The calculation time for one time step is approximately 30 seconds for both methods; this result shows that the scalability of the FMM starts to become an advantage over FFT-based methods beyond 2000 GPUs.
Large parallel cosmic string simulations: New results on loop production
Blanco-Pillado, Jose J.; Olum, Ken D.; Shlaer, Benjamin [Institute of Cosmology, Department of Physics and Astronomy, Tufts University, Medford, Massachusetts 02155 (United States)
2011-04-15
Using a new parallel computing technique, we have run the largest cosmic string simulations ever performed. Our results confirm the existence of a long transient period where a nonscaling distribution of small loops is produced at lengths depending on the initial correlation scale. As time passes, this initial population gives way to the true scaling regime, where loops of size approximately equal to one-twentieth the horizon distance become a significant component. We observe similar behavior in matter and radiation eras, as well as in flat space. In the matter era, the scaling population of large loops becomes the dominant component; we expect this to eventually happen in the other eras as well.
Parallelizing N-Body Simulations on a Heterogeneous Cluster
NASA Astrophysics Data System (ADS)
Stenborg, T. N.
2009-10-01
This thesis evaluates quantitatively the effectiveness of a new technique for parallelising direct gravitational N-body simulations on a heterogeneous computing cluster. In addition to being an investigation into how a specific computational physics task can be optimally load balanced across the heterogeneity factors of a distributed computing cluster, it is also, more generally, a case study in effective heterogeneous parallelisation of an all-pairs programming task. If high-performance computing clusters are not designed to be heterogeneous initially, they tend to become so over time as new nodes are added, or existing nodes are replaced or upgraded. As a result, effective techniques for application parallelisation on heterogeneous clusters are needed if maximum cluster utilisation is to be achieved and is an active area of research. A custom C/MPI parallel particle-particle N-body simulator was developed, validated and deployed for this evaluation. Simulation communication proceeds over cluster nodes arranged in a logical ring and employs nonblocking message passing to encourage overlap of communication with computation. Redundant calculations arising from force symmetry given by Newton's third law are removed by combining chordal data transfer of accumulated forces with ring passing data transfer. Heterogeneity in node computation speed is addressed by decomposing system data across nodes in proportion to node computation speed, in conjunction with use of evenly sized communication buffers. This scheme is shown experimentally to have some potential in improving simulation performance in comparison with an even decomposition of data across nodes. Techniques for further heterogeneous cluster load balancing are discussed and remain an opportunity for further work.
NASA Technical Reports Server (NTRS)
Hsieh, Shang-Hsien
1993-01-01
The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.
Simulation as decision tool for capacity planning
Siebren Groothuis; Godefridus G. Van Merode; Arie Hasman
2001-01-01
In this paper we demonstrate how discrete event simulation technique can be used to optimise the use of catheterization capacity. The patient flow at the catheterization room is described. A simulation model of the current situation was built in MedModel, a discrete event simulation package, and the model was validated. A short presentation of MedModel is given. To investigate alternative
A performance study of the hypercube parallel processor architecture
Lamanna, C.A. ); Shaw, W.H. Jr. )
1991-03-01
This paper investigates the relationship between workload characteristics and process speedup obtainable on a hypercube parallel processor architecture. There were two goals: first was to determine the functional relationship between workload characteristics and speedup, and second was to show how simulation could be used to model the concurrently executing process to allow estimation of such a relation. The hypercube implementation used in this study was a packet-switched network with predetermined routing and a balanced computational workload. Three independent variables were controlled: total computational workload, number of processors and the message traffic load. A benchmark program was used to estimate the fundamental timing models and to validate a discrete event simulation. Results of this study are useful to software designers seeking to predict the degree of performance improvement attainable on a hypercube class machine. The methodology and results can be extended to other parallel processing architectures.
van Rosmalen, Joost; Toy, Mehlika; O'Mahony, James F
2013-08-01
Markov models are a simple and powerful tool for analyzing the health and economic effects of health care interventions. These models are usually evaluated in discrete time using cohort analysis. The use of discrete time assumes that changes in health states occur only at the end of a cycle period. Discrete-time Markov models only approximate the process of disease progression, as clinical events typically occur in continuous time. The approximation can yield biased cost-effectiveness estimates for Markov models with long cycle periods and if no half-cycle correction is made. The purpose of this article is to present an overview of methods for evaluating Markov models in continuous time. These methods use mathematical results from stochastic process theory and control theory. The methods are illustrated using an applied example on the cost-effectiveness of antiviral therapy for chronic hepatitis B. The main result is a mathematical solution for the expected time spent in each state in a continuous-time Markov model. It is shown how this solution can account for age-dependent transition rates and discounting of costs and health effects, and how the concept of tunnel states can be used to account for transition rates that depend on the time spent in a state. The applied example shows that the continuous-time model yields more accurate results than the discrete-time model but does not require much computation time and is easily implemented. In conclusion, continuous-time Markov models are a feasible alternative to cohort analysis and can offer several theoretical and practical advantages. PMID:23715464
A discrete event simulation model for unstructured supervisory control of unmanned vehicles
McDonald, Anthony D. (Anthony Douglas)
2010-01-01
Most current Unmanned Vehicle (UV) systems consist of teams of operators controlling a single UV. Technological advances will likely lead to the inversion of this ratio, and automation of low level tasking. These advances ...
Using Distributed Analytics to Enable Real-Time Exploration of Discrete Event Simulations
Pallickara, Shrideep
differ- ent diseases, including foot-and-mouth disease [2], avian in- fluenza [3], and pseudorabies [4 adjustments to quarantine procedures or the number of vaccines available in order to analyze economic
An Efficient Priority Queue for Large FPGA-Based Discrete Event Simulations of
Herbordt, Martin
: atoms are modeled as hard spheres, covalent bonds as hard chain links, and the van der Waals attraction those of time-step driven MD by eight or more orders of magnitude [6, 21]. The most important aspect to a significant number of pro- cessors [14]. Any FPGA acceleration of DMD is therefore doubly important: not only
Quantifying supply chain disruption risk using Monte Carlo and discrete-event simulation
Schmitt, Amanda J.
We present a model constructed for a large consumer products company to assess their vulnerability to disruption risk and quantify its impact on customer service. Risk profiles for the locations and connections in the ...
HUMAN BEHAVIOUR MODELLING FOR DISCRETE EVENT AND AGENT BASED SIMULATION: A CASE STUDY
Aickelin, Uwe
of both in tackling the human behaviour issues which relates to queuing time and customer satisfaction is to maximise customer satisfaction, for example by minimising their waiting times for the different services
Strengths & Drawbacks of MILP, CP and Discrete-Event Simulation based
Grossmann, Ignacio E.
manufacturing and service industries Pulp & Paper, Oil & Gas, Food & Beverages, Pharmaceuticals · Type) · Other aspects Storage policies · Fixed capacity (shared or not), unlimited or no storage Changeovers
Analysis of a hospital network transportation system with discrete event simulation
Kwon, Annie Y. (Annie Yean)
2011-01-01
VA New England Healthcare System (VISN1) provides transportation to veterans between eight medical centers and over 35 Community Based Outpatient Clinics across New England. Due to high variation in its geographic area, ...
Discrete event simulation as a tool to determine necessary nuclear power plant operating crew size
Ron Laughery; Beth M. Plott; Thomas H. Engh; Shelly Scott-Nash
1996-01-01
There are not always sufficient resources or time available to identi@ human factors issues early enough for development of detailed technical bases using empirical experimentation with human subjects. Consequently, analytical approaches are needed to augment the experimental approach for human factors regulatory decision making at the US Nuclear Regulatory Commission. One analytical approach, computer modeling of human performance, is being
Discrete event simulation as a tool to determine necessary nuclear power plant operating crew size
Ron Laughery; Beth M. Plott; Thomas H. Engh; Shelly Scott-Nash
1996-01-01
There are not always sufficient resources or time available to identify human factors issues early enough for development of detailed technical bases using empirical experimentation with human subjects. Consequently, analytical approaches are needed to augment the experimental approach for human factors regulatory decision making at the US Nuclear Regulatory Commission. One analytical approach, computer modeling of human performance, is being
Exploration of Cancellation Strategies for Parallel Simulation on Multi-core Beowulf Clusters
Wilsey, Philip A.
Exploration of Cancellation Strategies for Parallel Simulation on Multi-core Beowulf Clusters of cancellation strategies on multi-core Beowulf clusters with both shared memory and distributed memory, and dynamic cancellation strategies for parallel simulation on multi-core Beowulf Clusters in a multi
LARGE-SCALE MOLECULAR DYNAMICS SIMULATION USING VECTOR AND PARALLEL COMPUTERS
Rapaport, Dennis C.
LARGE-SCALE MOLECULAR DYNAMICS SIMULATION USING VECTOR AND PARALLEL COMPUTERS D.C. RAPAPORT Physics Department, Bar-Ilan University, Ramat-Gan S2100, Israel and Center for Simulational Physics, University for parallel processing .......................... 8 1.6. Algorithm development
Parallel Computation in Simulating Di usion and Deformation in Human Brain
Zhang, Jun
that a neurosurgeon has to face is the removal from the brain of as much diseased tissue as possible and meanwhileParallel Computation in Simulating Di#11;usion and Deformation in Human Brain #3; Ning Kang y Jun of parallel and high performance computation in simulating the di#11;usion process in the human brain
Parallel simulation of multiphase flows using octree adaptivity and the volume-of-fluid method
Fuster, Daniel
Parallel simulation of multiphase flows using octree adaptivity and the volume-of-fluid method solvers on an octree adaptive grid together with a peicewise linear volume of fluid interface tracking´esum´e Simulation parall`ele adaptative octree d'´ecoulements multiphasiques par suivi d'interface de type volume de
K. Morgan; N. P. Weatherill; O. Hassan; P. J. Brookes; R. Said; J. Jones
1999-01-01
High performance parallel computers offer the promise of sufficient computational power to enable the routine use of large scale simulations during the process of engineering design. With this in mind, and with particular reference to the aerospace industry, this paper describes developments that have been undertaken to provide parallel implementations of algorithms for simulation, mesh generation and visualization. Designers are
Kumar, Ratnesh
Decentralized Modular Diagnosis of Concurrent Discrete Event Systems C. Zhou , Member IEEE, R-- The problem of decentralized modular fault di- agnosis of concurrent discrete event systems, that is composed of a set of component modules, is formulated and studied. In the proposed decentralized modular framework
Optimal Memory Management for Time Warp Parallel Simulation Yi-Bing Lin
Tropper, Carl
Optimal Memory Management for Time Warp Parallel Simulation Yi-Bing Lin Bellcore Morristown, NJ 07962-1910 Bruno R. Preiss Department of Electrical and Computer Engineering University of Waterloo of sequential simulation, Chandy-Misra simulation, and Time Warp simulation. We show that Chandy-Misra may
A bridged diagnostic method for the monitoring of polymorphic discrete-event systems.
Lamperti, Gianfranco; Zanella, Marina
2004-10-01
Diagnosis of discrete-event systems (DESs) is a challenging problem that has been tackled both by automatic control and artificial intelligence communities. The relevant approaches share similarities, including modeling by automata, compositional modeling, and model-based reasoning. This paper aims to bridge two complementary approaches from these communities, namely, the diagnoser approach and the active system approach, respectively. The more significant shortcomings of such approaches are, on the one side, the need for the generation of the global system model and, on the other, the lack of monitoring capabilities. The former makes the application of the diagnoser approach prohibitive in real contexts, where the system model is too large to be generated, even offline. The latter requires the completion of the system observation before starting the diagnostic task, thereby, making the monitoring of the system. impossible. The bridged diagnostic method subsumes, to a large extent on the peculiarities of the two approaches and is capable of coping with an extended class of DESs that integrate both synchronous and asynchronous behavior. The bridge is built by extending the active system approach by means of several enhanced techniques, which eventually, allow the efficient monitoring of polymorphic DESs. Upon the occurrence of each system message, two pieces of diagnostic information are generated, namely, the snapshot and historic diagnostic sets. While the former accounts for the faults pertinent to the newly generated message only, the latter is based on the whole sequence of messages yielded by the system during operation. PMID:15503518
Sensor Configuration Selection for Discrete-Event Systems under Unreliable Observations
Wen-Chiao Lin; Tae-Sic Yoo; Humberto E. Garcia
2010-08-01
Algorithms for counting the occurrences of special events in the framework of partially-observed discrete event dynamical systems (DEDS) were developed in previous work. Their performances typically become better as the sensors providing the observations become more costly or increase in number. This paper addresses the problem of finding a sensor configuration that achieves an optimal balance between cost and the performance of the special event counting algorithm, while satisfying given observability requirements and constraints. Since this problem is generally computational hard in the framework considered, a sensor optimization algorithm is developed using two greedy heuristics, one myopic and the other based on projected performances of candidate sensors. The two heuristics are sequentially executed in order to find best sensor configurations. The developed algorithm is then applied to a sensor optimization problem for a multiunit- operation system. Results show that improved sensor configurations can be found that may significantly reduce the sensor configuration cost but still yield acceptable performance for counting the occurrences of special events.
A sweep algorithm for massively parallel simulation of circuit-switched networks
NASA Technical Reports Server (NTRS)
Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.
1992-01-01
A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.
GloMoSim: A Library for Parallel Simulation of Large-Scale Wireless Networks
Xiang Zeng; Rajive Bagrodia; Mario Gerla
1998-01-01
A number of library-based parallel and sequential network simulators have been designed. This paper describes a library, called GloMoSim (for Global Mobile system Simulator), for parallel simulation of wireless networks. GloMoSim has been designed to be extensible and composable: the communication protocol stack for wireless networks is divided into a set of layers, each with its own API. Models of
Simulation-Based Average Case Analysis for Parallel Job Scheduling
Fabrício Alves Barbosa Da Silva; Isaac D. Scherson
2001-01-01
This paper analyses the resource allocation problem in parallel jobs scheduling, with emphasis given to gang ser- vice algorithms. Gang service has been widely used as a practical solution to the dynamic parallel job schedul- ing problem. To provide a sound analysis of gang service performance, a novel methodology based on the traditional concept of competitive ratio is introduced. Dubbed
Hasenkamp, Daren
2011-01-01
of climate simulation data using varying numbers of virtualvirtual machine instances to perform data-driven parallel analysis of climate simulationvirtual machines to parallelize a program called TSTORMS [2], which finds tropical storms in climate simulation
ANNarchy: a code generation approach to neural simulations on parallel hardware
Vitay, Julien; Dinkelbach, Helge Ü.; Hamker, Fred H.
2015-01-01
Many modern neural simulators focus on the simulation of networks of spiking neurons on parallel hardware. Another important framework in computational neuroscience, rate-coded neural networks, is mostly difficult or impossible to implement using these simulators. We present here the ANNarchy (Artificial Neural Networks architect) neural simulator, which allows to easily define and simulate rate-coded and spiking networks, as well as combinations of both. The interface in Python has been designed to be close to the PyNN interface, while the definition of neuron and synapse models can be specified using an equation-oriented mathematical description similar to the Brian neural simulator. This information is used to generate C++ code that will efficiently perform the simulation on the chosen parallel hardware (multi-core system or graphical processing unit). Several numerical methods are available to transform ordinary differential equations into an efficient C++code. We compare the parallel performance of the simulator to existing solutions.
xSim: The Extreme-Scale Simulator
Boehm, Swen; Engelmann, Christian
2011-01-01
Investigating parallel application performance properties at scale is becoming an important part of high-performance computing (HPC) application development and deployment. The Extreme-scale Simulator (xSim) is a performance investigation toolkit that permits running an application in a controlled environment at extreme scale without the need for a respective extreme-scale HPC system. Using a lightweight parallel discrete event simulation, xSim executes a parallel application with a virtual wall clock time, such that performance data can be extracted based on a processor model and a network model. This paper presents significant enhancements to the xSim toolkit prototype that provide a more complete Message Passing Interface (MPI) support and improve its versatility. These enhancements include full virtual MPI group, communicator and collective communication support, and global variables support. The new capabilities are demonstrated by executing the entire NAS Parallel Benchmark suite in a simulated HPC environment.
Efficient parallelization of molecular dynamics simulations with short-ranged forces
NASA Astrophysics Data System (ADS)
Meyer, Ralf
2014-10-01
Recently, an alternative strategy for the parallelization of molecular dynamics simulations with short-ranged forces has been proposed. In this work, this algorithm is tested on a variety of multi-core systems using three types of benchmark simulations. The results show that the new algorithm gives consistent speedups which are depending on the properties of the simulated system either comparable or superior to those obtained with spatial decomposition. Comparisons of the parallel speedup on different systems indicates that on multi-core machines the parallel efficiency of the method is mainly limited by memory access speed.
NASA Technical Reports Server (NTRS)
Hsieh, Shang-Hsien; Abel, J. F.
1993-01-01
The principal objective of this research is to investigate, develop and demonstrate coarse-grained, parallel-processing strategies for nonlinear dynamic simulations for rotating bladed-disk assemblies. The parallel -processing strategies addressed include numerical algorithms for parallel nonlinear solutions and techniques to effect load balancing among processors. The parallel environment employed is a distributed-memory, coarse-grained one consisting of networked workstations. A parallel explicit time integration method has been implemented for transient nonlinear solutions of rotationg bladed-disk assemblies. Automatic domain partitioning techniques have been investigated for load balancing among processors. Advanced computing environments, data structures and interactive computer graphics all contribute to an integrated parallel finite element analysis system to facilitate more efficient and powerful dynamic simulations.
A natural partitioning scheme for parallel simulation of multibody systems
NASA Technical Reports Server (NTRS)
Chiou, J. C.; Park, K. C.; Farhat, C.
1993-01-01
A parallel partitioning scheme based on physical-co-ordinate variables is presented to systematically eliminate system constraint forces and yield the equations of motion of multibody dynamics systems in terms of their independent coordinates. Key features of the present scheme include an explicit determination of the independent coordinates, a parallel construction of the null space matrix of the constraint Jacobian matrix, an easy incorporation of the previously developed two-stage staggered solution procedure and a Schur complement based parallel preconditioned conjugate gradient numerical algorithm.
A natural partitioning scheme for parallel simulation of multibody systems
NASA Technical Reports Server (NTRS)
Chiou, J. C.; Park, K. C.; Farhat, C.
1991-01-01
A parallel partitioning scheme based on physical-coordinate variables is presented to systematically eliminate system constraint forces and yield the equations of motion of multibody dynamics systems in terms of their independent coordinates. Key features of the present scheme include an explicit determination of the independent coordinates, a parallel construction of the null space matrix of the constraint Jacobian matrix, an easy incorporation of the previously developed two-stage staggered solution procedure, and Schur complement based parallel preconditioned conjugate gradient numerical algorithm.
Performance Analysis of a Parallel Discrete Model for the Simulation of Laser Dynamics
Jose Luis Guisado; Francisco Fernández De Vega; K. Iskra
2006-01-01
Abstract This paper presents an analysis on the performance,of a parallel implementation,of a discrete model of laser dy- namics, which is based on cellular automata. The perfor- mance,of a 2D parallel version of the model is studied as a r st step to test the feasibility of a parallel 3D version, which is needed to simulate specic,laser systems. The 3D
Humans can integrate feedback of discrete events in their sensorimotor control of a robotic hand.
Cipriani, Christian; Segil, Jacob L; Clemente, Francesco; ff Weir, Richard F; Edin, Benoni
2014-11-01
Providing functionally effective sensory feedback to users of prosthetics is a largely unsolved challenge. Traditional solutions require high band-widths for providing feedback for the control of manipulation and yet have been largely unsuccessful. In this study, we have explored a strategy that relies on temporally discrete sensory feedback that is technically simple to provide. According to the Discrete Event-driven Sensory feedback Control (DESC) policy, motor tasks in humans are organized in phases delimited by means of sensory encoded discrete mechanical events. To explore the applicability of DESC for control, we designed a paradigm in which healthy humans operated an artificial robot hand to lift and replace an instrumented object, a task that can readily be learned and mastered under visual control. Assuming that the central nervous system of humans naturally organizes motor tasks based on a strategy akin to DESC, we delivered short-lasting vibrotactile feedback related to events that are known to forcefully affect progression of the grasp-lift-and-hold task. After training, we determined whether the artificial feedback had been integrated with the sensorimotor control by introducing short delays and we indeed observed that the participants significantly delayed subsequent phases of the task. This study thus gives support to the DESC policy hypothesis. Moreover, it demonstrates that humans can integrate temporally discrete sensory feedback while controlling an artificial hand and invites further studies in which inexpensive, noninvasive technology could be used in clever ways to provide physiologically appropriate sensory feedback in upper limb prosthetics with much lower band-width requirements than with traditional solutions. PMID:24992899
Parallel Monte-Carlo Tree Search with Simulation Servers
Hideki Kato; Ikuo Takeuchi
2010-01-01
Monte-Carlo tree search is a new best-first tree search algorithm that triggered a revolution in the computer Go world. Developing good parallel Monte-Carlo tree search algorithms is importan because single processor's performance cannot be expected to increase as used to. A novel parallel Monte-Carlo tree search algorithm is proposed. A tree searcher runs on a client computer and multiple Monte-Carlo
High Performance System Framework for Parallel in-Silico Biological Simulations
Plamenka Borovska; Ognian Nakov; Veska Gancheva; Ivailo Georgiev
2011-01-01
The parallel implementation of methods and algorithms for analysis of biological data using high-performance computing is essential for accelerating the research and reduce the investment. The paper presents a high-performance framework for carrying out scientific experiments in the area of bioinformatics, on the basis of parallel computer simulations on a heterogeneous compact computer cluster. Several of the most popular and
Parallelization of Shallow Water Simulations on Current Multi-threaded Systems
Fraguela, Basilio B.
: Shallow water, pollutant transport, stream programming, com- piler parallelizing transformations, GPUParallelization of Shallow Water Simulations on Current Multi-threaded Systems J. Lobeiras (jacobo@anamat.cie.uma.es) § September 24, 2012 Abstract In this work, several parallel implementations of a numerical model of pollutant
Dynamic simulation of a parallel-plate electrochemical uorination reactor , G.L. BAUER2
Weidner, John W.
Dynamic simulation of a parallel-plate electrochemical ¯uorination reactor K. JHA1 , G.L. BAUER2 for correspondence) Received 2 July 1998; accepted in revised form 29 June 1999 Key words: ¯uorination reactor, time-dependent modelling Abstract A time-dependent mathematical model of a parallel-plate reactor was developed to study
Electrical Simulations of Series and Parallel PV Arc-Faults Jack Flicker and Jay Johnson
Electrical Simulations of Series and Parallel PV Arc-Faults Jack Flicker and Jay Johnson Sandia this danger by requiring arc-fault circuit interrupters (AFCI). Currently, the requirement is only for series arc-faults, but to fully protect PV installations from arc-fault-generated fires, parallel arc-faults
A systolic parallel simulation system for dynamic traffic assignment: SPSS-DTA
Kwangho Park; Wonkyu Kim
2001-01-01
This paper presents a new approach to solve dynamic traffic assignment problems. The approach employs a mixed method of real-time simulation and off-line optimization. The fundamental approach to the simulation is systolic parallel processing based on autonomous agent modeling. Agents continuously act on their own initiatives and access to database to get the status of the simulation world. In particular,
Parallel Co-simulation Using Virtual Synchronization with Redundant Host Execution
Gupta, Rajesh
Parallel Co-simulation Using Virtual Synchronization with Redundant Host Execution Dohyung Kim simula- tion from trace generation from component simulators. The latter is known as virtual caused by data depend- ency in simulation models. By combining virtual synchro- nization and redundant
Parallel Simulation of Subsonic Fluid Dynamics on a Cluster of Workstations
Skordos, Panayotis A.
1995-12-01
An effective approach of simulating fluid dynamics on a cluster of non- dedicated workstations is presented. The approach uses local interaction algorithms, small communication capacity, and automatic migration of parallel ...
Parallelization of particle-in-cell simulation modeling Hall-effect thrusters
Fox, Justin M., 1981-
2005-01-01
MIT's fully kinetic particle-in-cell Hall thruster simulation is adapted for use on parallel clusters of computers. Significant computational savings are thus realized with a predicted linear speed up efficiency for certain ...
Three-dimensional shock wave physics simulations with MIMD PAGOSA on massively parallel computers
Gardner, D.R.; Vaughan, C.T. [Sandia National Labs., Albuquerque, NM (United States); Cline, D.D. [Texas Univ., Austin, TX (United States). Center for High Performance Computing
1992-12-31
The numerical modeling of penetrator-armor interactions for design studies requires rapid, detailed, three-dimensional simulation of complex interactions of exotic materials at high speeds and high rates of strain. To perform such simulations, we have developed a multiple-instruction, multiple-data (MIMD) version of the PAGOSA hydrocode. The code includes a variety of models for material strength, fracture, and the detonation of high explosives. We present a typical armor/antiarmor penetration simulation conducted with this code, and measurements of its performance. The scaled speedups for MIMD PAGOSA on the 1024-processor nCUBE 2 parallel computer, measured as the simulation size is increased with the number of processors, reveal that small grind times (computational time per cell per cycle) and parallel scaled efficiencies of 90% can be achieved for realistic problems. This simulation demonstrates that massively parallel hydrocodes can provide rapid, highly-detailed armor/ antiarmor simulations.
Three-dimensional shock wave physics simulations with MIMD PAGOSA on massively parallel computers
Gardner, D.R.; Vaughan, C.T. (Sandia National Labs., Albuquerque, NM (United States)); Cline, D.D. (Texas Univ., Austin, TX (United States). Center for High Performance Computing)
1992-01-01
The numerical modeling of penetrator-armor interactions for design studies requires rapid, detailed, three-dimensional simulation of complex interactions of exotic materials at high speeds and high rates of strain. To perform such simulations, we have developed a multiple-instruction, multiple-data (MIMD) version of the PAGOSA hydrocode. The code includes a variety of models for material strength, fracture, and the detonation of high explosives. We present a typical armor/antiarmor penetration simulation conducted with this code, and measurements of its performance. The scaled speedups for MIMD PAGOSA on the 1024-processor nCUBE 2 parallel computer, measured as the simulation size is increased with the number of processors, reveal that small grind times (computational time per cell per cycle) and parallel scaled efficiencies of 90% can be achieved for realistic problems. This simulation demonstrates that massively parallel hydrocodes can provide rapid, highly-detailed armor/ antiarmor simulations.
A Parallel Overset Grid High-Order Flow Solver for Large Eddy Simulation
P. Morgan; M. Visbal; D. Rizzetta
2006-01-01
This work describes the development and validation of a parallel high-order compact finite difference Navier–Stokes solver for application to large-eddy simulation (LES) and direct numerical simulation. The implicit solver can employ up to sixth-order spatial formulations and tenth-order filtering. The parallelization of the solver is founded on the overset grid technique. LES were then performed for turbulent channel flow with
A Queue Simulation Tool for a High Performance Scientific Computing Center
NASA Technical Reports Server (NTRS)
Spear, Carrie; McGalliard, James
2007-01-01
The NASA Center for Computational Sciences (NCCS) at the Goddard Space Flight Center provides high performance highly parallel processors, mass storage, and supporting infrastructure to a community of computational Earth and space scientists. Long running (days) and highly parallel (hundreds of CPUs) jobs are common in the workload. NCCS management structures batch queues and allocates resources to optimize system use and prioritize workloads. NCCS technical staff use a locally developed discrete event simulation tool to model the impacts of evolving workloads, potential system upgrades, alternative queue structures and resource allocation policies.
Motion simulation using a high-speed parallel link mechanism
S. Taraot; E. Inohiraf; M. Uchiyama
2000-01-01
A new kind of motion simulation is proposed for the purpose of simulating complicated motions such as those caused by interactions between objects in special environments (e.g., those of micro-gravity). The motion simulation is based on a hybrid simulation: consisting of a combined analog-digital system. Hence it has functions of real-time numerical simulation and embedding a physical model in the
Hisashi Ishida; Mariko Higuchi; Yoshiteru Yonetani; Takuma Kano; Yasumasa Joti; Akio Kitao; Nobuhiro Go
The Earth Simulator has the highest power ever achieved to perform molecular dynamics simulation of large-scale supra- molecular systems. Now, we are developing a molecular dynamics simulation system, called PABIOS, which is designed to run a system composed of more than a million particles efficiently on parallel computers. To perform large-scale simulations rapidly and accurately, state-of-the-art algorithms, such as Particle-Particle
Kothe, D.B.; Turner, J.A.; Mosso, S.J.; Ferrell, R.C.
1997-03-01
We discuss selected aspects of a new parallel three-dimensional (3-D) computational tool for the unstructured mesh simulation of Los Alamos National Laboratory (LANL) casting processes. This tool, known as {bold Telluride}, draws upon on robust, high resolution finite volume solutions of metal alloy mass, momentum, and enthalpy conservation equations to model the filling, cooling, and solidification of LANL castings. We briefly describe the current {bold Telluride} physical models and solution methods, then detail our parallelization strategy as implemented with Fortran 90 (F90). This strategy has yielded straightforward and efficient parallelization on distributed and shared memory architectures, aided in large part by new parallel libraries {bold JTpack9O} for Krylov-subspace iterative solution methods and {bold PGSLib} for efficient gather/scatter operations. We illustrate our methodology and current capabilities with source code examples and parallel efficiency results for a LANL casting simulation.
Xyce parallel electronic simulator users guide, version 6.1
Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason C.; Baur, David Gregory
2014-03-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.
Xyce parallel electronic simulator users' guide, Version 6.0.1.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David Gregory. [Raytheon, Albuquerque, NM
2014-01-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.
NASA Astrophysics Data System (ADS)
Wu, Di M.; Zhao, S. S.; Lu, Jun Q.; Hu, Xin-Hua
2000-06-01
In Monte Carlo simulations of light propagating in biological tissues, photons propagating in the media are described as classic particles being scattered and absorbed randomly in the media, and their path are tracked individually. To obtain any statistically significant results, however, a large number of photons is needed in the simulations and the calculations are time consuming and sometime impossible with existing computing resource, especially when considering the inhomogeneous boundary conditions. To overcome this difficulty, we have implemented a parallel computing technique into our Monte Carlo simulations. And this moment is well justified due to the nature of the Monte Carlo simulation. Utilizing the PVM (Parallel Virtual Machine, a parallel computing software package), parallel codes in both C and Fortran have been developed on the massive parallel computer of Cray T3E and a local PC-network running Unix/Sun Solaris. Our results show that parallel computing can significantly reduce the running time and make efficient usage of low cost personal computers. In this report, we present a numerical study of light propagation in a slab phantom of skin tissue using the parallel computing technique.
Parallel simulation of strong ground motions during recent and historical
Furumura, Takashi
2004 Available online 14 April 2005 Abstract The development of high-performance computing facilities-1-1 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan b Research Organization for Information Sciences and Technology: furumura@eri.u-tokyo.ac.jp (T. Furumura). www.elsevier.com/locate/parco Parallel Computing 31 (2005) 149
A Perceptually-Driven Parallel Algorithm for Efficient Radiosity Simulation
Simon Gibson; Roger J. Hubbold
2000-01-01
The authors describe a novel algorithm for computing view-independent finite element radiosity solutions on distributed shared memory parallel architectures. Our approach is based on the notion of a subiteration being the transfer of energy from a single source to a subset of the scene's receiver patches. By using an efficient queue based scheduling system to process these subiterations, we show
Towards scalable parallel-in-time turbulent flow simulations
Wang, Qiqi
We present a reformulation of unsteady turbulent flow simulations. The initial condition is relaxed and information is allowed to propagate both forward and backward in time. Simulations of chaotic dynamical systems with ...
Asynchronous distributed simulation via a sequence of parallel computations
K. Mani Chandy; Jayadev Misra
1981-01-01
An approach to carrying out asynchronous, distributed simulation on multiprocessor messagepassing architectures is presented. This scheme differs from other distributed simulation schemes because (1) the amount of memory required by all processors together is bounded and is no more than the amount required in sequential simulation and (2) the multiprocessor network is allowed to deadlock, the deadlock is detected, and
A Visualized Parallel Network Simulator for Modeling Large-Scale Distributed Applications
Siming Lin; Xueqi Cheng; Lv Jianming
2007-01-01
Large-scale distributed systems, with thousands or even millions of nodes, produce complex and dynamic behaviors. Packet-level simulation is necessary to test and analyze these systems, such as grids, peer-to-peer (P2P) applications as well as worm and DDoS containment systems. However, the current network simulators are not convenient for application layer simulation. We present the NSME, a visualized parallel simulator that
Parallel kinetic Monte Carlo simulations of Ag(111) island coarsening using a large database
NASA Astrophysics Data System (ADS)
Nandipati, Giridhar; Shim, Yunsic; Amar, Jacques G.; Karim, Altaf; Kara, Abdelkader; Rahman, Talat S.; Trushin, Oleg
2009-02-01
The results of parallel kinetic Monte Carlo (KMC) simulations of the room-temperature coarsening of Ag(111) islands carried out using a very large database obtained via self-learning KMC simulations are presented. Our results indicate that, while cluster diffusion and coalescence play an important role for small clusters and at very early times, at late time the coarsening proceeds via Ostwald ripening, i.e. large clusters grow while small clusters evaporate. In addition, an asymptotic analysis of our results for the average island size S(t) as a function of time t leads to a coarsening exponent n = 1/3 (where S(t)~t2n), in good agreement with theoretical predictions. However, by comparing with simulations without concerted (multi-atom) moves, we also find that the inclusion of such moves significantly increases the average island size. Somewhat surprisingly we also find that, while the average island size increases during coarsening, the scaled island-size distribution does not change significantly. Our simulations were carried out both as a test of, and as an application of, a variety of different algorithms for parallel kinetic Monte Carlo including the recently developed optimistic synchronous relaxation (OSR) algorithm as well as the semi-rigorous synchronous sublattice (SL) algorithm. A variation of the OSR algorithm corresponding to optimistic synchronous relaxation with pseudo-rollback (OSRPR) is also proposed along with a method for improving the parallel efficiency and reducing the number of boundary events via dynamic boundary allocation (DBA). A variety of other methods for enhancing the efficiency of our simulations are also discussed. We note that, because of the relatively high temperature of our simulations, as well as the large range of energy barriers (ranging from 0.05 to 0.8 eV), developing an efficient algorithm for parallel KMC and/or SLKMC simulations is particularly challenging. However, by using DBA to minimize the number of boundary events, we have achieved significantly improved parallel efficiencies for the OSRPR and SL algorithms. Finally, we note that, among the three parallel algorithms which we have tested here, the semi-rigorous SL algorithm with DBA led to the highest parallel efficiencies. As a result, we have obtained reasonable parallel efficiencies in our simulations of room-temperature Ag(111) island coarsening for a small number of processors (e.g. Np = 2 and 4). Since the SL algorithm scales with system size for fixed processor size, we expect that comparable and/or even larger parallel efficiencies should be possible for parallel KMC and/or SLKMC simulations of larger systems with larger numbers of processors.
A parallel finite element simulator for ion transport through three-dimensional ion channel systems.
Tu, Bin; Chen, Minxin; Xie, Yan; Zhang, Linbo; Eisenberg, Bob; Lu, Benzhuo
2013-09-15
A parallel finite element simulator, ichannel, is developed for ion transport through three-dimensional ion channel systems that consist of protein and membrane. The coordinates of heavy atoms of the protein are taken from the Protein Data Bank and the membrane is represented as a slab. The simulator contains two components: a parallel adaptive finite element solver for a set of Poisson-Nernst-Planck (PNP) equations that describe the electrodiffusion process of ion transport, and a mesh generation tool chain for ion channel systems, which is an essential component for the finite element computations. The finite element method has advantages in modeling irregular geometries and complex boundary conditions. We have built a tool chain to get the surface and volume mesh for ion channel systems, which consists of a set of mesh generation tools. The adaptive finite element solver in our simulator is implemented using the parallel adaptive finite element package Parallel Hierarchical Grid (PHG) developed by one of the authors, which provides the capability of doing large scale parallel computations with high parallel efficiency and the flexibility of choosing high order elements to achieve high order accuracy. The simulator is applied to a real transmembrane protein, the gramicidin A (gA) channel protein, to calculate the electrostatic potential, ion concentrations and I - V curve, with which both primitive and transformed PNP equations are studied and their numerical performances are compared. To further validate the method, we also apply the simulator to two other ion channel systems, the voltage dependent anion channel (VDAC) and ?-Hemolysin (?-HL). The simulation results agree well with Brownian dynamics (BD) simulation results and experimental results. Moreover, because ionic finite size effects can be included in PNP model now, we also perform simulations using a size-modified PNP (SMPNP) model on VDAC and ?-HL. It is shown that the size effects in SMPNP can effectively lead to reduced current in the channel, and the results are closer to BD simulation results. PMID:23740647
California at Berkeley, University of
PROTEUS: A High-Performance Parallel-Architecture Simulator Eric A. Brewer (lhrysanthos N Introduction PROTEUS is an execution-driven simulator for MIMD machines. Like Tango [3] and RPPT [2], it directly ex- ecutes most instructions to achieve very high perfor- mance. Despite exceptional speed PROTEUS
Martín, Pino
simulations (DNS), all possible length scales and time scales must be resolved by the numerical method. ThusA parallel implicit method for the direct numerical simulation of wall-bounded compressible is presented. The formulation of the implicit method and the corresponding tunable parameters are introduced
A parallel computational method for simulating two-phase gel dynamics
Wright, Grady B.
A parallel computational method for simulating two-phase gel dynamics Jian Du 1 , Aaron L computational algorithm for simulating models of gel dynamics where the gel is described by two phases, a networked polymer and a fluid solvent. The models consist of transport equations for the two phases, two
InAsGaAs square nanomesas: Multimillion-atom molecular dynamics simulations on parallel computers
Southern California, University of
been determined using scanning tunneling microscopy STM and first-principles electronicInAsÕGaAs square nanomesas: Multimillion-atom molecular dynamics simulations on parallel computers-scale molecular dynamics MD simulations are performed to investigate mechanical stresses in InAs/GaAs nanomesas
A simulator for adaptive parallel applications Basile Schaeli, Sebastian Gerlach, Roger D. Hersch
Hersch, Roger D.
A simulator for adaptive parallel applications Basile Schaeli, Sebastian Gerlach, Roger D. Hersch taking CPU and network sharing into account. Simulations can be carried out without needing to modify number of compute nodes to an appli- cation, causing nodes to become idle or underutilized when
A Scalable Multi-scale Framework for Parallel Simulation and Visualization of Microbial Evolution
Tagkopoulos, Ilias
microbial evolution dynamics. The simulator employs multi-scale models and data structures that captureA Scalable Multi-scale Framework for Parallel Simulation and Visualization of Microbial Evolution new strains with desired proprieties (e.g. resilient strains for recombinant protein or bio-fuels
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL
Zuo, Wangda; McNeil, Andrew; Wetter, Michael; Lee, Eleanor
2011-09-06
We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
Parallel Monte Carlo Electron and Photon Transport Simulation Code (PMCEPT code)
NASA Astrophysics Data System (ADS)
Kum, Oyeon
2004-11-01
Simulations for customized cancer radiation treatment planning for each patient are very useful for both patient and doctor. These simulations can be used to find the most effective treatment with the least possible dose to the patient. This typical system, so called ``Doctor by Information Technology", will be useful to provide high quality medical services everywhere. However, the large amount of computing time required by the well-known general purpose Monte Carlo(MC) codes has prevented their use for routine dose distribution calculations for a customized radiation treatment planning. The optimal solution to provide ``accurate" dose distribution within an ``acceptable" time limit is to develop a parallel simulation algorithm on a beowulf PC cluster because it is the most accurate, efficient, and economic. I developed parallel MC electron and photon transport simulation code based on the standard MPI message passing interface. This algorithm solved the main difficulty of the parallel MC simulation (overlapped random number series in the different processors) using multiple random number seeds. The parallel results agreed well with the serial ones. The parallel efficiency approached 100% as was expected.
Parallel kinetic Monte Carlo simulations of two-dimensional island coarsening.
Shi, Feng; Shim, Yunsic; Amar, Jacques G
2007-09-01
The results of parallel kinetic Monte Carlo (KMC) simulations of island coarsening based on a bond-counting model are presented. Our simulations were carried out both as a test of and as an application of the recently developed semirigorous synchronous sublattice (SL) algorithm. By carrying out simulations over long times and for large system sizes the asymptotic coarsening behavior and scaled island-size distribution (ISD) were determined. Our results indicate that while cluster diffusion and coalescence play a role at early and intermediate times, at late times the coarsening proceeds via Ostwald ripening. In addition, we find that the asymptotic scaled ISD is significantly narrower and more sharply peaked than the mean-field theory prediction. The dependence of the scaled ISD on coverage is also studied. Our results demonstrate that parallel KMC simulations can be used to effectively extend the time scale over which realistic coarsening simulations can be carried out. In particular, for simulations of the late stages of coarsening with system size L=1600 and eight processors, a parallel efficiency larger than 80% was obtained. These results suggest that the SL algorithm is likely to be useful in the future in parallel KMC simulations of more complicated models of coarsening. PMID:17930256
Parallel kinetic Monte Carlo simulations of two-dimensional island coarsening
NASA Astrophysics Data System (ADS)
Shi, Feng; Shim, Yunsic; Amar, Jacques G.
2007-09-01
The results of parallel kinetic Monte Carlo (KMC) simulations of island coarsening based on a bond-counting model are presented. Our simulations were carried out both as a test of and as an application of the recently developed semirigorous synchronous sublattice (SL) algorithm. By carrying out simulations over long times and for large system sizes the asymptotic coarsening behavior and scaled island-size distribution (ISD) were determined. Our results indicate that while cluster diffusion and coalescence play a role at early and intermediate times, at late times the coarsening proceeds via Ostwald ripening. In addition, we find that the asymptotic scaled ISD is significantly narrower and more sharply peaked than the mean-field theory prediction. The dependence of the scaled ISD on coverage is also studied. Our results demonstrate that parallel KMC simulations can be used to effectively extend the time scale over which realistic coarsening simulations can be carried out. In particular, for simulations of the late stages of coarsening with system size L=1600 and eight processors, a parallel efficiency larger than 80% was obtained. These results suggest that the SL algorithm is likely to be useful in the future in parallel KMC simulations of more complicated models of coarsening.
Simulations of Ion Thruster Plume–Spacecraft Interactions on Parallel Supercomputer
Joseph Wang; Yong Cao; Raed Kafafy; Julien Pierru; Viktor K. Decyk
2006-01-01
A parallel three-dimensional electrostatic Particle-In-Cell (PIC) code is developed for large-scale simulations of ion thruster plume-spacecraft interactions on parallel supercomputers. This code is based on a newly developed immersed finite-element (IFE) PIC. The IFE-PIC is designed to handle complex boundary conditions accurately while maintaining the computational speed of the standard PIC code. Domain decomposition is used in both field solve
Parallelizing Power Systems Simulation for Multi-core Clusters: Design for an SME
Hossein Pourreza; Ani M. Gole; Shaahin Filizadeh; Peter Graham
2009-01-01
\\u000a In this paper, we discuss the parallelization of power systems simulation using modern clusters consisting of multi-core compute\\u000a nodes interconnected by a low latency interconnect. We describe an implementation strategy that exploits three types of parallelization\\u000a that are amenable to the various types of inter-core “connectivity” commonly seen in such clusters. We also consider a set\\u000a of design criteria that
Parallelized modelling and solution scheme for hierarchically scaled simulations
NASA Technical Reports Server (NTRS)
Padovan, Joe
1995-01-01
This two-part paper presents the results of a benchmarked analytical-numerical investigation into the operational characteristics of a unified parallel processing strategy for implicit fluid mechanics formulations. This hierarchical poly tree (HPT) strategy is based on multilevel substructural decomposition. The Tree morphology is chosen to minimize memory, communications and computational effort. The methodology is general enough to apply to existing finite difference (FD), finite element (FEM), finite volume (FV) or spectral element (SE) based computer programs without an extensive rewrite of code. In addition to finding large reductions in memory, communications, and computational effort associated with a parallel computing environment, substantial reductions are generated in the sequential mode of application. Such improvements grow with increasing problem size. Along with a theoretical development of general 2-D and 3-D HPT, several techniques for expanding the problem size that the current generation of computers are capable of solving, are presented and discussed. Among these techniques are several interpolative reduction methods. It was found that by combining several of these techniques that a relatively small interpolative reduction resulted in substantial performance gains. Several other unique features/benefits are discussed in this paper. Along with Part 1's theoretical development, Part 2 presents a numerical approach to the HPT along with four prototype CFD applications. These demonstrate the potential of the HPT strategy.
Simulation of Transients in Plasma Processing Reactors Using Moderate Parallelism
NASA Astrophysics Data System (ADS)
Subramonium, Pramod; Kushner, Mark J.
2000-10-01
Quantifying transient phenomena in plasma processing, such as startup, shutdown, recipe changes or pulsed operation, are important to optimizing plasma and materials properties. These long term phenomena are difficult to resolve in multi-dimensional plasma equipment models due to the large computational burden. Hybrid models, which sequentially execute modules, may not be adequate to resolve the physics of transients. In this paper, we describe a new numerical technique in which a moderately parallel implementation of a hybrid plasma equipment model is used to address long term transients. In this implementation, the Electromagnetics Module (EMM), Electron Energy Transport Module (EETM) and the Fluid Kinetics Module (FKM) of the Hybrid Plasma Equipment Model are executed in parallel. Rate coefficients are continuously updated in EETM and are immediately available in shared memory for FKM. Electric fields from the EMM and FKM are, likewise, continuously updated and are immediately available for the EETM. Results will be discussed for 2-dimensional plasma properties during transients and pulses in low pressure inductively coupled plasmas for etching.
Large Eddy simulation of parallel blade-vortex interaction
NASA Astrophysics Data System (ADS)
Felten, Frederic; Lund, Thomas
2002-11-01
Helicopter Blade-Vortex Interaction (BVI) generally occurs under certain conditions of powered descent or during extreme maneuvering. The vibration and acoustic problems associated with the interaction of rotor tip vortices and the following blades is a major aerodynamic concern for the helicopter community. Numerous experimental and computational studies have been done over the last two decades in order to gain a better understanding of the physical mechanisms involved in BVI. The most severe interaction, in terms of generated noise, happens when the vortex filament is parallel to the blade, thus affecting a great portion of it. The majority of the previous numerical studies of parallel BVI fall within a potential flow framework. Some Navier-Stokes approaches using dissipative numerical methods and RANS-type turbulence models have also been attempted, but with limited success. The current investigation makes use of an incompressible, non-dissipative, kinetic energy conserving collocated mesh scheme in conjunction with a dynamic subgrid-scale model. The concentrated tip vortex is not attenuated as it is convected downstream and over a NACA-0012 airfoil. The lift, drag, moment and pressure coefficients induced by the passage of the vortex are monitored in time and compared with experimental data.
The Design for Parallel Computing Simulation of Waveforms in the Front End of Sonar
Qiu Feng; Xu Ren-zhou; Cai Zhiming; Wei Hongkai
2010-01-01
A commercial off-the-shelf (COTS) parallel processing system, consisted of multi-DSPs and a control computer, is used as a hardware platform for the online simulation of waveforms in the front end of sonar. The simulation system should be kept open and be run in real time, so the design issues for the simulation software in the multi-DSPs platform were addressed. The
Molecular Dynamic Simulations of Nanostructured Ceramic Materials on Parallel Computers
Vashishta, Priya; Kalia, Rajiv
2005-02-24
Large-scale molecular-dynamics (MD) simulations have been performed to gain insight into: (1) sintering, structure, and mechanical behavior of nanophase SiC and SiO2; (2) effects of dynamic charge transfers on the sintering of nanophase TiO2; (3) high-pressure structural transformation in bulk SiC and GaAs nanocrystals; (4) nanoindentation in Si3N4; and (5) lattice mismatched InAs/GaAs nanomesas. In addition, we have designed a multiscale simulation approach that seamlessly embeds MD and quantum-mechanical (QM) simulations in a continuum simulation. The above research activities have involved strong interactions with researchers at various universities, government laboratories, and industries. 33 papers have been published and 22 talks have been given based on the work described in this report.
: A Scalable and Transparent System for Simulating MPI Programs
Perumalla, Kalyan S
2010-01-01
is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, and MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.
NASA Astrophysics Data System (ADS)
Seough, J.; Yoon, P. H.; Hwang, J.; Kim, K. H.
2014-12-01
In situ observations have shown that the measured electron temperature anisotropy in the expanding solar wind is regulated by the electron fire-hose instabilities (EFI), which could be excited by excessive parallel temperature anisotropy. It is known that for parallel propagation mode the enhanced transverse fluctuations driven by the parallel EFI are resonant with the ions. In the present study, nonlinear properties of the parallel EFI are investigated using one-dimensional particle-in-cell simulations with various initial proton plasma betas. It is found that the protons in resonance with the left-hand polarized EFI modes are anisotropically heated and subsequently their resonant interactions give rise to the excitation of the ion-acoustic waves (IAW). It is shown that the intensity of the excited IAW is proportional to the values of the electron to proton temperature ratio. In addition, the presence of the unusual electrostatic modes driven by nonlinear behavior of the protons, especially for the lower proton beta simulations, leads to the formation of the suprathermal component in the proton parallel velocity distribution, although the parallel proton temperature does not practically change throughout the simulation period.
A parallel algorithm for transient solid dynamics simulations with contact detection
Attaway, S.; Hendrickson, B.; Plimpton, S.; Gardner, D.; Vaughan, C.; Heinstein, M.; Peery, J.
1996-06-01
Solid dynamics simulations with Lagrangian finite elements are used to model a wide variety of problems, such as the calculation of impact damage to shipping containers for nuclear waste and the analysis of vehicular crashes. Using parallel computers for these simulations has been hindered by the difficulty of searching efficiently for material surface contacts in parallel. A new parallel algorithm for calculation of arbitrary material contacts in finite element simulations has been developed and implemented in the PRONTO3D transient solid dynamics code. This paper will explore some of the issues involved in developing efficient, portable, parallel finite element models for nonlinear transient solid dynamics simulations. The contact-detection problem poses interesting challenges for efficient implementation of a solid dynamics simulation on a parallel computer. The finite element mesh is typically partitioned so that each processor owns a localized region of the finite element mesh. This mesh partitioning is optimal for the finite element portion of the calculation since each processor must communicate only with the few connected neighboring processors that share boundaries with the decomposed mesh. However, contacts can occur between surfaces that may be owned by any two arbitrary processors. Hence, a global search across all processors is required at every time step to search for these contacts. Load-imbalance can become a problem since the finite element decomposition divides the volumetric mesh evenly across processors but typically leaves the surface elements unevenly distributed. In practice, these complications have been limiting factors in the performance and scalability of transient solid dynamics on massively parallel computers. In this paper the authors present a new parallel algorithm for contact detection that overcomes many of these limitations.
A General Simulation Framework for Supply Chain Modeling: State of the Art and Case Study
Cimino, Antonio; Mirabelli, Giovanni
2010-01-01
Nowadays there is a large availability of discrete event simulation software that can be easily used in different domains: from industry to supply chain, from healthcare to business management, from training to complex systems design. Simulation engines of commercial discrete event simulation software use specific rules and logics for simulation time and events management. Difficulties and limitations come up when commercial discrete event simulation software are used for modeling complex real world-systems (i.e. supply chains, industrial plants). The objective of this paper is twofold: first a state of the art on commercial discrete event simulation software and an overview on discrete event simulation models development by using general purpose programming languages are presented; then a Supply Chain Order Performance Simulator (SCOPS, developed in C++) for investigating the inventory management problem along the supply chain under different supply chain scenarios is proposed to readers.
Parallel Performance Optimizations on Unstructured Mesh-based Simulations
Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas; Huck, Kevin; Hollingsworth, Jeffrey; Malony, Allen; Williams, Samuel; Oliker, Leonid
2015-01-01
This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches.more »We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less
Parallel performance optimizations on unstructured mesh-based simulations
Sarje, Abhinav [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Song, Sukhyun [Univ. of Maryland, College Park (United States); Jacobsen, Douglas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Huck, Kevin [Univ. of Oregon, Eugene, OR (United States); Hollingsworth, Jeffrey [Univ. of Maryland, College Park (United States); Malony, Allen [Univ. of Oregon, Eugene, OR (United States); Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
2015-01-01
This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.
Virtual reality visualization of parallel molecular dynamics simulation
Disz, T.; Papka, M.; Stevens, R.; Pellegrino, M. [Argonne National Lab., IL (United States); Taylor, V. [Northwestern Univ., Evanston, IL (United States). Electrical Engineering and Computer Science
1995-12-31
When performing communications mapping experiments for massively parallel processors, it is important to be able to visualize the mappings and resulting communications. In a molecular dynamics model, visualization of the atom to atom interaction and the processor mappings provides insight into the effectiveness of the communications algorithms. The basic quantities available for visualization in a model of this type are the number of molecules per unit volume, the mass, and velocity of each molecule. The computational information available for visualization is the atom to atom interaction within each time step, the atom to processor mapping, and the energy resealing events. We use the CAVE (CAVE Automatic Virtual Environment) to provide interactive, immersive visualization experiences.
Dependability analysis of parallel systems using a simulation-based approach. M.S. Thesis
NASA Technical Reports Server (NTRS)
Sawyer, Darren Charles
1994-01-01
The analysis of dependability in large, complex, parallel systems executing real applications or workloads is examined in this thesis. To effectively demonstrate the wide range of dependability problems that can be analyzed through simulation, the analysis of three case studies is presented. For each case, the organization of the simulation model used is outlined, and the results from simulated fault injection experiments are explained, showing the usefulness of this method in dependability modeling of large parallel systems. The simulation models are constructed using DEPEND and C++. Where possible, methods to increase dependability are derived from the experimental results. Another interesting facet of all three cases is the presence of some kind of workload of application executing in the simulation while faults are injected. This provides a completely new dimension to this type of study, not possible to model accurately with analytical approaches.
A conflict-free, path-level parallelization approach for sequential simulation algorithms
NASA Astrophysics Data System (ADS)
Rasera, Luiz Gustavo; Machado, Péricles Lopes; Costa, João Felipe C. L.
2015-07-01
Pixel-based simulation algorithms are the most widely used geostatistical technique for characterizing the spatial distribution of natural resources. However, sequential simulation does not scale well for stochastic simulation on very large grids, which are now commonly found in many petroleum, mining, and environmental studies. With the availability of multiple-processor computers, there is an opportunity to develop parallelization schemes for these algorithms to increase their performance and efficiency. Here we present a conflict-free, path-level parallelization strategy for sequential simulation. The method consists of partitioning the simulation grid into a set of groups of nodes and delegating all available processors for simulation of multiple groups of nodes concurrently. An automated classification procedure determines which groups are simulated in parallel according to their spatial arrangement in the simulation grid. The major advantage of this approach is that it does not require conflict resolution operations, and thus allows exact reproduction of results. Besides offering a large performance gain when compared to the traditional serial implementation, the method provides efficient use of computational resources and is generic enough to be adapted to several sequential algorithms.
NASA Astrophysics Data System (ADS)
Bordovitsyna, T. V.; Avdyushev, V. A.; Chuvashov, I. N.; Aleksandrova, A. G.; Tomilova, I. V.
2009-11-01
In this paper features of numerical simulation of the large-scale system artificial satellites motion by parallel computing is discussed per example instantiation program complex "Numerical model of the system artificial satellites motion" in cluster "Skiff Cyberia". It is shown that using of parallel computing allows to implement simultaneously high-precision numerical simulation of the motion of large-scale system artificial satellites. It opens comprehensive facilities in solve direct and regressive problems of dynamics such satellite system as GLONASS and objects of space debris.
Massively Parallel Simulations of Solar Flares and Plasma Turbulence
Grauer, Rainer
-threaded implementation. 1 #12;Figure 1. Simulation of a Rayleigh-Taylor instability. Load balancing is based on a Hilbert in the first place can be naturally obtained in AMR by the use of small grid block sizes that fit well
Efficient Optimistic Parallel Simulations using Reverse Computation Christopher D. Carothers
computations, and to op- timize them to transparently reap the performance benefits of reverse computation realizing that it was in fact incorrect. The "computation" in simulation applications is one in which a set of the individual operations that are executed in the event computation. The system guarantees that the inverse op
Massively Parallel Reactive and Quantum Molecular Dynamics Simulations
NASA Astrophysics Data System (ADS)
Vashishta, Priya
2015-03-01
In this talk I will discuss two simulations: Cavitation bubbles readily occur in fluids subjected to rapid changes in pressure. We use billion-atom reactive molecular dynamics simulations on a 163,840-processor BlueGene/P supercomputer to investigate chemical and mechanical damages caused by shock-induced collapse of nanobubbles in water near silica surface. Collapse of an empty nanobubble generates high-speed nanojet, resulting in the formation of a pit on the surface. The gas-filled bubbles undergo partial collapse and consequently the damage on the silica surface is mitigated. Quantum molecular dynamics (QMD) simulations are performed on 786,432-processor Blue Gene/Q to study on-demand production of hydrogen gas from water using Al nanoclusters. QMD simulations reveal rapid hydrogen production from water by an Al nanocluster. We find a low activation-barrier mechanism, in which a pair of Lewis acid and base sites on the Aln surface preferentially catalyzes hydrogen production. I will also discuss on-demand production of hydrogen gas from water using and LiAl alloy particles. Research reported in this lecture was carried in collaboration with Rajiv Kalia, Aiichiro Nakano and Ken-ichi Nomura from the University of Southern California, and Fuyuki Shimojo and Kohei Shimamura from Kumamoto University, Japan.
A parallel finite volume algorithm for large-eddy simulation of turbulent flows
NASA Astrophysics Data System (ADS)
Bui, Trong Tri
1998-11-01
A parallel unstructured finite volume algorithm is developed for large-eddy simulation of compressible turbulent flows. Major components of the algorithm include piecewise linear least-square reconstruction of the unknown variables, trilinear finite element interpolation for the spatial coordinates, Roe flux difference splitting, and second-order MacCormack explicit time marching. The computer code is designed from the start to take full advantage of the additional computational capability provided by the current parallel computer systems. Parallel implementation is done using the message passing programming model and message passing libraries such as the Parallel Virtual Machine (PVM) and Message Passing Interface (MPI). The development of the numerical algorithm is presented in detail. The parallel strategy and issues regarding the implementation of a flow simulation code on the current generation of parallel machines are discussed. The results from parallel performance studies show that the algorithm is well suited for parallel computer systems that use the message passing programming model. Nearly perfect parallel speedup is obtained on MPP systems such as the Cray T3D and IBM SP2. Performance comparison with the older supercomputer systems such as the Cray YMP show that the simulations done on the parallel systems are approximately 10 to 30 times faster. The results of the accuracy and performance studies for the current algorithm are reported. To validate the flow simulation code, a number of Euler and Navier-Stokes simulations are done for internal duct flows. Inviscid Euler simulation of a very small amplitude acoustic wave interacting with a shock wave in a quasi-1D convergent-divergent nozzle shows that the algorithm is capable of simultaneously tracking the very small disturbances of the acoustic wave and capturing the shock wave. Navier-Stokes simulations are made for fully developed laminar flow in a square duct, developing laminar flow in a rectangular duct, and developing laminar flow in a 90-degree square bend. The Navier-Stokes solutions show good agreements with available analytical solutions and experimental data. To validate the flow simulation code for turbulence simulation, LES of fully-developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. The accuracy of the above algorithm for turbulence simulations is evaluated by comparison with the DNS solution. The effects of grid resolution, upwind numerical dissipation, and subgrid scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux difference splitting dissipation adversely affect the accuracy of the turbulence simulation. This problem is unique to the turbulence simulation, since it does not occur in the Euler and laminar Navier-Stokes simulations using the same code. For accurate turbulence simulation, it is found that only three to five percent of the standard Roe flux difference splitting dissipation is needed.
A new parallel method for molecular dynamics simulation of macromolecular systems
NASA Astrophysics Data System (ADS)
Plimpton, S.; Hendrickson, B.
1994-08-01
Short-range molecular dynamics simulations of molecular systems are commonly parallelized by replicated-data methods, where each processor stores a copy of all atom positions. This enables computation of bonded 2-, 3-, and 4-body forces within the molecular topology to be partitioned among processors straightforwardly. A drawback to such methods is that the inter-processor communication scales as N, the number of atoms, independent of P, the number of processors. Thus, their parallel efficiency falls off rapidly when large numbers of processors are used. In this paper a new parallel method called force-decomposition for simulating macromolecular or small-molecule systems is presented. Its memory and communication costs scale as N/(radical)P, allowing larger problems to be run faster on greater numbers of processors. Like replicated-data techniques, and in contrast to spatial-decomposition approaches, the new method can be simply load-balanced and performs well even for irregular simulation geometries. The implementation of the algorithm in a prototypical macromolecular simulation code ParBond is also discussed. On a 1024-processor Intel Paragon, ParBond runs a standard benchmark simulation of solvated myoglobin with a parallel efficiency of 61% and at 40 times the speed of a vectorized version of CHARMM running on a single Cray Y-MP processor.
Parallel simulation of tsunami inundation on a large-scale supercomputer
NASA Astrophysics Data System (ADS)
Oishi, Y.; Imamura, F.; Sugawara, D.
2013-12-01
An accurate prediction of tsunami inundation is important for disaster mitigation purposes. One approach is to approximate the tsunami wave source through an instant inversion analysis using real-time observation data (e.g., Tsushima et al., 2009) and then use the resulting wave source data in an instant tsunami inundation simulation. However, a bottleneck of this approach is the large computational cost of the non-linear inundation simulation and the computational power of recent massively parallel supercomputers is helpful to enable faster than real-time execution of a tsunami inundation simulation. Parallel computers have become approximately 1000 times faster in 10 years (www.top500.org), and so it is expected that very fast parallel computers will be more and more prevalent in the near future. Therefore, it is important to investigate how to efficiently conduct a tsunami simulation on parallel computers. In this study, we are targeting very fast tsunami inundation simulations on the K computer, currently the fastest Japanese supercomputer, which has a theoretical peak performance of 11.2 PFLOPS. One computing node of the K computer consists of 1 CPU with 8 cores that share memory, and the nodes are connected through a high-performance torus-mesh network. The K computer is designed for distributed-memory parallel computation, so we have developed a parallel tsunami model. Our model is based on TUNAMI-N2 model of Tohoku University, which is based on a leap-frog finite difference method. A grid nesting scheme is employed to apply high-resolution grids only at the coastal regions. To balance the computation load of each CPU in the parallelization, CPUs are first allocated to each nested layer in proportion to the number of grid points of the nested layer. Using CPUs allocated to each layer, 1-D domain decomposition is performed on each layer. In the parallel computation, three types of communication are necessary: (1) communication to adjacent neighbours for the finite difference calculation, (2) communication between adjacent layers for the calculations to connect each layer, and (3) global communication to obtain the time step which satisfies the CFL condition in the whole domain. A preliminary test on the K computer showed the parallel efficiency on 1024 cores was 57% relative to 64 cores. We estimate that the parallel efficiency will be considerably improved by applying a 2-D domain decomposition instead of the present 1-D domain decomposition in future work. The present parallel tsunami model was applied to the 2011 Great Tohoku tsunami. The coarsest resolution layer covers a 758 km × 1155 km region with a 405 m grid spacing. A nesting of five layers was used with the resolution ratio of 1/3 between nested layers. The finest resolution region has 5 m resolution and covers most of the coastal region of Sendai city. To complete 2 hours of simulation time, the serial (non-parallel) computation took approximately 4 days on a workstation. To complete the same simulation on 1024 cores of the K computer, it took 45 minutes which is more than two times faster than real-time. This presentation discusses the updated parallel computational performance and the efficient use of the K computer when considering the characteristics of the tsunami inundation simulation model in relation to the characteristics and capabilities of the K computer.
NASA Technical Reports Server (NTRS)
Fijany, Amir (inventor); Bejczy, Antal K. (inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
NASA Astrophysics Data System (ADS)
Perumalla, K.; Fujimoto, R.; Pande, S.; Karimabadi, H.; Driscoll, J.; Omelchenko, Y.
2005-12-01
Large parallel/distributed scientific simulations are very complex, and their dynamic behavior is hard to predict. Efficient development of massively parallel codes remains a computational challenge. For example, almost none of the kinetic codes in use in space physics today have dynamic load balancing capability. Here we present a new infrastructure for design and prediction of parallel codes. Performance prediction is useful to analyze, understand and experiment with different partitioning schemes, multiple modeling alternatives and so on, without having to run the application on supercomputers. Instrumentation of the model (with least perturbance to performance) is useful to glean key metrics and understand application-level behavior. Unfortunately, traditional approaches to virtual execution and instrumentation are limited by either slow execution speed or low resolution or both. We present a new framework that provides a high-resolution framework that provides a virtual CPU abstraction (with a full thread context per CPU), yet scales to thousands of virtual CPUs. The tool, called PDES2, presents different levels of modeling interfaces, from general purpose parallel simulations to parallel grid-based particle-in-cell (PIC) codes. The tool itself runs on multiple processors in order to accommodate the high-resolution by distributing the virtual execution across processors. Validation experiments of PIC models in the framework using a 1-D hybrid shock application show close agreement of results from virtual executions with results from actual supercomputer runs. The utility of this tool is further illustrated through an application to a parallel global hybrid code.
Hendrickson, B.; Plimpton, S.; Attaway, S.; Swegle, J.
1996-09-01
Transient dynamics simulations are commonly used to model phenomena such as car crashes, underwater explosions, and the response of shipping containers to high-speed impacts. Physical objects in such a simulation are typically represented by Lagrangian meshes because the meshes can move and deform with the objects as they undergo stress. Fluids (gasoline, water) or fluid-like materials (earth) in the simulation can be modeled using the techniques of smoothed particle hydrodynamics. Implementing a hybrid mesh/particle model on a massively parallel computer poses several difficult challenges. One challenge is to simultaneously parallelize and load-balance both the mesh and particle portions of the computation. A second challenge is to efficiently detect the contacts that occur within the deforming mesh and between mesh elements and particles as the simulation proceeds. These contacts impart forces to the mesh elements and particles which must be computed at each timestep to accurately capture the physics of interest. In this paper we describe new parallel algorithms for smoothed particle hydrodynamics and contact detection which turn out to have several key features in common. Additionally, we describe how to join the new algorithms with traditional parallel finite element techniques to create an integrated particle/mesh transient dynamics simulation. Our approach to this problem differs from previous work in that we use three different parallel decompositions, a static one for the finite element analysis and dynamic ones for particles and for contact detection. We have implemented our ideas in a parallel version of the transient dynamics code PRONTO-3D and present results for the code running on a large Intel Paragon.
Gedney, S.D.
1990-12-01
The Parallel-Plate Bounded-Wave EMP Simulator is typically used to test the vulnerability of electronic systems to the electromagnetic pulse (EMP) produced by a high altitude nuclear burst by subjecting the systems to a simulated EMP environment. However, when large test objects are placed within the simulator for investigation, the desired EMP environment may be affected by the interaction between the simulator and the test object. This simulator/obstacle interaction can be attributed to the following phenomena: (1) mutual coupling between the test object and the simulator, (2) fringing effects due to the finite width of the conducting plates of the simulator, and (3) multiple reflections between the object and the simulator's tapered end-sections. When the interaction is significant, the measurement of currents coupled into the system may not accurately represent those induced by an actual EMP. To better understand the problem of simulator/obstacle interaction, a dynamic analysis of the fields within the parallel-plate simulator is presented. The fields are computed using a moment method solution based on a wire mesh approximation of the conducting surfaces of the simulator. The fields within an empty simulator are found to be predominately transversse electromagnetic (TEM) for frequencies within the simulator's bandwidth, properly simulating the properties of the EMP propagating in free space. However, when a large test object is placed within the simulator, it is found that the currents induced on the object can be quite different from those on an object situated in free space. A comprehensive study of the mechanisms contributing to this deviation is presented.
A parallel simulated annealing algorithm for standard cell placement on a hypercube computer
NASA Technical Reports Server (NTRS)
Jones, Mark Howard
1987-01-01
A parallel version of a simulated annealing algorithm is presented which is targeted to run on a hypercube computer. A strategy for mapping the cells in a two dimensional area of a chip onto processors in an n-dimensional hypercube is proposed such that both small and large distance moves can be applied. Two types of moves are allowed: cell exchanges and cell displacements. The computation of the cost function in parallel among all the processors in the hypercube is described along with a distributed data structure that needs to be stored in the hypercube to support parallel cost evaluation. A novel tree broadcasting strategy is used extensively in the algorithm for updating cell locations in the parallel environment. Studies on the performance of the algorithm on example industrial circuits show that it is faster and gives better final placement results than the uniprocessor simulated annealing algorithms. An improved uniprocessor algorithm is proposed which is based on the improved results obtained from parallelization of the simulated annealing algorithm.
Parallel electric fields in a simulation of magnetotail reconnection and plasmoid evolution
NASA Technical Reports Server (NTRS)
Hesse, M.; Birn, J.
1990-01-01
Properties of the electric field component parallel to the magnetic field are investigate in a 3D MHD simulation of plasmoid formation and evolution in the magnetotail, in the presence of a net dawn-dusk magnetic field component. The spatial localization of E-parallel, and the concept of a diffusion zone and the role of E-parallel in accelerating electrons are discussed. A localization of the region of enhanced E-parallel in all space directions is found, with a strong concentration in the z direction. This region is identified as the diffusion zone, which plays a crucial role in reconnection theory through the local break-down of magnetic flux conservation.
Efficient Parallel Algorithm for Statistical Ion Track Simulations in Crystalline Materials
Jeon, Byoungseon
2008-01-01
We present an efficient parallel algorithm for statistical Molecular Dynamics simulations of ion tracks in solids. The method is based on the Rare Event Enhanced Domain following Molecular Dynamics (REED-MD) algorithm, which has been successfully applied to studies of, e.g., ion implantation into crystalline semiconductor wafers. We discuss the strategies for parallelizing the method, and we settle on a host-client type polling scheme in which a multiple of asynchronous processors are continuously fed to the host, which, in turn, distributes the resulting feed-back information to the clients. This real-time feed-back consists of, e.g., cumulative damage information or statistics updates necessary for the cloning in the rare event algorithm. We finally demonstrate the algorithm for radiation effects in a nuclear oxide fuel, and we show the balanced parallel approach with high parallel efficiency in multiple processor configurations.
Efficient parallel algorithm for statistical ion track simulations in crystalline materials
NASA Astrophysics Data System (ADS)
Jeon, Byoungseon; Grønbech-Jensen, Niels
2009-02-01
We present an efficient parallel algorithm for statistical Molecular Dynamics simulations of ion tracks in solids. The method is based on the Rare Event Enhanced Domain following Molecular Dynamics (REED-MD) algorithm, which has been successfully applied to studies of, e.g., ion implantation into crystalline semiconductor wafers. We discuss the strategies for parallelizing the method, and we settle on a host-client type polling scheme in which a multiple of asynchronous processors are continuously fed to the host, which, in turn, distributes the resulting feed-back information to the clients. This real-time feed-back consists of, e.g., cumulative damage information or statistics updates necessary for the cloning in the rare event algorithm. We finally demonstrate the algorithm for radiation effects in a nuclear oxide fuel, and we show the balanced parallel approach with high parallel efficiency in multiple processor configurations.
Massively parallel Monte Carlo for many-particle simulations on GPUs
Anderson, Joshua A; Grubb, Thomas L; Engel, Michael; Glotzer, Sharon C
2013-01-01
Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a GeForce GTX 680, our GPU implementation executes 95 times faster than on a single Intel Xeon E5540 CPU core, enabling 17 times better performance per dollar and cutting energy usage by a factor of 10.
Pacheco, P; Miller, P; Kim, J; Leese, T; Zabiyaka, Y
2003-05-07
Object-oriented NeuroSys (ooNeuroSys) is a collection of programs for simulating very large networks of biologically accurate neurons on distributed memory parallel computers. It includes two principle programs: ooNeuroSys, a parallel program for solving the large systems of ordinary differential equations arising from the interconnected neurons, and Neurondiz, a parallel program for visualizing the results of ooNeuroSys. Both programs are designed to be run on clusters and use the MPI library to obtain parallelism. ooNeuroSys also includes an easy-to-use Python interface. This interface allows neuroscientists to quickly develop and test complex neuron models. Both ooNeuroSys and Neurondiz have a design that allows for both high performance and relative ease of maintenance.
Wake Encounter Analysis for a Closely Spaced Parallel Runway Paired Approach Simulation
NASA Technical Reports Server (NTRS)
Mckissick,Burnell T.; Rico-Cusi, Fernando J.; Murdoch, Jennifer; Oseguera-Lohr, Rosa M.; Stough, Harry P, III; O'Connor, Cornelius J.; Syed, Hazari I.
2009-01-01
A Monte Carlo simulation of simultaneous approaches performed by two transport category aircraft from the final approach fix to a pair of closely spaced parallel runways was conducted to explore the aft boundary of the safe zone in which separation assurance and wake avoidance are provided. The simulation included variations in runway centerline separation, initial longitudinal spacing of the aircraft, crosswind speed, and aircraft speed during the approach. The data from the simulation showed that the majority of the wake encounters occurred near or over the runway and the aft boundaries of the safe zones were identified for all simulation conditions.
Zhao, Jinkui
2011-01-01
IB is a Monte Carlo simulation tool for aiding neutron scattering instrument designs. It is written in C++ and implemented under Parallel Virtual Machine. The program has a few basic components, or modules, that can be used to build a virtual neutron scattering instrument. More complex components, such as neutron guides and multichannel beam benders, can be constructed using the grouping technique unique to IB. Users can specify a collection of modules as a group. For example, a neutron guide can be constructed by grouping four neutron mirrors together that make up the four sides of the guide. IB s simulation engine ensures that neutrons entering a group will be properly operated upon by all members of the group. For simulations that require higher computer speed, the program can be run in parallel mode under the PVM architecture. Initially, the program was written for designing instruments on pulsed neutron sources, it has since been used to simulate reactor based instruments as well.
Time-partitioning simulation models for calculation on parallel computers
NASA Technical Reports Server (NTRS)
Milner, Edward J.; Blech, Richard A.; Chima, Rodrick V.
1987-01-01
A technique allowing time-staggered solution of partial differential equations is presented in this report. Using this technique, called time-partitioning, simulation execution speedup is proportional to the number of processors used because all processors operate simultaneously, with each updating of the solution grid at a different time point. The technique is limited by neither the number of processors available nor by the dimension of the solution grid. Time-partitioning was used to obtain the flow pattern through a cascade of airfoils, modeled by the Euler partial differential equations. An execution speedup factor of 1.77 was achieved using a two processor Cray X-MP/24 computer.
Time-partitioning simulation models for calculation of parallel computers
NASA Technical Reports Server (NTRS)
Milner, Edward J.; Blech, Richard A.; Chima, Rodrick V.
1987-01-01
A technique allowing time-staggered solution of partial differential equations is presented in this report. Using this technique, called time-partitioning, simulation execution speedup is proportional to the number of processors used because all processors operate simultaneously, with each updating of the solution grid at a different time point. The technique is limited by neither the number of processors available nor by the dimension of the solution grid. Time-partitioning was used to obtain the flow pattern through a cascade of airfoils, modeled by the Euler partial differential equations. An execution speedup factor of 1.77 was achieved using a two processor Cray X-MP/24 computer.
Molecular Dynamic Simulations of Nanostructured ; Ceramic Materials on Parallel Computers
Priya Vashishta; Rajiv Kalia
2005-01-01
Large-scale molecular-dynamics (MD) simulations have been performed to gain insight into: (1) sintering, structure, and mechanical behavior of nanophase SiC and SiO2; (2) effects of dynamic charge transfers on the sintering of nanophase TiO2; (3) high-pressure structural transformation in bulk SiC and GaAs nanocrystals; (4) nanoindentation in Si3N4; and (5) lattice mismatched InAs\\/GaAs nanomesas. In addition, we have designed a
PARSEK: a Parallel Software Package for Implicit Particle-in-Cell Simulations
Stefano Markidis; Giovanni Lapenta; Enrico Camporeale
2007-01-01
A C++ software package, called PARSEK, for Particle-in-Cell(PIC) plasma simulations on parallel computers is presented. PARSEK computational engine is based on the fully implicit solution of discretized three dimensional Maxwell's equations and particle equation of motion. The implicit method allows to describe effectively low-frequency plasma phenomena without paying the severe restrictions of explicit numerical schemes on simulation time steps and
Shuji Ogata; Elefterios Lidorikis; Fuyuki Shimojo; Aiichiro Nakano; Priya Vashishta; Rajiv K. Kalia
2001-01-01
A hybrid simulation approach is developed to study chemical reactions coupled with long-range mechanical phenomena in materials. The finite-element method for continuum mechanics is coupled with the molecular dynamics method for an atomic system that embeds a cluster of atoms described quantum-mechanically with the electronic density-functional method based on real-space multigrids. The hybrid simulation approach is implemented on parallel computers
Xyce parallel electronic simulator reference guide, Version 6.0.1.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David Gregory. [Raytheon, Albuquerque, NM
2014-01-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1] .
GalaxSee HPC Module 1: The N-Body Problem, Serial and Parallel Simulation
NSDL National Science Digital Library
David Joiner
This module introduces the N-body problem, which seeks to account for the dynamics of systems of multiple interacting objects. Galaxy dynamics serves as the motivating example to introduce a variety of computational methods for simulating change and criteria that can be used to check for model accuracy. Finally, the basic issues and ideas that must be considered when developing a parallel implementation of the simulation are introduced.
Zhang, Jun
Comparison of Parallel Preconditioners in Anisotropic Di#11;usion Simulation with Human Brain DT#11;usion process in the human brain by discretizing the governing di#11;usion equation on Cartesian approximate inverse preconditioning strategy, which is robust and eÆcient with good scalability, gives a much
Parallel Tempering Simulations of HP-36 Chai-Yu Lin,1
Parallel Tempering Simulations of HP-36 Chai-Yu Lin,1 Chin-Kun Hu,1 and Ulrich H. E. Hansmann2* 1-residue villin head- piece subdomain HP-36. Protein-solvent interactions are approximated by an implicit of available CPU time, and physical quantities cannot be calculated accurately. A second problem
Komarov, Ivan; D'Souza, Roshan M.
2012-01-01
The Gillespie Stochastic Simulation Algorithm (GSSA) and its variants are cornerstone techniques to simulate reaction kinetics in situations where the concentration of the reactant is too low to allow deterministic techniques such as differential equations. The inherent limitations of the GSSA include the time required for executing a single run and the need for multiple runs for parameter sweep exercises due to the stochastic nature of the simulation. Even very efficient variants of GSSA are prohibitively expensive to compute and perform parameter sweeps. Here we present a novel variant of the exact GSSA that is amenable to acceleration by using graphics processing units (GPUs). We parallelize the execution of a single realization across threads in a warp (fine-grained parallelism). A warp is a collection of threads that are executed synchronously on a single multi-processor. Warps executing in parallel on different multi-processors (coarse-grained parallelism) simultaneously generate multiple trajectories. Novel data-structures and algorithms reduce memory traffic, which is the bottleneck in computing the GSSA. Our benchmarks show an 8×?120× performance gain over various state-of-the-art serial algorithms when simulating different types of models. PMID:23152751
Exploiting Temporal Uncertainty in Parallel and Distributed Simulations 1 Richard M. Fujimoto
zero lookahead using conservative synchronization techniques, a longstanding problem in the parallel Technology Thrust (ASTT) program. #12; 2 a sequential simulation program. Rather than partitioning a large of a large telecommunication network can be realized by federating separate, stand alone sequential
Exploiting Temporal Uncertainty in Parallel and Distributed Simulations1 Richard M. Fujimoto
zero lookahead using conservative synchronization techniques, a long-standing problem in the parallel Technology Thrust (ASTT) program. #12;2 a sequential simulation program. Rather than partitioning a large of a large telecommunication network can be realized by federating separate, stand alone sequential
Asymptotic dispersion in 2D heterogeneous porous media determined by parallel numerical simulations
Jean-Raynald de Dreuzy; Anthony Beaudoin; Jocelyne Erhel
2007-01-01
We determine the asymptotic dispersion coefficients in 2D exponentially correlated lognormally distributed permeability fields by using parallel computing. Fluid flow is computed by solving the flow equation discretized on a regular grid and transport triggered by advection and diffusion is simulated by a particle tracker. To obtain a well-defined asymptotic regime under ergodic conditions (initial plume size much larger than
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Owen, Jeffrey E.
1988-01-01
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
A component-based architecture for parallel multi-physics PDE simulation
Steven G. Parker
2006-01-01
We describe the Uintah Computational Framework (UCF), a set of software components and libraries that facilitate the simulation of partial differential equations on structured adaptive mesh refinement grids using hundreds to thousands of processors. The UCF uses a non-traditional approach to achieving parallelism, employing an abstract taskgraph representation to describe computation and communication. This representation has a number of advantages
Christopher J. Hughes; Radek Grzeszczuk; Eftychios Sifakis; Daehyun Kim; Sanjeev Kumar; Andrew Selle; Jatin Chhugani; Matthew J. Holliman; Yen-kuang Chen
2007-01-01
We explore the emerging application area of physics-based simu- lation for computer animation and visual special effects. I n par- ticular, we examine its parallelization potential and char acterize its behavior on a chip multiprocessor (CMP). Applications in this do- main model and simulate natural phenomena, and often direct vi- sual components of motion pictures. We study a set of
Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube
NASA Technical Reports Server (NTRS)
Joslin, Ronald D.; Zubair, Mohammad
1993-01-01
The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.
Seinstra, Frank J.
~~PUTER SYSTEMS Parallel simulation of ion recombination in nonpolar liquids Frank J. Seinstra a,`, Hem-i E. Bal a Boelelaan 1081, 1081 HV Amsterdam, Netherlands Abstract Ion recombination in nonpolar liquids is an important problem in radiation chemistry. We have designed and implemented a parallel Monte Carlo simulation
Robust large-scale parallel nonlinear solvers for simulations.
Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson (Sandia National Laboratories, Livermore, CA)
2005-11-01
This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write and easily portable. However, the method usually takes twice as long to solve as Newton-GMRES on general problems because it solves two linear systems at each iteration. In this paper, we discuss modifications to Bouaricha's method for a practical implementation, including a special globalization technique and other modifications for greater efficiency. We present numerical results showing computational advantages over Newton-GMRES on some realistic problems. We further discuss a new approach for dealing with singular (or ill-conditioned) matrices. In particular, we modify an algorithm for identifying a turning point so that an increasingly ill-conditioned Jacobian does not prevent convergence.
Relationship between parallel faults and stress field in rock mass based on numerical simulation
NASA Astrophysics Data System (ADS)
Imai, Y.; Mikada, H.; Goto, T.; Takekawa, J.
2012-12-01
Parallel cracks and faults, caused by earthquakes and crustal deformations, are often observed in various scales from regional to laboratory scales. However, the mechanism of formation of these parallel faults has not been quantitatively clarified, yet. Since the stress field plays a key role to the nucleation of parallel faults, it is fundamentally to investigate the failure and the extension of cracks in a large-scale rock mass (not with a laboratory-scale specimen) due to mechanically loaded stress field. In this study, we developed a numerical simulations code for rock mass failures under different loading conditions, and conducted rock failure experiments using this code. We assumed a numerical rock mass consisting of basalt with a rectangular shape for the model. We also assumed the failure of rock mass in accordance with the Mohr-Coulomb criterion, and the distribution of the initial tensile and compressive strength of rock elements to be the Weibull model. In this study, we use the Hamiltonian Particle Method (HPM), one of the particle methods, to represent large deformation and the destruction of materials. Out simulation results suggest that the confining pressure would have dominant influence for the initiation of parallel faults and their conjugates in compressive conditions. We conclude that the shearing force would provoke the propagation of parallel fractures along the shearing direction, but prevent that of fractures to the conjugate direction.
Improving the Performance of the Extreme-scale Simulator
Engelmann, Christian [ORNL; Naughton, III, Thomas J [ORNL
2014-01-01
Investigating the performance of parallel applications at scale on future high-performance computing (HPC) architectures and the performance impact of different architecture choices is an important component of HPC hardware/software co-design. The Extreme-scale Simulator (xSim) is a simulation-based toolkit for investigating the performance of parallel applications at scale. xSim scales to millions of simulated Message Passing Interface (MPI) processes. The overhead introduced by a simulation tool is an important performance and productivity aspect. This paper documents two improvements to xSim: (1) a new deadlock resolution protocol to reduce the parallel discrete event simulation management overhead and (2) a new simulated MPI message matching algorithm to reduce the oversubscription management overhead. The results clearly show a significant performance improvement, such as by reducing the simulation overhead for running the NAS Parallel Benchmark suite inside the simulator from 1,020\\% to 238% for the conjugate gradient (CG) benchmark and from 102% to 0% for the embarrassingly parallel (EP) and benchmark, as well as, from 37,511% to 13,808% for CG and from 3,332% to 204% for EP with accurate process failure simulation.
Large-scale numerical simulation of laser propulsion by parallel computing
NASA Astrophysics Data System (ADS)
Zeng, Yaoyuan; Zhao, Wentao; Wang, Zhenghua
2013-05-01
As one of the most significant methods to study laser propelled rocket, the numerical simulation of laser propulsion has drawn an ever increasing attention at present. Nevertheless, the traditional serial simulation model cannot satisfy the practical needs because of insatiable memory overhead and considerable computation time. In order to solve this problem, we study on a general algorithm for laser propulsion design, and bring about parallelization by using a twolevel hybrid parallel programming model. The total computing domain is decomposed into distributed data spaces, and each partition is assigned to a MPI process. A single step of computation operates in the inter loop level, where a compiler directive is used to split MPI process into several OpenMP threads. Finally, parallel efficiency of hybrid program about two typical configurations on a China-made supercomputer with 4 to 256 cores is compared with pure MPI program. And, the hybrid program exhibits better performance than the pure MPI program on the whole, roughly as expected. The result indicates that our hybrid parallel approach is effective and practical in large-scale numerical simulation of laser propulsion.
Midpoint cell method for hybrid (MPI+OpenMP) parallelization of molecular dynamics simulations.
Jung, Jaewoon; Mori, Takaharu; Sugita, Yuji
2014-05-30
We have developed a new hybrid (MPI+OpenMP) parallelization scheme for molecular dynamics (MD) simulations by combining a cell-wise version of the midpoint method with pair-wise Verlet lists. In this scheme, which we call the midpoint cell method, simulation space is divided into subdomains, each of which is assigned to a MPI processor. Each subdomain is further divided into small cells. The interaction between two particles existing in different cells is computed in the subdomain containing the midpoint cell of the two cells where the particles reside. In each MPI processor, cell pairs are distributed over OpenMP threads for shared memory parallelization. The midpoint cell method keeps the advantages of the original midpoint method, while filtering out unnecessary calculations of midpoint checking for all the particle pairs by single midpoint cell determination prior to MD simulations. Distributing cell pairs over OpenMP threads allows for more efficient shared memory parallelization compared with distributing atom indices over threads. Furthermore, cell grouping of particle data makes better memory access, reducing the number of cache misses. The parallel performance of the midpoint cell method on the K computer showed scalability up to 512 and 32,768 cores for systems of 20,000 and 1 million atoms, respectively. One MD time step for long-range interactions could be calculated within 4.5 ms even for a 1 million atoms system with particle-mesh Ewald electrostatics. PMID:24659253
libMesh: a C++ library for parallel adaptive mesh refinement\\/coarsening simulations
Benjamin S. Kirk; John W. Peterson; Roy H. Stogner; Graham F. Carey
2006-01-01
In this paper we describe the \\u000a libMesh\\u000a (http:\\/\\/libmesh.sourceforge.net) framework for parallel adaptive finite element applications. \\u000a libMesh\\u000a is an open-source software library that has been developed to facilitate serial and parallel simulation of multiscale, multiphysics\\u000a applications using adaptive mesh refinement and coarsening strategies. The main software development is being carried out\\u000a in the CFDLab (http:\\/\\/cfdlab.ae.utexas.edu) at the University of Texas, but
A parallel multiprocessor system for Monte Carlo simulations in statistical physics
NASA Astrophysics Data System (ADS)
Saarinen, Jukka; Kaski, Kimmo; Viitanen, Jouko
1989-09-01
Hardware architecture and operating software of a parallel multiprocessor system for various Monte Carlo simulations of models in statistical physics is described. The main parts of the multiprocessor systems are the host processor, three tightly coupled MIMD parallel processors, and their individual memories. Our multiprocessor system is flexible to operate, easy to expand and executes programs written in high-level tamc language which is a subset of the c programming language. The host computer is used to develop, compile, and down-load programs and transmit the results from work memory to mass memory for further analysis.
Re-forming supercritical quasi-parallel shocks. I - One- and two-dimensional simulations
NASA Technical Reports Server (NTRS)
Thomas, V. A.; Winske, D.; Omidi, N.
1990-01-01
The process of reforming supercritical quasi-parallel shocks is investigated using one-dimensional and two-dimensional hybrid (particle ion, massless fluid electron) simulations both of shocks and of simpler two-stream interactions. It is found that the supercritical quasi-parallel shock is not steady. Instread of a well-defined shock ramp between upstream and downstream states that remains at a fixed position in the flow, the ramp periodically steepens, broadens, and then reforms upstream of its former position. It is concluded that the wave generation process is localized at the shock ramp and that the reformation process proceeds in the absence of upstream perturbations intersecting the shock.
NASA Astrophysics Data System (ADS)
Abraham, Mark James; Murtola, Teemu; Schulz, Roland; Páll, Szilárd; Smith, Jeremy C.; Hess, Berk; Lindahl, Erik
2015-09-01
GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. These work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU-GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. The latest best-in-class compressed trajectory storage format is supported.
NASA Technical Reports Server (NTRS)
Lyons, Daniel T.; Desai, Prasun N.
2005-01-01
This paper will describe the Entry, Descent and Landing simulation tradeoffs and techniques that were used to provide the Monte Carlo data required to approve entry during a critical period just before entry of the Genesis Sample Return Capsule. The same techniques will be used again when Stardust returns on January 15, 2006. Only one hour was available for the simulation which propagated 2000 dispersed entry states to the ground. Creative simulation tradeoffs combined with parallel processing were needed to provide the landing footprint statistics that were an essential part of the Go/NoGo decision that authorized release of the Sample Return Capsule a few hours before entry.
Parallel Monte Carlo simulations on an ARC-enabled computing grid
NASA Astrophysics Data System (ADS)
Nilsen, Jon K.; Samset, Bjørn H.
2011-12-01
Grid computing opens new possibilities for running heavy Monte Carlo simulations of physical systems in parallel. The presentation gives an overview of GaMPI, a system for running an MPI-based random walker simulation on grid resources. Integrating the ARC middleware and the new storage system Chelonia with the Ganga grid job submission and control system, we show that MPI jobs can be run on a world-wide computing grid with good performance and promising scaling properties. Results for relatively communication-heavy Monte Carlo simulations run on multiple heterogeneous, ARC-enabled computing clusters in several countries are presented.
Parallel DEVS: a parallel, hierarchical, modular, modeling formalism
Alex Chung Hen Chow; Bernard P. Zeigler
1994-01-01
ABSTRACT We present a revision of the hierarchical, modular Discrete,Event,System,Specification,(DEVS) mod- eling,formalism.,The,revision,distinguishes,between transition,collisions,and,ordinary,external,events,in the,external,transition,function,of,DEVS,models. Such,separation,enables,us to extend,the,modeling,ca- pability,of the,collisions.,The,revision,also,does,away with,the,necessity,for,tie-breaking,of simultaneously scheduled events, as embodied in the select function. The,latter,is replaced,by,a well-defined,and,consis- tent,formal,construct,that,allows,all transitions,to be simultaneously,activated. The revision provides a modeler,wit h both,conceptual,and,parallel-execution benefits. 1,INTRODUCTION The Discrete Event System Specijication(DEVS),for-
Parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada.
Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G S
2003-01-01
This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-1-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the 1-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models. PMID:12714301
Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G.S.
2001-08-31
This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-one-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the one-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models.
Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN
NASA Astrophysics Data System (ADS)
Hammond, G. E.; Lichtner, P. C.; Mills, R. T.
2014-01-01
To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted.
Parallel Sparse Matrix Solver on the GPU Applied to Simulation of Electrical Machines
Rodrigues, Antonio Wendell De Oliveira; Menach, Yvonnick Le; Dekeyser, Jean-Luc
2010-01-01
Nowadays, several industrial applications are being ported to parallel architectures. In fact, these platforms allow acquire more performance for system modelling and simulation. In the electric machines area, there are many problems which need speed-up on their solution. This paper examines the parallelism of sparse matrix solver on the graphics processors. More specifically, we implement the conjugate gradient technique with input matrix stored in CSR, and Symmetric CSR and CSC formats. This method is one of the most efficient iterative methods available for solving the finite-element basis functions of Maxwell's equations. The GPU (Graphics Processing Unit), which is used for its implementation, provides mechanisms to parallel the algorithm. Thus, it increases significantly the computation speed in relation to serial code on CPU based systems.
NASA Astrophysics Data System (ADS)
Jaure, S.; Duchaine, F.; Staffelbach, G.; Gicquel, L. Y. M.
2013-01-01
Optimizing gas turbines is a complex multi-physical and multi-component problem that has long been based on expensive experiments. Today, computer simulation can reduce design process costs and is acknowledged as a promising path for optimization. However, performing such computations using high-fidelity methods such as a large eddy simulation (LES) on gas turbines is challenging. Nevertheless, such simulations become accessible for specific components of gas turbines. These stand-alone simulations face a new challenge: to improve the quality of the results, new physics must be introduced. Therefore, an efficient massively parallel coupling methodology is investigated. The flow solver modeling relies on the LES code AVBP which has already been ported on massively parallel architectures. The conduction solver is based on the same data structure and thus shares its scalability. Accurately coupling these solvers while maintaining their scalability is challenging and is the actual objective of this work. To obtain such goals, a methodology is proposed and different key issues to code the coupling are addressed: convergence, stability, parallel geometry mapping, transfers and interpolation. This methodology is then applied to a real burner configuration, hence demonstrating the possibilities and limitations of the solution.
Simulation and Optimization of Wind Farm Operations under Stochastic Conditions
Byon, Eunshin
2011-08-08
generation model, wind speed model, and maintenance model. We provide practical insights gained by examining di erent maintenance strategies. To the best of our knowledge, our simulation model is the rst discrete-event simulation model for wind farm...
A Component-Based Extension Framework for Large-Scale Parallel Simulations in NEURON
King, James G.; Hines, Michael; Hill, Sean; Goodman, Philip H.; Markram, Henry; Schürmann, Felix
2008-01-01
As neuronal simulations approach larger scales with increasing levels of detail, the neurosimulator software represents only a part of a chain of tools ranging from setup, simulation, interaction with virtual environments to analysis and visualizations. Previously published approaches to abstracting simulator engines have not received wide-spread acceptance, which in part may be to the fact that they tried to address the challenge of solving the model specification problem. Here, we present an approach that uses a neurosimulator, in this case NEURON, to describe and instantiate the network model in the simulator's native model language but then replaces the main integration loop with its own. Existing parallel network models are easily adopted to run in the presented framework. The presented approach is thus an extension to NEURON but uses a component-based architecture to allow for replaceable spike exchange components and pluggable components for monitoring, analysis, or control that can run in this framework alongside with the simulation. PMID:19430597
Parallel electric fields in a simulation of magnetotail reconnection and plasmoid evolution
NASA Technical Reports Server (NTRS)
Hesse, Michael; Birn, Joachim
1989-01-01
Properties of the electric field component parallel to the magnetic field (E sub parallel) in a three-dimensional MHD simulation of plasmoid formation and evolution in the magnetotail in the presence of a net dawn-dusk magnetic field component were observed. Particularly emphasized was the spatial location of E(sub parallel), the concept of a diffusion zone and the role of E(sub parallel) in accelerating electrons. A localization of the region of enhanced E(sub parallel) in all space directions with a strong concentration in the z direction was found. This region was identified as the diffusion zone, which plays a crucial role in reconnection theory through the local break-down of magnetic flux conservation. The presence of B(sub y) implies a north-south asymmetry of the injection of accelerated particles into the near-earth region, if the net B(sub y) field is strong enough to force particles to follow field lines through the diffusion region. It is estimated that for a typical net B(sub y) field this should affect the injection of electrons into the near-earth dawn region, so that precipitation into the Northern (Southern) Hemisphere should dominate for duskward (dawnward) net B(sub y). In addition, a spatial clottiness of the expected injection of adiabatic particles which could be related to the appearance bright spots in auroras was observed.
Relevance of the parallel nonlinearity in gyrokinetic simulations of tokamak plasmas
Candy, J.; Waltz, R. E.; Parker, S. E.; Chen, Y. [General Atomics, San Diego, California 92121 (United States); Center for Integrated Plasma Studies, University of Colorado at Boulder, Boulder, Colorado 80309 (United States)
2006-07-15
The influence of the parallel nonlinearity on transport in gyrokinetic simulations is assessed for values of {rho}{sub *} which are typical of current experiments. Here, {rho}{sub *}={rho}{sub s}/a is the ratio of gyroradius, {rho}{sub s}, to plasma minor radius, a. The conclusion, derived from simulations with both GYRO [J. Candy and R. E. Waltz, J. Comput. Phys., 186, 585 (2003)] and GEM [Y. Chen and S. E. Parker J. Comput. Phys., 189, 463 (2003)] is that no measurable effect of the parallel nonlinearity is apparent for {rho}{sub *}<0.012. This result is consistent with scaling arguments, which suggest that the parallel nonlinearity should be O({rho}{sub *}) smaller than the ExB nonlinearity. Indeed, for the plasma parameters under consideration, the magnitude of the parallel nonlinearity is a factor of 8{rho}{sub *} smaller (for 0.000 75<{rho}{sub *}<0.012) than the other retained terms in the nonlinear gyrokinetic equation.
Parallel-in-time implementation of transient stability simulations on a transputer network
La Scala, M.; Sblendorio, G.; Sbrizzai, R. (Politecnico di Bari (Italy). Dept. di Elettrotecnica ed Elettronica)
1994-05-01
The most time consuming computer simulation in power system studies is the transient stability analysis. In recent years, parallel processing has been applied for time domain simulations of power system transient behavior. In this paper, a parallel implementation of an algorithm based on Shifted-Picard dynamic iterations is presented. The main idea is that a set of nonlinear Differential Algebraic Equations (DAEs), which describes the system, can be solved by the iterative solution of a linear set of DAEs. The time behavior of the linear set of differential equations can be obtained by the evaluation of the convolution integral. In the parallel-in-time implementation of the proposed algorithm, each processor is devoted to the evaluation of the complete set of variables relative to each time step. The quadrature formula, adopted for the integral evaluation, can be easily parallelized by using a number of processors equal to the number of time steps. The algorithm, implemented on a transputer network with 32 Inmos T800/20 adopting a uni-directional ring topology, has been tested on standard power systems.
A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows
NASA Technical Reports Server (NTRS)
Bui, Trong T.
1999-01-01
A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.
Parallel solutions for voxel-based simulations of reaction-diffusion systems.
D'Agostino, Daniele; Pasquale, Giulia; Clematis, Andrea; Maj, Carlo; Mosca, Ettore; Milanesi, Luciano; Merelli, Ivan
2014-01-01
There is an increasing awareness of the pivotal role of noise in biochemical processes and of the effect of molecular crowding on the dynamics of biochemical systems. This necessity has given rise to a strong need for suitable and sophisticated algorithms for the simulation of biological phenomena taking into account both spatial effects and noise. However, the high computational effort characterizing simulation approaches, coupled with the necessity to simulate the models several times to achieve statistically relevant information on the model behaviours, makes such kind of algorithms very time-consuming for studying real systems. So far, different parallelization approaches have been deployed to reduce the computational time required to simulate the temporal dynamics of biochemical systems using stochastic algorithms. In this work we discuss these aspects for the spatial TAU-leaping in crowded compartments (STAUCC) simulator, a voxel-based method for the stochastic simulation of reaction-diffusion processes which relies on the S?-DPP algorithm. In particular we present how the characteristics of the algorithm can be exploited for an effective parallelization on the present heterogeneous HPC architectures. PMID:25045716
Object-Oriented Parallel Particle-in-Cell Code for Beam Dynamics Simulation in Linear Accelerators
Qiang, J.; Ryne, R.D.; Habib, S.; Decky, V.
1999-11-13
In this paper, we present an object-oriented three-dimensional parallel particle-in-cell code for beam dynamics simulation in linear accelerators. A two-dimensional parallel domain decomposition approach is employed within a message passing programming paradigm along with a dynamic load balancing. Implementing object-oriented software design provides the code with better maintainability, reusability, and extensibility compared with conventional structure based code. This also helps to encapsulate the details of communications syntax. Performance tests on SGI/Cray T3E-900 and SGI Origin 2000 machines show good scalability of the object-oriented code. Some important features of this code also include employing symplectic integration with linear maps of external focusing elements and using z as the independent variable, typical in accelerators. A successful application was done to simulate beam transport through three superconducting sections in the APT linac design.
Schuchardt, Karen L.; Agarwal, Khushbu; Chase, Jared M.; Rockhold, Mark L.; Freedman, Vicky L.; Elsethagen, Todd O.; Scheibe, Timothy D.; Chin, George; Sivaramakrishnan, Chandrika
2010-07-15
The Support Architecture for Large-Scale Subsurface Analysis (SALSSA) provides an extensible framework, sophisticated graphical user interface, and underlying data management system that simplifies the process of running subsurface models, tracking provenance information, and analyzing the model results. Initially, SALSSA supported two styles of job control: user directed execution and monitoring of individual jobs, and load balancing of jobs across multiple machines taking advantage of many available workstations. Recent efforts in subsurface modelling have been directed at advancing simulators to take advantage of leadership class supercomputers. We describe two approaches, current progress, and plans toward enabling efficient application of the subsurface simulator codes via the SALSSA framework: automating sensitivity analysis problems through task parallelism, and task parallel parameter estimation using the PEST framework.
Adaptive finite element simulation of flow and transport applications on parallel computers
NASA Astrophysics Data System (ADS)
Kirk, Benjamin Shelton
The subject of this work is the adaptive finite element simulation of problems arising in flow and transport applications on parallel computers. Of particular interest are new contributions to adaptive mesh refinement (AMR) in this parallel high-performance context, including novel work on data structures, treatment of constraints in a parallel setting, generality and extensibility via object-oriented programming, and the design/implementation of a flexible software framework. This technology and software capability then enables more robust, reliable treatment of multiscale--multiphysics problems and specific studies of fine scale interaction such as those in biological chemotaxis (Chapter 4) and high-speed shock physics for compressible flows (Chapter 5). The work begins by presenting an overview of key concepts and data structures employed in AMR simulations. Of particular interest is how these concepts are applied in the physics-independent software framework which is developed here and is the basis for all the numerical simulations performed in this work. This open-source software framework has been adopted by a number of researchers in the U.S. and abroad for use in a wide range of applications. The dynamic nature of adaptive simulations pose particular issues for efficient implementation on distributed-memory parallel architectures. Communication cost, computational load balance, and memory requirements must all be considered when developing adaptive software for this class of machines. Specific extensions to the adaptive data structures to enable implementation on parallel computers is therefore considered in detail. The libMesh framework for performing adaptive finite element simulations on parallel computers is developed to provide a concrete implementation of the above ideas. This physics-independent framework is applied to two distinct flow and transport applications classes in the subsequent application studies to illustrate the flexibility of the design and to demonstrate the capability for resolving complex multiscale processes efficiently and reliably. The first application considered is the simulation of chemotactic biological systems such as colonies of Escherichia coli. This work appears to be the first application of AMR to chemotactic processes. These systems exhibit transient, highly localized features and are important in many biological processes, which make them ideal for simulation with adaptive techniques. A nonlinear reaction-diffusion model for such systems is described and a finite element formulation is developed. The solution methodology is described in detail. Several phenomenological studies are conducted to study chemotactic processes and resulting biological patterns which use the parallel adaptive refinement capability developed in this work. The other application study is much more extensive and deals with fine scale interactions for important hypersonic flows arising in aerospace applications. These flows are characterized by highly nonlinear, convection-dominated flowfields with very localized features such as shock waves and boundary layers. These localized features are well-suited to simulation with adaptive techniques. A novel treatment of the inviscid flux terms arising in a streamline-upwind Petrov-Galerkin finite element formulation of the compressible Navier-Stokes equations is also presented and is found to be superior to the traditional approach. The parallel adaptive finite element formulation is then applied to several complex flow studies, culminating in fully three-dimensional viscous flows about complex geometries such as the Space Shuttle Orbiter. Physical phenomena such as viscous/inviscid interaction, shock wave/boundary layer interaction, shock/shock interaction, and unsteady acoustic-driven flowfield response are considered in detail. A computational investigation of a 25°/55° double cone configuration details the complex multiscale flow features and investigates a potential source of experimentally-observed unsteady flowfield response.
Simulation Programming with Python
Nelson, Barry L.
Chapter 4 Simulation Programming with Python This chapter shows how simulations of some of the examples in Chap. 3 can be programmed using Python and the SimPy simulation library[1]. The goals-based discrete-event simulation library for Python. It is open source and released under the M license. Sim
Lin, Feng
Human Immunodeficiency Virus (HIV)/Acquired Immune Defi- ciency Syndrome (AIDS) treatment guidelines 2007 A Self-Learning Fuzzy Discrete Event System for HIV/AIDS Treatment Regimen Selection Hao Ying developed a self-learning HIV/AIDS regimen selection system for the initial round of com- bination
Parallel Adaptive Solvers in Compressible PETSc-FUN3D Simulations
S. Bhowmick; D. Kaushik; L. McInnes; B. Norris; P. Raghavan
We consider parallel, three-dimensional transonic Euler flow using the PETSc- FUN3Dapplication, whichemployspseudo-transientNewton-Krylovmethods. Solvingalarge, sparse linear system at each nonlinear iteration dominates the overall simulation time for this fully implicit strategy. This paper presents a polyalgorithmic technique for adaptively select- ing the linear solver method to match the numeric properties of the linear systems as they evolve during the course of
Construction of a parallel processor for simulating manipulators and other mechanical systems
NASA Technical Reports Server (NTRS)
Hannauer, George
1991-01-01
This report summarizes the results of NASA Contract NAS5-30905, awarded under phase 2 of the SBIR Program, for a demonstration of the feasibility of a new high-speed parallel simulation processor, called the Real-Time Accelerator (RTA). The principal goals were met, and EAI is now proceeding with phase 3: development of a commercial product. This product is scheduled for commercial introduction in the second quarter of 1992.
Parallel Simulation of Stress Evolution of Wenchuan 8.0 Earthquake
H. Zhang; Y. Shi; M. Liu; Z. Wu; G. Zhu; D. Yuen
2008-01-01
We used parallel finite element modeling to simulate time-dependent Coulomb stress migration following the 12 May 2008 Wenchuan earthquake (Mw 8.0) in Sichuan, China. The model domain is 650x650x100 km (E100-106, N27-33); we used more than 2.6 million unstructured meshes to represent realistic fault systems in the region and three-dimensional variations of lithospheric structure and rheology. We calculated day-by- day
Reducing I/O Complexity by Simulating Coarse Grained Parallel Algorithms
Dehene, Frank
/rounds, local memory size , computation time +L, communication time g +L and mes- sage size N v2 canL, communication time v pg + v pL, and I/O time v pG O DB + v pL for v p, M = , N = vDB and B = ON v2 . GisReducing I/O Complexity by Simulating Coarse Grained Parallel Algorithms Frank Dehne, David
NASA Technical Reports Server (NTRS)
Campbell, David; Wysong, Ingrid; Kaplan, Carolyn; Mott, David; Wadsworth, Dean; VanGilder, Douglas
2000-01-01
An AFRL/NRL team has recently been selected to develop a scalable, parallel, reacting, multidimensional (SUPREM) Direct Simulation Monte Carlo (DSMC) code for the DoD user community under the High Performance Computing Modernization Office (HPCMO) Common High Performance Computing Software Support Initiative (CHSSI). This paper will introduce the JANNAF Exhaust Plume community to this three-year development effort and present the overall goals, schedule, and current status of this new code.
A Component-Based Architecture for Parallel Multi-physics PDE Simulation
Steven G. Parker
2002-01-01
We describe the Uintah Computational Framework (UCF), a set of software components and libraries that facilitate the simulation\\u000a of Partial Differential Equations (PDEs) on Structured AMR (SAMR) grids using hundreds to thousands of processors. The UCF\\u000a uses a nontraditional approach to achieving parallelism, employing an abstract taskgraph representation to describe computation\\u000a and communication. This representation has a number of advantages
NASA Astrophysics Data System (ADS)
Lee, Nicholas Jabari Ouma
Parallel molecular dynamics (MD) simulations are performed to investigate pressure-induced solid-to-solid structural phase transformations in cadmium selenide (CdSe) nanorods. The effects of the size and shape of nanorods on different aspects of structural phase transformations are studied. Simulations are based on interatomic potentials validated extensively by experiments. Simulations range from 105 to 106 atoms. These simulations are enabled by highly scalable algorithms executed on massively parallel Beowulf computing architectures. Pressure-induced structural transformations are studied using a hydrostatic pressure medium simulated by atoms interacting via Lennard-Jones potential. Four single-crystal CdSe nanorods, each 44A in diameter but varying in length, in the range between 44A and 600A, are studied independently in two sets of simulations. The first simulation is the downstroke simulation, where each rod is embedded in the pressure medium and subjected to increasing pressure during which it undergoes a forward transformation from a 4-fold coordinated wurtzite (WZ) crystal structure to a 6-fold coordinated rocksalt (RS) crystal structure. In the second so-called upstroke simulation, the pressure on the rods is decreased and a reverse transformation from 6-fold RS to a 4-fold coordinated phase is observed. The transformation pressure in the forward transformation depends on the nanorod size, with longer rods transforming at lower pressures close to the bulk transformation pressure. Spatially-resolved structural analyses, including pair-distributions, atomic-coordinations and bond-angle distributions, indicate nucleation begins at the surface of nanorods and spreads inward. The transformation results in a single RS domain, in agreement with experiments. The microscopic mechanism for transformation is observed to be the same as for bulk CdSe. A nanorod size dependency is also found in reverse structural transformations, with longer nanorods transforming more readily than smaller ones. Nucleation initiates at the center of the rod and grows outward.
Spontaneous Hot Flow Anomalies at Quasi-Parallel Shocks: 2. Hybrid Simulations
NASA Technical Reports Server (NTRS)
Omidi, N.; Zhang, H.; Sibeck, D.; Turner, D.
2013-01-01
Motivated by recent THEMIS observations, this paper uses 2.5-D electromagnetic hybrid simulations to investigate the formation of Spontaneous Hot Flow Anomalies (SHFA) upstream of quasi-parallel bow shocks during steady solar wind conditions and in the absence of discontinuities. The results show the formation of a large number of structures along and upstream of the quasi-parallel bow shock. Their outer edges exhibit density and magnetic field enhancements, while their cores exhibit drops in density, magnetic field, solar wind velocity and enhancements in ion temperature. Using virtual spacecraft in the simulation, we show that the signatures of these structures in the time series data are very similar to those of SHFAs seen in THEMIS data and conclude that they correspond to SHFAs. Examination of the simulation data shows that SHFAs form as the result of foreshock cavitons interacting with the bow shock. Foreshock cavitons in turn form due to the nonlinear evolution of ULF waves generated by the interaction of the solar wind with the backstreaming ions. Because foreshock cavitons are an inherent part of the shock dissipation process, the formation of SHFAs is also an inherent part of the dissipation process leading to a highly non-uniform plasma in the quasi-parallel magnetosheath including large scale density and magnetic field cavities.
Simulation study of a parallel processor with unbalanced loads. Master's thesis
Moore, T.S.
1987-12-01
The purpose of this thesis was twofold: to estimate the impact of unbalanced computational loads on a parallel-processing architecture via Monte Carlo simulation; and second to investigate the impact of representing the dynamics of the parallel-processing problem via animated simulation. It is constrained to the hypercube architecture in which each node is connected in a predetermined topology and allowed to communicate to other nodes through calls to the operating system. Routing of messages through the network is fixed and specified within the operating system. Message-transmission preempts nodal processing causing internodal communications to complicate the concurrent operation of the network. Two independent variables are defined: 1) the degree of imbalance characterizes the nature or severity of the load imbalance, and 2) the degree of locality characterizes the node loadings with respect to node locations across the cube. A SLAM II simulation model of a generic 16 node hypercube was constructed in which each node processes a predetermined number of computational tasks and, following each task, sends a message to a single randomly chosen receiver node. An experiment was designed in which the independent variables, degree of imbalance and degree of locality were varied across two computation-to-IO ratios to determine their separate and interactive effects on the dependent variable, job speedup. ANOVA and regression techniques were used to estimate the relationship between load imbalance, locality, computation-to-IO ratio, and their interactions to job speedup. Results show that load imbalance severely impacts a parallel-processor's performance.
NASA Astrophysics Data System (ADS)
Qiang, Ji; Li, Xiaoye
2010-12-01
Particle-in-cell (PIC) simulation is widely used in many branches of physics and engineering. In this paper, we give an analysis of the particle-field decomposition method and the domain decomposition method in parallel particle-in-cell beam dynamics simulation. The parallel performance of the two decomposition methods was studied on the Cray XT4 and the IBM Blue Gene/P Computers. The domain decomposition method shows better scalability but is slower than the particle-field decomposition in most cases (up to a few thousand processors) for macroparticle dominant applications. The particle-field decomposition method also shows less memory usage than the domain decomposition method due to its use of perfect static load balance. For applications with a smaller ratio of macroparticles to grid points, the domain decomposition method exhibits better scalability and faster speed. Application of the particle-field decomposition scheme to high-resolution macroparticle-dominant parallel beam dynamics simulation for a future light source linear accelerator is presented as an example.
NASA Technical Reports Server (NTRS)
Morgan, Philip E.
2004-01-01
This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Holkundkar, Amol R. [Department of Physics, Birla Institute of Technology and Science, Pilani-333 031 (India)] [Department of Physics, Birla Institute of Technology and Science, Pilani-333 031 (India)
2013-11-15
The objective of this article is to report the parallel implementation of the 3D molecular dynamic simulation code for laser-cluster interactions. The benchmarking of the code has been done by comparing the simulation results with some of the experiments reported in the literature. Scaling laws for the computational time is established by varying the number of processor cores and number of macroparticles used. The capabilities of the code are highlighted by implementing various diagnostic tools. To study the dynamics of the laser-cluster interactions, the executable version of the code is available from the author.
NASA Astrophysics Data System (ADS)
Honkonen, I.
2015-03-01
I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring modification of existing code. This is an advantage for the development and testing of, e.g., geoscientific software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. An implementation of the generic simulation cell method presented here, generic simulation cell class (gensimcell), also includes support for parallel programming by allowing model developers to select which simulation variables of, e.g., a domain-decomposed model to transfer between processes via a Message Passing Interface (MPI) library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class requires a C++ compiler that supports a version of the language standardized in 2011 (C++11). The code is available at https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those who do are kindly requested to acknowledge and cite this work.
Parallel and Distributed Simulation from Many Cores to the Public Cloud (Extended Version)
D'Angelo, Gabriele
2011-01-01
In this tutorial paper, we will firstly review some basic simulation concepts and then introduce the parallel and distributed simulation techniques in view of some new challenges of today and tomorrow. More in particular, in the last years there has been a wide diffusion of many cores architectures and we can expect this trend to continue. On the other hand, the success of cloud computing is strongly promoting the everything as a service paradigm. Is parallel and distributed simulation ready for these new challenges? The current approaches present many limitations in terms of usability and adaptivity: there is a strong need for new evaluation metrics and for revising the currently implemented mechanisms. In the last part of the paper, we propose a new approach based on multi-agent systems for the simulation of complex systems. It is possible to implement advanced techniques such as the migration of simulated entities in order to build mechanisms that are both adaptive and very easy to use. Adaptive mechanisms...
Large-eddy simulations of compressible convection on massively parallel computers. [stellar physics
NASA Technical Reports Server (NTRS)
Xie, Xin; Toomre, Juri
1993-01-01
We report preliminary implementation of the large-eddy simulation (LES) technique in 2D simulations of compressible convection carried out on the CM-2 massively parallel computer. The convective flow fields in our simulations possess structures similar to those found in a number of direct simulations, with roll-like flows coherent across the entire depth of the layer that spans several density scale heights. Our detailed assessment of the effects of various subgrid scale (SGS) terms reveals that they may affect the gross character of convection. Yet, somewhat surprisingly, we find that our LES solutions, and another in which the SGS terms are turned off, only show modest differences. The resulting 2D flows realized here are rather laminar in character, and achieving substantial turbulence may require stronger forcing and less dissipation.
Field-Scale, Massively Parallel Simulation of Production from Oceanic Gas Hydrate Deposits
NASA Astrophysics Data System (ADS)
Reagan, M. T.; Moridis, G. J.; Freeman, C. M.; Pan, L.; Boyle, K. L.; Johnson, J. N.; Husebo, J. A.
2012-12-01
The quantity of hydrocarbon gases trapped in natural hydrate accumulations is enormous, leading to significant interest in the evaluation of their potential as an energy source. It has been shown that large volumes of gas can be readily produced at high rates for long times from some types of methane hydrate accumulations by means of depressurization-induced dissociation, and using conventional technologies with horizontal or vertical well configurations. However, these systems are currently assessed using simplified or reduced-scale 3D or even 2D production simulations. In this study, we use the massively parallel TOUGH+HYDRATE code (pT+H) to assess the production potential of a large, deep-ocean hydrate reservoir and develop strategies for effective production. The simulations model a full 3D system of over 24 km2 extent, examining the productivity of vertical and horizontal wells, single or multiple wells, and explore variations in reservoir properties. Systems of up to 2.5M gridblocks, running on thousands of supercomputing nodes, are required to simulate such large systems at the highest level of detail. The simulations reveal the challenges inherent in producing from deep, relatively cold systems with extensive water-bearing channels and connectivity to large aquifers, including the difficulty of achieving depressurizing, the challenges of high water removal rates, and the complexity of production design. Also highlighted are new frontiers in large-scale reservoir simulation of coupled flow, transport, thermodynamics, and phase behavior, including the construction of large meshes, the use parallel numerical solvers and MPI, and large-scale, parallel 3D visualization of results.
Wang, Junye; Zhang, Xiaoxian; Bengough, Anthony G; Crawford, John W
2005-07-01
The lattice Boltzmann method has proven to be a promising method to simulate flow in porous media. Its practical application often relies on parallel computation because of the demand for a large domain and fine grid resolution to adequately resolve pore heterogeneity. The existing domain-decomposition methods for parallel computation usually decompose a domain into a number of subdomains first and then recover the interfaces and perform the load balance. Normally, the interface recovery and the load balance have to be performed iteratively until an acceptable load balance is achieved; this costs time. In this paper we propose a cell-based domain-decomposition method for parallel lattice Boltzmann simulation of flow in porous media. Unlike the existing methods, the cell-based method performs the load balance first to divide the total number of fluid cells into a number of groups (or subdomains), in which the difference of fluid cells in each group is either 0 or 1, depending on if the total number of fluid cells is a multiple of the processor numbers; the interfaces between the subdomains are recovered at last. The cell-based method is to recover the interfaces rather than the load balance; it does not need iteration and gives an exact load balance. The performance of the proposed method is analyzed and compared using different computer systems; the results indicate that it reaches the theoretical parallel efficiency. The method is then applied to simulate flow in a three-dimensional porous medium obtained with microfocus x-ray computed tomography to calculate the permeability, and the result shows good agreement with the experimental data. PMID:16090133
Parallel Verlet Neighbor List Algorithm for GPU-Optimized MD Simulations
NASA Astrophysics Data System (ADS)
Cho, Samuel
2013-03-01
How biomolecules fold and assemble into well-defined structures that correspond to cellular functions is a fundamental problem in biophysics. Molecular dynamics (MD) simulations provide a molecular-resolution physical description of the folding and assembly processes, but the computational demands of the algorithms restrict the size and the timescales one can simulate. In a recent study, we introduced a parallel neighbor list algorithm that was specifically optimized for MD simulations on GPUs. We now analyze the performance of our MD simulation code that incorporates the algorithm, and we observe that the force calculations and the evaluation of the neighbor list and pair lists constitute a majority of the overall execution time. The overall speedup of the GPU-optimized MD simulations as compared to the CPU-optimized version is N-dependent and ~ 30x for the full 70s ribosome (10,219 beads). The pair and neighbor list evaluations have performance speedups of ~ 25x and ~ 55x, respectively. We then make direct comparisons with the performance of our MD simulation code with that of the SOP model implemented in the simulation code of HOOMD, a leading general particle dynamics simulation package that is specifically optimized for GPUs.
Taufer, Michela
such as catalysis, crystal growth, surface diffusion, phase transitions on single crystals, and cell membrane, and cell membrane receptor dynamics. In parallel CGMC, the tau-leap method is used for parallel simulations phenomena such as catalysis, crystal growth, surface diffusion, phase transitions on single crystals
Massively parallel Monte Carlo for many-particle simulations on GPUs
Joshua A. Anderson; Eric Jankowski; Thomas L. Grubb; Michael Engel; Sharon C. Glotzer
2013-08-23
Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.
A New Simulation Technique for Study of Collisionless Shocks: Self-Adaptive Simulations
Karimabadi, H.; Omelchenko, Y.; Driscoll, J.; Krauss-Varban, D. [SciberQuest, Inc., Solana Beach, CA, 92075 (United States); Fujimoto, R.; Perumalla, K. [Georgia Institute of Technology, Atlanta, GA, 30332 (United States)
2005-08-01
The traditional technique for simulating physical systems modeled by partial differential equations is by means of time-stepping methodology where the state of the system is updated at regular discrete time intervals. This method has inherent inefficiencies. In contrast to this methodology, we have developed a new asynchronous type of simulation based on a discrete-event-driven (as opposed to time-driven) approach, where the simulation state is updated on a 'need-to-be-done-only' basis. Here we report on this new technique, show an example of particle acceleration in a fast magnetosonic shockwave, and briefly discuss additional issues that we are addressing concerning algorithm development and parallel execution.
NASA Astrophysics Data System (ADS)
Mosaddeghi, Hamid; Alavi, Saman; Kowsari, M. H.; Najafi, Bijan
2012-11-01
We use molecular dynamics simulations to study the structure, dynamics, and transport properties of nano-confined water between parallel graphite plates with separation distances (H) from 7 to 20 Å at different water densities with an emphasis on anisotropies generated by confinement. The behavior of the confined water phase is compared to non-confined bulk water under similar pressure and temperature conditions. Our simulations show anisotropic structure and dynamics of the confined water phase in directions parallel and perpendicular to the graphite plate. The magnitude of these anisotropies depends on the slit width H. Confined water shows "solid-like" structure and slow dynamics for the water layers near the plates. The mean square displacements (MSDs) and velocity autocorrelation functions (VACFs) for directions parallel and perpendicular to the graphite plates are calculated. By increasing the confinement distance from H = 7 Å to H = 20 Å, the MSD increases and the behavior of the VACF indicates that the confined water changes from solid-like to liquid-like dynamics. If the initial density of the water phase is set up using geometric criteria (i.e., distance between the graphite plates), large pressures (in the order of ˜10 katm), and large pressure anisotropies are established within the water. By decreasing the density of the water between the confined plates to about 0.9 g cm-3, bubble formation and restructuring of the water layers are observed.
Trinitis, C; Schulz, M
2006-06-29
In today's world, the use of parallel programming and architectures is essential for simulating practical problems in engineering and related disciplines. Remarkable progress in CPU architecture, system scalability, and interconnect technology continues to provide new opportunities, as well as new challenges for both system architects and software developers. These trends are paralleled by progress in parallel algorithms, simulation techniques, and software integration from multiple disciplines. ParSim brings together researchers from both application disciplines and computer science and aims at fostering closer cooperation between these fields. Since its successful introduction in 2002, ParSim has established itself as an integral part of the EuroPVM/MPI conference series. In contrast to traditional conferences, emphasis is put on the presentation of up-to-date results with a short turn-around time. This offers a unique opportunity to present new aspects in this dynamic field and discuss them with a wide, interdisciplinary audience. The EuroPVM/MPI conference series, as one of the prime events in parallel computation, serves as an ideal surrounding for ParSim. This combination enables the participants to present and discuss their work within the scope of both the session and the host conference. This year, eleven papers from authors in nine countries were submitted to ParSim, and we selected five of them. They cover a wide range of different application fields including gas flow simulations, thermo-mechanical processes in nuclear waste storage, and cosmological simulations. At the same time, the selected contributions also address the computer science side of their codes and discuss different parallelization strategies, programming models and languages, as well as the use nonblocking collective operations in MPI. We are confident that this provides an attractive program and that ParSim will be an informal setting for lively discussions and for fostering new collaborations. We hope this session will fulfill its purpose to provide new insights from both the engineering and the computer science side and encourages interdisciplinary exchange of ideas and cooperation. We hope that this will continue ParSim's tradition at EuroPVM/MPI.
NASA Astrophysics Data System (ADS)
Dong, Haibo
Due to the progress in computer technology in recent years, distributed memory parallel computer systems are rapidly gaining importance in direct numerical simulation (DNS) of the stability and transition of compressible boundary layers. In most works, explicit methods have mainly been used in such simulations to advance the compressible Navier-Stokes equations in time. However, the small wall-normal grid sizes for viscous flow simulations impose severe stability restriction on the allowable time steps in simulations using explicit method. This requires implicit treatment to the numerical methods. Although fully implicit methods are often used in steady-flow calculations to remove the stability restriction on time steps, they are seldom used in transient flow simulations because the time steps used in time-accurate calculations are often not large enough to offset high computational cost of using fully implicit methods. In this thesis, we present an efficient high-order semi-implicit method, which only treats the stiff terms implicitly, for the DNS study the hypersonic boundary-layer receptivity to freestream disturbances over blunt bodies. It is shown that the semi-implicit method can meet the requirements for both computational efficiency and numerical accuracy in the DNS studies. However, we can not implement our semi-implicit method on single computer to solve unsteady Navier-Stokes equations for the direct numerical simulation of supersonic and hypersonic boundary layer flows on parallel computers directly. The semi-implicit algorithm has to be modified to achieve the communications among processors in solving the global block linear systems. In this thesis, a divide and conquer (DAC) method is used to parallelly solve the block linear system from the semi-implicit method. A parallel Fourier collocation method is also implemented in the periodic spanwise direction. It is shown that by implementing the new parallel semi-implicit scheme the simulations of compressible transient flow can benefit greatly from parallel computer systems by increasing both simulation sizes and speed while maintaining high temporal accuracy. To implement our new numerical methods on the numerical studies of compressible boundary layer stability and transitions, numerical simulations of the receptivity process of hypersonic boundary layer flows over 3-D blunt leading edges are chosen to be investigated because the receptivity phenomena are much more complex and currently not well understood. In this thesis, parametric simulations of receptivity freestream disturbances which includes fast acoustic waves, vorticity waves and entropy waves for Mach 15 flow over 3-D blunt leading edges have been carried out by using our new methods. The results show that initial transient growth generated and developed inside the hypersonic boundary layer near the leading edge can be observed in the receptivity of freestream standing vorticity or entropy waves, but not acoustic waves or traveling waves. It has been shown that this initial transient growth near the leading edge can be possibly explained by the transient growth theory. Additionally, cooling the surface will increase the growth. By adding inhomogeneous boundary conditions or random roughness on the surface can strongly increase the magnitude of growth.
Bradley, Randolph L. (Randolph Lewis)
2012-01-01
Heavy industries operate equipment having a long life to generate revenue or perform a mission. These industries must invest in the specialized service parts needed to maintain their equipment, because unlike in other ...
Changbum Ahn; Wenjia Pan; SangHyun Lee; Feniosky A. Peña-Mora
2010-01-01
Construction operations have a tremendous impact upon both the environment and public health due to the generation of significant amounts of airborne emissions, including greenhouse gases and other traditional criteria air pollutants. Quantifying emissions in the pre-planning phase of construction operations is the first step in identifying mitigation opportunities. The authors therefore have quantified construction emissions produced by various types
Implementation of unsteady sampling procedures for the parallel direct simulation Monte Carlo method
NASA Astrophysics Data System (ADS)
Cave, H. M.; Tseng, K.-C.; Wu, J.-S.; Jermy, M. C.; Huang, J.-C.; Krumdieck, S. P.
2008-06-01
An unsteady sampling routine for a general parallel direct simulation Monte Carlo method called PDSC is introduced, allowing the simulation of time-dependent flow problems in the near continuum range. A post-processing procedure called DSMC rapid ensemble averaging method (DREAM) is developed to improve the statistical scatter in the results while minimising both memory and simulation time. This method builds an ensemble average of repeated runs over small number of sampling intervals prior to the sampling point of interest by restarting the flow using either a Maxwellian distribution based on macroscopic properties for near equilibrium flows (DREAM-I) or output instantaneous particle data obtained by the original unsteady sampling of PDSC for strongly non-equilibrium flows (DREAM-II). The method is validated by simulating shock tube flow and the development of simple Couette flow. Unsteady PDSC is found to accurately predict the flow field in both cases with significantly reduced run-times over single processor code and DREAM greatly reduces the statistical scatter in the results while maintaining accurate particle velocity distributions. Simulations are then conducted of two applications involving the interaction of shocks over wedges. The results of these simulations are compared to experimental data and simulations from the literature where there these are available. In general, it was found that 10 ensembled runs of DREAM processing could reduce the statistical uncertainty in the raw PDSC data by 2.5-3.3 times, based on the limited number of cases in the present study.
NASA Astrophysics Data System (ADS)
He, W.; Beyer, C.; Fleckenstein, J. H.; Jang, E.; Kolditz, O.; Naumov, D.; Kalbacher, T.
2015-03-01
This technical paper presents an efficient and performance-oriented method to model reactive mass transport processes in environmental and geotechnical subsurface systems. The open source scientific software packages OpenGeoSys and IPhreeqc have been coupled, to combine their individual strengths and features to simulate thermo-hydro-mechanical-chemical coupled processes in porous and fractured media with simultaneous consideration of aqueous geochemical reactions. Furthermore, a flexible parallelization scheme using MPI (Message Passing Interface) grouping techniques has been implemented, which allows an optimized allocation of computer resources for the node-wise calculation of chemical reactions on the one hand, and the underlying processes such as for groundwater flow or solute transport on the other hand. The coupling interface and parallelization scheme have been tested and verified in terms of precision and performance.
NASA Astrophysics Data System (ADS)
Laundy, D.; Sutter, J. P.; Wagner, U. H.; Rau, C.; Thomas, C. A.; Sawhney, K. J. S.; Chubar, O.
2013-03-01
Hard X-ray undulator radiation at 3rd generation storage rings falls between the geometrical and the fully coherent limit. This is a result of the small but finite emittance of the electron beam source and means that the radiation cannot be completely modelled by incoherent ray tracing or by fully coherent wave propagation. We have developed using the wavefront propagation code Synchrotron Radiation Workshop (SRW) running in a Python environment, a parallel computer program using the Monte Carlo method for modelling the partially coherent emission from electron beam sources taking into account the finite emittance of the source. Using a parallel computing cluster with in excess of 500 cores and each core calculating the wavefront from in excess of a 1000 electrons, a source containing millions of electrons could be simulated. We have applied this method to the Diamond X-ray Imaging and Coherence beamline (113).
A study of the parallel algorithm for large-scale DC simulation of nonlinear systems
NASA Astrophysics Data System (ADS)
Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel
Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.
Parallel Vortex Method Simulation of Turbulent Flow in a Hydraulic Spool Valve
NASA Astrophysics Data System (ADS)
Dimas, Athanassios; Lottati, Isaac; Bernard, Peter
2000-11-01
Many mechanical systems employing fluid power use one or more spool-type hydraulic valves to control fluid flow. These valves experience flow forces which affect their axial motion and radial clearance. The unique software VORCAT (VORtex Computational Algorithm for Turbulence), which is based on a fast parallel vortex method, is employed to perform a time-accurate simulation of the flow in the spool valve of a pressure regulator from an automotive automatic transmission. The vortex method is based on a parallel implementation of the Fast Multipole Method, and includes a hybrid vortex filament/sheet representation of the flow field, a vortex filament creation model that imitates the physical vortex self-replication process, and a hairpin removal and reconnection mechanism to limit the number and scale of vortical structures to those that are dynamically essential. Velocity, vorticity and pressure distribution results are presented for spool valve openings of 1mm and 0.25mm.
Neural network simulations on massively parallel computers: Applications in chemical physics
Sumpter, B.G.; Getino, C.; Noid, D.W. ); Guenther, R.E. . Dept. of Physics); Halloy, C. . Joint Inst. of Computational Sciences)
1993-01-01
A fully connected feedforward neural network is simulated on a number of parallel computers (MasPar-1, Connection Machine CM5, Intel iPSC-2 and iPSC-860) and the performance is compared to that obtained on sequential vector computers (Cray YMP, Cray C90, IIBM-3090) and to a scaler workstation (MM RISC-6000). Peak performances of up to 342 million connections per second (MCPS) could be obtained on the Cray C90 using a single processor while the optimum performance obtained on the parallel computers was 90 MCPS using 4096 processors. Efficiency such as these has enabled neural network computations to be carried out for a number of chemical physics problems. Several examples are discussed: multi-dimensional function/surface fitting, coordinate transformations, and predictions of physical properties from chemical structure.
Neural network simulations on massively parallel computers: Applications in chemical physics
Sumpter, B.G.; Getino, C.; Noid, D.W.; Guenther, R.E.; Halloy, C.
1993-03-01
A fully connected feedforward neural network is simulated on a number of parallel computers (MasPar-1, Connection Machine CM5, Intel iPSC-2 and iPSC-860) and the performance is compared to that obtained on sequential vector computers (Cray YMP, Cray C90, IIBM-3090) and to a scaler workstation (MM RISC-6000). Peak performances of up to 342 million connections per second (MCPS) could be obtained on the Cray C90 using a single processor while the optimum performance obtained on the parallel computers was 90 MCPS using 4096 processors. Efficiency such as these has enabled neural network computations to be carried out for a number of chemical physics problems. Several examples are discussed: multi-dimensional function/surface fitting, coordinate transformations, and predictions of physical properties from chemical structure.
Billion-atom synchronous parallel kinetic Monte Carlo simulations of critical 3D Ising systems
Martinez, E. [IMDEA-Materiales, Madrid 28040 (Spain); Monasterio, P.R. [Massachusetts Institute of Technology, Cambridge, MA 02139 (United States); Marian, J., E-mail: marian1@llnl.go [Lawrence Livermore National Laboratory, Livermore, CA 94551 (United States)
2011-02-20
An extension of the synchronous parallel kinetic Monte Carlo (spkMC) algorithm developed by Martinez et al. [J. Comp. Phys. 227 (2008) 3804] to discrete lattices is presented. The method solves the master equation synchronously by recourse to null events that keep all processors' time clocks current in a global sense. Boundary conflicts are resolved by adopting a chessboard decomposition into non-interacting sublattices. We find that the bias introduced by the spatial correlations attendant to the sublattice decomposition is within the standard deviation of serial calculations, which confirms the statistical validity of our algorithm. We have analyzed the parallel efficiency of spkMC and find that it scales consistently with problem size and sublattice partition. We apply the method to the calculation of scale-dependent critical exponents in billion-atom 3D Ising systems, with very good agreement with state-of-the-art multispin simulations.
Univ. of California, San Diego; Li, Xiaoye Sherry; Cicotti, Pietro; Li, Xiaoye Sherry; Baden, Scott B.
2008-04-15
Sparse parallel factorization is among the most complicated and irregular algorithms to analyze and optimize. Performance depends both on system characteristics such as the floating point rate, the memory hierarchy, and the interconnect performance, as well as input matrix characteristics such as such as the number and location of nonzeros. We present LUsim, a simulation framework for modeling the performance of sparse LU factorization. Our framework uses micro-benchmarks to calibrate the parameters of machine characteristics and additional tools to facilitate real-time performance modeling. We are using LUsim to analyze an existing parallel sparse LU factorization code, and to explore a latency tolerant variant. We developed and validated a model of the factorization in SuperLU_DIST, then we modeled and implemented a new variant of slud, replacing a blocking collective communication phase with a non-blocking asynchronous point-to-point one. Our strategy realized a mean improvement of 11percent over a suite of test matrices.
Superposition-Enhanced Estimation of Optimal Temperature Spacings for Parallel Tempering Simulations
2014-01-01
Effective parallel tempering simulations rely crucially on a properly chosen sequence of temperatures. While it is desirable to achieve a uniform exchange acceptance rate across neighboring replicas, finding a set of temperatures that achieves this end is often a difficult task, in particular for systems undergoing phase transitions. Here we present a method for determination of optimal replica spacings, which is based upon knowledge of local minima in the potential energy landscape. Working within the harmonic superposition approximation, we derive an analytic expression for the parallel tempering acceptance rate as a function of the replica temperatures. For a particular system and a given database of minima, we show how this expression can be used to determine optimal temperatures that achieve a desired uniform acceptance rate. We test our strategy for two atomic clusters that exhibit broken ergodicity, demonstrating that our method achieves uniform acceptance as well as significant efficiency gains. PMID:25512744
CLUSTEREASY: A Program for Simulating Scalar Field Evolution on Parallel Computers
Gary N Felder
2007-12-05
We describe a new, parallel programming version of the scalar field simulation program LATTICEEASY. The new C++ program, CLUSTEREASY, can simulate arbitrary scalar field models on distributed-memory clusters. The speed and memory requirements scale well with the number of processors. As with the serial version of LATTICEEASY, CLUSTEREASY can run simulations in one, two, or three dimensions, with or without expansion of the universe, with customizable parameters and output. The program and its full documentation are available on the LATTICEEASY website at http://www.science.smith.edu/departments/Physics/fstaff/gfelder/latticeeasy/. In this paper we provide a brief overview of what CLUSTEREASY does and the ways in which it does and doesn't differ from the serial version of LATTICEEASY.
Embedded Microclusters in Zeolites and Cluster Beam Sputtering -- Simulation on Parallel Computers
Greenwell, Donald L.; Kalia, Rajiv K.; Vashishta, Priya
1996-12-01
This report summarizes the research carried out under DOE supported program (DOE/ER/45477) Computer Science--during the course of this project. Large-scale molecular-dynamics (MD) simulations were performed to investigate: (1) sintering of microporous and nanophase Si{sub 3}N{sub 4}; (2) crack-front propagation in amorphous silica; (3) phonons in highly efficient multiscale algorithms and dynamic load-balancing schemes for mapping process, structural correlations, and mechanical behavior including dynamic fracture in graphitic tubules; and (4) amorphization and fracture in nanowires. The simulations were carried out with irregular atomistic simulations on distributed-memory parallel architectures. These research activities resulted in fifty-three publications and fifty-five invited presentations.
DSMC Simulations Of Low-Density Choked Flows In Parallel-Plate Channels
NASA Astrophysics Data System (ADS)
Ilgaz, M.; ?elenligil, M. C.
2003-05-01
Rarefied choked flows in parallel-plate channels have been studied using the direct simulation Monte Carlo technique. Calculations are performed for various transitional flows, and results are presented for the computed flowfield quantities, wall pressures and discharge coefficients. Comparisons are made with the available experimental data, and the physical and numerical factors which affect the solutions are discussed. Separate calculations are performed for the "nearly-incompressible" rarefied channel flows in which density variations are small. The calculations of this study indicate that the DSMC simulations are well suited for determining transitional flows from free-molecule limit to laminar flow regime, but become prohibitive for turbulent channel flow simulations with today's computers.
Xyce parallel electronic simulator users%3CU%2B2019%3E guide, version 6.0.
Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David G. [Raytheon, Albuquerque, NM
2013-08-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.