For comprehensive and current results, perform a real-time search at Science.gov.

1

Parallel discrete event simulation

Parallel discrete event simulation (PDES), sometimes called distributed simulation, refers to the execution of a single discrete event simulation program on a parallel computer. PDES has attracted a considerable amount of interest in recent years. From a pragmatic standpoint, this interest arises from the fact that large simulations in engineering, computer science, economics, and military applications, to mention a few,

Richard M. Fujimoto

1990-01-01

2

Discrete Event Simulation Parallel Discrete-Event Simulation

Discrete Event Simulation Parallel Discrete-Event Simulation Using TM for PDES Conclusions & Future Works Using TM for high-performance Discrete-Event Simulation on multi-core architectures EuroTM 2013 14, 2013 Olivier Dalle Using TM for high-performance Discrete-Event Simulation on mu #12;Discrete

3

Parallel discrete-event simulation framework

This paper describes a software environment devised to support parallel and sequential discrete-event simulation. It provides assistance to the user in issues such as selection of the synchronization protocol to be used in the execution of the simulation of the model. The software framework has been built upon the bulk-synchronous model of parallel computing. The well-defined structure of this model

M. Mann; Rodrigo Miranda; A. Alavarado

2003-01-01

4

Parallel discrete event simulation using shared memory

NASA Technical Reports Server (NTRS)

With traditional event-list techniques, evaluating a detailed discrete-event simulation-model can often require hours or even days of computation time. By eliminating the event list and maintaining only sufficient synchronization to ensure causality, parallel simulation can potentially provide speedups that are linear in the numbers of processors. A set of shared-memory experiments, using the Chandy-Misra distributed-simulation algorithm, to simulate networks of queues is presented. Parameters of the study include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential-simulation of most queueing network models.

Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

1988-01-01

5

Parallel Discrete Event Simulation Benno Overeinder Bob Hertzberger Peter Sloot

as the complex of activities associated with constructing models of real world systems and simulating them in reliable evaluation of the effectiveness of these strategies. 1 Introduction In the Parallel Scientific gives an introduction to discrete event simulation. In section 3 a parallel view to the sequential

6

Parallel discrete-event simulation of FCFS stochastic queueing networks

NASA Technical Reports Server (NTRS)

Physical systems are inherently parallel. Intuition suggests that simulations of these systems may be amenable to parallel execution. The parallel execution of a discrete-event simulation requires careful synchronization of processes in order to ensure the execution's correctness; this synchronization can degrade performance. Largely negative results were recently reported in a study which used a well-known synchronization method on queueing network simulations. Discussed here is a synchronization method (appointments), which has proven itself to be effective on simulations of FCFS queueing networks. The key concept behind appointments is the provision of lookahead. Lookahead is a prediction on a processor's future behavior, based on an analysis of the processor's simulation state. It is shown how lookahead can be computed for FCFS queueing network simulations, give performance data that demonstrates the method's effectiveness under moderate to heavy loads, and discuss performance tradeoffs between the quality of lookahead, and the cost of computing lookahead.

Nicol, David M.

1988-01-01

7

Parallel Discrete Event Simulation Using Space-Time Memory

An abstraction called space-time memory is discussed that allows parallel discrete event simulationprograms using the Time Warp mechanism to be written using shared memory constructs. Afew salient points concerning the implementation and use of space-time memory in parallel simulationare discussed. It is argued that this abstraction is useful from a programming standpointfor certain applications, and can yield good performance. Initial

Kaushik Ghosh; Richard Fujimoto

1991-01-01

8

The cost of conservative synchronization in parallel discrete event simulations

NASA Technical Reports Server (NTRS)

The performance of a synchronous conservative parallel discrete-event simulation protocol is analyzed. The class of simulation models considered is oriented around a physical domain and possesses a limited ability to predict future behavior. A stochastic model is used to show that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of the average per-event overhead due to synchronization, event list manipulation, lookahead calculations, and processor idle time approach the complexity of the average per-event overhead of a serial simulation. The method is therefore within a constant factor of optimal. The analysis demonstrates that on large problems--those for which parallel processing is ideally suited--there is often enough parallel workload so that processors are not usually idle. The viability of the method is also demonstrated empirically, showing how good performance is achieved on large problems using a thirty-two node Intel iPSC/2 distributed memory multiprocessor.

Nicol, David M.

1990-01-01

9

Parallel discrete-event simulation schemes with heterogeneous processing elements

NASA Astrophysics Data System (ADS)

To understand the effects of nonidentical processing elements (PEs) on parallel discrete-event simulation (PDES) schemes, two stochastic growth models, the restricted solid-on-solid (RSOS) model and the Family model, are investigated by simulations. The RSOS model is the model for the PDES scheme governed by the Kardar-Parisi-Zhang equation (KPZ scheme). The Family model is the model for the scheme governed by the Edwards-Wilkinson equation (EW scheme). Two kinds of distributions for nonidentical PEs are considered. In the first kind computing capacities of PEs are not much different, whereas in the second kind the capacities are extremely widespread. The KPZ scheme on the complex networks shows the synchronizability and scalability regardless of the kinds of PEs. The EW scheme never shows the synchronizability for the random configuration of PEs of the first kind. However, by regularizing the arrangement of PEs of the first kind, the EW scheme is made to show the synchronizability. In contrast, EW scheme never shows the synchronizability for any configuration of PEs of the second kind.

Kim, Yup; Kwon, Ikhyun; Chae, Huiseung; Yook, Soon-Hyung

2014-07-01

10

Parallel discrete-event simulation schemes with heterogeneous processing elements.

To understand the effects of nonidentical processing elements (PEs) on parallel discrete-event simulation (PDES) schemes, two stochastic growth models, the restricted solid-on-solid (RSOS) model and the Family model, are investigated by simulations. The RSOS model is the model for the PDES scheme governed by the Kardar-Parisi-Zhang equation (KPZ scheme). The Family model is the model for the scheme governed by the Edwards-Wilkinson equation (EW scheme). Two kinds of distributions for nonidentical PEs are considered. In the first kind computing capacities of PEs are not much different, whereas in the second kind the capacities are extremely widespread. The KPZ scheme on the complex networks shows the synchronizability and scalability regardless of the kinds of PEs. The EW scheme never shows the synchronizability for the random configuration of PEs of the first kind. However, by regularizing the arrangement of PEs of the first kind, the EW scheme is made to show the synchronizability. In contrast, EW scheme never shows the synchronizability for any configuration of PEs of the second kind. PMID:25122349

Kim, Yup; Kwon, Ikhyun; Chae, Huiseung; Yook, Soon-Hyung

2014-07-01

11

Optimistic Parallel Discrete Event Simulation on a Beowulf Cluster of Multi-core Machines

Optimistic Parallel Discrete Event Simulation on a Beowulf Cluster of Multi-core Machines The trend towards multi-core and many-core CPUs is forever changing the composition of the Beowulf cluster. The modern Beowulf cluster is now a heterogeneous cluster of single core, multi-core, and even many

Wilsey, Philip A.

12

On the Trade-off between Time and Space in Optimistic Parallel Discrete-Event Simulation

Optimistically synchronized parallel discrete-event simulation is based on the use of communicating sequential processes. Optimistic syn- chronization means that the processes execute under the assumption that synchronization is fortuitous. Periodic checkpointing of the state of a pro- cess allows the process to roll back to an earlier state when synchronization errors occur. This paper examines the effects of varying the

Bruno R. Preiss; Ian D. MacIntyre; Wayne M. Loucks

1992-01-01

13

We re-examine the problem of load balancing in conservatively synchronized parallel, discrete-event simulations executed on high-performance computing clusters, focusing on simulations where computational and messaging load tend to be spatially clustered. Such domains are frequently characterized by the presence of geographic 'hot-spots' - regions that generate significantly more simulation events than others. Examples of such domains include simulation of urban regions, transportation networks and networks where interaction between entities is often constrained by physical proximity. Noting that in conservatively synchronized parallel simulations, the speed of execution of the simulation is determined by the slowest (i.e most heavily loaded) simulation process, we study different partitioning strategies in achieving equitable processor-load distribution in domains with spatially clustered load. In particular, we study the effectiveness of partitioning via spatial scattering to achieve optimal load balance. In this partitioning technique, nearby entities are explicitly assigned to different processors, thereby scattering the load across the cluster. This is motivated by two observations, namely, (i) since load is spatially clustered, spatial scattering should, intuitively, spread the load across the compute cluster, and (ii) in parallel simulations, equitable distribution of CPU load is a greater determinant of execution speed than message passing overhead. Through large-scale simulation experiments - both of abstracted and real simulation models - we observe that scatter partitioning, even with its greatly increased messaging overhead, significantly outperforms more conventional spatial partitioning techniques that seek to reduce messaging overhead. Further, even if hot-spots change over the course of the simulation, if the underlying feature of spatial clustering is retained, load continues to be balanced with spatial scattering leading us to the observation that spatial scattering can often obviate the need for dynamic load balancing.

Thulasidasan, Sunil [Los Alamos National Laboratory; Kasiviswanathan, Shiva [Los Alamos National Laboratory; Eidenbenz, Stephan [Los Alamos National Laboratory; Romero, Philip [Los Alamos National Laboratory

2010-01-01

14

Distributed discrete-event simulation

Traditional discrete-event simulations employ an inherently sequential algorithm. In practice, simulations of large systems are limited by this sequentiality, because only a modest number of events can be simulated. Distributed discrete-event simulation (carried out on a network of processors with asynchronous message-communicating capabilities) is proposed as an alternative; it may provide better performance by partitioning the simulation among the component

Jayadev Misra

1986-01-01

15

We investigate the universal characteristics of the simulated time horizon of the basic conservative parallel algorithm when implemented on regular lattices. This technique [1, 2] is generically applicable to various physical, biological, or chemical systems where the underlying dynamics is asynchronous. Employing direct simulations, and using standard tools and the concept of dynamic scaling from non-equilibrium surface\\/interface physics, we identify

G. Korniss; M. A. Novotny; A. K. Kolakowska; H. Guclu

2002-01-01

16

Writing parallel, discrete-event simulations in ModSim: Insight and experience

The Time Warp Operating System (TWOS) has been the focus of much research in parallel simulation. A new language, called ModSim, has been developed for use in conjunction with TWOS. The coupling of ModSim and TWOS provides a tool to construct large, complex simulation models that will run on several parallel and distributed computer systems. As part of the Griffin Project'' underway here at Los Alamos National Laboratory, there is strong interest in assessing the coupling of ModSim and TWOS from an application-oriented perspective. To this end, a key component of the Eagle combat simulation has been implemented in ModSim for execution on TWOS. In this paper brief overviews of ModSim and TWOS will be presented. Finally, the compatibility of the computational models presented by the language and the operating system will be examined in light of experience gained to date. 18 refs., 4 figs.

Rich, D.O.; Michelsen, R.E.

1989-09-11

17

THE OMNET++ DISCRETE EVENT SIMULATION SYSTEM

The paper introduces OMNeT++, a C++-based discrete event simulation package primarily targeted at simulating computer networks and other distributed systems. OMNeT++ is fully programmable and modular, and it was designed from the ground up to support modeling very large networks built from reusable model components. Large emphasis was placed also on easy traceability and debuggability of simulation models: one can

András Varga

2001-01-01

18

Performance bounds on parallel self-initiating discrete-event

NASA Technical Reports Server (NTRS)

The use is considered of massively parallel architectures to execute discrete-event simulations of what is termed self-initiating models. A logical process in a self-initiating model schedules its own state re-evaluation times, independently of any other logical process, and sends its new state to other logical processes following the re-evaluation. The interest is in the effects of that communication on synchronization. The performance is considered of various synchronization protocols by deriving upper and lower bounds on optimal performance, upper bounds on Time Warp's performance, and lower bounds on the performance of a new conservative protocol. The analysis of Time Warp includes the overhead costs of state-saving and rollback. The analysis points out sufficient conditions for the conservative protocol to outperform Time Warp. The analysis also quantifies the sensitivity of performance to message fan-out, lookahead ability, and the probability distributions underlying the simulation.

Nicol, David M.

1990-01-01

19

RFIDSim : a discrete event simulator for Radio Frequency Identification systems

This thesis presents RFIDSim, a discrete event process-oriented simulator designed to model Radio Frequency Identification (RFID) communication. The simulator focuses on the discovery and identification process of passive ...

Yu, Kenneth Kwan-Wai, 1979-

2003-01-01

20

Maisie: A Language for the Design of Efficient Discrete-Event Simulations

Maisie is a C-based discrete-event simulation language that was designed to cleanly separatea simulation model from the underlying algorithm (sequential or parallel) used for the executionof the model. With few modifications, a Maisie program may be executed using a sequentialsimulation algorithm, a parallel conservative algorithm or a parallel optimistic algorithm. Thelanguage constructs allow the runtime system to implement optimizations that

Rajive L. Bagrodia; Wen-toh Liao

1994-01-01

21

Discrete event simulation in the artificial intelligence environment

Discrete Event Simulations performed in an Artificial Intelligence (AI) environment provide benefits in two major areas. The productivity provided by Object Oriented Programming, Rule Based Programming, and AI development environments allows simulations to be developed and maintained more efficiently than conventional environments allow. Secondly, the use of AI techniques allows direct simulation of human decision making processes and Command and Control aspects of a system under study. An introduction to AI techniques is presented. Two discrete event simulations produced in these environments are described. Finally, a software engineering methodology is discussed that allows simulations to be designed for use in these environments. 3 figs.

Egdorf, H.W.; Roberts, D.J.

1987-01-01

22

Discrete-Event Simulation in Chemical Engineering.

ERIC Educational Resources Information Center

Gives examples, descriptions, and uses for various types of simulation systems, including the Flowtran, Process, Aspen Plus, Design II, GPSS, Simula, and Simscript. Explains similarities in simulators, terminology, and a batch chemical process. Tables and diagrams are included. (RT)

Schultheisz, Daniel; Sommerfeld, Jude T.

1988-01-01

23

Reversible Discrete Event Formulation and Optimistic Parallel Execution of Vehicular Traffic Models

Vehicular traffic simulations are useful in applications such as emergency planning and traffic management. High speed of traffic simulations translates to speed of response and level of resilience in those applications. Discrete event formulation of traffic flow at the level of individual vehicles affords both the flexibility of simulating complex scenarios of vehicular flow behavior as well as rapid simulation time advances. However, efficient parallel/distributed execution of the models becomes challenging due to synchronization overheads. Here, a parallel traffic simulation approach is presented that is aimed at reducing the time for simulating emergency vehicular traffic scenarios. Our approach resolves the challenges that arise in parallel execution of microscopic, vehicular-level models of traffic. We apply a reverse computation-based optimistic execution approach to address the parallel synchronization problem. This is achieved by formulating a reversible version of a discrete event model of vehicular traffic, and by utilizing this reversible model in an optimistic execution setting. Three unique aspects of this effort are: (1) exploration of optimistic simulation applied to vehicular traffic simulation (2) addressing reverse computation challenges specific to optimistic vehicular traffic simulation (3) achieving absolute (as opposed to self-relative) speedup with a sequential speed close to that of a fast, de facto standard sequential simulator for emergency traffic. The design and development of the parallel simulation system is presented, along with a performance study that demonstrates excellent sequential performance as well as parallel performance. The benefits of optimistic execution are demonstrated, including a speed up of nearly 20 on 32 processors observed on a vehicular network of over 65,000 intersections and over 13 million vehicles.

Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL

2009-01-01

24

Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models

The spatial scale, runtime speed and behavioral detail of epidemic outbreak simulations together require the use of large-scale parallel processing. In this paper, an optimistic parallel discrete event execution of a reaction-diffusion simulation model of epidemic outbreaks is presented, with an implementation over the $\\mu$sik simulator. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the simulation are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundred million individuals in the largest case) are exercised.

Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL

2010-01-01

25

Discrete event simulation in an artificial intelligence environment: Some examples

Several Los Alamos National Laboratory (LANL) object-oriented discrete-event simulation efforts have been completed during the past three years. One of these systems has been put into production and has a growing customer base. Another (started two years earlier than the first project) was completed but has not yet been used. This paper will describe these simulation projects. Factors which were pertinent to the success of the one project, and to the failure of the second project will be discussed (success will be measured as the extent to which the simulation model was used as originally intended). 5 figs.

Roberts, D.J.; Farish, T.

1991-01-01

26

Discrete event simulation of patient admissions to a neurovascular unit.

Evidence exists that clinical outcomes improve for stroke patients admitted to specialized Stroke Units. The Toronto Western Hospital created a Neurovascular Unit (NVU) using beds from general internal medicine, Neurology and Neurosurgery to care for patients with stroke and acute neurovascular conditions. Using patient-level data for NVU-eligible patients, a discrete event simulation was created to study changes in patient flow and length of stay pre- and post-NVU implementation. Varying patient volumes and resources were tested to determine the ideal number of beds under various conditions. In the first year of operation, the NVU admitted 507 patients, over 66% of NVU-eligible patient volumes. With the introduction of the NVU, length of stay decreased by around 8%. Scenario testing showed that the current level of 20 beds is sufficient for accommodating the current demand and would continue to be sufficient with an increase in demand of up to 20%. PMID:25193372

Hahn-Goldberg, S; Chow, E; Appel, E; Ko, F T F; Tan, P; Gavin, M B; Ng, T; Abrams, H B; Casaubon, L K; Carter, M W

2014-01-01

27

Discrete Event Modeling and Massively Parallel Execution of Epidemic Outbreak Phenomena

In complex phenomena such as epidemiological outbreaks, the intensity of inherent feedback effects and the significant role of transients in the dynamics make simulation the only effective method for proactive, reactive or post-facto analysis. The spatial scale, runtime speed, and behavioral detail needed in detailed simulations of epidemic outbreaks make it necessary to use large-scale parallel processing. Here, an optimistic parallel execution of a new discrete event formulation of a reaction-diffusion simulation model of epidemic propagation is presented to facilitate in dramatically increasing the fidelity and speed by which epidemiological simulations can be performed. Rollback support needed during optimistic parallel execution is achieved by combining reverse computation with a small amount of incremental state saving. Parallel speedup of over 5,500 and other runtime performance metrics of the system are observed with weak-scaling execution on a small (8,192-core) Blue Gene / P system, while scalability with a weak-scaling speedup of over 10,000 is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes exceeding several hundreds of millions of individuals in the largest cases are successfully exercised to verify model scalability.

Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL

2011-01-01

28

Metrics for Availability Analysis Using a Discrete Event Simulation Method

The system performance metric 'availability' is a central concept with respect to the concerns of a plant's operators and owners, yet it can be abstract enough to resist explanation at system levels. Hence, there is a need for a system-level metric more closely aligned with a plant's (or, more generally, a system's) raison d'etre. Historically, availability of repairable systems - intrinsic, operational, or otherwise - has been defined as a ratio of times. This paper introduces a new concept of availability, called endogenous availability, defined in terms of a ratio of quantities of product yield. Endogenous availability can be evaluated using a discrete event simulation analysis methodology. A simulation example shows that endogenous availability reduces to conventional availability in a simple series system with different processing rates and without intermediate storage capacity, but diverges from conventional availability when storage capacity is progressively increased. It is shown that conventional availability tends to be conservative when a design includes features, such as in - process storage, that partially decouple the components of a larger system.

Schryver, Jack C [ORNL; Nutaro, James J [ORNL; Haire, Marvin Jonathan [ORNL

2012-01-01

29

Discrete EVent Systems Specification (DEVS) formalism supports specification of discrete event models in a hierarchical modular manner. This paper proposes a DEVS modeling language called DEVS Specification Language (DEVSpecL) based on which discrete event systems are modeled, simulated and analyzed within a DEVS-based framework for seamless systems design. Models specified in DEVSpecL can be translated in different forms of codes

Ki Jung Hong; Tag Gon Kim

2006-01-01

30

Parallel discrete event simulation with predictors

care of our 2 year old Siddharth which gave my husband and me an opportunity to finish graduate schooL And last but not least, I would like to acknowledge my husband Dr. T. K. Kumaran, and my 1'll tornado Siddharth for all their love and support... time Clock + L, the process is said to have a lookahead of L. Lookahead enhances one's ability to predict future events, which in turn, can be used to deterinine which other events are safe to process. Lookahead is used in the deadlock avoidance...

Gummadi, Vidya

2012-06-07

31

Discrete event simulation and production system design for Rockwell hardness test blocks

The research focuses on increasing production volume and decreasing costs at a hardness test block manufacturer. A discrete event simulation model is created to investigate potential system wide improvements. Using the ...

Scheinman, David Eliot

2009-01-01

32

Web-based simulation 1: D-SOL; a distributed Java based discrete event simulation architecture

Most discrete event simulation environments are based on a process-oriented, and therefore multi-threaded paradigm. This results in simulation environments that are very hard to distribute over more computers, and not easy to integrate with scattered external information sources. The architecture presented here is based on the event-based DES paradigm which is implemented by scheduled method invocation. Objects used in the

Peter H. M. Jacobs; Niels A. Lang; Alexander Verbraeck

2002-01-01

33

An approach for loosely coupled discrete event simulation models and animation components

Animation techniques are used during simulation studies for verifying and validating the model, and for communication purposes to external parties. Including animation in a simulation model, often results in models that contain many references to animation specific code. Moreover, in the specific case of discrete event simulation, challenges arise due to the difference in time-bases with animation techniques that mostly

Michele Fumarola; Mamadou Seck; Alexander Verbraeck

2010-01-01

34

Virginia Tech, September 24th 26th, 2003 Interactive Discrete-Event Simulation of Construction

discrete-event simulation can be of significant help in the verification, validation, and accreditation with a running simulation model and proactively influence/control the model's behavior and/or outcome in real of one or more simulation entities (i.e. resources) in a running model via a Virtual Reality interface

Kamat, Vineet R.

35

Objectives: This article explores the potential of discrete event simulation (DES) methods to advance system-level investigation of emergency department (ED) operations. To this end, the authors describe the development and operation of Emergency Department SIMulation (EDSIM), a new platform for computer simulation of ED activity at a Level 1 trauma center. The authors also demonstrate one potential application of EDSIM

Lloyd G. Connelly; Aaron E. Bair

2004-01-01

36

A sustainable manufacturing systems design using processes, methodologies, and technologies that are energy efficient and environmental friendly is desirable and essential for sustainable development of products and services. Efforts must be made to create and maintain such sustainable manufacturing systems. Discrete Event Simulation (DES) in combination with Life Cycle Assessment (LCA) system can be utilized to evaluate a manufacturing system

Björn Johansson; Anders Skoogh; Mahesh Mani; Swee Leong

2009-01-01

37

Bottleneck analysis in MDF-production by means of discrete event simulation

Material flow analysis by means of discrete event simulation proved to be a useful tool for decision support by several studies. This case study presents a bottleneck analysis for an Austrian Medium Density Fibreboard (MDF) production plant. The developed model was linked to actual production data and animated. The aim was to picture production, storage and transporting processes from the

2007-01-01

38

A Survey of the Use of the Discrete-Event Simulation in Manufacturing Industry

Discrete-event simulation improves the possibility to study manufacturing systems. The use is in- creasing compared to previous studies and 15% of 80 companies investigated are using the tool and of these four companies extensively. The main advantage according to the survey beside the visualization part is that the knowledge about a system is investigated and documented. Other benefits can be

A. Ingemansson; G. S. Bolmsjö; U. Harlin

39

Using Discrete-Event Simulation to Model Situational Awareness of Unmanned-Vehicle Operators

1 Using Discrete-Event Simulation to Model Situational Awareness of Unmanned-Vehicle Operators Carl Institute of Technology Cambridge, MA 02139 As the paradigm of operators supervising multiple unmanned on situational awareness as the size of the unmanned vehicle team being supervised is varied. INTRODUCTION N

Cummings, Mary "Missy"

40

scales. Koala is based loosely on the Amazon Elastic Compute Cloud (EC2) and on Eucalyptus openKoala: A DiscreteEvent Simulation Model of Infrastructure Clouds Koala is a discrete-event simulator that can model infrastructure as a service (IaaS) clouds of up to O(105 ) nodes. The model

41

DISCRETE EVENT SIMULATION OF OPTICAL SWITCH MATRIX PERFORMANCE IN COMPUTER NETWORKS

In this paper, we present application of a Discrete Event Simulator (DES) for performance modeling of optical switching devices in computer networks. Network simulators are valuable tools in situations where one cannot investigate the system directly. This situation may arise if the system under study does not exist yet or the cost of studying the system directly is prohibitive. Most available network simulators are based on the paradigm of discrete-event-based simulation. As computer networks become increasingly larger and more complex, sophisticated DES tool chains have become available for both commercial and academic research. Some well-known simulators are NS2, NS3, OPNET, and OMNEST. For this research, we have applied OMNEST for the purpose of simulating multi-wavelength performance of optical switch matrices in computer interconnection networks. Our results suggest that the application of DES to computer interconnection networks provides valuable insight in device performance and aids in topology and system optimization.

Imam, Neena [ORNL; Poole, Stephen W [ORNL

2013-01-01

42

A hybrid system is a combination of discrete event and continuous systems that act together to perform a function not possible with any one of the individual system types alone. A simulation model for the system consists of two sub-models, a continuous system model, and a discrete event model, and interfaces between them. Naturally, the modeling\\/simulation tool\\/environment of each sub-model

Changho Sung; Tag Gon Kim

2011-01-01

43

Discrete event model-based simulation for train movement on a single-line railway

NASA Astrophysics Data System (ADS)

The aim of this paper is to present a discrete event model-based approach to simulate train movement with the considered energy-saving factor. We conduct extensive case studies to show the dynamic characteristics of the traffic flow and demonstrate the effectiveness of the proposed approach. The simulation results indicate that the proposed discrete event model-based simulation approach is suitable for characterizing the movements of a group of trains on a single railway line with less iterations and CPU time. Additionally, some other qualitative and quantitative characteristics are investigated. In particular, because of the cumulative influence from the previous trains, the following trains should be accelerated or braked frequently to control the headway distance, leading to more energy consumption.

Xu, Xiao-Ming; Li, Ke-Ping; Yang, Li-Xing

2014-08-01

44

Although discrete-event simulation has pedagogically been rooted in computer science, and the practicality of geo- graphic information systems in geography, the combined use of both in the business world allows solving some very challenging temporal\\/spatial (time and space dependent) business problems. The discrete-event simulation lan- guage WebGPSS, an ideal simulation environment for the business person, is teamed with Microsoft MapPoint,

Richard G. Born

2005-01-01

45

Discrete event simulation of the Defense Waste Processing Facility (DWPF) analytical laboratory

A discrete event simulation of the Savannah River Site (SRS) Defense Waste Processing Facility (DWPF) analytical laboratory has been constructed in the GPSS language. It was used to estimate laboratory analysis times at process analytical hold points and to study the effect of sample number on those times. Typical results are presented for three different simultaneous representing increasing levels of complexity, and for different sampling schemes. Example equipment utilization time plots are also included. SRS DWPF laboratory management and chemists found the simulations very useful for resource and schedule planning.

Shanahan, K.L.

1992-02-01

46

Discrete-event simulation of uncertainty in single-neutron experiments

A discrete-event simulation approach which provides a cause-and-effect description of many experiments with photons and neutrons exhibiting interference and entanglement is applied to a recent single-neutron experiment that tests (generalizations of) Heisenberg's uncertainty relation. The event-based simulation algorithm reproduces the results of the quantum theoretical description of the experiment but does not require the knowledge of the solution of a wave equation nor does it rely on concepts of quantum theory. In particular, the data satisfies uncertainty relations derived in the context of quantum theory.

Hans De Raedt; Kristel Michielsen

2014-03-18

47

d li d l iModeling and Solution Issues in Discrete Event Simulationin Discrete Event Simulation

, Global warming) Operator training, model validation (computational pilot plant)Operator training, model of a real system. Simulation helps to predict performance test ideas eliminateSimulation helps to predict in the state of the system. Resources represent anything with restricted capacity Global variables

Grossmann, Ignacio E.

48

A Framework for the Optimization of Discrete-Event Simulation Models

NASA Technical Reports Server (NTRS)

With the growing use of computer modeling and simulation, in all aspects of engineering, the scope of traditional optimization has to be extended to include simulation models. Some unique aspects have to be addressed while optimizing via stochastic simulation models. The optimization procedure has to explicitly account for the randomness inherent in the stochastic measures predicted by the model. This paper outlines a general purpose framework for optimization of terminating discrete-event simulation models. The methodology combines a chance constraint approach for problem formulation, together with standard statistical estimation and analyses techniques. The applicability of the optimization framework is illustrated by minimizing the operation and support resources of a launch vehicle, through a simulation model.

Joshi, B. D.; Unal, R.; White, N. H.; Morris, W. D.

1996-01-01

49

Discrete-event simulation for the design and evaluation of physical protection systems

This paper explores the use of discrete-event simulation for the design and control of physical protection systems for fixed-site facilities housing items of significant value. It begins by discussing several modeling and simulation activities currently performed in designing and analyzing these protection systems and then discusses capabilities that design/analysis tools should have. The remainder of the article then discusses in detail how some of these new capabilities have been implemented in software to achieve a prototype design and analysis tool. The simulation software technology provides a communications mechanism between a running simulation and one or more external programs. In the prototype security analysis tool, these capabilities are used to facilitate human-in-the-loop interaction and to support a real-time connection to a virtual reality (VR) model of the facility being analyzed. This simulation tool can be used for both training (in real-time mode) and facility analysis and design (in fast mode).

Jordan, S.E.; Snell, M.K.; Madsen, M.M. [Sandia National Labs., Albuquerque, NM (United States); Smith, J.S.; Peters, B.A. [Texas A and M Univ., College Station, TX (United States). Industrial Engineering Dept.

1998-08-01

50

DeMO: An Ontology for Discrete-event Modeling and Simulation

Several fields have created ontologies for their subdomains. For example, the biological sciences have developed extensive ontologies such as the Gene Ontology, which is considered a great success. Ontologies could provide similar advantages to the Modeling and Simulation community. They provide a way to establish common vocabularies and capture knowledge about a particular domain with community-wide agreement. Ontologies can support significantly improved (semantic) search and browsing, integration of heterogeneous information sources, and improved knowledge discovery capabilities. This paper discusses the design and development of an ontology for Modeling and Simulation called the Discrete-event Modeling Ontology (DeMO), and it presents prototype applications that demonstrate various uses and benefits that such an ontology may provide to the Modeling and Simulation community. PMID:22919114

Silver, Gregory A; Miller, John A; Hybinette, Maria; Baramidze, Gregory; York, William S

2011-01-01

51

Statistical and Probabilistic Extensions to Ground Operations' Discrete Event Simulation Modeling

NASA Technical Reports Server (NTRS)

NASA's human exploration initiatives will invest in technologies, public/private partnerships, and infrastructure, paving the way for the expansion of human civilization into the solar system and beyond. As it is has been for the past half century, the Kennedy Space Center will be the embarkation point for humankind's journey into the cosmos. Functioning as a next generation space launch complex, Kennedy's launch pads, integration facilities, processing areas, launch and recovery ranges will bustle with the activities of the world's space transportation providers. In developing this complex, KSC teams work through the potential operational scenarios: conducting trade studies, planning and budgeting for expensive and limited resources, and simulating alternative operational schemes. Numerous tools, among them discrete event simulation (DES), were matured during the Constellation Program to conduct such analyses with the purpose of optimizing the launch complex for maximum efficiency, safety, and flexibility while minimizing life cycle costs. Discrete event simulation is a computer-based modeling technique for complex and dynamic systems where the state of the system changes at discrete points in time and whose inputs may include random variables. DES is used to assess timelines and throughput, and to support operability studies and contingency analyses. It is applicable to any space launch campaign and informs decision-makers of the effects of varying numbers of expensive resources and the impact of off nominal scenarios on measures of performance. In order to develop representative DES models, methods were adopted, exploited, or created to extend traditional uses of DES. The Delphi method was adopted and utilized for task duration estimation. DES software was exploited for probabilistic event variation. A roll-up process was used, which was developed to reuse models and model elements in other less - detailed models. The DES team continues to innovate and expand DES capabilities to address KSC's planning needs.

Trocine, Linda; Cummings, Nicholas H.; Bazzana, Ashley M.; Rychlik, Nathan; LeCroy, Kenneth L.; Cates, Grant R.

2010-01-01

52

NASA Astrophysics Data System (ADS)

Sudden Cardiac Death (SCD) is responsible for at least 180,000 deaths a year and incurs an average cost of $286 billion annually in the United States alone. Herein, we present a novel discrete event simulation model of SCD, which quantifies the chains of events associated with the formation, growth, and rupture of atheroma plaques, and the subsequent formation of clots, thrombosis and on-set of arrhythmias within a population. The predictions generated by the model are in good agreement both with results obtained from pathological examinations on the frequencies of three major types of atheroma, and with epidemiological data on the prevalence and risk of SCD. These model predictions allow for identification of interventions and importantly for the optimal time of intervention leading to high potential impact on SCD risk reduction (up to 8-fold reduction in the number of SCDs in the population) as well as the increase in life expectancy.

Andreev, Victor P.; Head, Trajen; Johnson, Neil; Deo, Sapna K.; Daunert, Sylvia; Goldschmidt-Clermont, Pascal J.

2013-05-01

53

This study used discrete event simulation to model the personnel recruiting process for a U.S. Army recruiting company. Actual data from the company was collected and used to build the simulation model. The model is run under various conditions...

Fancher, Robert H.

2012-06-07

54

The objective of the study was to develop a simulation model for field machinery operations using a discrete event simulation technique in order to analyse machinery performance based on daily status of soil workability for a series of years (Daily Work. method), and to compare the results with those from a simpler method based on average probability values of available

A. de Toro; P.-A. Hansson

2004-01-01

55

to be investigated in the context of model re-use and the emergency medical services field. 2. To develop an AgentModelling Large Pathways: A Hybrid Agent-Based Discrete Event Simulation Tool for Emergency Medical simulation tool for analysing emergency medical services. To reflect upon this aim the following objectives

Oakley, Jeremy

56

The effects of indoor environmental exposures on pediatric asthma: a discrete event simulation model

Background In the United States, asthma is the most common chronic disease of childhood across all socioeconomic classes and is the most frequent cause of hospitalization among children. Asthma exacerbations have been associated with exposure to residential indoor environmental stressors such as allergens and air pollutants as well as numerous additional factors. Simulation modeling is a valuable tool that can be used to evaluate interventions for complex multifactorial diseases such as asthma but in spite of its flexibility and applicability, modeling applications in either environmental exposures or asthma have been limited to date. Methods We designed a discrete event simulation model to study the effect of environmental factors on asthma exacerbations in school-age children living in low-income multi-family housing. Model outcomes include asthma symptoms, medication use, hospitalizations, and emergency room visits. Environmental factors were linked to percent predicted forced expiratory volume in 1 second (FEV1%), which in turn was linked to risk equations for each outcome. Exposures affecting FEV1% included indoor and outdoor sources of NO2 and PM2.5, cockroach allergen, and dampness as a proxy for mold. Results Model design parameters and equations are described in detail. We evaluated the model by simulating 50,000 children over 10 years and showed that pollutant concentrations and health outcome rates are comparable to values reported in the literature. In an application example, we simulated what would happen if the kitchen and bathroom exhaust fans were improved for the entire cohort, and showed reductions in pollutant concentrations and healthcare utilization rates. Conclusions We describe the design and evaluation of a discrete event simulation model of pediatric asthma for children living in low-income multi-family housing. Our model simulates the effect of environmental factors (combustion pollutants and allergens), medication compliance, seasonality, and medical history on asthma outcomes (symptom-days, medication use, hospitalizations, and emergency room visits). The model can be used to evaluate building interventions and green building construction practices on pollutant concentrations, energy savings, and asthma healthcare utilization costs, and demonstrates the value of a simulation approach for studying complex diseases such as asthma. PMID:22989068

2012-01-01

57

Discrete event simulation tool for analysis of qualitative models of continuous processing systems

NASA Technical Reports Server (NTRS)

An artificial intelligence design and qualitative modeling tool is disclosed for creating computer models and simulating continuous activities, functions, and/or behavior using developed discrete event techniques. Conveniently, the tool is organized in four modules: library design module, model construction module, simulation module, and experimentation and analysis. The library design module supports the building of library knowledge including component classes and elements pertinent to a particular domain of continuous activities, functions, and behavior being modeled. The continuous behavior is defined discretely with respect to invocation statements, effect statements, and time delays. The functionality of the components is defined in terms of variable cluster instances, independent processes, and modes, further defined in terms of mode transition processes and mode dependent processes. Model construction utilizes the hierarchy of libraries and connects them with appropriate relations. The simulation executes a specialized initialization routine and executes events in a manner that includes selective inherency of characteristics through a time and event schema until the event queue in the simulator is emptied. The experimentation and analysis module supports analysis through the generation of appropriate log files and graphics developments and includes the ability of log file comparisons.

Malin, Jane T. (inventor); Basham, Bryan D. (inventor); Harris, Richard A. (inventor)

1990-01-01

58

In this paper, we present the first published healthcare application of discrete-event simulation embedded in an ant colony optimization model. We consider the problem of choosing optimal screening policies for retinopathy, a serious complication of diabetes. In order to minimize the screening cost per year of sight saved, compared with a situation with no screening, individuals aged between 30 and

Marion S. Rauner; Walter J. Gutjahr; Sally C. Brailsford; Wolfgang Zeppelzauer

59

In this paper we present the first application to a healthcare problem of discrete-event simulation (DES) embedded in an ant colony optimisation (ACO) model. We are concerned with choosing optimal screening policies for retinopathy, a sight-threatening complication of diabetes. The early signs of retinopathy can be detected by screening before the patient is aware of symptoms, and blindness prevented by

Sally C. Brailsford; Walter J. Gutjahr; Marion S. Rauner; Wolfgang Zeppelzauer

2007-01-01

60

NASA Astrophysics Data System (ADS)

The Time Wrap algorithm [3] offers a run time recovery mechanism that deals with the causality errors. These run time recovery mechanisms consists of rollback, anti-message, and Global Virtual Time (GVT) techniques. For rollback, there is a need to compute GVT which is used in discrete-event simulation to reclaim the memory, commit the output, detect the termination, and handle the errors. However, the computation of GVT requires dealing with transient message problem and the simultaneous reporting problem. These problems can be dealt in an efficient manner by the Samadi's algorithm [8] which works fine in the presence of causality errors. However, the performance of both Time Wrap and Samadi's algorithms depends on the latency involve in GVT computation. Both algorithms give poor latency for large simulation systems especially in the presence of causality errors. To improve the latency and reduce the processor ideal time, we implement tree and butterflies barriers with the optimistic algorithm. Our analysis shows that the use of synchronous barriers such as tree and butterfly with the optimistic algorithm not only minimizes the GVT latency but also minimizes the processor idle time.

Rizvi, Syed S.; Shah, Dipali; Riasat, Aasia

61

Discrete event simulation for exploring strategies: an urban water management case.

This paper presents a model structure aimed at offering an overview of the various elements of a strategy and exploring their multidimensional effects through time in an efficient way. It treats a strategy as a set of discrete events planned to achieve a certain strategic goal and develops a new form of causal networks as an interfacing component between decision makers and environment models, e.g., life cycle inventory and material flow models. The causal network receives a strategic plan as input in a discrete manner and then outputs the updated parameter sets to the subsequent environmental models. Accordingly, the potential dynamic evolution of environmental systems caused by various strategies can be stepwise simulated. It enables a way to incorporate discontinuous change in models for environmental strategy analysis, and enhances the interpretability and extendibility of a complex model by its cellular constructs. It is exemplified using an urban water management case in Kunming, a major city in Southwest China. By utilizing the presented method, the case study modeled the cross-scale interdependencies of the urban drainage system and regional water balance systems, and evaluated the effectiveness of various strategies for improving the situation of Dianchi Lake. PMID:17328203

Huang, Dong-Bin; Scholz, Roland W; Gujer, Willi; Chitwood, Derek E; Loukopoulos, Peter; Schertenleib, Roland; Siegrist, Hansruedi

2007-02-01

62

Discrete-event simulation of a wide-area health care network.

OBJECTIVE: Predict the behavior and estimate the telecommunication cost of a wide-area message store-and-forward network for health care providers that uses the telephone system. DESIGN: A tool with which to perform large-scale discrete-event simulations was developed. Network models for star and mesh topologies were constructed to analyze the differences in performances and telecommunication costs. The distribution of nodes in the network models approximates the distribution of physicians, hospitals, medical labs, and insurers in the Province of Saskatchewan, Canada. Modeling parameters were based on measurements taken from a prototype telephone network and a survey conducted at two medical clinics. Simulation studies were conducted for both topologies. RESULTS: For either topology, the telecommunication cost of a network in Saskatchewan is projected to be less than $100 (Canadian) per month per node. The estimated telecommunication cost of the star topology is approximately half that of the mesh. Simulations predict that a mean end-to-end message delivery time of two hours or less is achievable at this cost. A doubling of the data volume results in an increase of less than 50% in the mean end-to-end message transfer time. CONCLUSION: The simulation models provided an estimate of network performance and telecommunication cost in a specific Canadian province. At the expected operating point, network performance appeared to be relatively insensitive to increases in data volume. Similar results might be anticipated in other rural states and provinces in North America where a telephone-based network is desired. PMID:7583646

McDaniel, J G

1995-01-01

63

Can discrete event simulation be of use in modelling major depression?

Background Depression is among the major contributors to worldwide disease burden and adequate modelling requires a framework designed to depict real world disease progression as well as its economic implications as closely as possible. Objectives In light of the specific characteristics associated with depression (multiple episodes at varying intervals, impact of disease history on course of illness, sociodemographic factors), our aim was to clarify to what extent "Discrete Event Simulation" (DES) models provide methodological benefits in depicting disease evolution. Methods We conducted a comprehensive review of published Markov models in depression and identified potential limits to their methodology. A model based on DES principles was developed to investigate the benefits and drawbacks of this simulation method compared with Markov modelling techniques. Results The major drawback to Markov models is that they may not be suitable to tracking patients' disease history properly, unless the analyst defines multiple health states, which may lead to intractable situations. They are also too rigid to take into consideration multiple patient-specific sociodemographic characteristics in a single model. To do so would also require defining multiple health states which would render the analysis entirely too complex. We show that DES resolve these weaknesses and that its flexibility allow patients with differing attributes to move from one event to another in sequential order while simultaneously taking into account important risk factors such as age, gender, disease history and patients attitude towards treatment, together with any disease-related events (adverse events, suicide attempt etc.). Conclusion DES modelling appears to be an accurate, flexible and comprehensive means of depicting disease progression compared with conventional simulation methodologies. Its use in analysing recurrent and chronic diseases appears particularly useful compared with Markov processes. PMID:17147790

Le Lay, Agathe; Despiegel, Nicolas; Francois, Clement; Duru, Gerard

2006-01-01

64

Purpose – Seeks to present a methodology for working with bottle-neck reduction by using a combination of automatic data collection and discrete-event simulation (DES) for a manufacturing system. Design\\/methodology\\/approach – In the DES model, the bottle-neck was identified by studying the simulation runs based on the collected automatic data from the different machines in the manufacturing system. Findings – A

Arne Ingemansson; Torbjörn Ylipää; Gunnar S. Bolmsjö

2005-01-01

65

Tutorial: Parallel Simulation on Supercomputers

This tutorial introduces typical hardware and software characteristics of extant and emerging supercomputing platforms, and presents issues and solutions in executing large-scale parallel discrete event simulation scenarios on such high performance computing systems. Covered topics include synchronization, model organization, example applications, and observed performance from illustrative large-scale runs.

Perumalla, Kalyan S [ORNL

2012-01-01

66

This report outlines a methodology to study the effects of disruptive events on nuclear waste material in stable geologic sites. The methodology is based upon developing a discrete events model that can be simulated on the computer. This methodology allows a natural development of simulation models that use computer resources in an efficient manner. Accurate modeling in this area depends in large part upon accurate modeling of ion transport behavior in the storage media. Unfortunately, developments in this area are not at a stage where there is any consensus on proper models for such transport. Consequently, our work is directed primarily towards showing how disruptive events can be properly incorporated in such a model, rather than as a predictive tool at this stage. When and if proper geologic parameters can be determined, then it would be possible to use this as a predictive model. Assumptions and their bases are discussed, and the mathematical and computer model are described.

Aggarwal, S.; Ryland, S.; Peck, R.

1980-06-19

67

Users of simulation continue to demand more realism and accuracy. This has been addressed by the development of new simulation languages, better simulation software, more user-friendly interface, and more advanced computers. Despite this advance in general simulation capability, only recently has significant research addressed the efficiency of the simulation of the motion of autonomous spatial objects in discrete simulation. The

Eugene P. Paulo; Linda C. Malone

1997-01-01

68

Using machine learning techniques to interpret results from discrete event

Using machine learning techniques to interpret results from discrete event simulation Dunja Mladeni machine learning techniques. The results of two simulators were processed as machine learning problems discovered. Key words: discrete event simulation, machine learning, artificial intelligence 1 Introduction

Mladenic, Dunja

69

The development of simulation technology provides a technical base for multimedia teaching to realize three-dimensional, dynamic, extensive forms, but how to apply simulation technology to improve the quality of modern education is a challenge the college teaching is facing. Simulation science and technology is not only a new education method, but also a new means of practice and experiment. More

Wang Yachao

2008-01-01

70

Optimistic Parallel Simulation of TCP/IP Over ATM Networks

on TCP/IP over ATM networks, and compares the performance of a parallel simulator to ProTEuS (a compared to ProTEuS. #12;Contents 1 Introduction 1 1.1 Parallel Discrete Event Simulation Related Work 7 2.1 Proportional Time Emulation and Simulation (ProTEuS) . . . . . . . . . . . . 7 3

Kansas, University of

71

A fuzzy approach on simulating and optimizing the performance of a discrete-event production line

Applications of fuzzy set theory are spreading widely into different research areas. The main differences between fuzzy and non-fuzzy simulation approaches are related to the most important attribute of fuzzy approaches and that is modeling uncertain key system parameters (in this study, specifically by fuzzy triangular numbers) with vague data, and therefore obtaining more realistic systematic outputs, whilst utilizing overused

A. H. A. Rahnama; Akhavan Rahnama

2010-01-01

72

A methodology for fabrication of intelligent discrete-event simulation models

In this article a meta-specification for the software requirements and design of intelligent discrete next-event simulation models has been presented. The specification is consistent with established practices for software development as presented in the software engineering literature. The specification has been adapted to take into consideration the specialized needs of object-oriented programming resulting in the actor-centered taxonomy. The heart of the meta-specification is the methodology for requirements specification and design specification of the model. The software products developed by use of the methodology proposed herein are at the leading edge of technology in two very synergistic disciplines - expert systems and simulation. By incorporating simulation concepts into expert systems a deeper reasoning capability is obtained - one that is able to emulate the dynamics or behavior of the object system or process over time. By including expert systems concepts into simulation, the capability to emulate the reasoning functions of decision-makers involved with (and subsumed by) the object system is attained. In either case the robustness of the technology is greatly enhanced.

Morgeson, J.D.; Burns, J.R.

1987-01-01

73

Discrete event front tracking simulator of a physical fire spread model

into account roads and fuel breaks, and thus 25 million cells are necessary which greatly impact simulation, interface, front tracking * Corresponding author Email address: filippi@univ-corse.fr (J.B. Filippi). hal the phenomenon is evolving on an area composed of non burnable roads, rivers and fuel breaks and burnable large

Boyer, Edmond

74

Discrete event simulation of a proton therapy facility: a case study.

Proton therapy is a type of particle therapy which utilizes a beam of protons to irradiate diseased tissue. The main difference with respect to conventional radiotherapy (X-rays, ?-rays) is the capability to target tumors with extreme precision, which makes it possible to treat deep-seated tumors and tumors affecting noble tissues as brain, eyes, etc. However, proton therapy needs high-energy cyclotrons and this requires sophisticated control-supervision schema to guarantee, further than the prescribed performance, the safety of the patients and of the operators. In this paper we present the modeling and simulation of the irradiation process of the PROSCAN facility at the Paul Scherrer Institut. This is a challenging task because of the complexity of the operation scenario, which consists of deterministic and stochastic processes resulting from the coordination-interaction among diverse entities such as distributed automatic control systems, safety protection systems, and human operators. PMID:20675013

Corazza, Uliana; Filippini, Roberto; Setola, Roberto

2011-06-01

75

Knowledge acquisition for discrete event systems using machine learning

Knowledge acquisition for discrete event systems using machine learning Dunja Mladeni'c, 1 and Ivan of discrete event simulation systems is a difficult task. Machine Learning has been investigated to help of discrete event simulation modÂ els and machine learning as tools for the intelligent analyÂ sis

Mladenic, Dunja

76

Objective Develop and validate particular, concrete, and abstract yet plausible in silico mechanistic explanations for large intra- and interindividual variability observed for eleven bioequivalence study participants. Do so in the face of considerable uncertainty about mechanisms. Methods We constructed an object-oriented, discrete event model called subject (we use small caps to distinguish computational objects from their biological counterparts). It maps abstractly to a dissolution test system and study subject to whom product was administered orally. A subject comprises four interconnected grid spaces and event mechanisms that map to different physiological features and processes. Drugs move within and between spaces. We followed an established, Iterative Refinement Protocol. Individualized mechanisms were made sufficiently complicated to achieve prespecified Similarity Criteria, but no more so. Within subjects, the dissolution space is linked to both a product-subject Interaction Space and the GI tract. The GI tract and Interaction Space connect to plasma, from which drug is eliminated. Results We discovered parameterizations that enabled the eleven subject simulation results to achieve the most stringent Similarity Criteria. Simulated profiles closely resembled those with normal, odd, and double peaks. We observed important subject-by-formulation interactions within subjects. Conclusion We hypothesize that there were interactions within bioequivalence study participants corresponding to the subject-by-formulation interactions within subjects. Further progress requires methods to transition currently abstract subject mechanisms iteratively and parsimoniously to be more physiologically realistic. As that objective is achieved, the approach presented is expected to become beneficial to drug development (e.g., controlled release) and to a reduction in the number of subjects needed per study plus faster regulatory review. PMID:22938185

2012-01-01

77

\\u000a The diffusion rate of Radio Frequency Identification (RFID) technology in supply chains is lower than expected. The main reason\\u000a is the doubtful Return On Investment (ROI) mainly due to high tag prices. In this contribution, we leverage a prototypical\\u000a RFID implementation in the fashion industry, extend this prototype by a discrete event simulation, and discuss the impact\\u000a of an Electronic

Jürgen Müller; Ralph Tröger; Alexander Zeier; Rainer Alt

2009-01-01

78

The local Time Warp approach to parallel simulation

The two main approaches to parallel discrete event simulation – conservative and optimistic – are likely to encounter some limitations when the size and complexity of the simulation system increases. For such large scale simulations, the conservative approach appears to be limited by blocking overhead and sensitivity to lookahead, whereas the optimistic approach may become prone to cascading rollbacks, state

Hassan Rajaei; Rassul Ayani; Lars-Erik Thorelli

1993-01-01

79

Simulating Billion-Task Parallel Programs

In simulating large parallel systems, bottom-up approaches exercise detailed hardware models with effects from simplified software models or traces, whereas top-down approaches evaluate the timing and functionality of detailed software models over coarse hardware models. Here, we focus on the top-down approach and significantly advance the scale of the simulated parallel programs. Via the direct execution technique combined with parallel discrete event simulation, we stretch the limits of the top-down approach by simulating message passing interface (MPI) programs with millions of tasks. Using a timing-validated benchmark application, a proof-of-concept scaling level is achieved to over 0.22 billion virtual MPI processes on 216,000 cores of a Cray XT5 supercomputer, representing one of the largest direct execution simulations to date, combined with a multiplexing ratio of 1024 simulated tasks per real task.

Perumalla, Kalyan S [ORNL] [ORNL; Park, Alfred J [ORNL] [ORNL

2014-01-01

80

Threaded WARPED: An Optimistic Parallel Discrete Event Simulator for Cluster of Multi-Core Machines

. However, the emergence of low-cost multi-core and many-core processors suitable for use in Beowulf called WARPED to a Beowulf Cluster of many-core processors. More precisely, WARPED is an optimistically for efficient execution on single-core Beowulf Clusters. The work of this thesis extends the WARPED kernel

Wilsey, Philip A.

81

Inflated speedups in parallel simulations via malloc()

NASA Technical Reports Server (NTRS)

Discrete-event simulation programs make heavy use of dynamic memory allocation in order to support simulation's very dynamic space requirements. When programming in C one is likely to use the malloc() routine. However, a parallel simulation which uses the standard Unix System V malloc() implementation may achieve an overly optimistic speedup, possibly superlinear. An alternate implementation provided on some (but not all systems) can avoid the speedup anomaly, but at the price of significantly reduced available free space. This is especially severe on most parallel architectures, which tend not to support virtual memory. It is shown how a simply implemented user-constructed interface to malloc() can both avoid artificially inflated speedups, and make efficient use of the dynamic memory space. The interface simply catches blocks on the basis of their size. The problem is demonstrated empirically, and the effectiveness of the solution is shown both empirically and analytically.

Nicol, David M.

1990-01-01

82

characteristics can be derived. Thus, these cyclic activities are of primary concern to management and should be the focus of the simulation model. AbouRizk and Halpin (1990) and (1992) have developed more detailed guides which build upon Halpin's original... of such models requires: (1) Application of input modeling techniques; (2) appropriate analysis of output parameters based on multiple runs; and (3) validation and verification of the results. AbouRizk and Halpin (1992) focus on the statistical characteristics...

Krukenberg, Harry J.

2012-06-07

83

Background Osteoporotic fractures cause a large health burden and substantial costs. This study estimated the expected fracture numbers and costs for the remaining lifetime of postmenopausal women in Germany. Methods A discrete event simulation (DES) model which tracks changes in fracture risk due to osteoporosis, a previous fracture or institutionalization in a nursing home was developed. Expected lifetime fracture numbers and costs per capita were estimated for postmenopausal women (aged 50 and older) at average osteoporosis risk (AOR) and for those never suffering from osteoporosis. Direct and indirect costs were modeled. Deterministic univariate and probabilistic sensitivity analyses were conducted. Results The expected fracture numbers over the remaining lifetime of a 50 year old woman with AOR for each fracture type (% attributable to osteoporosis) were: hip 0.282 (57.9%), wrist 0.229 (18.2%), clinical vertebral 0.206 (39.2%), humerus 0.147 (43.5%), pelvis 0.105 (47.5%), and other femur 0.033 (52.1%). Expected discounted fracture lifetime costs (excess cost attributable to osteoporosis) per 50 year old woman with AOR amounted to €4,479 (€1,995). Most costs were accrued in the hospital €1,743 (€751) and long-term care sectors €1,210 (€620). Univariate sensitivity analysis resulted in percentage changes between -48.4% (if fracture rates decreased by 2% per year) and +83.5% (if fracture rates increased by 2% per year) compared to base case excess costs. Costs for women with osteoporosis were about 3.3 times of those never getting osteoporosis (€7,463 vs. €2,247), and were markedly increased for women with a previous fracture. Conclusion The results of this study indicate that osteoporosis causes a substantial share of fracture costs in postmenopausal women, which strongly increase with age and previous fractures. PMID:24981316

2014-01-01

84

This tutorial demonstrates the use of agent-based simulation (ABS) in modeling emergent behaviors. We first introduce key concepts of ABS by using two simple examples: the Game of Life and the Boids models. We illustrate agent-based modeling issues and simulation of emergent behaviors by using examples in social networks, auction-type markets, emergency evacuation, crowd behavior under normal situations, biology, material

Young-Jun Son; Charles M. Macal

2010-01-01

85

NASA Technical Reports Server (NTRS)

This paper surveys topics that presently define the state of the art in parallel simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, time parallelism, hardware support for parallel simulation, load balancing algorithms, and dynamic memory management for optimistic synchronization.

Nicol, David; Fujimoto, Richard

1992-01-01

86

Parallel Atomistic Simulations

Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.

HEFFELFINGER,GRANT S.

2000-01-18

87

Supervision Patterns Discrete Event Systems Diagnosis

Supervision Patterns in Discrete Event Systems Diagnosis Thierry JÂ´eron, HervÂ´e Marchand, SophieName.Name}@irisa.fr Abstract-- In this paper, we are interested in the diagnosis of discrete event systems modeled by finite of particular trajectories of the system. Modeling the diagnosis objective by a supervision pattern allows us

Paris-Sud XI, UniversitÃ© de

88

Scalable Parallel Crash Simulations

We are pleased to submit our efforts in parallelizing the PRONTO application suite for con- sideration in the SuParCup 99 competition. PRONTO is a finite element transient dynamics simulator which includes a smoothed particle hydrodynamics (SPH) capability; it is similar in scope to the well-known DYNA, PamCrash, and ABAQUS codes. Our efforts over the last few years have produced a fully parallel version of the entire PRONTO code which (1) runs fast and scalably on thousands of processors, (2) has performed the largest finite-element transient dynamics simulations we are aware of, and (3) includes several new parallel algorithmic ideas that have solved some difficult problems associated with contact detection and SPH scalability. We motivate this work, describe the novel algorithmic advances, give performance numbers for PRONTO running on Sandia's Intel Teraflop machine, and highlight two prototypical large-scale computations we have performed with the parallel code. We have successfully parallelized a large-scale production transient dynamics code with a novel algorithmic approach that utilizes multiple decompositions for different key segments of the computations. To be able to simulate a more than ten million element model in a few tenths of second per timestep is unprecedented for solid dynamics simulations, especially when full global contact searches are required. The key reason is our new algorithmic ideas for efficiently parallelizing the contact detection stage. To our knowledge scalability of this computation had never before been demonstrated on more than 64 processors. This has enabled parallel PRONTO to become the only solid dynamics code we are aware of that can run effectively on 1000s of processors. More importantly, our parallel performance compares very favorably to the original serial PRONTO code which is optimized for vector supercomputers. On the container crush problem, a Teraflop node is as fast as a single processor of the Cray Jedi. This means that on the Teraflop machine we can now run simulations with tens of millions of elements thousands of times faster than we could on the Jedi! This is enabling transient dynamics simulations of unprecedented scale and fidelity. Not only can previous applications be run with vastly improved resolution and speed, but qualitatively new and different analyses have been made possible.

Attaway, Stephen; Barragy, Ted; Brown, Kevin; Gardner, David; Gruda, Jeff; Heinstein, Martin; Hendrickson, Bruce; Metzinger, Kurt; Neilsen, Mike; Plimpton, Steve; Pott, John; Swegle, Jeff; Vaughan, Courtenay

1999-06-01

89

The Lightweight Time Warp (LTW) protocol offers a novel approach to high-performance optimistic parallel discrete-event simulation, especially when a large number of simultaneous events need to be executed at each virtual time. With LTW, the local simulation space on each node is partitioned into two sub-domains, allowing purely optimistic simulation to be driven by only a few full-fledged logical processes

Qi Liu; Gabriel Wainer

2009-01-01

90

An algebra of discrete event processes

NASA Technical Reports Server (NTRS)

This report deals with an algebraic framework for modeling and control of discrete event processes. The report consists of two parts. The first part is introductory, and consists of a tutorial survey of the theory of concurrency in the spirit of Hoare's CSP, and an examination of the suitability of such an algebraic framework for dealing with various aspects of discrete event control. To this end a new concurrency operator is introduced and it is shown how the resulting framework can be applied. It is further shown that a suitable theory that deals with the new concurrency operator must be developed. In the second part of the report the formal algebra of discrete event control is developed. At the present time the second part of the report is still an incomplete and occasionally tentative working paper.

Heymann, Michael; Meyer, George

1991-01-01

91

Discrete Events as Units of Perceived Time

ERIC Educational Resources Information Center

In visual images, we perceive both space (as a continuous visual medium) and objects (that inhabit space). Similarly, in dynamic visual experience, we perceive both continuous time and discrete events. What is the relationship between these units of experience? The most intuitive answer may be similar to the spatial case: time is perceived as an…

Liverence, Brandon M.; Scholl, Brian J.

2012-01-01

92

Failure diagnosis using discrete-event models

Detection and isolation of failures in large, complex systems is a crucial and challenging task. The increasingly stringent requirements on performance and reliability of complex technological systems have necessitated the development of sophisticated and systematic methods for the timely and accurate diagnosis of system failures. We propose a discrete-event systems (DES) approach to the failure diagnosis problem. This approach is

Meera Sampath; Raja Sengupta; Stephane Lafortune; Kasim Sinnamohideen; Demosthenis C. Teneketzis

1996-01-01

93

Complex system analysis through discrete event simulation

E-commerce is generally thought of as a world without walls. Although a computer monitor may replace a storefront window, the products that are purchased online have to be distributed from a brick and mortar warehouse. ...

Faranca, Anthony G. (Anthony Gilbert), 1971-

2004-01-01

94

A language orientation for the discrete event modeling of application programs and operating systems

Discrete event simulation of computer programs, particularly operating system programs, has until recently been an elusive objective of computer performance simulation. However, out of my own experience and my observations of the efforts of others, I have come to believe that the most direct approach to achieving such a simulation capability is essentially linguistic. I consider these subjects to be

Leo J. Cohen

1974-01-01

95

Simulation Environment Configuration for Parallel Simulation of Multicore Embedded Systems

Simulation Environment Configuration for Parallel Simulation of Multicore Embedded Systems Dukyoung turnaround time due to the growing demand of simulation time. Parallel simulation aims to accelerate the simulation speed by running component simulators concurrently. But extra overhead of communication

Ha, Soonhoi

96

Parallelizing Timed Petri Net simulations

NASA Technical Reports Server (NTRS)

The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.

Nicol, David M.

1993-01-01

97

Gradient estimation for discrete-event systems by measure-valued differentiation

In simulation of complex stochastic systems, such as Discrete-Event Systems (DES), statistical distributions are used to model the underlying randomness in the system. A sensitivity analysis of the simulation output with respect to parameters of the input distributions, such as the mean and the variance, is therefore of great value. The focus of this article is to provide a practical

Bernd Heidergott; Felisa J. Vázquez--Abad; Georg Ch. Pflug; Taoying Farenhorst-Yuan

2010-01-01

98

A Parallel Quantum Computer Simulator

A Quantum Computer is a new type of computer which can efficiently solve complex problems such as prime factorization. A quantum computer threatens the security of public key encryption systems because these systems rely on the fact that prime factorization is computationally difficult. Errors limit the effectiveness of quantum computers. Because of the exponential nature of quantum com puters, simulating the effect of errors on them requires a vast amount of processing and memory resources. In this paper we describe a parallel simulator which accesses the feasibility of quantum computers. We also derive and validate an analytical model of execution time for the simulator, which shows that parallel quantum computer simulation is very scalable.

Kevin M. Obenland; Alvin M. Despain

1998-04-16

99

PARALLEL AND DISTRIBUTED SIMULATION SYSTEMS

Originating from basic research conducted in the 1970's and 1980's, the parallel and distributed simulation field has ma- tured over the last few decades. Today, operational systems have been fielded for applications such as military training, analysis of communication networks, and air traffic control systems, to mention a few. This tutorial gives an overview of technologies to distribute the execution

Richard M. Fujimoto

1999-01-01

100

An assessment of the ModSim/TWOS parallel simulation environment

The Time Warp Operating System (TWOS) has been the focus of significant research in parallel, discrete-event simulation (PDES). A new language, ModSim, has been developed for use in conjunction with TWOS. The coupling of ModSim and TWOS is an attempt to address the development of large-scale, complex, discrete-event simulation models for parallel execution. The approach, simply stated, is to provide a high-level simulation-language that embodies well-known software engineering principles combined with a high-performance parallel execution environment. The inherent difficulty with this approach is the mapping of the simulation application to the parallel run-time environment. To use TWOS, Time Warp applications are currently developed in C and must be tailored according to a set of constraints and conventions. C/TWOS applications are carefully developed using explicit calls to the Time Warp primitives; thus, the mapping of application to parallel run-time environment is done by the application developer. The disadvantage to this approach is the questionable scalability to larger software efforts; the obvious advantage is the degree of control over managing the efficient execution of the application. The ModSim/TWOS system provides an automatic mapping from a ModSim application to an equivalent C/TWOS application. The major flaw with the ModSim/TWOS system is it currently exists is that there is no compiler support for mapping a ModSim application into an efficient C/TWOS application. Moreover, the ModSim language as currently defined does not provide explicit hooks into the Time Warp Operating System and hence the developer is unable to tailor a ModSim application in the same fashion that a C application can be tailored. Without sufficient compiler support, there is a mismatch between ModSim's object-oriented, process-based execution model and the Time Warp execution model.

Rich, D.O.; Michelsen, R.E.

1991-01-01

101

Diagnosis of asynchronous discrete event systems, a net unfolding approach

1 Diagnosis of asynchronous discrete event systems, a net unfolding approach Albert Benveniste, Fellow, IEEE, Eric Fabre, Stefan Haar, and Claude Jard Abstract-- In this paper we consider the diagnosis in telecommunications network management. Keywords: diagnosis, asynchronous diagnosis, discrete event systems, Petri

Paris-Sud XI, UniversitÃ© de

102

Discrete event dynamic system (DES)-based modeling for dynamic material flow in the pyroprocess

A modeling and simulation methodology was proposed in order to implement the dynamic material flow of the pyroprocess. Since the static mass balance provides the limited information on the material flow, it is hard to predict dynamic behavior according to event. Therefore, a discrete event system (DES)-based model named, PyroFlow, was developed at the Korea Atomic Energy Research Institute (KAERI).

Hyo Jik Lee; Kiho Kim; Ho Dong Kim; Han Soo Lee

2011-01-01

103

Xyce parallel electronic simulator design.

This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

2010-09-01

104

Parallel network simulations with NEURON.

The NEURON simulation environment has been extended to support parallel network simulations. Each processor integrates the equations for its subnet over an interval equal to the minimum (interprocessor) presynaptic spike generation to postsynaptic spike delivery connection delay. The performance of three published network models with very different spike patterns exhibits superlinear speedup on Beowulf clusters and demonstrates that spike communication overhead is often less than the benefit of an increased fraction of the entire problem fitting into high speed cache. On the EPFL IBM Blue Gene, almost linear speedup was obtained up to 100 processors. Increasing one model from 500 to 40,000 realistic cells exhibited almost linear speedup on 2,000 processors, with an integration time of 9.8 seconds and communication time of 1.3 seconds. The potential for speed-ups of several orders of magnitude makes practical the running of large network simulations that could otherwise not be explored. PMID:16732488

Migliore, M; Cannia, C; Lytton, W W; Markram, Henry; Hines, M L

2006-10-01

105

Modelling machine ensembles with discrete event dynamical system theory

NASA Technical Reports Server (NTRS)

Discrete Event Dynamical System (DEDS) theory can be utilized as a control strategy for future complex machine ensembles that will be required for in-space construction. The control strategy involves orchestrating a set of interactive submachines to perform a set of tasks for a given set of constraints such as minimum time, minimum energy, or maximum machine utilization. Machine ensembles can be hierarchically modeled as a global model that combines the operations of the individual submachines. These submachines are represented in the global model as local models. Local models, from the perspective of DEDS theory , are described by the following: a set of system and transition states, an event alphabet that portrays actions that takes a submachine from one state to another, an initial system state, a partial function that maps the current state and event alphabet to the next state, and the time required for the event to occur. Each submachine in the machine ensemble is presented by a unique local model. The global model combines the local models such that the local models can operate in parallel under the additional logistic and physical constraints due to submachine interactions. The global model is constructed from the states, events, event functions, and timing requirements of the local models. Supervisory control can be implemented in the global model by various methods such as task scheduling (open-loop control) or implementing a feedback DEDS controller (closed-loop control).

Hunter, Dan

1990-01-01

106

Parallel simulation of the Sharks World problem

The Sharks World problem has been suggested as a suitable application to evaluate the effectiveness of parallel simulation algorithms. This paper develops a simulation model in Maisie, a C-based simulation language. With minor modifications, a Maisie progrmm may be executed using either sequential or parallel simulation algorithms. The paper presents the results of executing the Maisie model on a multicomputer

Rajive L. Bagrodia; Wen-Toh Liao

1990-01-01

107

Diagnosis of a class of distributed discrete-event systems

Discrete-event modeling can be applied to a large variety of physical systems, in order to support different tasks, including fault detection, monitoring, and diagnosis. The paper focuses on the model-based diagnosis of a class of distributed discrete-event systems, called active systems. An active system, which is designed to react to possibly harmful external events, is modeled as a network of

Pietro Baroni; Gianfranco Lamperti; Paolo Pogliano; Marina Zanella

2000-01-01

108

A Discrete Event Control Based on EVALPSN Stable Model Computation

\\u000a In this paper, we introduce a discrete event control for Cat and Mouse example based on a paraconsistent logic program EVALPSN\\u000a stable model computation. Predicting and avoiding control deadlock states are crucial problems in discrete event control systems.\\u000a We show that the EVALPSN control can deal with prediction and avoidance of control dadlock states in the Cat and Mouse by

Kazumi Nakamatsu; Sheng-luen Chung; Hayato Komaba; Atsuyuki Suzuki

2005-01-01

109

State Estimation and Detectability of Probabilistic Discrete Event Systems1

A probabilistic discrete event system (PDES) is a nondeterministic discrete event system where the probabilities of nondeterministic transitions are specified. State estimation problems of PDES are more difficult than those of non-probabilistic discrete event systems. In our previous papers, we investigated state estimation problems for non-probabilistic discrete event systems. We defined four types of detectabilities and derived necessary and sufficient conditions for checking these detectabilities. In this paper, we extend our study to state estimation problems for PDES by considering the probabilities. The first step in our approach is to convert a given PDES into a nondeterministic discrete event system and find sufficient conditions for checking probabilistic detectabilities. Next, to find necessary and sufficient conditions for checking probabilistic detectabilities, we investigate the “convergence” of event sequences in PDES. An event sequence is convergent if along this sequence, it is more and more certain that the system is in a particular state. We derive conditions for convergence and hence for detectabilities. We focus on systems with complete event observation and no state observation. For better presentation, the theoretical development is illustrated by a simplified example of nephritis diagnosis. PMID:19956775

Shu, Shaolong; Ying, Hao; Chen, Xinguang

2009-01-01

110

Loosely coupled parallel network simulator

This patent describes a network simulator for simulating a plurality of parallel processing networks. It comprises: a buses for transmitting information segments to processing sites; each of the busses includes data line means to transmit an information segment, at least one control line means to transmit control information related to the information segment; and a reply line means for indicating that another processing site is coupled to the bus to receive the information segment; sets of processing sites each set being coupled to a given one of the plurality of buses, each processing site having a processor means and interface means coupling the processor to the bus; clock means coupled to each of the interface means to synchronize time intervals during which the respective interface means couples it corresponding processor means to its corresponding bus; time multiplex switching means coupled to each of the buses to receive an information segment from one of the busses for transmission on another of the buses; and each of the interface means including sequencing means coupled to the clock means to select during which time interval the corresponding processor means is to be coupled to its respective bus.

Woodward, T.R.

1990-07-17

111

Incremental Diagnosis of Discrete-Event Systems Alban Grastien

Incremental Diagnosis of Discrete-Event Systems Alban Grastien Irisa Â UniversitÂ´e Rennes 1 Rennes represented as finite-state machines (or automata) and the diagnosis formally defined as the synchro- nized the observations one after the other and to incrementally compute the global diagnosis. In this paper, we rely

Paris-Sud XI, UniversitÃ© de

112

Time templates for discrete event fault monitoring in manufacturing systems

The input and output signals of automated manufacturing systems can be characterized as observed time functions of discrete events. Fault monitoring is the online analysis of the process observations to determine if they correspond to a specification of correct process operation. In this paper, we describe a new fault monitoring method, called template monitoring. Template monitoring overcomes several limitations associated

Lawrence E. Holloway; Sujeet Chand

1994-01-01

113

/ variate generation 5. Input data analysis 6. Output data analysis 7. Verification & Validation of models 8 in simulation 12. Simulation modeling using software Textbook/ References 1. J. Banks, J. S. Carson, B. L Series 2. A. M. Law and W. D. Kelton (2000), Simulation Modeling And Analysis, 3rd ed., McGraw Hill

Venkateswaran, Jayendran

114

Graphite: A Distributed Parallel Simulator for Multicores

This paper introduces the open-source Graphite distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, ...

Beckmann, Nathan

2009-11-09

115

Template languages for fault monitoring of timed discrete event processes

This paper introduces a new framework for modeling discrete event processes. This framework, called condition templates, allows the modeling of processes in which both single-instance and multiple-instance behaviors are exhibited concurrently. A single-instance behavior corresponds to a trace from a single finite-state process, and a multiple-instance behavior corresponds to the timed interleavings of an unspecified number of identical processes operating

Deepa N. Pandalai; Larry E. Holloway

2000-01-01

116

Efficient Parallel Simulation of Pulse-Coded Neural Networks (PCNN)

Neural networks are the common model for brain style data processing. Therefore, the algorithms are inherently parallel and a parallel implementation of neural network simulations seems to be straightforward. However, typical parallel artificial neural network (ANN) simulations show only poor speedup on most parallel computers. In contrast, pulse-coded neural networks (PCNN) seem to be better suited for a parallel simulation,

R. Preis; K. Salzwedel; G. Hartmann; C. Wolff

117

Parallel optimization for large eddy simulations

We developed a parallel Bayesian optimization algorithm for large eddy simulations. These simulations challenge optimization methods because they take hours or days to compute, and their objective function contains noise as turbulent statistics that are averaged over a finite time. Surrogate based optimization methods, including Bayesian optimization, have shown promise for noisy and expensive objective functions. Here we adapt Bayesian optimization to minimize drag in a turbulent channel flow and to design the trailing edge of a turbine blade to reduce turbulent heat transfer and pressure loss. Our optimization simultaneously runs several simulations, each parallelized to thousands of cores, in order to utilize additional concurrency offered by today's supercomputers.

Talnikar, Chaitanya; Bodart, Julien; Wang, Qiqi

2014-01-01

118

Simulating the scheduling of parallel supercomputer applications

An Event Driven Simulator for Evaluating Multiprocessing Scheduling (EDSEMS) disciplines is presented. The simulator is made up of three components: machine model; parallel workload characterization ; and scheduling disciplines for mapping parallel applications (many processes cooperating on the same computation) onto processors. A detailed description of how the simulator is constructed, how to use it and how to interpret the output is also given. Initial results are presented from the simulation of parallel supercomputer workloads using Dog-Eat-Dog,'' Family'' and Gang'' scheduling disciplines. These results indicate that Gang scheduling is far better at giving the number of processors that a job requests than Dog-Eat-Dog or Family scheduling. In addition, the system throughput and turnaround time are not adversely affected by this strategy. 10 refs., 8 figs., 1 tab.

Seager, M.K.; Stichnoth, J.M.

1989-09-19

119

Parallel processing of a rotating shaft simulation

NASA Technical Reports Server (NTRS)

A FORTRAN program describing the vibration modes of a rotor-bearing system is analyzed for parellelism in this simulation using a Pascal-like structured language. Potential vector operations are also identified. A critical path through the simulation is identified and used in conjunction with somewhat fictitious processor characteristics to determine the time to calculate the problem on a parallel processing system having those characteristics. A parallel processing overhead time is included as a parameter for proper evaluation of the gain over serial calculation. The serial calculation time is determined for the same fictitious system. An improvement of up to 640 percent is possible depending on the value of the overhead time. Based on the analysis, certain conclusions are drawn pertaining to the development needs of parallel processing technology, and to the specification of parallel processing systems to meet computational needs.

Arpasi, Dale J.

1989-01-01

120

Fault Diagnosis in Discrete-Event Systems: How to Analyse Algorithm Performance?

Fault Diagnosis in Discrete-Event Systems: How to Analyse Algorithm Performance? Yannick PencolÂ´e 1 the fault diagnosis problem in discrete-event systems in an experimental way. To achieve this pur- pose, we an experimental platform based on a tool called DIADES (Diagnosis of Discrete-Event Sys- tems) to run experiments

PencolÃ©, Yannick

121

Ion Propulsion Simulations Using Parallel Supercomputer

A parallel, three-dimensional electrostatic PIC code is developed for large- scale electric propulsion simulations using parallel supercomputers. Two algorithms are implemented in the code, a standard flnite-difierence (FD) PIC and a newly developed immersed-flnite-element (IFE) PIC. The IFE-PIC is designed to handle complex bound- ary conditions accurately while maintaining the computational speed of the standard PIC code. Domain decomposition is

J. Wang; Y. Cao; R. Kafafy; J. Pierru; V. Decyk

122

Ion Propulsion Plume Simulations Using Parallel Supercomputer

Abstract: A parallel, three-dimensional electrostatic PIC code is developed for large- scale electric propulsion simulations using parallel supercomputers. Two algorithms are implemented in the code, a standard flnite-difierence (FD) PIC and a newly developed immersed-flnite-element (IFE) PIC. The IFE-PIC is designed to handle complex bound- ary conditions accurately while maintaining the computational speed of the standard PIC code. Domain decomposition

J. Wang; Y. Cao; R. Kafafy; J. Pierru; V. Decyk

2006-01-01

123

The Xyce Parallel Electronic Simulator - An Overview

The Xyce{trademark} Parallel Electronic Simulator has been written to support the simulation needs of the Sandia National Laboratories electrical designers. As such, the development has focused on providing the capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). In addition, they are providing improved performance for numerical kernels using state-of-the-art algorithms, support for modeling circuit phenomena at a variety of abstraction levels and using object-oriented and modern coding-practices that ensure the code will be maintainable and extensible far into the future. The code is a parallel code in the most general sense of the phrase--a message passing parallel implementation--which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Furthermore, careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved even as the number of processors grows.

HUTCHINSON,SCOTT A.; KEITER,ERIC R.; HOEKSTRA,ROBERT J.; WATTS,HERMAN A.; WATERS,ARLON J.; SCHELLS,REGINA L.; WIX,STEVEN D.

2000-12-08

124

NASA Technical Reports Server (NTRS)

Fast, efficient parallel algorithms are presented for discrete event simulations of dynamic channel assignment schemes for wireless cellular communication networks. The driving events are call arrivals and departures, in continuous time, to cells geographically distributed across the service area. A dynamic channel assignment scheme decides which call arrivals to accept, and which channels to allocate to the accepted calls, attempting to minimize call blocking while ensuring co-channel interference is tolerably low. Specifically, the scheme ensures that the same channel is used concurrently at different cells only if the pairwise distances between those cells are sufficiently large. Much of the complexity of the system comes from ensuring this separation. The network is modeled as a system of interacting continuous time automata, each corresponding to a cell. To simulate the model, conservative methods are used; i.e., methods in which no errors occur in the course of the simulation and so no rollback or relaxation is needed. Implemented on a 16K processor MasPar MP-1, an elegant and simple technique provides speedups of about 15 times over an optimized serial simulation running on a high speed workstation. A drawback of this technique, typical of conservative methods, is that processor utilization is rather low. To overcome this, new methods were developed that exploit slackness in event dependencies over short intervals of time, thereby raising the utilization to above 50 percent and the speedup over the optimized serial code to about 120 times.

Greenberg, Albert G.; Lubachevsky, Boris D.; Nicol, David M.; Wright, Paul E.

1994-01-01

125

Discrete-Event Execution Alternatives on General Purpose Graphical Processing Units

Graphics cards, traditionally designed as accelerators for computer graphics, have evolved to support more general-purpose computation. General Purpose Graphical Processing Units (GPGPUs) are now being used as highly efficient, cost-effective platforms for executing certain simulation applications. While most of these applications belong to the category of time-stepped simulations, little is known about the applicability of GPGPUs to discrete event simulation (DES). Here, we identify some of the issues & challenges that the GPGPU stream-based interface raises for DES, and present some possible approaches to moving DES to GPGPUs. Initial performance results on simulation of a diffusion process show that DES-style execution on GPGPU runs faster than DES on CPU and also significantly faster than time-stepped simulations on either CPU or GPGPU.

Perumalla, Kalyan S [ORNL

2006-01-01

126

Safety Discrete Event Models for Holonic Cyclic Manufacturing Systems

NASA Astrophysics Data System (ADS)

In this paper the expression “holonic cyclic manufacturing systems” refers to complex assembly/disassembly systems or fork/join systems, kanban systems, and in general, to any discrete event system that transforms raw material and/or components into products. Such a system is said to be cyclic if it provides the same sequence of products indefinitely. This paper considers the scheduling of holonic cyclic manufacturing systems and describes a new approach using Petri nets formalism. We propose an approach to frame the optimum schedule of holonic cyclic manufacturing systems in order to maximize the throughput while minimize the work in process. We also propose an algorithm to verify the optimum schedule.

Ciufudean, Calin; Filote, Constantin

127

Parallel simulation of cellular neural networks

In this paper a new simulator for cellular neural networks (CNN) called PSIMCNN is presented. It has been studied for a parallel general-purpose computing architecture based on transputers. The Gauss-Jacobi waveform relaxation (WR) algorithm has been adopted. It has been analytically proved that the WR algorithm is convergent for the most common CNN models. Implementation issues have been described and

L. Fortuna; G. Manganaro; G. Muscato; G. Nunnari

1996-01-01

128

Parallel and Distributed System Simulation

NASA Technical Reports Server (NTRS)

This exploratory study initiated our research into the software infrastructure necessary to support the modeling and simulation techniques that are most appropriate for the Information Power Grid. Such computational power grids will use high-performance networking to connect hardware, software, instruments, databases, and people into a seamless web that supports a new generation of computation-rich problem solving environments for scientists and engineers. In this context we looked at evaluating the NetSolve software environment for network computing that leverages the potential of such systems while addressing their complexities. NetSolve's main purpose is to enable the creation of complex applications that harness the immense power of the grid, yet are simple to use and easy to deploy. NetSolve uses a modular, client-agent-server architecture to create a system that is very easy to use. Moreover, it is designed to be highly composable in that it readily permits new resources to be added by anyone willing to do so. In these respects NetSolve is to the Grid what the World Wide Web is to the Internet. But like the Web, the design that makes these wonderful features possible can also impose significant limitations on the performance and robustness of a NetSolve system. This project explored the design innovations that push the performance and robustness of the NetSolve paradigm as far as possible without sacrificing the Web-like ease of use and composability that make it so powerful.

Dongarra, Jack

1998-01-01

129

State-space supervision of reconfigurable discrete event systems

The Discrete Event Systems (DES) theory of supervisory and state feedback control offers many advantages for implementing supervisory systems. Algorithmic concepts have been introduced to assure that the supervising algorithms are correct and meet the specifications. It is often assumed that the supervisory specifications are invariant or, at least, until a given supervisory task is completed. However, there are many practical applications where the supervising specifications update at real time. For example, in a Reconfigurable Discrete Event System (RDES) architecture, a bank of supervisors is defined to accommodate each identified operational condition or different supervisory specifications. This adaptive supervisory control system changes the supervisory configuration to accept coordinating commands or to adjust for changes in the controlled process. This paper addresses reconfiguration at the supervisory level of hybrid systems along with a RDES underlying architecture. It reviews the state-based supervisory control theory and extends it to the paradigm of RDES and in view of process control applications. The paper addresses theoretical issues with a limited number of practical examples. This control approach is particularly suitable for hierarchical reconfigurable hybrid implementations.

Garcia, H.E. [Argonne National Lab., IL (United States); Ray, A. [Pennsylvania State Univ., University Park, PA (United States)

1995-12-31

130

Continuum Representation for Simulating Discrete Events of Battery Operation

engineering model, which describes the galvanostatic charge/open-circuit/discharge processes of a thin efforts and computation cost. However, it is not ideal for state detection. Â© 2009 The Electrochemical or discharge process. Wu and White12 de- vised an initialization subroutine called differential algebraic

Panchagnula, Mahesh

131

Improving ICU patient flow through discrete-event simulation

Massachusetts General Hospital (MGH), the largest hospital in New England and a national leader in care delivery, teaching, and research, operates ten Intensive Care Units (ICUs), including the 20-bed Ellison 4 Surgical ...

Christensen, Benjamin A. (Benjamin Arthur)

2012-01-01

132

Discrete event simulation to improve aircraft availability and maintainability

This paper presents a study of maintenance operations at the Daytona Beach, Florida campus of Embry-Riddle Aeronautical University. Embry-Riddle is well-known for its large flight training programs. The Flight Training Department also maintains the school's aircraft on-site at the Daytona Beach campus. There, overall system availability at the operational level has been a chronic problem. The number of aircraft grounded

Massoud Bazargan; R. N. McGrath

2003-01-01

133

Parallel Implicit Kinetic Simulation with PARSEK

NASA Astrophysics Data System (ADS)

Kinetic plasma simulation is the ultimate tool for plasma analysis. One of the prime tools for kinetic simulation is the particle in cell (PIC) method. The explicit or semi-implicit (i.e. implicit only on the fields) PIC method requires exceedingly small time steps and grid spacing, limited by the necessity to resolve the electron plasma frequency, the Debye length and the speed of light (for fully explicit schemes). A different approach is to consider fully implicit PIC methods where both particles and fields are discretized implicitly. This approach allows radically larger time steps and grid spacing, reducing the cost of a simulation by orders of magnitude while keeping the full kinetic treatment. In our previous work, simulations impossible for the explicit PIC method even on massively parallel computers have been made possible on a single processor machine using the implicit PIC code CELESTE3D [1]. We propose here another quantum leap: PARSEK, a parallel cousin of CELESTE3D, based on the same approach but sporting a radically redesigned software architecture (object oriented C++, where CELESTE3D was structured and written in FORTRAN77/90) and fully parallelized using MPI for both particle and grid communication. [1] G. Lapenta, J.U. Brackbill, W.S. Daughton, Phys. Plasmas, 10, 1577 (2003).

Stefano, Markidis; Giovanni, Lapenta

2004-11-01

134

Parallel node placement method by bubble simulation

NASA Astrophysics Data System (ADS)

An efficient Parallel Node Placement method by Bubble Simulation (PNPBS), employing METIS-based domain decomposition (DD) for an arbitrary number of processors is introduced. In accordance with the desired nodal density and Newton’s Second Law of Motion, automatic generation of node sets by bubble simulation has been demonstrated in previous work. Since the interaction force between nodes is short-range, for two distant nodes, their positions and velocities can be updated simultaneously and independently during dynamic simulation, which indicates the inherent property of parallelism, it is quite suitable for parallel computing. In this PNPBS method, the METIS-based DD scheme has been investigated for uniform and non-uniform node sets, and dynamic load balancing is obtained by evenly distributing work among the processors. For the nodes near the common interface of two neighboring subdomains, there is no need for special treatment after dynamic simulation. These nodes have good geometrical properties and a smooth density distribution which is desirable in the numerical solution of partial differential equations (PDEs). The results of numerical examples show that quasi linear speedup in the number of processors and high efficiency are achieved.

Nie, Yufeng; Zhang, Weiwei; Qi, Nan; Li, Yiqiang

2014-03-01

135

Optimal memory management for time warp parallel simulation

Recently there has been a great deal of interest in performance evalution of parallel simulation. Most work is devoted to the time complexity and assumes that the amount of memory available for parallel simulation is unlimited. This paper studies the space complexity of parallel simulation. Our goal is to design an efficient memory management protocol which guarantees that the memory

Yi-Bing Lin; Bruno R. Preiss

1991-01-01

136

Fracture simulations via massively parallel molecular dynamics

Fracture simulations at the atomistic level have heretofore been carried out for relatively small systems of particles, typically 10,000 or less. In order to study anything approaching a macroscopic system, massively parallel molecular dynamics (MD) must be employed. In two spatial dimensions (2D), it is feasible to simulate a sample that is 0.1 {mu}m on a side. We report on recent MD simulations of mode I crack extension under tensile loading at high strain rates. The method of uniaxial, homogeneously expanding periodic boundary conditions was employed to represent tensile stress conditions near the crack tip. The effects of strain rate, temperature, material properties (equation of state and defect energies), and system size were examined. We found that, in order to mimic a bulk sample, several tricks (in addition to expansion boundary conditions) need to be employed: (1) the sample must be pre-strained to nearly the condition at which the crack will spontaneously open; (2) to relieve the stresses at free surfaces, such as the initial notch, annealing by kinetic-energy quenching must be carried out to prevent unwanted rarefactions; (3) sound waves emitted as the crack tip opens and dislocations emitted from the crack tip during blunting must be absorbed by special reservoir regions. The tricks described briefly in this paper will be especially important to carrying out feasible massively parallel 3D simulations via MD.

Holian, B.L. [Los Alamos National Lab., NM (United States); Abraham, F.F. [IBM Research Div., San Jose, CA (United States). Almaden Research Center; Ravelo, R. [Texas Univ., El Paso, TX (United States)

1993-09-01

137

Parallel Strategies for Crash and Impact Simulations

We describe a general strategy we have found effective for parallelizing solid mechanics simula- tions. Such simulations often have several computationally intensive parts, including finite element integration, detection of material contacts, and particle interaction if smoothed particle hydrody- namics is used to model highly deforming materials. The need to balance all of these computations simultaneously is a difficult challenge that has kept many commercial and government codes from being used effectively on parallel supercomputers with hundreds or thousands of processors. Our strategy is to load-balance each of the significant computations independently with whatever bal- ancing technique is most appropriate. The chief benefit is that each computation can be scalably paraIlelized. The drawback is the data exchange between processors and extra coding that must be written to maintain multiple decompositions in a single code. We discuss these trade-offs and give performance results showing this strategy has led to a parallel implementation of a widely-used solid mechanics code that can now be run efficiently on thousands of processors of the Pentium-based Sandia/Intel TFLOPS machine. We illustrate with several examples the kinds of high-resolution, million-element models that can now be simulated routinely. We also look to the future and dis- cuss what possibilities this new capabUity promises, as well as the new set of challenges it poses in material models, computational techniques, and computing infrastructure.

Attaway, S.; Brown, K.; Hendrickson, B.; Plimpton, S.

1998-12-07

138

Multibus-based parallel processor for simulation

NASA Technical Reports Server (NTRS)

A Multibus-based parallel processor simulation system is described. The system is intended to serve as a vehicle for gaining hands-on experience, testing system and application software, and evaluating parallel processor performance during development of a larger system based on the horizontal/vertical-bus interprocessor communication mechanism. The prototype system consists of up to seven Intel iSBC 86/12A single-board computers which serve as processing elements, a multiple transmission controller (MTC) designed to support system operation, and an Intel Model 225 Microcomputer Development System which serves as the user interface and input/output processor. All components are interconnected by a Multibus/IEEE 796 bus. An important characteristic of the system is that it provides a mechanism for a processing element to broadcast data to other selected processing elements. This parallel transfer capability is provided through the design of the MTC and a minor modification to the iSBC 86/12A board. The operation of the MTC, the basic hardware-level operation of the system, and pertinent details about the iSBC 86/12A and the Multibus are described.

Ogrady, E. P.; Wang, C.-H.

1983-01-01

139

Improving the Teaching of Discrete-Event Control Systems Using a LEGO Manufacturing Prototype

ERIC Educational Resources Information Center

This paper discusses the usefulness of employing LEGO as a teaching-learning aid in a post-graduate-level first course on the control of discrete-event systems (DESs). The final assignment of the course is presented, which asks students to design and implement a modular hierarchical discrete-event supervisor for the coordination layer of a…

Sanchez, A.; Bucio, J.

2012-01-01

140

Diagnosis of Discrete-Event Systems Using Satisfiability Algorithms Alban Grastien

Diagnosis of Discrete-Event Systems Using Satisfiability Algorithms Alban Grastien NICTA University of Melbourne Melbourne, Australia Abstract The diagnosis of a discrete-event system is the problem the behaviors are normal or faulty. We show how the diagnosis problems can be translated into the propositional

Anbulagan, A.

141

On-Time Diagnosis of Discrete Event Systems Aditya Mahajan and Demosthenis Teneketzis

On-Time Diagnosis of Discrete Event Systems Aditya Mahajan and Demosthenis Teneketzis Department-- A formulation and solution methodology for on- time fault diagnosis in discrete event systems is presented. This formulation and solution methodology captures the timeliness aspect of fault diagnosis and is therefore

Mahajan, Aditya

142

Failure Diagnosis of Discrete Event Systems With Linear-Time Temporal Logic Specifications

Failure Diagnosis of Discrete Event Systems With Linear-Time Temporal Logic Specifications Shengbing Jiang and Ratnesh Kumar Abstract The paper studies failure diagnosis of discrete event systems for the diagnosis algorithms in prior methods to work, is relaxed. Finally, a simple example is given

Kumar, Ratnesh

143

Supervisory Control and Deadlock Avoidance Control Problem for Concurrent Discrete Event

Supervisory Control and Deadlock Avoidance Control Problem for Concurrent Discrete Event Systems.Gaudin,Herve.Marchand}@irisa.fr Abstract-- We tackle the Non-Blocking Supervisory Control Problem control for Concurrent Discrete Event first outline the method allowing to solve the state avoidance control problem on concurrent systems. We

Paris-Sud XI, UniversitÃ© de

144

A Survey of Petri Net Methods for Controlled Discrete Event Systems

This paper surveys recent research on the application of Petri net models to the analysis and synthesis of controllers for discrete event systems. Petri nets have been used extensively in applications such as automated manufacturing, and there exists a large body of tools for qualitative and quantitative analysis of Petri nets. The goal of Petri net research in discrete event

L. E. HOLLOWAY; B. H. KROGH; A. GIUA

1997-01-01

145

Parallel Finite Element Simulation of Tracer Injection in Oil Reservoirs

Parallel Finite Element Simulation of Tracer Injection in Oil Reservoirs Alvaro L.G.A. Coutinho In this work, parallel finite element techniques for the simulation of tracer injection in oil reservoirs. The pressure, velocity and concentration linear systems of equations are solved with parallel elementÂbyÂelement

Coutinho, Alvaro L. G. A.

146

Abstraction of continuous system to discrete event system using neural network

NASA Astrophysics Data System (ADS)

A hybrid system consists of continuous systems and discrete event systems, which interact with each other. In such configuration, a continuous system can't directly communicate with a discrete event system. Therefore, a form of interface between two systems is required for possible communication. An interface from a continuous system to a discrete event system requires abstraction of a continuous system as a discrete event system. This paper proposes a methodology for abstraction of a continuous system as a discrete event system using neural network. A continuous system is first represented by a timed state transition model and then the model is mapped into a neural network by learning capability of the network. With a simple example, this paper describes the abstraction process in detail and discusses application methods of the neural network model. Finally, an application of such abstraction in design of intelligent control is discussed.

Jung, Sung H.; Kim, Tag G.

1997-06-01

147

Parallel solvers for reservoir simulation on MIMD computers

We have investigated parallel solvers for reservoir simulation. We compare different solvers and preconditioners using T3D and SP1 parallel computers. We use block diagonal domain decomposition preconditioner with non-overlapping sub-domains.

Piault, E. [Cisis and Comissariat a l`Energie Atomique, Gif sur Yvette (France); Willien, F. [Institut Francais du Petrole, Rueil-Malmaison (France); Roux, F.X. [Office Nationnal d`Etudes et Recherches Aerospatiales, Chatillon (France)

1995-12-01

148

Parallel and Distributed Multi-Algorithm Circuit Simulation

With the proliferation of parallel computing, parallel computer-aided design (CAD) has received significant research interests. Transient transistor-level circuit simulation plays an important role in digital/analog circuit design and verification...

Dai, Ruicheng

2012-10-19

149

Optimal Parametric Discrete Event Control: Problem and Solution

We present a novel optimization problem for discrete event control, similar in spirit to the optimal parametric control problem common in statistical process control. In our problem, we assume a known finite state machine plant model $G$ defined over an event alphabet $\\Sigma$ so that the plant model language $L = \\LanM(G)$ is prefix closed. We further assume the existence of a \\textit{base control structure} $M_K$, which may be either a finite state machine or a deterministic pushdown machine. If $K = \\LanM(M_K)$, we assume $K$ is prefix closed and that $K \\subseteq L$. We associate each controllable transition of $M_K$ with a binary variable $X_1,\\dots,X_n$ indicating whether the transition is enabled or not. This leads to a function $M_K(X_1,\\dots,X_n)$, that returns a new control specification depending upon the values of $X_1,\\dots,X_n$. We exhibit a branch-and-bound algorithm to solve the optimization problem $\\min_{X_1,\\dots,X_n}\\max_{w \\in K} C(w)$ such that $M_K(X_1,\\dots,X_n) \\models \\Pi$ and $\\LanM(M_K(X_1,\\dots,X_n)) \\in \\Con(L)$. Here $\\Pi$ is a set of logical assertions on the structure of $M_K(X_1,\\dots,X_n)$, and $M_K(X_1,\\dots,X_n) \\models \\Pi$ indicates that $M_K(X_1,\\dots,X_n)$ satisfies the logical assertions; and, $\\Con(L)$ is the set of controllable sublanguages of $L$.

Griffin, Christopher H [ORNL

2008-01-01

150

Empirical study of parallel LRU simulation algorithms

NASA Technical Reports Server (NTRS)

This paper reports on the performance of five parallel algorithms for simulating a fully associative cache operating under the LRU (Least-Recently-Used) replacement policy. Three of the algorithms are SIMD, and are implemented on the MasPar MP-2 architecture. Two other algorithms are parallelizations of an efficient serial algorithm on the Intel Paragon. One SIMD algorithm is quite simple, but its cost is linear in the cache size. The two other SIMD algorithm are more complex, but have costs that are independent on the cache size. Both the second and third SIMD algorithms compute all stack distances; the second SIMD algorithm is completely general, whereas the third SIMD algorithm presumes and takes advantage of bounds on the range of reference tags. Both MIMD algorithm implemented on the Paragon are general and compute all stack distances; they differ in one step that may affect their respective scalability. We assess the strengths and weaknesses of these algorithms as a function of problem size and characteristics, and compare their performance on traces derived from execution of three SPEC benchmark programs.

Carr, Eric; Nicol, David M.

1994-01-01

151

Parallel Proximity Detection for Computer Simulation

NASA Technical Reports Server (NTRS)

The present invention discloses a system for performing proximity detection in computer simulations on parallel processing architectures utilizing a distribution list which includes movers and sensor coverages which check in and out of grids. Each mover maintains a list of sensors that detect the mover's motion as the mover and sensor coverages check in and out of the grids. Fuzzy grids are includes by fuzzy resolution parameters to allow movers and sensor coverages to check in and out of grids without computing exact grid crossings. The movers check in and out of grids while moving sensors periodically inform the grids of their coverage. In addition, a lookahead function is also included for providing a generalized capability without making any limiting assumptions about the particular application to which it is applied. The lookahead function is initiated so that risk-free synchronization strategies never roll back grid events. The lookahead function adds fixed delays as events are scheduled for objects on other nodes.

Steinman, Jeffrey S. (Inventor); Wieland, Frederick P. (Inventor)

1997-01-01

152

Parallel Proximity Detection for Computer Simulations

NASA Technical Reports Server (NTRS)

The present invention discloses a system for performing proximity detection in computer simulations on parallel processing architectures utilizing a distribution list which includes movers and sensor coverages which check in and out of grids. Each mover maintains a list of sensors that detect the mover's motion as the mover and sensor coverages check in and out of the grids. Fuzzy grids are included by fuzzy resolution parameters to allow movers and sensor coverages to check in and out of grids without computing exact grid crossings. The movers check in and out of grids while moving sensors periodically inform the grids of their coverage. In addition, a lookahead function is also included for providing a generalized capability without making any limiting assumptions about the particular application to which it is applied. The lookahead function is initiated so that risk-free synchronization strategies never roll back grid events. The lookahead function adds fixed delays as events are scheduled for objects on other nodes.

Steinman, Jeffrey S. (Inventor); Wieland, Frederick P. (Inventor)

1998-01-01

153

Parallel multiscale simulations of a brain aneurysm

Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multi-scale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver ?? ?r. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers ( ?? ?r and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work. PMID:23734066

Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em

2012-01-01

154

Parallel Performance of a Combustion Chemistry Simulation Gregg Skinner

combustion- generated pollutants, reducing knocking in internal combustion engines, studyingParallel Performance of a Combustion Chemistry Simulation Gregg Skinner Rudolf Eigenmann Center used a description of a combustion simulation'smathematicaland computationalmethods to develop

Padua, David

155

Non-intrusive parallelization of multibody system dynamic simulations

This paper evaluates two non-intrusive parallelization techniques for multibody system dynamics: parallel sparse linear equation\\u000a solvers and OpenMP. Both techniques can be applied to existing simulation software with minimal changes in the code structure;\\u000a this is a major advantage over Message Passing Interface, the standard parallelization method in multibody dynamics. Both\\u000a techniques have been applied to parallelize a starting sequential

Francisco González; Alberto Luaces; Urbano Lugrís; Manuel González

2009-01-01

156

Parssec: A Parallel Simulation Environment for Complex Systems

ulating large-scale systems. Widespread use ofparallel simulation, however, has been significantlyhindered by a lack of tools for integrating parallelmodel execution into the overall framework of systemsimulation. Although a number of algorithmicalternatives exist for parallel execution of discreteeventsimulation models, performance analysts notexpert in parallel simulation have relatively few toolsgiving them flexibility to experiment with multiplealgorithmic or architectural...

Rajive Bagrodia; Richard A. Meyer; Mineo Takai; Yu-An Chen; Xiang Zeng; Jay Martin; Ha Yoon Song

1998-01-01

157

Parallel magnetic field perturbations in gyrokinetic simulations

At low beta it is common to neglect parallel magnetic field perturbations on the basis that they are of order beta{sup 2}. This is only true if effects of order beta are canceled by a term in the nablaB drift also of order beta[H. L. Berk and R. R. Dominguez, J. Plasma Phys. 18, 31 (1977)]. To our knowledge this has not been rigorously tested with modern gyrokinetic codes. In this work we use the gyrokinetic code GS2[Kotschenreuther et al., Comput. Phys. Commun. 88, 128 (1995)] to investigate whether the compressional magnetic field perturbation B{sub ||} is required for accurate gyrokinetic simulations at low beta for microinstabilities commonly found in tokamaks. The kinetic ballooning mode (KBM) demonstrates the principle described by Berk and Dominguez strongly, as does the trapped electron mode, in a less dramatic way. The ion and electron temperature gradient (ETG) driven modes do not typically exhibit this behavior; the effects of B{sub ||} are found to depend on the pressure gradients. The terms which are seen to cancel at long wavelength in KBM calculations can be cumulative in the ion temperature gradient case and increase with eta{sub e}. The effect of B{sub ||} on the ETG instability is shown to depend on the normalized pressure gradient beta{sup '} at constant beta.

Joiner, N.; Hirose, A. [Department of Physics and Engineering Physics, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5E2 (Canada); Dorland, W. [University of Maryland, College Park, Maryland 20742 (United States)

2010-07-15

158

Parallel magnetic field perturbations in gyrokinetic simulations

NASA Astrophysics Data System (ADS)

At low ? it is common to neglect parallel magnetic field perturbations on the basis that they are of order ?2. This is only true if effects of order ? are canceled by a term in the ?B drift also of order ? [H. L. Berk and R. R. Dominguez, J. Plasma Phys. 18, 31 (1977)]. To our knowledge this has not been rigorously tested with modern gyrokinetic codes. In this work we use the gyrokinetic code GS2 [Kotschenreuther et al., Comput. Phys. Commun. 88, 128 (1995)] to investigate whether the compressional magnetic field perturbation B? is required for accurate gyrokinetic simulations at low ? for microinstabilities commonly found in tokamaks. The kinetic ballooning mode (KBM) demonstrates the principle described by Berk and Dominguez strongly, as does the trapped electron mode, in a less dramatic way. The ion and electron temperature gradient (ETG) driven modes do not typically exhibit this behavior; the effects of B? are found to depend on the pressure gradients. The terms which are seen to cancel at long wavelength in KBM calculations can be cumulative in the ion temperature gradient case and increase with ?e. The effect of B? on the ETG instability is shown to depend on the normalized pressure gradient ?' at constant ?.

Joiner, N.; Hirose, A.; Dorland, W.

2010-07-01

159

Parallelization of DQMC Simulation for Strongly Correlated Electron Systems

Parallelization of DQMC Simulation for Strongly Correlated Electron Systems Che-Rung Lee Dept simulations. For instance, the study of strongly correlated materials in many technically important++ program. It is particularly suitable for simulating strongly correlated materials. In the DQMC simulation

California at Davis, University of

160

Discrete event command and control for networked teams with multiple missions

NASA Astrophysics Data System (ADS)

During mission execution in military applications, the TRADOC Pamphlet 525-66 Battle Command and Battle Space Awareness capabilities prescribe expectations that networked teams will perform in a reliable manner under changing mission requirements, varying resource availability and reliability, and resource faults. In this paper, a Command and Control (C2) structure is presented that allows for computer-aided execution of the networked team decision-making process, control of force resources, shared resource dispatching, and adaptability to change based on battlefield conditions. A mathematically justified networked computing environment is provided called the Discrete Event Control (DEC) Framework. DEC has the ability to provide the logical connectivity among all team participants including mission planners, field commanders, war-fighters, and robotic platforms. The proposed data management tools are developed and demonstrated on a simulation study and an implementation on a distributed wireless sensor network. The results show that the tasks of multiple missions are correctly sequenced in real-time, and that shared resources are suitably assigned to competing tasks under dynamically changing conditions without conflicts and bottlenecks.

Lewis, Frank L.; Hudas, Greg R.; Pang, Chee Khiang; Middleton, Matthew B.; McMurrough, Christopher

2009-05-01

161

Time warp for efficient parallel logic simulation on a massively parallel SIMD machine

A variation of the time warp protocol for efficient parallel logic simulation on a massively parallel SIMD machine is proposed. A scheme in which each object has a single event queue instead of three queues needed in the conventional time warp protocol is developed for fast queue manipulation and small storage requirement. An immediate cancellation technique of the rollback mechanism

Yunmo Chung; Moon Jung Chung

1991-01-01

162

SCALABLE PARALLEL COLLISION DETECTION SIMULATION Ilan Grinberg

Calibration Optimization, Safety Military Experiments and Simulation of Battles. All of these tools are based Simulation. 1 Introduction There are many simulation software tools in the civilian and military markets

Wiseman, Yair

163

A Comparative Analysis of Various Time Warp Algorithms Implemented in the WARPED Simulation Kernel

The Time Warp mechanism conceptually has the poten- tial to speedup discrete event simulations on parallel plat - forms. However, practical implementations of the optimist ic mechanism have been hindered by several drawbacks such as large memory usage, excessive rollbacks (instability), and wasted lookahead computation. Several optimizations and variations to the original Time Warp algorithm have been presented in the

Radharamanan Radhakrishnan; Timothy J. Mcbrayer; Krishnan Subramani; Malolan Chetlur; Vijay Balakrishnan; Philip A. Wilsey

1996-01-01

164

Parallel-Processing Test Bed For Simulation Software

NASA Technical Reports Server (NTRS)

Second-generation Hypercluster computing system is multiprocessor test bed for research on parallel algorithms for simulation in fluid dynamics, electromagnetics, chemistry, and other fields with large computational requirements but relatively low input/output requirements. Built from standard, off-shelf hardware readily upgraded as improved technology becomes available. System used for experiments with such parallel-processing concepts as message-passing algorithms, debugging software tools, and computational steering. First-generation Hypercluster system described in "Hypercluster Parallel Processor" (LEW-15283).

Blech, Richard; Cole, Gary; Townsend, Scott

1996-01-01

165

Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors

NASA Technical Reports Server (NTRS)

Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.

Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.

1990-01-01

166

PARALLEL COMPUTER SIMULATION TECHNIQUES FOR THE STUDY OF MACROMOLECULES

PARALLEL COMPUTER SIMULATION TECHNIQUES FOR THE STUDY OF MACROMOLECULES Mark R. Wilson and Jaroslav years two important developments in computing have occurred. At the high-cost end of the scale, supercomputers have become parallel comput- ers. The ultra-fast (specialist) processors and the expensive vector-computers

Wilson, Mark R.

167

Extensions to Time Warp Parallel Simulation for Spatial Decomposed Applications

Extensions to Time Warp Parallel Simulation for Spatial Decomposed Applications Benno J. Overeinder systems puts some extra requirements on the parallel syn- chronization schemes such as Time Warp. The large sci- entific problems require efficient memory management-- both time and space efficient

Amsterdam, Universiteit van

168

Balanced Decomposition for Power System Simulation on Parallel Computers

Balanced Decomposition for Power System Simulation on Parallel Computers Felipe Morales, Hugh parallelization strategy is tested in a Parsytec computer incorpo- rating two PowerXplorer systems, each one System. 1 Introduction Power system analysis is intensive in computational terms 1 . In fact, the power

Catholic University of Chile (Universidad CatÃ³lica de Chile)

169

PROTEUS: A High-Performance Parallel-Architecture Simulator

PROTEUS: A High-Performance Parallel-Architecture Simulator by Eric A. Brewer Chrysanthos N. Dellarocas Adrian Colbrook William E. Weihl September 1991 Abstract Proteus is a high-performance simulator gured to simulate a wide range of architectures. Proteus provides a modular structure that simpli es

Koppelman, David M.

170

Massively Parallel Simulations of Solar Flares and Plasma Turbulence

Massively Parallel Simulations of Solar Flares and Plasma Turbulence Lukas Arnold, Christoph Beetz in space- and astrophysical plasmasystems include solar flares and hydro- or magnetohydrodynamic turbulence of solar flares and Lagrangian statistics of compressible and incompress- ible turbulent flows

Grauer, Rainer

171

SUPERB: Simulator Utilizing Parallel Evaluation of Resistive Bridges Piet Engelke1

SUPERB: Simulator Utilizing Parallel Evaluation of Resistive Bridges Piet Engelke1 Bettina bridging fault simulator SUPERB (Simulator Utilizing Parallel Evaluation of Resis- tive Bridges- stuck-at simulation. It outperforms a conventional interval- based resistive bridging fault simulator

Polian, Ilia

172

Xyce parallel electronic simulator : users' guide. Version 5.1.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

2009-11-01

173

Applications Parallel PIC plasma simulation through particle

for the conÂ®nement degradation of the plasma. www.elsevier.com/locate/parco Parallel Computing 27 (2001) 295 magnetohydrodynamic (MHD) description of the plasma. Such a description, indeed, corresponds to taking a set, because the latter are dominated by long-range ef- fects. This is true as long as the number of particles

Vlad, Gregorio

174

Iterative Schemes for Time Parallelization with Application to Reservoir Simulation

Parallel methods are usually not applied to the time domain because of the inherit sequentialness of time evolution. But for many evolutionary problems, computer simulation can benefit substantially from time parallelization methods. In this paper, they present several such algorithms that actually exploit the sequential nature of time evolution through a predictor-corrector procedure. This sequentialness ensures convergence of a parallel predictor-corrector scheme within a fixed number of iterations. The performance of these novel algorithms, which are derived from the classical alternating Schwarz method, are illustrated through several numerical examples using the reservoir simulator Athena.

Garrido, I; Fladmark, G E; Espedal, M S; Lee, B

2005-04-18

175

A conservative approach to parallelizing the Sharks World simulation

NASA Technical Reports Server (NTRS)

Parallelizing a benchmark problem for parallel simulation, the Sharks World, is described. The described solution is conservative, in the sense that no state information is saved, and no 'rollbacks' occur. The used approach illustrates both the principal advantage and principal disadvantage of conservative parallel simulation. The advantage is that by exploiting lookahead an approach was found that dramatically improves the serial execution time, and also achieves excellent speedups. The disadvantage is that if the model rules are changed in such a way that the lookahead is destroyed, it is difficult to modify the solution to accommodate the changes.

Nicol, David M.; Riffe, Scott E.

1990-01-01

176

Traffic simulations on parallel computers using domain decomposition techniques

Large scale simulations of Intelligent Transportation Systems (ITS) can only be achieved by using the computing resources offered by parallel computing architectures. Domain decomposition techniques are proposed which allow the performance of traffic simulations with the standard simulation package TRAF-NETSIM on a 128 nodes IBM SPx parallel supercomputer as well as on a cluster of SUN workstations. Whilst this particular parallel implementation is based on NETSIM, a microscopic traffic simulation model, the presented strategy is applicable to a broad class of traffic simulations. An outer iteration loop must be introduced in order to converge to a global solution. A performance study that utilizes a scalable test network that consist of square-grids is presented, which addresses the performance penalty introduced by the additional iteration loop.

Hanebutte, U.R.; Tentner, A.M.

1995-12-31

177

High-performance retargetable simulator for parallel architectures. Technical report

In this thesis, the authors describe Proteus, a high-performance simulation-based system for the evaluation of parallel algorithms and system software. Proteus is built around a retargetable parallel architecture simulator and a flexible data collection and display component. The simulator uses a combination of simulation and direct execution to achieve high performance, while retaining simulation accuracy. Proteus can be configured to simulate a wide range of shared memory and message passing MIMD architectures and the level of simulation detail can be chosen by the user. Detailed memory, cache and network simulation is supported. Parallel programs can be written using a programming model based on C and a set of runtime system calls for thread and memory management. The system allows nonintrusive monitoring of arbitrary information about an execution, and provides flexible graphical utilities for displaying recorded data. To validate the accuracy of the system, a number of published experiments were reproduced on Proteus. In all cases the results obtained by simulation are very close to those published, a fact that provides support for the reliability of the system. Performance measurements demonstrate that the simulator is one to two orders of magnitude faster than other similar multiprocessor simulators.

Dellarocas, C.N.

1991-06-01

178

Parallel and Adaptive Simulation of Fuel Cells Robert Klfkorn1

Parallel and Adaptive Simulation of Fuel Cells in 3d Robert KlÃ¶fkorn1 , Dietmar KrÃ¶ner1 , Mario) fuel cells. Hereby, we focus on the simulation done in 3d us- ing modern techniques like higher order and the transport of species in the cathodic gas diffusion layer of the fuel cell. Therefore, from the detailed

MÃ¼nster, WestfÃ¤lische Wilhelms-UniversitÃ¤t

179

Feedback Control in Time Warp Synchronized Parallel Simulators

The time warp mechanism is one of the most important synchronization protocols for parallel simulation. However for most applications, the successful use of time warp requires the careful selection of time warp optimization parameters (e.g. cancellation strategies, state saving frequency, and so on). Unfortunately, the optimal setting for the simulation parameters may not hold across an application domain or even

Philip A. Wilsey

1997-01-01

180

Parallel Transient Dynamics Simulations: Algorithms for Contact Detection

) in the simulation can be modeled using the techniques of smoothed particle hydrodynamics. Implementing a hybrid mesh particle hydrodynamics (SPH) provides an attractive method for including fluids in the modelParallel Transient Dynamics Simulations: Algorithms for Contact Detection and Smoothed Particle

Plimpton, Steve

181

PLC-based implementation of supervisory control for discrete event systems

The supervisory control theory is a general theory for automatic synthesis of controllers (supervisors) for discrete event systems, given a plant model and a specification for the controlled behavior. Though the theory has for over a decade received substantial attention in academics, still very few industrial applications exist. The main reason for this seems to be a discrepancy between the

M. Fabian; A. Hellgren

1998-01-01

182

Decision Fusion and Supervisor Synthesis in Decentralized Discrete-Event Systems

Decision Fusion and Supervisor Synthesis in Decentralized Discrete-Event Systems Joseph H. Prossery supervisory control ar- chitecture involving decision fusion. We formulate two problems that extend GP and SGP. These are GP with Fusion (GPF) and SGP with Fusion (SGPF). Our main result is a set of local supervisors

Kwatny, Harry G.

183

Fuzzy discrete event system modeling and temporal fuzzy reasoning in urban traffic control

In this paper, P DEVS-based fuzzy supervisory conirdlcr is applicd to an urban traffic control problem also modeled from a discrete event perspective via state flow formalism. Time varying nature of intersections, involvement of human behavior in the systcm and distributed naturc of the problcin at large makes the urban trafic control a challenging problem. Specifically, bucause of this distributed

A. Akramizadeh; M.-R. Akbarzadeh-T; M. Khademi

2004-01-01

184

Diagnosis of Repeated Failures For Discrete Event Systems With Linear-Time Temporal Logic

Diagnosis of Repeated Failures For Discrete Event Systems With Linear-Time Temporal Logic work we introduced a state-based approach for the diagnosis of repeat- edly occurring failures is provided. The diagnosis problem for repeated failures in the temporal logic setting is reduced to one

Kumar, Ratnesh

185

UML2ALLOY: A TOOL FOR LIGHTWEIGHT MODELLING OF DISCRETE EVENT SYSTEMS

UML2ALLOY: A TOOL FOR LIGHTWEIGHT MODELLING OF DISCRETE EVENT SYSTEMS Behzad Bordbar School United Kingdom K.Anastasakis@cs.bham.ac.uk ABSTRACT Alloy is a textual language developed by Daniel of the UML and Alloy into a single CASE tool, which aims to take advantage of the positive aspect of both

Bordbar, Behzad

186

Modeling of discrete event systems: A holistic and incremental approach using Petri nets

In this article, the authors provide an alternative view on Petri nets modeling of discrete event systems. The proposed modeling procedure follows the Systems Specification guidelines underlying the well-known DEVS modeling formalism. The authors' endeavour is towards perfecting the design of reusable Petri nets-based models by searching for a good primitive for a modular model construction and the introduction of

Carmen-Veronica Bobeanu; Eugene J. H. Kerckhoffs; Hendrik Van Landeghem

2004-01-01

187

Behavior Coordination of Mobile Robotics Using Supervisory Control of Fuzzy Discrete Event Systems

In order to incorporate the uncertainty and impre- ciseness present in real-world event-driven asynchronous systems, fuzzy discrete event systems (DESs) (FDESs) have been proposed as an extension to crisp DESs. In this paper, first, we propose an extension to the supervisory control theory of FDES by redefin- ing fuzzy controllable and uncontrollable events. The proposed supervisor is capable of enabling

Awantha Jayasiri; George K. I. Mann; Raymond G. Gosine

2011-01-01

188

Fault Detection and Isolation in Manufacturing Systems with an Identified Discrete Event Model

Fault Detection and Isolation in Manufacturing Systems with an Identified Discrete Event Model) In this paper a generic method for fault detection and isolation (FDI) in manufacturing systems considered and controller built on the basis of observed fault free system behavior. An identification algorithm known from

Paris-Sud XI, UniversitÃ© de

189

Parallel Multiphysics Simulations of Charged Particles in Microfluidic Flows

The article describes parallel multiphysics simulations of charged particles in microfluidic flows with the waLBerla framework. To this end, three physical effects are coupled: rigid body dynamics, fluid flow modelled by a lattice Boltzmann algorithm, and electric potentials represented by a finite volume discretisation. For solving the finite volume discretisation for the electrostatic forces, a cell-centered multigrid algorithm is developed that conforms to the lattice Boltzmann meshes and the parallel communication structure of waLBerla. The new functionality is validated with suitable benchmark scenarios. Additionally, the parallel scaling and the numerical efficiency of the algorithms are analysed on an advanced supercomputer.

Bartuschat, Dominik

2014-01-01

190

Diagnosis of Discrete-Event Systems in Rules-based Model using First-order Linear Temporal Logic

Diagnosis of Discrete-Event Systems in Rules-based Model using First-order Linear Temporal Logic study diagnosis of discrete-event systems (DESs) modeled in the rules-based modeling formalism presented is to develop failure diagnosis techniques that are able to exploit this compactness

Kumar, Ratnesh

191

Fault Diagnosis of Continuous Systems Using Discrete-Event Methods Matthew Daigle, Xenofon.j.daigle,xenofon.koutsoukos,gautam.biswas@vanderbilt.edu Abstract-- Fault diagnosis is crucial for ensuring the safe operation of complex engineering systems. Although discrete- event diagnosis methods are used extensively, they do not easily apply to parametric

Koutsoukos, Xenofon D.

192

Parallel Monte Carlo Simulation for control system design

NASA Technical Reports Server (NTRS)

The research during the 1993/94 academic year addressed the design of parallel algorithms for stochastic robustness synthesis (SRS). SRS uses Monte Carlo simulation to compute probabilities of system instability and other design-metric violations. The probabilities form a cost function which is used by a genetic algorithm (GA). The GA searches for the stochastic optimal controller. The existing sequential algorithm was analyzed and modified to execute in a distributed environment. For this, parallel approaches to Monte Carlo simulation and genetic algorithms were investigated. Initial empirical results are available for the KSR1.

Schubert, Wolfgang M.

1995-01-01

193

Parallel runway requirement analysis study. Volume 2: Simulation manual

NASA Technical Reports Server (NTRS)

This document is a user manual for operating the PLAND_BLUNDER (PLB) simulation program. This simulation is based on two aircraft approaching parallel runways independently and using parallel Instrument Landing System (ILS) equipment during Instrument Meteorological Conditions (IMC). If an aircraft should deviate from its assigned localizer course toward the opposite runway, this constitutes a blunder which could endanger the aircraft on the adjacent path. The worst case scenario would be if the blundering aircraft were unable to recover and continue toward the adjacent runway. PLAND_BLUNDER is a Monte Carlo-type simulation which employs the events and aircraft positioning during such a blunder situation. The model simulates two aircraft performing parallel ILS approaches using Instrument Flight Rules (IFR) or visual procedures. PLB uses a simple movement model and control law in three dimensions (X, Y, Z). The parameters of the simulation inputs and outputs are defined in this document along with a sample of the statistical analysis. This document is the second volume of a two volume set. Volume 1 is a description of the application of the PLB to the analysis of close parallel runway operations.

Ebrahimi, Yaghoob S.; Chun, Ken S.

1993-01-01

194

Efficient parallel CFD-DEM simulations using OpenMP

NASA Astrophysics Data System (ADS)

The paper describes parallelization strategies for the Discrete Element Method (DEM) used for simulating dense particulate systems coupled to Computational Fluid Dynamics (CFD). While the field equations of CFD are best parallelized by spatial domain decomposition techniques, the N-body particulate phase is best parallelized over the number of particles. When the two are coupled together, both modes are needed for efficient parallelization. It is shown that under these requirements, OpenMP thread based parallelization has advantages over MPI processes. Two representative examples, fairly typical of dense fluid-particulate systems are investigated, including the validation of the DEM-CFD and thermal-DEM implementation with experiments. Fluidized bed calculations are performed on beds with uniform particle loading, parallelized with MPI and OpenMP. It is shown that as the number of processing cores and the number of particles increase, the communication overhead of building ghost particle lists at processor boundaries dominates time to solution, and OpenMP which does not require this step is about twice as fast as MPI. In rotary kiln heat transfer calculations, which are characterized by spatially non-uniform particle distributions, the low overhead of switching the parallelization mode in OpenMP eliminates the load imbalances, but introduces increased overheads in fetching non-local data. In spite of this, it is shown that OpenMP is between 50-90% faster than MPI.

Amritkar, Amit; Deb, Surya; Tafti, Danesh

2014-01-01

195

A Reconfigurable Stochastic Model Simulator for Analysis of Parallel Systems

Markov chain and queueing model are convenient tools with which to analyze parallel systems for architects. For a high speed\\u000a execution and easy modeling, a reconfigurable Markov chain\\/queueing model simulation system called RSMS (Reconfigurable Stochastic\\u000a Model Simulator) is proposed. A user describes the target system in a dedicated description language called Taico. The description\\u000a is automatically translated into the HDL

Ou Yamamoto; Yuichiro Shibata; Hitoshi Kurosawa; Hideharu Amano

2000-01-01

196

Xyce Parallel Electronic Simulator : reference guide, version 2.0.

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont; Fixel, Deborah A.; Russo, Thomas V.; Keiter, Eric Richard; Hutchinson, Scott Alan; Pawlowski, Roger Patrick; Wix, Steven D.

2004-06-01

197

Parallel PDE-Based Simulations Using the Common Component Architecture

The complexity of parallel PDE-based simulations continues to increase as multimodel, multiphysics, and multi-institutional projects become widespread. A goal of component- based software engineering in such large-scale simulations is to help manage this complexity by enabling better interoperability among various codes that have been independently developed by different groups. The Common Component Architecture (CCA) Forum is defining a component architecture

Lois Curfman McInnes; Benjamin A. Allan; Robert Armstrong; Steven J. Benson; David E. Bernholdt; Tamara L. Dahlgren; Lori Freitag Diachin; Manojkumar Krishnan; James A. Kohl; J. Walter Larson; Sophia Lefantzi; Jarek Nieplocha; Boyana Norris; Steven G. Parker; Jaideep Ray; Shujia Zhou

2006-01-01

198

Xyce parallel electronic simulator reference guide, version 6.0.

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1].

Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David G. [Raytheon, Albuquerque, NM

2013-08-01

199

Parallel Monte Carlo Ion Recombination Simulation in Orca

Parallel Monte Carlo Ion Recombination Simulation in Orca Frank J. Seinstra Department of Mathematics and Computer Science Vrije Universiteit Amsterdam, The Netherlands August 1996 Abstract: Orca in most languages for disÂ tributed programming is based on message passing. In Orca, however, a shared

Seinstra, Frank J.

200

PARALLEL SIMULATION USING THE TIME WARP OPERATING SYSTEM

PARALLEL SIMULATION USING THE TIME WARP OPERATING SYSTEM Peter L. Reiher Jet Propulsion Laboratory 4800 Oak Grove Drive Pasadena, California 91109 ABSTRACT The Time Warp Operating System runs discrete for experimental use. The first half of this tutorial will discuss how to use the Time Warp Operating System

California at Los Angeles, University of

201

Parallel FEM Simulation of Crack Propagation --Challenges, Status, and Perspectives

research challenges, #15; developing algorithms for parallel mesh generation for unstructured 3D meshes and accurate computer simulation of crack propagation in realistic 3D structures would be a valuable tool, Rhodes Hall, Cornell University, Ithaca, NY 14853 2 CS Department, Upson Hall, Cornell University, Ithaca

Stodghill, Paul

202

A General Model for Non-Markovian Stochastic Decision Discrete-Event Systems

This paper extends previous work on modeling stochastic decision discrete-event systems (DDES) through a generalized semi-Markov decision process (GSMDP), which discards any restrictive unrealistic assumptions and can be applied to complex cases including non-Markovian environment. Moreover, as a typical example, we develop a GSMDP model for the optimal call admission control (CAC) problem in an integrated voice\\/data wireless network supporting

Wen Chen; Feiyu Lei; Weinong Wang

2005-01-01

203

\\u000a We present an analysis of possible blocking phenomena, deadlock, in Discrete Event Systems (DES) having corrective and\\/or\\u000a Preventive Maintenance Schedules (PMS). Although deadlock avoidance analysis for several classes of DES systems has been widely\\u000a published, and although different approaches for PMS exist, it is not obvious how to mix deadlock avoidance and maintenance\\u000a theories to improve throughput. In this paper

Jose Mireles; Frank L. Lewis

204

A Paraconsistent Logic Program Based Control for a Discrete Event Cat and Mouse

\\u000a We have developed a paraconsistent logic program called an Extended Vector Annotated Logic Program with Strong Negation (abbr.\\u000a EVALPSN), which can deal with defeasible deontic reasoning and contradiction, and applied it to safety verification and control\\u000a such as railway interlocking safety verification, traffic signal control etc.. In this paper, we introduce how to apply EVALPSN\\u000a to discrete event control with

Kazumi Nakamatsu; Ryuji Ishikawa; Atsuyuki Suzuki

2004-01-01

205

Qualitative representation of continuous time system toward discrete event system application

The purpose of this paper is to establish a performance evaluation or control strategy for an advanced control system using a programmable controller (PC), including not only discrete-event systems but also continuous-time systems. First of all, the structure of PC control systems, including continuous-time systems, is introduced, and the way in which a PC executes is explained. A Petri net

Y. Itoh

2001-01-01

206

Molecular simulation of rheological properties using massively parallel supercomputers

Advances in parallel supercomputing now make possible molecular-based engineering and science calculations that will soon revolutionize many technologies, such as those involving polymers and those involving aqueous electrolytes. We have developed a suite of message-passing codes for classical molecular simulation of such complex fluids and amorphous materials and have completed a number of demonstration calculations of problems of scientific and technological importance with each. In this paper, we will focus on the molecular simulation of rheological properties, particularly viscosity, of simple and complex fluids using parallel implementations of non-equilibrium molecular dynamics. Such calculations represent significant challenges computationally because, in order to reduce the thermal noise in the calculated properties within acceptable limits, large systems and/or long simulated times are required.

Bhupathiraju, R.K.; Cui, S.T.; Gupta, S.A.; Cummings, P.T. [Univ. of Tennessee, Knoxville, TN (United States). Dept of Chemical Engineering; Cochran, H.D. [Oak Ridge National Lab., TN (United States)

1996-11-01

207

Random number generators for massively parallel simulations on GPU

High-performance streams of (pseudo) random numbers are crucial for the efficient implementation for countless stochastic algorithms, most importantly, Monte Carlo simulations and molecular dynamics simulations with stochastic thermostats. A number of implementations of random number generators has been discussed for GPU platforms before and some generators are even included in the CUDA supporting libraries. Nevertheless, not all of these generators are well suited for highly parallel applications where each thread requires its own generator instance. For this specific situation encountered, for instance, in simulations of lattice models, most of the high-quality generators with large states such as Mersenne twister cannot be used efficiently without substantial changes. We provide a broad review of existing CUDA variants of random-number generators and present the CUDA implementation of a new massively parallel high-quality, high-performance generator with a small memory load overhead.

Markus Manssen; Martin Weigel; Alexander K. Hartmann

2012-04-27

208

Reusable Component Model Development Approach for Parallel and Distributed Simulation

Model reuse is a key issue to be resolved in parallel and distributed simulation at present. However, component models built by different domain experts usually have diversiform interfaces, couple tightly, and bind with simulation platforms closely. As a result, they are difficult to be reused across different simulation platforms and applications. To address the problem, this paper first proposed a reusable component model framework. Based on this framework, then our reusable model development approach is elaborated, which contains two phases: (1) domain experts create simulation computational modules observing three principles to achieve their independence; (2) model developer encapsulates these simulation computational modules with six standard service interfaces to improve their reusability. The case study of a radar model indicates that the model developed using our approach has good reusability and it is easy to be used in different simulation platforms and applications. PMID:24729751

Zhu, Feng; Yao, Yiping; Chen, Huilong; Yao, Feng

2014-01-01

209

PRATHAM: Parallel Thermal Hydraulics Simulations using Advanced Mesoscopic Methods

At the Oak Ridge National Laboratory, efforts are under way to develop a 3D, parallel LBM code called PRATHAM (PaRAllel Thermal Hydraulic simulations using Advanced Mesoscopic Methods) to demonstrate the accuracy and scalability of LBM for turbulent flow simulations in nuclear applications. The code has been developed using FORTRAN-90, and parallelized using the message passing interface MPI library. Silo library is used to compact and write the data files, and VisIt visualization software is used to post-process the simulation data in parallel. Both the single relaxation time (SRT) and multi relaxation time (MRT) LBM schemes have been implemented in PRATHAM. To capture turbulence without prohibitively increasing the grid resolution requirements, an LES approach [5] is adopted allowing large scale eddies to be numerically resolved while modeling the smaller (subgrid) eddies. In this work, a Smagorinsky model has been used, which modifies the fluid viscosity by an additional eddy viscosity depending on the magnitude of the rate-of-strain tensor. In LBM, this is achieved by locally varying the relaxation time of the fluid.

Joshi, Abhijit S [ORNL] [ORNL; Jain, Prashant K [ORNL] [ORNL; Mudrich, Jaime A [ORNL] [ORNL; Popov, Emilian L [ORNL] [ORNL

2012-01-01

210

Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.

Moitra, Stuti; Gatski, Thomas B.

1997-01-01

211

Numerical simulation of polymer flows: A parallel computing approach

We present a parallel algorithm for the numerical simulation of viscoelastic fluids on distributed memory computers. The algorithm has been implemented within a general-purpose commercial finite element package used in polymer processing applications. Results obtained on the Intel iPSC/860 computer demonstrate high parallel efficiency in complex flow problems. However, since the computational load is unknown a priori, load balancing is a challenging issue. We have developed an adaptive allocation strategy which dynamically reallocates the work load to the processors based upon the history of the computational procedure. We compare the results obtained with the adaptive and static scheduling schemes.

Aggarwal, R.; Keunings, R. [Universite Catholique de Louvain (Belgium); Roux, F.X. [O.N.E.R.A., Chatillon (France)

1993-12-31

212

Parallelization of Program to Optimize Simulated Trajectories (POST3D)

NASA Technical Reports Server (NTRS)

This paper describes the parallelization of the Program to Optimize Simulated Trajectories (POST3D). POST3D uses a gradient-based optimization algorithm that reaches an optimum design point by moving from one design point to the next. The gradient calculations required to complete the optimization process, dominate the computational time and have been parallelized using a Single Program Multiple Data (SPMD) on a distributed memory NUMA (non-uniform memory access) architecture. The Origin2000 was used for the tests presented.

Hammond, Dana P.; Korte, John J. (Technical Monitor)

2001-01-01

213

Potts-model grain growth simulations: Parallel algorithms and applications

Microstructural morphology and grain boundary properties often control the service properties of engineered materials. This report uses the Potts-model to simulate the development of microstructures in realistic materials. Three areas of microstructural morphology simulations were studied. They include the development of massively parallel algorithms for Potts-model grain grow simulations, modeling of mass transport via diffusion in these simulated microstructures, and the development of a gradient-dependent Hamiltonian to simulate columnar grain growth. Potts grain growth models for massively parallel supercomputers were developed for the conventional Potts-model in both two and three dimensions. Simulations using these parallel codes showed self similar grain growth and no finite size effects for previously unapproachable large scale problems. In addition, new enhancements to the conventional Metropolis algorithm used in the Potts-model were developed to accelerate the calculations. These techniques enable both the sequential and parallel algorithms to run faster and use essentially an infinite number of grain orientation values to avoid non-physical grain coalescence events. Mass transport phenomena in polycrystalline materials were studied in two dimensions using numerical diffusion techniques on microstructures generated using the Potts-model. The results of the mass transport modeling showed excellent quantitative agreement with one dimensional diffusion problems, however the results also suggest that transient multi-dimension diffusion effects cannot be parameterized as the product of the grain boundary diffusion coefficient and the grain boundary width. Instead, both properties are required. Gradient-dependent grain growth mechanisms were included in the Potts-model by adding an extra term to the Hamiltonian. Under normal grain growth, the primary driving term is the curvature of the grain boundary, which is included in the standard Potts-model Hamiltonian.

Wright, S.A.; Plimpton, S.J.; Swiler, T.P. [and others

1997-08-01

214

Parallelizing a DNA simulation code for the Cray MTA-2.

The Cray MTA-2 (Multithreaded Architecture) is an unusual parallel supercomputer that promises ease of use and high performance. We describe our experience on the MTA-2 with a molecular dynamics code, SIMU-MD, that we are using to simulate the translocation of DNA through a nanopore in a silicon based ultrafast sequencer. Our sequencer is constructed using standard VLSI technology and consists of a nanopore surrounded by Field Effect Transistors (FETs). We propose to use the FETs to sense variations in charge as a DNA molecule translocates through the pore and thus differentiate between the four building block nucleotides of DNA. We were able to port SIMU-MD, a serial C code, to the MTA with only a modest effort and with good performance. Our porting process needed neither a parallelism support platform nor attention to the intimate details of parallel programming and interprocessor communication, as would have been the case with more conventional supercomputers. PMID:15838145

Bokhari, Shahid H; Glaser, Matthew A; Jordan, Harry F; Lansac, Yves; Sauer, Jon R; Van Zeghbroeck, Bart

2002-01-01

215

Massively parallelized replica-exchange simulations of polymers on GPUs

NASA Astrophysics Data System (ADS)

We discuss the advantages of parallelization by multithreading on graphics processing units (GPUs) for parallel tempering Monte Carlo computer simulations of an exemplified bead-spring model for homopolymers. Since the sampling of a large ensemble of conformations is a prerequisite for the precise estimation of statistical quantities such as typical indicators for conformational transitions like the peak structure of the specific heat, the advantage of a strong increase in performance of Monte Carlo simulations cannot be overestimated. Employing multithreading and utilizing the massive power of the large number of cores on GPUs, being available in modern but standard graphics cards, we find a rapid increase in efficiency when porting parts of the code from the central processing unit (CPU) to the GPU.

Gross, Jonathan; Janke, Wolfhard; Bachmann, Michael

2011-08-01

216

Optimistic Simulations of Physical Systems using Reverse Computation

Efficient computer simulation of complex physical phenomena has long been challenging due to their multi-physics and multi-scale nature. In contrast to traditional time-stepped execution methods, we describe an approach using optimistic parallel discrete event simulation (PDES) and reverse computation techniques to execute plasma physics codes. We show that reverse computation-based optimistic parallel execution can significantly reduce the execution time of an example plasma simulation without requiring a significant amount of additional memory compared to conservative execution techniques. We describe an application-level reverse computation technique that is efficient and suitable for complex scientific simulations.

Tang, Yarong [Georgia Institute of Technology; Perumalla, Kalyan S [ORNL; Fujimoto, Richard [ORNL; Karimabadi, Dr. Homa [SciberQuest Inc.; Driscoll, Jonathan [SciberQuest Inc.; Omelchenko, Yuri [SciberQuest Inc.

2006-01-01

217

Parallel algorithms for simulating continuous time Markov chains

NASA Technical Reports Server (NTRS)

We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallel simulation of continuous-time Markov chains. This paper reviews the basic method and compares five different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.

Nicol, David M.; Heidelberger, Philip

1992-01-01

218

An automated parallel simulation execution and analysis approach

NASA Astrophysics Data System (ADS)

State-of-the-art simulation computing requirements are continually approaching and then exceeding the performance capabilities of existing computers. This trend remains true even with huge yearly gains in processing power and general computing capabilities; simulation scope and fidelity often increases as well. Accordingly, simulation studies often expend days or weeks executing a single test case. Compounding the problem, stochastic models often require execution of each test case with multiple random number seeds to provide valid results. Many techniques have been developed to improve the performance of simulations without sacrificing model fidelity: optimistic simulation, distributed simulation, parallel multi-processing, and the use of supercomputers such as Beowulf clusters. An approach and prototype toolset has been developed that augments existing optimization techniques to improve multiple-execution timelines. This approach, similar in concept to the SETI @ home experiment, makes maximum use of unused licenses and computers, which can be geographically distributed. Using a publish/subscribe architecture, simulation executions are dispatched to distributed machines for execution. Simulation results are then processed, collated, and transferred to a single site for analysis.

Dallaire, Joel D.; Green, David M.; Reaper, Jerome H.

2004-08-01

219

We review and develop techniques to determine associations between series of discrete events. The bootstrap, a nonparametric statistical method, allows the determination of the significance of associations with minimal assumptions about the underlying processes. We find the key requirement for this method: one of the series must be widely spaced in time to guarantee the theoretical applicability of the bootstrap. If this condition is met, the calculated significance passes a reasonableness test. We conclude with some potential future extensions and caveats on the applicability of these methods. The techniques presented have been implemented in a Python-based software toolkit.

Niehof, Jonathan T.; Morley, Steven K.

2012-01-01

220

Simulation and parallel connection of step-down piezoelectric transformers

NASA Astrophysics Data System (ADS)

Piezoelectric transformers have been used widely in electronic circuits due to advantages such as high efficiency, miniaturization and no flammability; however the output power has been limited. For overcoming this drawback, some research has recently been focused on connections between piezoelectric transformers. Based on these operations, the output power has been improved compared to the single operation. Parallel operation of step-down piezoelectric transformers is presented in this paper. An important factor affecting the parallel operation of piezoelectric transformer was the resonance frequency, and a small difference in resonance frequencies was obtained with transformers having the same dimensions and fabricating processes. The piezoelectric transformers were found to operate in first radial mode at a frequency of 68 kHz. An equivalent circuit was used to investigate parallel driving of piezoelectric transformers and then to compare the result with experimental observations. The electrical characteristics, including the output voltage, output power and efficient were measured at a matching resistive load. Effects of frequency on the step-down ratio and of the input voltage on the power properties in the simulation were similar to the experimental results. The output power of the parallel operation was 35 W at a load of 50 ? and an input voltage of 100 V; the temperature rise was 30 °C and the efficiency was 88%.

Thang, Vo Viet; Kim, Insung; Jeong, Soonjong; Kim, Minsoo; Song, Jaesung

2012-01-01

221

Massively Parallel Processing for Fast and Accurate Stamping Simulations

NASA Astrophysics Data System (ADS)

The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.

Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu

2005-08-01

222

Particle simulation of plasmas on the massively parallel processor

NASA Technical Reports Server (NTRS)

Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.

Gledhill, I. M. A.; Storey, L. R. O.

1987-01-01

223

LAMMPS (http://lammps.sandia.gov/index.html) stands for Large-scale Atomic/Molecular Massively Parallel Simulator and is a code that can be used to model atoms or, as the LAMMPS website says, as a parallel particle simulator at the atomic, meso, or continuum scale. This Sandia-based website provides a long list of animations from large simulations. These were created using different visualization packages to read LAMMPS output, and each one provides the name of the PI and a brief description of the work done or visualization package used. See also the static images produced from simulations at http://lammps.sandia.gov/pictures.html The foundation paper for LAMMPS is: S. Plimpton, Fast Parallel Algorithms for Short-Range Molecular Dynamics, J Comp Phys, 117, 1-19 (1995), but the website also lists other papers describing contributions to LAMMPS over the years.

Plimpton, Steve; Thompson, Aidan; Crozier, Paul

224

Conservative parallel simulation of priority class queueing networks

NASA Technical Reports Server (NTRS)

A conservative synchronization protocol is described for the parallel simulation of queueing networks having C job priority classes, where a job's class is fixed. This problem has long vexed designers of conservative synchronization protocols because of its seemingly poor ability to compute lookahead: the time of the next departure. For, a job in service having low priority can be preempted at any time by an arrival having higher priority and an arbitrarily small service time. The solution is to skew the event generation activity so that the events for higher priority jobs are generated farther ahead in simulated time than lower priority jobs. Thus, when a lower priority job enters service for the first time, all the higher priority jobs that may preempt it are already known and the job's departure time can be exactly predicted. Finally, the protocol was analyzed and it was demonstrated that good performance can be expected on the simulation of large queueing networks.

Nicol, David M.

1990-01-01

225

MRISIMUL: a GPU-based parallel approach to MRI simulations.

A new step-by-step comprehensive MR physics simulator (MRISIMUL) of the Bloch equations is presented. The aim was to develop a magnetic resonance imaging (MRI) simulator that makes no assumptions with respect to the underlying pulse sequence and also allows for complex large-scale analysis on a single computer without requiring simplifications of the MRI model. We hypothesized that such a simulation platform could be developed with parallel acceleration of the executable core within the graphic processing unit (GPU) environment. MRISIMUL integrates realistic aspects of the MRI experiment from signal generation to image formation and solves the entire complex problem for densely spaced isochromats and for a densely spaced time axis. The simulation platform was developed in MATLAB whereas the computationally demanding core services were developed in CUDA-C. The MRISIMUL simulator imaged three different computer models: a user-defined phantom, a human brain model and a human heart model. The high computational power of GPU-based simulations was compared against other computer configurations. A speedup of about 228 times was achieved when compared to serially executed C-code on the CPU whereas a speedup between 31 to 115 times was achieved when compared to the OpenMP parallel executed C-code on the CPU, depending on the number of threads used in multithreading (2-8 threads). The high performance of MRISIMUL allows its application in large-scale analysis and can bring the computational power of a supercomputer or a large computer cluster to a single GPU personal computer. PMID:24595337

Xanthis, Christos G; Venetis, Ioannis E; Chalkias, A V; Aletras, Anthony H

2014-03-01

226

Diagnosis of delay-deadline failures in real time discrete event models.

In this paper a method for fault detection and diagnosis (FDD) of real time systems has been developed. A modeling framework termed as real time discrete event system (RTDES) model is presented and a mechanism for FDD of the same has been developed. The use of RTDES framework for FDD is an extension of the works reported in the discrete event system (DES) literature, which are based on finite state machines (FSM). FDD of RTDES models are suited for real time systems because of their capability of representing timing faults leading to failures in terms of erroneous delays and deadlines, which FSM-based ones cannot address. The concept of measurement restriction of variables is introduced for RTDES and the consequent equivalence of states and indistinguishability of transitions have been characterized. Faults are modeled in terms of an unmeasurable condition variable in the state map. Diagnosability is defined and the procedure of constructing a diagnoser is provided. A checkable property of the diagnoser is shown to be a necessary and sufficient condition for diagnosability. The methodology is illustrated with an example of a hydraulic cylinder. PMID:17559854

Biswas, Santosh; Sarkar, Dipankar; Bhowal, Prodip; Mukhopadhyay, Siddhartha

2007-10-01

227

High Performance Parallel Methods for Space Weather Simulations

NASA Technical Reports Server (NTRS)

This is the final report of our NASA AISRP grant entitled 'High Performance Parallel Methods for Space Weather Simulations'. The main thrust of the proposal was to achieve significant progress towards new high-performance methods which would greatly accelerate global MHD simulations and eventually make it possible to develop first-principles based space weather simulations which run much faster than real time. We are pleased to report that with the help of this award we made major progress in this direction and developed the first parallel implicit global MHD code with adaptive mesh refinement. The main limitation of all earlier global space physics MHD codes was the explicit time stepping algorithm. Explicit time steps are limited by the Courant-Friedrichs-Lewy (CFL) condition, which essentially ensures that no information travels more than a cell size during a time step. This condition represents a non-linear penalty for highly resolved calculations, since finer grid resolution (and consequently smaller computational cells) not only results in more computational cells, but also in smaller time steps.

Hunter, Paul (Technical Monitor); Gombosi, Tamas I.

2003-01-01

228

Numerical Simulation of Flow Field Within Parallel Plate Plastometer

NASA Technical Reports Server (NTRS)

Parallel Plate Plastometer (PPP) is a device commonly used for measuring the viscosity of high polymers at low rates of shear in the range 10(exp 4) to 10(exp 9) poises. This device is being validated for use in measuring the viscosity of liquid glasses at high temperatures having similar ranges for the viscosity values. PPP instrument consists of two similar parallel plates, both in the range of 1 inch in diameter with the upper plate being movable while the lower one is kept stationary. Load is applied to the upper plate by means of a beam connected to shaft attached to the upper plate. The viscosity of the fluid is deduced from measuring the variation of the plate separation, h, as a function of time when a specified fixed load is applied on the beam. Operating plate speeds measured with the PPP is usually in the range of 10.3 cm/s or lower. The flow field within the PPP can be simulated using the equations of motion of fluid flow for this configuration. With flow speeds in the range quoted above the flow field between the two plates is certainly incompressible and laminar. Such flows can be easily simulated using numerical modeling with computational fluid dynamics (CFD) codes. We present below the mathematical model used to simulate this flow field and also the solutions obtained for the flow using a commercially available finite element CFD code.

Antar, Basil N.

2002-01-01

229

Regional scale hydrologic simulation utilizing cluster-based parallel computing

NASA Astrophysics Data System (ADS)

To conduct regional-scale hydrologic-response/sediment-transport simulation at high resolution, a complex physics-based numerical model, the integrated Hydrology Model (InHM), is revised utilizing cluster-based parallel computing. The revised InHM (RInHM) divides the simulated area into multiple catchments based upon geomorphologic features and generates boundary-value-problems for each catchment to construct simulation tasks, which are then dispatched to different computers to start the simulation. The system takes a pause-integrate-divide-resume routine during the simulation to ensure the hydrologic validity, and the routine repeats until the entire simulation period is accomplished. The RInHM has been tested in a computer cluster (5 PC, 1 ‘master’ node and 4 ‘worker’ nodes, using 13 processors in total) to simulate 100 years’ hydrologic-response and soil erosion for 117 Km2 Hawaiian island of Kaho’olawe. The simulation time step ranges from 1.0×10-6 to 252,000 seconds and the horizontal/vertical resolution of the discretization is 75~200m and 0.05m, respectively, resulting in a 3D irregular mesh of 981,612 nodes and 1,945,778 elements. The efficiency of the RInHM is evaluated by comparing the performance of cluster and single processor computer. The results of this study show that it is feasible to conduct regional-scale hydrologic-response and sediment-transport simulation at high resolution without demanding significant computing resources.

Su, D.; Ran, Q.

2010-12-01

230

An understanding of particle transport is necessary to reduce contamination of semiconductor wafers during low-pressure processing. The trajectories of particles in these reactors are determined by external forces (the most important being neutral fluid drag, thermophoresis, electrostatic, viscous ion drag, and gravitational), by Brownian motion (due to neutral and charged gas molecule collisions), and by particle inertia. Gas velocity and temperature fields are also needed for particle transport calculations, but conventional continuum fluid approximations break down at low pressures when the gas mean free path becomes comparable to chamber dimensions. Thus, in this work we use a massively parallel direct simulation Monte Carlo method to calculate low-pressure internal gas flow fields which show temperature jump and velocity slip at the reactor boundaries. Because particle residence times can be short compared to particle response times in these low-pressure systems (for which continuum diffusion theory fails), we solve the Langevin equation using a numerical Lagrangian particle tracking model which includes a fluctuating Brownian force. Because of the need for large numbers of particle trajectories to ensure statistical accuracy, the particle tracking model is also implemented on a massively parallel computer. The particle transport model is validated by comparison to the Ornstein{endash}Furth theoretical result for the mean square displacement of a cloud of particles. For long times, the particles tend toward a Maxwellian spatial distribution, while at short times, particle spread is controlled by their initial (Maxwellian) velocity distribution. Several simulations using these techniques are presented for particle transport and deposition in a low pressure, parallel-plate reactor geometry. The corresponding particle collection efficiencies on a wafer for different particle sizes, gas temperature gradients, and gas pressures are evaluated.

Choi, S.J.; Rader, D.J.; Geller, A.S. [Sandia National Laboratories, Engineering Science Center, Albuquerque, New Mexico 87185-0827 (United States)] [Sandia National Laboratories, Engineering Science Center, Albuquerque, New Mexico 87185-0827 (United States)

1996-03-01

231

Data Parallel Simulation Using Time-Warp on the Connection Machine

A new data parallel simulation technique of Time-Warp on the Connection Machine is presented. Our scheme handles the rollback problem of Time-Warp efficiently, and maximizes data parallelism, where the parallelism is extracted from simultaneous evaluation of different circuit elements. Each event is assigned to one of the processors of the Connection Machine to achieve high data parallelism. A new scheme

Moon Jung Chung; Yunmo Chung

1989-01-01

232

Data parallel simulation using time-warp on the connection machine

A new data parallel simulation technique of Time-Warp on the Connection Machine is presented. Our scheme handles the rollback problem of Time-Warp efficiently, and maximizes data parallelism, where the parallelism is extracted from simultaneous evaluation of different circuit elements. Each event is assigned to one of the processors of the Connection Machine to achieve high data parallelism. A new scheme

Moon Jung Chung; Yunmo Chung

1989-01-01

233

A Partitioning Approach for Parallel Simulation of AC-Radial Shipboard Power Systems

analysis, defining the subsystems using a diakoptics-based approach, and the simulation parallelized using a multicore computer. A program was developed in C# to conduct multithreaded parallel-sequential simulations of an SPS. The program first... ........................................................................... 130 3.4 Simulation and Multithreaded Simulation .................................... 130 3.5 Determining the Number of Partitions .......................................... 136 3.5.1 Cost of Step 1...

Uriarte, Fabian Marcel

2011-08-08

234

A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor

NASA Technical Reports Server (NTRS)

The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing.

Rao, Hariprasad Nannapaneni

1989-01-01

235

Parallel Unsteady Turbopump Simulations for Liquid Rocket Engines

NASA Technical Reports Server (NTRS)

This paper reports the progress being made towards complete turbo-pump simulation capability for liquid rocket engines. Space Shuttle Main Engine (SSME) turbo-pump impeller is used as a test case for the performance evaluation of the MPI and hybrid MPI/Open-MP versions of the INS3D code. Then, a computational model of a turbo-pump has been developed for the shuttle upgrade program. Relative motion of the grid system for rotor-stator interaction was obtained by employing overset grid techniques. Time-accuracy of the scheme has been evaluated by using simple test cases. Unsteady computations for SSME turbo-pump, which contains 136 zones with 35 Million grid points, are currently underway on Origin 2000 systems at NASA Ames Research Center. Results from time-accurate simulations with moving boundary capability, and the performance of the parallel versions of the code will be presented in the final paper.

Kiris, Cetin C.; Kwak, Dochan; Chan, William

2000-01-01

236

Financial simulations on a massively parallel Connection Machine

This paper reports on the valuation of complex financial instruments that appear in the banking and insurance industries which requires simulations of their cashflow behavior in a volatile interest rate environment. These simulations are complex and computationally intensive. Their use, thus far, has been limited to intra-day analysis and planning. Researchers at the Wharton School and Thinking Machines Corporation have developed model formulations for massively parallel architectures, like the Connection Machine CM-2. A library of financial modeling primitives has been designed and used to implement a model for the valuation of mortgage-backed securities. Analyzing a portfolio of these securities-which would require 2 days on a large mainframe-is carried out in 1 hour on a CM-2a.

Hutchinson, J.M.; Zenios, S.A. (Thinking Machine Corp., Cambridge, MA (US))

1991-01-01

237

Use of Parallel Tempering for the Simulation of Polymer Melts

NASA Astrophysics Data System (ADS)

The parallel tempering algorithm(C. J. Geyer, Computing Science and Statistics: Proceedings of the 23rd Symposium of the Interface, 156 (1991).) is based on simulating several systems in parallel, each of which have a slightly different Hamiltonian. The systems are put in equilibrium with each other by stochastic swaps between neighboring Hamiltonians. Previous implementations have mainly focused on the temperature as control variable. In contrast, we vary the excluded-volume interaction in a continuum bead-spring polymer melt, as has been done for lattice polymers already(Y. Iba, G. Chikenji, M. Kikuchi, J. Phys. Soc. Japan v. 67, 3327 (1998).). The "softest" interactions allow for substantial monomer overlap such that pivot moves become feasible. We have benchmarked the algorithm by comparing it to the chain breaking algorithm used on the same system. Possible applications of the algorithm include the simulation of polymer systems with complex topologies and combining the method with the Gibbs ensemble technique for the phase behavior of polymer blends.

Bunker, Alex; Duenweg, Burkhard; Theodorou, Doros

2000-03-01

238

NASA Astrophysics Data System (ADS)

A new parallel computing environment, called as ``Parallel Molecular Dynamics Stencil'', has been developed to carry out a large-scale short-range molecular dynamics simulation of solids. The stencil is written in C language using MPI for parallelization and designed successfully to separate and conceal parts of the programs describing cutoff schemes and parallel algorithms for data communication. This has been made possible by introducing the concept of image atoms. Therefore, only a sequential programming of the force calculation routine is required for executing the stencil in parallel environment. Typical molecular dynamics routines, such as various ensembles, time integration methods, and empirical potentials, have been implemented in the stencil. In the presentation, the performance of the stencil on parallel computers of Hitachi, IBM, SGI, and PC-cluster using the models of Lennard-Jones and the EAM type potentials for fracture problem will be reported.

Shimizu, Futoshi; Kimizuka, Hajime; Kaburaki, Hideo

2002-08-01

239

Massively Parallel Simulations of Diffusion in Dense Polymeric Structures

An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon(1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of magnitude.The main advantage of the technique is that one can circumvent the bottlenecks in configuration space that inhibit relaxation in molecular dynamics simulations. The technique is based on the fact that tetravalent atoms (such as carbon and silicon) fit in the center of a regular tetrahedron and that regular tetrahedrons can be used to mesh the three-dimensional space. Thus, the problem of polymer equilibration described by continuous equations in molecular dynamics is reduced to a discrete problem where solutions are approximated by simple algorithms. Practical modeling applications include the constructing of butyl rubber and ethylene-propylene-dimer-monomer (EPDM) models for oxygen and water diffusion calculations. Butyl and EPDM are used in O-ring systems and serve as sealing joints in many manufactured objects. Diffusion coefficients of small gases have been measured experimentally on both polymeric systems, and in general the diffusion coefficients in EPDM are an order of magnitude larger than in butyl. In order to better understand the diffusion phenomena, 10, 000 atoms models were generated and equilibrated for butyl and EPDM. The models were submitted to a massively parallel molecular dynamics simulation to monitor the trajectories of the diffusing species.

Faulon, Jean-Loup, Wilcox, R.T. [Sandia National Labs., Albuquerque, NM (United States)], Hobbs, J.D. [Montana Tech of the Univ. of Montana, Butte, MT (United States). Dept. of Chemistry and Geochemistry], Ford, D.M. [Texas A and M Univ., College Station, TX (United States). Dept. of Chemical Engineering

1997-11-01

240

NASA Astrophysics Data System (ADS)

Autonomous Sensor Networks have the potential for broad applicability to national security, intelligent transportation, industrial production and environmental and hazardous process control. Distributed sensors may be used for detecting bio-terrorist attacks, for contraband interdiction, border patrol, monitoring building safety and security, battlefield surveillance, or may be embedded in complex dynamic systems for enabling fault tolerant operations. In this paper we present algorithms and automation tools for constructing discrete event controllers for complex networked systems that restrict the dynamic behavior of the system according to given specifications. In our previous work we have modeled dynamic system as a discrete event automation whose open loop behavior is represented as a language L of strings generated with the alphabet 'Elipson' of all possible atomic events that cause state transitions in the network. The controlled behavior is represented by a sublanguage K, contained in L, that restricts the behavior of the system according to the specifications of the controller. We have developed the algebraic structure of controllable sublanguages as perfect right partial ideals that satisfy a precontrollability condition. In this paper we develop an iterative algorithm to take an ad hoc specification described using a natural language, and to formulate a complete specification that results in a controllable sublanguage. A supervisory controller modeled as an automaton that runs synchronously with the open loop system in the sense of Ramadge and Wonham is automatically generated to restrict the behavior of the open loop system to the controllable sublanguage. A battlefield surveillance scenario illustrates the iterative evolution of ad hoc specifications for controlling an autonomous sensor network and the generation of a controller that reconfigures the sensor network to dynamically adapt to environmental perturbations.

Damiani, Sarah; Griffin, Christopher; Phoha, Shashi

2003-12-01

241

Parallel continuous simulated tempering and its applications in large-scale molecular simulations.

In this paper, we introduce a parallel continuous simulated tempering (PCST) method for enhanced sampling in studying large complex systems. It mainly inherits the continuous simulated tempering (CST) method in our previous studies [C. Zhang and J. Ma, J. Chem. Phys. 130, 194112 (2009); C. Zhang and J. Ma, J. Chem. Phys. 132, 244101 (2010)], while adopts the spirit of parallel tempering (PT), or replica exchange method, by employing multiple copies with different temperature distributions. Differing from conventional PT methods, despite the large stride of total temperature range, the PCST method requires very few copies of simulations, typically 2-3 copies, yet it is still capable of maintaining a high rate of exchange between neighboring copies. Furthermore, in PCST method, the size of the system does not dramatically affect the number of copy needed because the exchange rate is independent of total potential energy, thus providing an enormous advantage over conventional PT methods in studying very large systems. The sampling efficiency of PCST was tested in two-dimensional Ising model, Lennard-Jones liquid and all-atom folding simulation of a small globular protein trp-cage in explicit solvent. The results demonstrate that the PCST method significantly improves sampling efficiency compared with other methods and it is particularly effective in simulating systems with long relaxation time or correlation time. We expect the PCST method to be a good alternative to parallel tempering methods in simulating large systems such as phase transition and dynamics of macromolecules in explicit solvent. PMID:25084887

Zang, Tianwu; Yu, Linglin; Zhang, Chong; Ma, Jianpeng

2014-07-28

242

Simulation of Distributed Systems Fernando G. Gonzalez

Simulation of Distributed Systems Fernando G. Gonzalez School of Electrical Engineering@pegasus.cc.ucf.edu Keywords: discrete event simulation, single threaded simulation, discrete event control Abstract This paper in a single threaded simulation. It is assumed that the distributed system is described by a collection

Gonzalez, Fernando

243

Ion dynamics at supercritical quasi-parallel shocks: Hybrid simulations

By separating the incident ions into directly transmitted, downstream thermalized, and diffuse ions, we perform one-dimensional (1D) hybrid simulations to investigate ion dynamics at a supercritical quasi-parallel shock. In the simulations, the angle between the upstream magnetic field and shock nominal direction is {theta}{sub Bn}=30 Degree-Sign , and the Alfven Mach number is M{sub A}{approx}5.5. The shock exhibits a periodic reformation process. The ion reflection occurs at the beginning of the reformation cycle. Part of the reflected ions is trapped between the old and new shock fronts for an extended time period. These particles eventually form superthermal diffuse ions after they escape to the upstream of the new shock front at the end of the reformation cycle. The other reflected ions may return to the shock immediately or be trapped between the old and new shock fronts for a short time period. When the amplitude of the new shock front exceeds that of the old shock front and the reformation cycle is finished, these ions become thermalized ions in the downstream. No noticeable heating can be found in the directly transmitted ions. The relevance of our simulations to the satellite observations is also discussed in the paper.

Su Yanqing; Lu Quanming; Gao Xinliang; Huang Can; Wang Shui [CAS Key Laboratory of Basic Plasma Physics, Department of Geophysics and Planetary Science, University of Science and Technology of China, Hefei 230026 (China)

2012-09-15

244

Parallel Numerical Simulation with MUFTEUG: Two Phase Flow Processes in the Subsurface

[ 1 ] Parallel Numerical Simulation with MUFTEÂUG: TwoÂ Phase Flow Processes in the Subsurface UÂUG, a parallel numerical simulator for multiphase flow, is introduced. The basic PDEs for twoÂphase flow together using a 2D example. Keywords Parallelisation, Numerical Simulation, TwoÂPhaseÂFlow, Multigrid Methods 1

Cirpka, Olaf Arie

245

Pseudorandom number generator for massively parallel molecular-dynamics simulations

A class of uniform pseudorandom number generators is proposed for modeling and simulations on massively parallel computers. The algorithm is simple, nonrecursive, and is easily transported to serial or vector computers. We have tested the procedure for uniformity, independence, and correlations by several methods. Related, less complex sequences passed some of these tests well enough; however, inadequacies were revealed by tests for correlations and in an interesting application, namely, annealing from an initial lattice that is mechanically unstable. In the latter case, initial velocities chosen by a random number generator that is not sufficiently random lead quickly to unphysical regularity in grain structure. The new class of generators passes this dynamical diagnostic for unwanted correlations.

Holian, B.L. (Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545 (United States)); Percus, O.E. (Courant Institute of Mathematical Science, New York University, 251 Mercer Street, New York, New York 10012 (United States)); Warnock, T.T. (Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545 (United States)); Whitlock, P.A. (Computer and Information Sciences Department, Brooklyn College, 2900 Bedford Avenue, Brooklyn, New York 11210 (United States))

1994-08-01

246

Petascale turbulence simulation using a highly parallel fast multipole method

We present a 0.5 Petaflop/s calculation of homogeneous isotropic turbulence in a cube of 2048^3 particles, using a highly parallel fast multipole method (FMM) using 2048 GPUs on the TSUBAME 2.0 system. We compare this particle-based code with a spectral DNS code under the same calculation condition and the same machine. The results of our particle-based turbulence simulation match quantitatively with that of the spectral method. The calculation time for one time step is approximately 30 seconds for both methods; this result shows that the scalability of the FMM starts to become an advantage over FFT-based methods beyond 2000 GPUs.

Yokota, R; Barba, L A; Yasuoka, K

2011-01-01

247

Parallelizing N-Body Simulations on a Heterogeneous Cluster

NASA Astrophysics Data System (ADS)

This thesis evaluates quantitatively the effectiveness of a new technique for parallelising direct gravitational N-body simulations on a heterogeneous computing cluster. In addition to being an investigation into how a specific computational physics task can be optimally load balanced across the heterogeneity factors of a distributed computing cluster, it is also, more generally, a case study in effective heterogeneous parallelisation of an all-pairs programming task. If high-performance computing clusters are not designed to be heterogeneous initially, they tend to become so over time as new nodes are added, or existing nodes are replaced or upgraded. As a result, effective techniques for application parallelisation on heterogeneous clusters are needed if maximum cluster utilisation is to be achieved and is an active area of research. A custom C/MPI parallel particle-particle N-body simulator was developed, validated and deployed for this evaluation. Simulation communication proceeds over cluster nodes arranged in a logical ring and employs nonblocking message passing to encourage overlap of communication with computation. Redundant calculations arising from force symmetry given by Newton's third law are removed by combining chordal data transfer of accumulated forces with ring passing data transfer. Heterogeneity in node computation speed is addressed by decomposing system data across nodes in proportion to node computation speed, in conjunction with use of evenly sized communication buffers. This scheme is shown experimentally to have some potential in improving simulation performance in comparison with an even decomposition of data across nodes. Techniques for further heterogeneous cluster load balancing are discussed and remain an opportunity for further work.

Stenborg, T. N.

2009-10-01

248

A discrete-event simulation approach to predict power consumption in machining processes

Whereas in the past the sustainable use of resources and the reduction of waste have mainly been looked at from an ecological\\u000a point of view, resource efficiency recently becomes more and more an issue of cost saving as well. In manufacturing engineering\\u000a especially the reduction of power consumption of machine tools and production facilities is in the focus of industry,

Roland Larek; Ekkard Brinksmeier; Daniel Meyer; Thorsten Pawletta; Olaf Hagendorf

249

A regional transportation system and the movement of large traffic volumes through it, are characteristic of stochastic systems. The standard traffic management or transportation planning approach uses a slice in time view of the system. Static, mean values of system variables are used for the basis of incident-caused, congestion management decisions. By reason of the highly variable nature of transportation

Roy Brooks Wiley; Thomas K. Keyser

1998-01-01

250

Discrete-event simulation of fluid stochastic Petri nets Gianfranco Ciardo1

-EEC-94-18765. the FSPN formalism we propose include: Â· Fluid impulses associated with both immediate impulses are the continuous analogue of ordinary token movements for standard Petri nets, while complete dependency of any behavior (including the guards of immediate transi- tions) on the entire state

Ciardo, Gianfranco

251

Quantifying supply chain disruption risk using Monte Carlo and discrete-event simulation

We present a model constructed for a large consumer products company to assess their vulnerability to disruption risk and quantify its impact on customer service. Risk profiles for the locations and connections in the ...

Schmitt, Amanda J.

252

Engineering Graduate School of Engineering and Management Air Force Institute of Technology Wright Patterson. A stochastic processor allocation algorithm is developed for assigning processes to processors in an effective simultaneously across the processors. The efficiency and effectiveness of the distributed system is dependent

Coello, Carlos A. Coello

253

World-class utilization of manufacturing resources is of vital importance to any manufacturing enterprise in the global competition of today. This requirement calls for superior performance of all processes related to the manufacturing of products. One of these processes is the resetting of machinery and equipment between two product runs, which will be the focus area of this text. This paper

Björn Johansson; Jiirgen Kaiser

2002-01-01

254

A discrete event simulation model for unstructured supervisory control of unmanned vehicles

Most current Unmanned Vehicle (UV) systems consist of teams of operators controlling a single UV. Technological advances will likely lead to the inversion of this ratio, and automation of low level tasking. These advances ...

McDonald, Anthony D. (Anthony Douglas)

2010-01-01

255

Can discrete event simulation be of use in modelling major depression?

BACKGROUND: Depression is among the major contributors to worldwide disease burden and adequate modelling requires a framework designed to depict real world disease progression as well as its economic implications as closely as possible. OBJECTIVES: In light of the specific characteristics associated with depression (multiple episodes at varying intervals, impact of disease history on course of illness, sociodemographic factors), our

Agathe Le Lay; Nicolas Despiegel; Clément François; Gérard Duru

2006-01-01

256

Analysis of a hospital network transportation system with discrete event simulation

VA New England Healthcare System (VISN1) provides transportation to veterans between eight medical centers and over 35 Community Based Outpatient Clinics across New England. Due to high variation in its geographic area, ...

Kwon, Annie Y. (Annie Yean)

2011-01-01

257

DIAGNOSIS OF DISCRETE EVENT SYSTEMS USING TIMED AUTOMATA Zineb Simeu-Abazi*, Maria Di Mascolo.simeu-Abazi@g-scop.inpg.fr Abstract: This paper proposes an effective way for diagnosis of discreteÂevent systems by using a timed the diagnosis path in the timed automaton. The proposed approach could be applied to dynamical systems, where

Paris-Sud XI, UniversitÃ© de

258

Simulation of the charge transfer inefficiency of column parallel CCDs

NASA Astrophysics Data System (ADS)

Charge-Coupled Devices (CCDs) have been successfully used in several high-energy physics experiments over the past two decades. Their high spatial resolution and thin sensitive layer make them an excellent tool for studying short-lived particles. The Linear Collider Flavour Identification (LCFI) collaboration is developing Column Parallel CCDs (CPCCDs) for the vertex detector of the International Linear Collider (ILC). The CPCCDs can be read out many times faster than standard CCDs, significantly increasing their operating speed. The results of detailed simulations of the Charge Transfer Inefficiency (CTI) of a prototype CPCCD chip are reported. The effects of the radiation damage on the CTI of a Si-based CCD particle detector are studied by simulating the effects of two electron trap levels Ec-0.17 and Ec-0.44 eV at different concentrations and operating temperatures. The dependence of the CTI on different occupancy levels (percentage of hit pixels) and readout frequencies is also studied. The optimal operating temperature—where the effects of the trapping are at a minimum—is found to be ˜230 K for the range of readout speeds proposed for the ILC.

Maneuski, Dzmitry

2008-06-01

259

Parallel Simulation of ElectronSolid Interactions Electron Microscopy Modeling

* Parallel Computational Sciences Department, # Materials and Process Sciences Center Key Words: parallel the sample composition [Michael, et al. 1990]. A simpler, yet effective method for accomplishing composition data and to characterize electron microscope performance are briefly highlighted. #12; Page 2

Plimpton, Steve

260

Parallel computation for reservoir thermal simulation of multicomponent and multiphase fluid flow

NASA Astrophysics Data System (ADS)

We consider parallel computing technology for the thermal simulation of multicomponent, multiphase fluid flow in petroleum reservoirs. This paper reports the development and applications of a parallel thermal recovery simulation code. This code utilizes the message passing interface (MPI) library, overlapping domain decomposition, and dynamic memory allocation techniques. Its efficiency is investigated through simulation of two three-dimensional multicomponent, multiphase field models for heavy oil crudes. Numerical results for these two simulation models indicate that this parallel code can significantly improve capacity and efficiency for large-scale thermal simulations.

Ma, Yuanle; Chen, Zhangxin

2004-11-01

261

faults. The second possibility uses models of fault-free system behavior only. A prominent exampleThe concept of residuals for fault localization in discrete event systems Matthias Roth a,b.*, Jean Wilson, 94235 Cachan Cedex, France ABSTRACT Keywords: Discrete event systems Fault detection Fault

Paris-Sud XI, UniversitÃ© de

262

Sensor Configuration Selection for Discrete-Event Systems under Unreliable Observations

Algorithms for counting the occurrences of special events in the framework of partially-observed discrete event dynamical systems (DEDS) were developed in previous work. Their performances typically become better as the sensors providing the observations become more costly or increase in number. This paper addresses the problem of finding a sensor configuration that achieves an optimal balance between cost and the performance of the special event counting algorithm, while satisfying given observability requirements and constraints. Since this problem is generally computational hard in the framework considered, a sensor optimization algorithm is developed using two greedy heuristics, one myopic and the other based on projected performances of candidate sensors. The two heuristics are sequentially executed in order to find best sensor configurations. The developed algorithm is then applied to a sensor optimization problem for a multiunit- operation system. Results show that improved sensor configurations can be found that may significantly reduce the sensor configuration cost but still yield acceptable performance for counting the occurrences of special events.

Wen-Chiao Lin; Tae-Sic Yoo; Humberto E. Garcia

2010-08-01

263

A performance study of the hypercube parallel processor architecture

This paper investigates the relationship between workload characteristics and process speedup obtainable on a hypercube parallel processor architecture. There were two goals: first was to determine the functional relationship between workload characteristics and speedup, and second was to show how simulation could be used to model the concurrently executing process to allow estimation of such a relation. The hypercube implementation used in this study was a packet-switched network with predetermined routing and a balanced computational workload. Three independent variables were controlled: total computational workload, number of processors and the message traffic load. A benchmark program was used to estimate the fundamental timing models and to validate a discrete event simulation. Results of this study are useful to software designers seeking to predict the degree of performance improvement attainable on a hypercube class machine. The methodology and results can be extended to other parallel processing architectures.

Lamanna, C.A. (Air Force Inst. of Tech., Wright-Patterson AFB, OH (United States)); Shaw, W.H. Jr. (Florida Inst. of Tech., Melbourne, FL (United States))

1991-03-01

264

Computer simulation machining a 3D free surface by using a 3-RPRU parallel machine tool

A novel computer aided geometric approach is proposed to machine a 3D free surface by adopting a 3-RPRU parallel machine tool.\\u000a Based on the geometry constraint and dimension-driving technique, a 3-RPRU parallel simulation mechanism is first created.\\u000a Next, a 3D free surface and a guiding plane of tool path are constituted. Finally, the 3-RPRU parallel simulation mechanism,\\u000a the 3D free

Yi Lu; Jiayin Xu

2007-01-01

265

Parallel Computation in Simulating Di usion and Deformation in Human Brain

in the development of improved approaches for the reconstruction of nerve #12;ber tracts in the human brainParallel Computation in Simulating Di#11;usion and Deformation in Human Brain #3; Ning Kang y Jun of parallel and high performance computation in simulating the di#11;usion process in the human brain

Zhang, Jun

266

Exploration of Cancellation Strategies for Parallel Simulation on Multi-core Beowulf Clusters

Exploration of Cancellation Strategies for Parallel Simulation on Multi-core Beowulf Clusters of cancellation strategies on multi-core Beowulf clusters with both shared memory and distributed memory, and dynamic cancellation strategies for parallel simulation on multi-core Beowulf Clusters in a multi

Wilsey, Philip A.

267

Parallel climate model (PCM) control and transient simulations

NASA Astrophysics Data System (ADS)

The Department of Energy (DOE) supported Parallel Climate Model (PCM) makes use of the NCAR Community Climate Model (CCM3) and Land Surface Model (LSM) for the atmospheric and land surface components, respectively, the DOE Los Alamos National Laboratory Parallel Ocean Program (POP) for the ocean component, and the Naval Postgraduate School sea-ice model. The PCM executes on several distributed and shared memory computer systems. The coupling method is similar to that used in the NCAR Climate System Model (CSM) in that a flux coupler ties the components together, with interpolations between the different grids of the component models. Flux adjustments are not used in the PCM. The ocean component has 2/3° average horizontal grid spacing with 32 vertical levels and a free surface that allows calculation of sea level changes. Near the equator, the grid spacing is approximately 1/2° in latitude to better capture the ocean equatorial dynamics. The North Pole is rotated over northern North America thus producing resolution smaller than 2/3° in the North Atlantic where the sinking part of the world conveyor circulation largely takes place. Because this ocean model component does not have a computational point at the North Pole, the Arctic Ocean circulation systems are more realistic and similar to the observed. The elastic viscous plastic sea ice model has a grid spacing of 27km to represent small-scale features such as ice transport through the Canadian Archipelago and the East Greenland current region. Results from a 300year present-day coupled climate control simulation are presented, as well as for a transient 1% per year compound CO2 increase experiment which shows a global warming of 1.27°C for a 10year average at the doubling point of CO2 and 2.89°C at the quadrupling point. There is a gradual warming beyond the doubling and quadrupling points with CO2 held constant. Globally averaged sea level rise at the time of CO2 doubling is approximately 7cm and at the time of quadrupling it is 23cm. Some of the regional sea level changes are larger and reflect the adjustments in the temperature, salinity, internal ocean dynamics, surface heat flux, and wind stress on the ocean. A 0.5% per year CO2 increase experiment also was performed showing a global warming of 1.5°C around the time of CO2 doubling and a similar warming pattern to the 1% CO2 per year increase experiment. El Niño and La Niña events in the tropical Pacific show approximately the observed frequency distribution and amplitude, which leads to near observed levels of variability on interannual time scales.

Washington, W. M.; Weatherly, J. W.; Meehl, G. A.; Semtner, A. J., Jr.; Bettge, T. W.; Craig, A. P.; Strand, W. G., Jr.; Arblaster, J.; Wayland, V. B.; James, R.; Zhang, Y.

268

Parallel Vehicular Traffic Simulation using Reverse Computation-based Optimistic Execution

Vehicular traffic simulations are useful in applications such as emergency management and homeland security planning tools. High speed of traffic simulations translates directly to speed of response and level of resilience in those applications. Here, a parallel traffic simulation approach is presented that is aimed at reducing the time for simulating emergency vehicular traffic scenarios. Three unique aspects of this effort are: (1) exploration of optimistic simulation applied to vehicular traffic simulation (2) addressing reverse computation challenges specific to optimistic vehicular traffic simulation (3) achieving absolute (as opposed to self-relative) speedup with a sequential speed equal to that of a fast, de facto standard sequential simulator for emergency traffic. The design and development of the parallel simulation system is presented, along with a performance study that demonstrates excellent sequential performance as well as parallel performance.

Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL

2008-01-01

269

Humans can integrate feedback of discrete events in their sensorimotor control of a robotic hand.

Providing functionally effective sensory feedback to users of prosthetics is a largely unsolved challenge. Traditional solutions require high band-widths for providing feedback for the control of manipulation and yet have been largely unsuccessful. In this study, we have explored a strategy that relies on temporally discrete sensory feedback that is technically simple to provide. According to the Discrete Event-driven Sensory feedback Control (DESC) policy, motor tasks in humans are organized in phases delimited by means of sensory encoded discrete mechanical events. To explore the applicability of DESC for control, we designed a paradigm in which healthy humans operated an artificial robot hand to lift and replace an instrumented object, a task that can readily be learned and mastered under visual control. Assuming that the central nervous system of humans naturally organizes motor tasks based on a strategy akin to DESC, we delivered short-lasting vibrotactile feedback related to events that are known to forcefully affect progression of the grasp-lift-and-hold task. After training, we determined whether the artificial feedback had been integrated with the sensorimotor control by introducing short delays and we indeed observed that the participants significantly delayed subsequent phases of the task. This study thus gives support to the DESC policy hypothesis. Moreover, it demonstrates that humans can integrate temporally discrete sensory feedback while controlling an artificial hand and invites further studies in which inexpensive, noninvasive technology could be used in clever ways to provide physiologically appropriate sensory feedback in upper limb prosthetics with much lower band-width requirements than with traditional solutions. PMID:24992899

Cipriani, Christian; Segil, Jacob L; Clemente, Francesco; Ff Weir, Richard F; Edin, Benoni

2014-11-01

270

Discrete Event Dyn Syst (2010) 20:317 DOI 10.1007/s10626-009-0078-3

chain X, we have the following Poisson equation (I - P)g + e = f, (2) where I is the M Ã? M identityDiscrete Event Dyn Syst (2010) 20:3Â17 DOI 10.1007/s10626-009-0078-3 On-Line Policy Gradient Water Bay, Kowloon, Hong Kong e-mail: whylyj@ustc.edu Present Address: Y.-J. Li Division of Control

Cao, Xiren

271

Parallel computing in enterprise modeling.

This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

2008-08-01

272

Parallel climate model (PCM) control and transient simulations

The Department of Energy (DOE) supported Parallel Climate Model (PCM) makes use of the NCAR Community Climate Model (CCM3)\\u000a and Land Surface Model (LSM) for the atmospheric and land surface components, respectively, the DOE Los Alamos National Laboratory\\u000a Parallel Ocean Program (POP) for the ocean component, and the Naval Postgraduate School sea-ice model. The PCM executes on\\u000a several distributed and

W. M. Washington; J. W. Weatherly; G. A. Meehl; A. J. Semtner Jr.; T. W. Bettge; A. P. Craig; W. G. Strand Jr.; J. M. Arblaster; V. B. Wayland; R. James; Y. Zhang

2000-01-01

273

Simulation of Earthquake Liquefaction Response on Parallel Computers

This paper presents a parallel nonlinear finite element program, ParCYCLIC, which is designed for the analysis of cyclic seismically-induced liquefaction problems. Key elements of the computational strategy employed in ParCYCLIC include the deployment of an automatic domain decomposer, the use of the multilevel nested dissection algorithm for the ordering of fi nite element nodes, and the development of a parallel

Jun Peng; Jinchi Lu; Kincho H. Law; Ahmed Elgamal

274

Parallel Monte-Carlo Tree Search with Simulation Servers

Monte-Carlo tree search is a new best-first tree search algorithm that triggered a revolution in the computer Go world. Developing good parallel Monte-Carlo tree search algorithms is importan because single processor's performance cannot be expected to increase as used to. A novel parallel Monte-Carlo tree search algorithm is proposed. A tree searcher runs on a client computer and multiple Monte-Carlo

Hideki Kato; Ikuo Takeuchi

2010-01-01

275

xSim: The Extreme-Scale Simulator

Investigating parallel application performance properties at scale is becoming an important part of high-performance computing (HPC) application development and deployment. The Extreme-scale Simulator (xSim) is a performance investigation toolkit that permits running an application in a controlled environment at extreme scale without the need for a respective extreme-scale HPC system. Using a lightweight parallel discrete event simulation, xSim executes a parallel application with a virtual wall clock time, such that performance data can be extracted based on a processor model and a network model. This paper presents significant enhancements to the xSim toolkit prototype that provide a more complete Message Passing Interface (MPI) support and improve its versatility. These enhancements include full virtual MPI group, communicator and collective communication support, and global variables support. The new capabilities are demonstrated by executing the entire NAS Parallel Benchmark suite in a simulated HPC environment.

Boehm, Swen [ORNL; Engelmann, Christian [ORNL

2011-01-01

276

The lattice Boltzmann method has proven to be a promising method to simulate flow in porous media. Its practical application often relies on parallel computation because of the demand for a large domain and fine grid resolution to adequately resolve pore heterogeneity. The existing domain-decomposition methods for parallel computation usually decompose a domain into a number of subdomains first and

Junye Wang; Xiaoxian Zhang; Anthony G. Bengough; John W. Crawford

2005-01-01

277

A parallel implementation of the Cellular Potts Model for simulation of cell-based morphogenesis

The Cellular Potts Model (CPM) has been used in a wide variety of biological simulations. However, most current CPM implementations use a sequential modi- fled Metropolis algorithm which restricts the size of simulations. In this paper we present a parallel CPM algorithm for simulations of morphogenesis, which includes cell-cell adhesion, a cell volume constraint, and cell haptotaxis. The algorithm uses

Nan Chen; James A. Glazier; Jesús A. Izaguirre; Mark S. Alber

2007-01-01

278

Parallel-Platform Based Numerical Simulation of Instabilities in Nanoscale Tunneling Devices

Parallel-Platform Based Numerical Simulation of Instabilities in Nanoscale Tunneling Devices C. T physics- based simulator. The results were obtained from a numerical implementation of the Wigner simulation tool will allow for the detailed study of RTS devices coupled to circuits where numerical

279

ParallelPlatform Based Numerical Simulation of Instabilities in Nanoscale Tunneling Devices

ParallelÂPlatform Based Numerical Simulation of Instabilities in Nanoscale Tunneling Devices C. T physicsÂ based simulator. The results were obtained from a numerical implementation of the Wigner simulation tool will allow for the detailed study of RTS devices coupled to circuits where numerical

280

A parallel implicit method for the direct numerical simulation of wall-bounded compressible-order accurate implicit temporal numerical scheme for the direct numerical simulation of turbulent flows. The numerical simulation results are compared with the results given by explicit RungeÂKutta schemes

MartÃn, Pino

281

Computer simulation program for parallel SITAN. [Sandia Inertia Terrain-Aided Navigation, in FORTRAN

This computer program simulates the operation of parallel SITAN using digitized terrain data. An actual trajectory is modeled including the effects of inertial navigation errors and radar altimeter measurements.

Andreas, R.D.; Sheives, T.C.

1980-11-01

282

Parallelization of particle-in-cell simulation modeling Hall-effect thrusters

MIT's fully kinetic particle-in-cell Hall thruster simulation is adapted for use on parallel clusters of computers. Significant computational savings are thus realized with a predicted linear speed up efficiency for certain ...

Fox, Justin M., 1981-

2005-01-01

283

Parallel Simulation of Subsonic Fluid Dynamics on a Cluster of Workstations

An effective approach of simulating fluid dynamics on a cluster of non- dedicated workstations is presented. The approach uses local interaction algorithms, small communication capacity, and automatic migration of parallel ...

Skordos, Panayotis A.

1995-12-01

284

Air shower simulation using {\\sc geant4} and commodity parallel computing

We present an evaluation of a simulated cosmic ray shower, based on {\\sc geant4} and {\\sc top-c}, which tracks all the particles in the shower. {\\sc top-c} (Task Oriented Parallel C) provides a framework for parallel algorithm development which makes tractable the problem of following each particle. This method is compared with a simulation program which employs the Hillas thinning algorithm.

L. A. Anchordoqui; G. Cooperman; V. Grinberg; T. P. McCauley; T. Paul; S. Reucroft; J. D. Swain; G. Alverson

2000-06-09

285

Parallel Algorithms for Time and Frequency Domain Circuit Simulation

As a most critical form of pre-silicon verification, transistor-level circuit simulation is an indispensable step before committing to an expensive manufacturing process. However, considering the nature of circuit simulation, it can...

Dong, Wei

2010-10-12

286

Asynchronous distributed simulation via a sequence of parallel computations

An approach to carrying out asynchronous, distributed simulation on multiprocessor messagepassing architectures is presented. This scheme differs from other distributed simulation schemes because (1) the amount of memory required by all processors together is bounded and is no more than the amount required in sequential simulation and (2) the multiprocessor network is allowed to deadlock, the deadlock is detected, and

K. Mani Chandy; Jayadev Misra

1981-01-01

287

Xyce parallel electronic simulator users' guide, Version 6.0.1.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.

Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David Gregory. [Raytheon, Albuquerque, NM] [Raytheon, Albuquerque, NM

2014-01-01

288

PROTEUS: A High-Performance Parallel-Architecture Simulator

Proteus is a high-performance simulator for MIMD multiprocessors. It is fast, accurate, and flexible:it is one to two orders of magnitude faster than comparable simulators, it can reproduce results from realmultiprocessors, and it is easily configured to simulate a wide range of architectures. Proteus providesa modular structure that simplifies customization and independent replacement of parts of architecture.There are typically multiple

Eric A. Brewer; Chrysanthos N. Dellarocas; Adrian Colbrook; William E. Weihl

1992-01-01

289

Parallel kinetic Monte Carlo simulations of Ag(111) island coarsening using a large database

NASA Astrophysics Data System (ADS)

The results of parallel kinetic Monte Carlo (KMC) simulations of the room-temperature coarsening of Ag(111) islands carried out using a very large database obtained via self-learning KMC simulations are presented. Our results indicate that, while cluster diffusion and coalescence play an important role for small clusters and at very early times, at late time the coarsening proceeds via Ostwald ripening, i.e. large clusters grow while small clusters evaporate. In addition, an asymptotic analysis of our results for the average island size S(t) as a function of time t leads to a coarsening exponent n = 1/3 (where S(t)~t2n), in good agreement with theoretical predictions. However, by comparing with simulations without concerted (multi-atom) moves, we also find that the inclusion of such moves significantly increases the average island size. Somewhat surprisingly we also find that, while the average island size increases during coarsening, the scaled island-size distribution does not change significantly. Our simulations were carried out both as a test of, and as an application of, a variety of different algorithms for parallel kinetic Monte Carlo including the recently developed optimistic synchronous relaxation (OSR) algorithm as well as the semi-rigorous synchronous sublattice (SL) algorithm. A variation of the OSR algorithm corresponding to optimistic synchronous relaxation with pseudo-rollback (OSRPR) is also proposed along with a method for improving the parallel efficiency and reducing the number of boundary events via dynamic boundary allocation (DBA). A variety of other methods for enhancing the efficiency of our simulations are also discussed. We note that, because of the relatively high temperature of our simulations, as well as the large range of energy barriers (ranging from 0.05 to 0.8 eV), developing an efficient algorithm for parallel KMC and/or SLKMC simulations is particularly challenging. However, by using DBA to minimize the number of boundary events, we have achieved significantly improved parallel efficiencies for the OSRPR and SL algorithms. Finally, we note that, among the three parallel algorithms which we have tested here, the semi-rigorous SL algorithm with DBA led to the highest parallel efficiencies. As a result, we have obtained reasonable parallel efficiencies in our simulations of room-temperature Ag(111) island coarsening for a small number of processors (e.g. Np = 2 and 4). Since the SL algorithm scales with system size for fixed processor size, we expect that comparable and/or even larger parallel efficiencies should be possible for parallel KMC and/or SLKMC simulations of larger systems with larger numbers of processors.

Nandipati, Giridhar; Shim, Yunsic; Amar, Jacques G.; Karim, Altaf; Kara, Abdelkader; Rahman, Talat S.; Trushin, Oleg

2009-02-01

290

A parallel finite element simulator for ion transport through three-dimensional ion channel systems.

A parallel finite element simulator, ichannel, is developed for ion transport through three-dimensional ion channel systems that consist of protein and membrane. The coordinates of heavy atoms of the protein are taken from the Protein Data Bank and the membrane is represented as a slab. The simulator contains two components: a parallel adaptive finite element solver for a set of Poisson-Nernst-Planck (PNP) equations that describe the electrodiffusion process of ion transport, and a mesh generation tool chain for ion channel systems, which is an essential component for the finite element computations. The finite element method has advantages in modeling irregular geometries and complex boundary conditions. We have built a tool chain to get the surface and volume mesh for ion channel systems, which consists of a set of mesh generation tools. The adaptive finite element solver in our simulator is implemented using the parallel adaptive finite element package Parallel Hierarchical Grid (PHG) developed by one of the authors, which provides the capability of doing large scale parallel computations with high parallel efficiency and the flexibility of choosing high order elements to achieve high order accuracy. The simulator is applied to a real transmembrane protein, the gramicidin A (gA) channel protein, to calculate the electrostatic potential, ion concentrations and I - V curve, with which both primitive and transformed PNP equations are studied and their numerical performances are compared. To further validate the method, we also apply the simulator to two other ion channel systems, the voltage dependent anion channel (VDAC) and ?-Hemolysin (?-HL). The simulation results agree well with Brownian dynamics (BD) simulation results and experimental results. Moreover, because ionic finite size effects can be included in PNP model now, we also perform simulations using a size-modified PNP (SMPNP) model on VDAC and ?-HL. It is shown that the size effects in SMPNP can effectively lead to reduced current in the channel, and the results are closer to BD simulation results. PMID:23740647

Tu, Bin; Chen, Minxin; Xie, Yan; Zhang, Linbo; Eisenberg, Bob; Lu, Benzhuo

2013-09-15

291

Summary form only given. The tremendous amount of data generated by large-scale, parallel scientific and engineering simulations make the archive and analysis of this data difficult. To address this problem we, in previous work, developed an efficient archival scheme based on the functional representation of simulation data-this approximation scheme can reduce storage requirement significantly. However, common visualization tools such as

Chuang Li; Paul E. Plassmann

2004-01-01

292

Simulation of Earthquake Liquefaction Response on Parallel Computers , K. H. Law3

1 Simulation of Earthquake Liquefaction Response on Parallel Computers J. Peng1 , J. Lu2 , K. H. Introduction Large-scale finite element simulations of earthquake ground response including liquefaction software, such as traditional finite element programs, needs to be re-designed to take full advantage

Stanford University

293

PROTEUS: A High-Performance Parallel-Architecture Simulator Eric A. Brewer (lhrysanthos N Introduction PROTEUS is an execution-driven simulator for MIMD machines. Like Tango [3] and RPPT [2], it directly ex- ecutes most instructions to achieve very high perfor- mance. Despite exceptional speed PROTEUS

Brewer, Eric A.

294

Data parallel execution challenges and runtime performance of agent simulations on GPUs

Programmable graphics processing units (GPUs) have emerged as excellent computational platforms for certain general-purpose applications. The data parallel execution capabilities of GPUs specifically point to the potential for effective use in simulations of agent-based models (ABM). In this paper, the computational efficiency of ABM simulation on GPUs is evaluated on representative ABM benchmarks. The runtime speed of GPU- based models

Kalyan S. Perumalla; Brandon G. Aaby

2008-01-01

295

We describe the current development status of IMD (ITAPMolecular Dynamics), a software package for classical molecular dynamics simulations on massively-parallel computers. IMD is a general purpose program which can be used for all kinds of two -and three-dimensional studies in condensed matter physics, in addition to the usual MD features it contains a number of special routines for simulation of

Johannes Roth; Jorg Stadler; Marco Brunelli; Dietmar Bunz; Franz Gahler; Jutta Hahn; Martin Hohl; Christof Horn; Jutta Kaiser; Ralf Mikulla; Gunther Schaaf; Joachim Stelzer; Hans-Rainer Trebin

1999-01-01

296

A Parallel Finite Element Simulator for Ion Transport through Three-Dimensional Ion Channel Systems

A Parallel Finite Element Simulator for Ion Transport through Three-Dimensional Ion Channel Systems finite element simulator, ichannel, is developed for ion transport through three-dimensional ion channel the electrodiffusion process of ion transport, and a mesh generation tool chain for ion channel systems, which

Lu, Benzhuo

297

Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL

We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.

Zuo, Wangda; McNeil, Andrew; Wetter, Michael; Lee, Eleanor

2011-09-06

298

Large Eddy simulation of parallel blade-vortex interaction

NASA Astrophysics Data System (ADS)

Helicopter Blade-Vortex Interaction (BVI) generally occurs under certain conditions of powered descent or during extreme maneuvering. The vibration and acoustic problems associated with the interaction of rotor tip vortices and the following blades is a major aerodynamic concern for the helicopter community. Numerous experimental and computational studies have been done over the last two decades in order to gain a better understanding of the physical mechanisms involved in BVI. The most severe interaction, in terms of generated noise, happens when the vortex filament is parallel to the blade, thus affecting a great portion of it. The majority of the previous numerical studies of parallel BVI fall within a potential flow framework. Some Navier-Stokes approaches using dissipative numerical methods and RANS-type turbulence models have also been attempted, but with limited success. The current investigation makes use of an incompressible, non-dissipative, kinetic energy conserving collocated mesh scheme in conjunction with a dynamic subgrid-scale model. The concentrated tip vortex is not attenuated as it is convected downstream and over a NACA-0012 airfoil. The lift, drag, moment and pressure coefficients induced by the passage of the vortex are monitored in time and compared with experimental data.

Felten, Frederic; Lund, Thomas

2002-11-01

299

NWO-P: Parallel Simulation of the Alewife Machine

Thispaper provides a brief overview of that effort, sample results indicatingthe performance of the current implementation, and a fewcomments about future work. The CM-5 port of our simulator hasbeen operational since June 1992 and has proved invaluable, especiallyfor running simulations of large Alewife systems (64 to 512nodes).2 Alewife OverviewAlewife is an experimental distributed-memory multiprocessor underconstruction at the MIT Laboratory for

Kirk Johnson; David Chaiken; Alan Mainwaring; Alewife CMMU

1993-01-01

300

Partitioning and packing mathematical simulation models for calculation on parallel computers

NASA Technical Reports Server (NTRS)

The development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system is described. Degrees of parallelism (i.e., coupling between the equations) and their impact on parallel processing are discussed. The problem of identifying computational parallelism within sets of closely coupled equations that require the exchange of current values of variables is described. A technique is presented for identifying this parallelism and for partitioning the equations for parallel solution on a multiprocessor. An algorithm which packs the equations into a minimum number of processors is also described. The results of the packing algorithm when applied to a turbojet engine model are presented in terms of processor utilization.

Arpasi, D. J.; Milner, E. J.

1986-01-01

301

Parallel Monte Carlo Electron and Photon Transport Simulation Code (PMCEPT code)

NASA Astrophysics Data System (ADS)

Simulations for customized cancer radiation treatment planning for each patient are very useful for both patient and doctor. These simulations can be used to find the most effective treatment with the least possible dose to the patient. This typical system, so called ``Doctor by Information Technology", will be useful to provide high quality medical services everywhere. However, the large amount of computing time required by the well-known general purpose Monte Carlo(MC) codes has prevented their use for routine dose distribution calculations for a customized radiation treatment planning. The optimal solution to provide ``accurate" dose distribution within an ``acceptable" time limit is to develop a parallel simulation algorithm on a beowulf PC cluster because it is the most accurate, efficient, and economic. I developed parallel MC electron and photon transport simulation code based on the standard MPI message passing interface. This algorithm solved the main difficulty of the parallel MC simulation (overlapped random number series in the different processors) using multiple random number seeds. The parallel results agreed well with the serial ones. The parallel efficiency approached 100% as was expected.

Kum, Oyeon

2004-11-01

302

Relevance of the parallel nonlinearity in gyrokinetic simulations of tokamak plasmas

NASA Astrophysics Data System (ADS)

The influence of the parallel nonlinearity on transport in gyrokinetic simulations is assessed for values of ?* which are typical of current experiments. Here, ?*=?s/a is the ratio of gyroradius, ?s, to plasma minor radius, a. The conclusion, derived from simulations with both GYRO [J. Candy and R. E. Waltz, J. Comput. Phys., 186, 585 (2003)] and GEM [Y. Chen and S. E. Parker J. Comput. Phys., 189, 463 (2003)] is that no measurable effect of the parallel nonlinearity is apparent for ?*<0.012. This result is consistent with scaling arguments, which suggest that the parallel nonlinearity should be O(? *) smaller than the E ×B nonlinearity. Indeed, for the plasma parameters under consideration, the magnitude of the parallel nonlinearity is a factor of 8?* smaller (for 0.000 75

Candy, J.; Waltz, R. E.; Parker, S. E.; Chen, Y.

2006-07-01

303

Development of parallel 3D RKPM meshless bulk forming simulation system

A parallel computational implementation of modern meshless system is presented for explicit for 3D bulk forming simulation problems. The system is implemented by reproducing kernel particle method. Aspects of a coarse grain parallel paradigm—domain decompose method—are detailed for a Lagrangian formulation using model partitioning. Integration cells are uniquely assigned on each process element and particles are overlap in boundary zones.

H. Wang; Guangyao Li; X. Han; Zhi Hua Zhong

2007-01-01

304

A component-based parallel infrastructure for the simulation of fluid-structure interaction

The Uintah computational framework is a component-based infrastructure, designed for highly parallel simulations of complex\\u000a fluid–structure interaction problems. Uintah utilizes an abstract representation of parallel computation and communication\\u000a to express data dependencies between multiple physics components. These features allow parallelism to be integrated between\\u000a multiple components while maintaining overall scalability. Uintah provides mechanisms for load-balancing, data communication,\\u000a data I\\/O, and

Steven G. Parker; James Guilkey; Todd Harman

2006-01-01

305

GPU-based simulation of the long-range Potts model via parallel tempering

NASA Astrophysics Data System (ADS)

We discuss the efficiency of parallelization on graphical processing units (GPUs) for the simulation of the one-dimensional Potts model with long-range interactions via parallel tempering. We investigate the behavior of some thermodynamic properties, such as equilibrium energy and magnetization, critical temperatures as well as the separation between the first- and second-order regimes. By implementing multispin coding techniques and an efficient parallelization of the interaction energy computation among threads, the GPU-accelerated approach reached speedup factors of up to 37.

Boer, Attila

2014-07-01

306

Virtual reality visualization of parallel molecular dynamics simulation

When performing communications mapping experiments for massively parallel processors, it is important to be able to visualize the mappings and resulting communications. In a molecular dynamics model, visualization of the atom to atom interaction and the processor mappings provides insight into the effectiveness of the communications algorithms. The basic quantities available for visualization in a model of this type are the number of molecules per unit volume, the mass, and velocity of each molecule. The computational information available for visualization is the atom to atom interaction within each time step, the atom to processor mapping, and the energy resealing events. We use the CAVE (CAVE Automatic Virtual Environment) to provide interactive, immersive visualization experiences.

Disz, T.; Papka, M.; Stevens, R.; Pellegrino, M. [Argonne National Lab., IL (United States); Taylor, V. [Northwestern Univ., Evanston, IL (United States). Electrical Engineering and Computer Science

1995-12-31

307

Object-oriented particle simulation on parallel computers

A general purpose, object-oriented particle simulation (OOPS) library has been developed for use on a variety of system architectures with a uniform high-level interface. This includes the development of library implementations for the CM5, Intel Paragon, and CRI T3D. Codes written on any of these platforms can be ported to other platforms without modifications by utilizing the high-level library. The general character of the library allows application to such diverse areas as plasma physics, suspension flows, vortex simulations, porous media, and materials science.

Reynders, J.V.W.; Forslund, D.W.; Hinker, P.J.; Tholburn, M.; Kilman, D.G.; Humphrey, W.F.

1994-04-01

308

Properties of a Family of Parallel Finite Element Simulations

earthquake-induced ground motion in the San Fernando Valley of Southern California, range in size from 10 of earthquake-induced ground motion in the San Fernando Valley [1] and are partitioned using a recursive geometric bisection algorithm [7, 8]. Because the simulations model earthquakes, we refer to them

Shewchuk, Jonathan

309

Constructing School Timetables Using Simulated Annealing: Sequential and Parallel Algorithms

This paper considers a solution to the school timetabling problem. The timetabling problem involves scheduling a number of tuples, each consisting of class of students, a teacher, a subject and a room, to a fixed number of time slots. A Monte Carlo scheme called simulated annealing is used as an optimisation technique. The paper introduces the timetabling problem, and then

D. Abramson

1991-01-01

310

Massively parallel simulation of cardiac electrical wave propagation ...

body in the form of the electrocardiogram (ECG). The QT ... the function of these proteins can lead to the initiation of a complex and life-threatening .... order of 10 cm, so directly simulating wave propagation in the human ventricle, a primary.

2007-10-30

311

High-Resolution Simulations of Parallel BladeVortex Interactions

contribution coming from the trailing edge. The simulations are then extended to three-dimensional moving overset meshes where the vortex generation and convection is also resolved. The numerical methodology. A high-resolution solution of the compressible Euler equations is performed on structured overset meshes

Alonso, Juan J.

312

and present a multi-core parallel simulation approach which au- tomatically protects communication between of heteroge- neous components, complex interconnect, sophisticated func- tionality, and slow simulationMulti-Core Parallel Simulation of System-Level Description Languages Rainer DÂ¨omer, Weiwei Chen, Xu

Gerstlauer, Andreas

313

Parallel density matrix propagation in spin dynamics simulations

Several methods for density matrix propagation in distributed computing environments, such as clusters and graphics processing units, are proposed and evaluated. It is demonstrated that the large communication overhead associated with each propagation step (two-sided multiplication of the density matrix by an exponential propagator and its conjugate) may be avoided and the simulation recast in a form that requires virtually no inter-thread communication. Good scaling is demonstrated on a 128-core (16 nodes, 8 cores each) cluster.

Edwards, Luke J

2011-01-01

314

Transient dynamics simulations are commonly used to model phenomena such as car crashes, underwater explosions, and the response of shipping containers to high-speed impacts. Physical objects in such a simulation are typically represented by Lagrangian meshes because the meshes can move and deform with the objects as they undergo stress. Fluids (gasoline, water) or fluid-like materials (earth) in the simulation can be modeled using the techniques of smoothed particle hydrodynamics. Implementing a hybrid mesh/particle model on a massively parallel computer poses several difficult challenges. One challenge is to simultaneously parallelize and load-balance both the mesh and particle portions of the computation. A second challenge is to efficiently detect the contacts that occur within the deforming mesh and between mesh elements and particles as the simulation proceeds. These contacts impart forces to the mesh elements and particles which must be computed at each timestep to accurately capture the physics of interest. In this paper we describe new parallel algorithms for smoothed particle hydrodynamics and contact detection which turn out to have several key features in common. Additionally, we describe how to join the new algorithms with traditional parallel finite element techniques to create an integrated particle/mesh transient dynamics simulation. Our approach to this problem differs from previous work in that we use three different parallel decompositions, a static one for the finite element analysis and dynamic ones for particles and for contact detection. We have implemented our ideas in a parallel version of the transient dynamics code PRONTO-3D and present results for the code running on a large Intel Paragon.

Hendrickson, B.; Plimpton, S.; Attaway, S.; Swegle, J. [and others

1996-09-01

315

Time-partitioning simulation models for calculation on parallel computers

NASA Technical Reports Server (NTRS)

A technique allowing time-staggered solution of partial differential equations is presented in this report. Using this technique, called time-partitioning, simulation execution speedup is proportional to the number of processors used because all processors operate simultaneously, with each updating of the solution grid at a different time point. The technique is limited by neither the number of processors available nor by the dimension of the solution grid. Time-partitioning was used to obtain the flow pattern through a cascade of airfoils, modeled by the Euler partial differential equations. An execution speedup factor of 1.77 was achieved using a two processor Cray X-MP/24 computer.

Milner, Edward J.; Blech, Richard A.; Chima, Rodrick V.

1987-01-01

316

A Solver for Massively Parallel Direct Numerical Simulation of Three-Dimensional Multiphase Flows

We present a new solver for massively parallel simulations of fully three-dimensional multiphase flows. The solver runs on a variety of computer architectures from laptops to supercomputers and on 65536 threads or more (limited only by the availability to us of more threads). The code is wholly written by the authors in Fortran 2003 and uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of the LCRM hybrid Front Tracking/Level Set method designed to handle highly deforming interfaces with complex topology changes. We discuss the implementation of this interface method and its particular suitability to distributed processing where all operations are carried out locally on distributed subdomains. We have developed parallel GMRES and Multigrid iterative solvers suited to the linear systems arising from the implicit solution of the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across flu...

Shin, S; Juric, D

2014-01-01

317

Wake Encounter Analysis for a Closely Spaced Parallel Runway Paired Approach Simulation

NASA Technical Reports Server (NTRS)

A Monte Carlo simulation of simultaneous approaches performed by two transport category aircraft from the final approach fix to a pair of closely spaced parallel runways was conducted to explore the aft boundary of the safe zone in which separation assurance and wake avoidance are provided. The simulation included variations in runway centerline separation, initial longitudinal spacing of the aircraft, crosswind speed, and aircraft speed during the approach. The data from the simulation showed that the majority of the wake encounters occurred near or over the runway and the aft boundaries of the safe zones were identified for all simulation conditions.

Mckissick,Burnell T.; Rico-Cusi, Fernando J.; Murdoch, Jennifer; Oseguera-Lohr, Rosa M.; Stough, Harry P, III; O'Connor, Cornelius J.; Syed, Hazari I.

2009-01-01

318

NASA Astrophysics Data System (ADS)

Strong scaling of scientific applications on parallel architectures is increasingly limited by communication latency. This talk will describe the techniques used to reduce latency and mitigate its effects on performance in Anton, a massively parallel special-purpose machine that accelerates molecular dynamics (MD) simulations by orders of magnitude compared with the previous state of the art. Achieving this speedup required both specialized hardware mechanisms and a restructuring of the application software to reduce network latency, sender and receiver overhead, and synchronization costs. Key elements of Anton's approach, in addition to tightly integrated communication hardware, include formulating data transfer in terms of counted remote writes and leveraging fine-grained communication. Anton delivers end-to-end inter-node latency significantly lower than any other large-scale parallel machine, and the total critical-path communication time for an Anton MD simulation is less than 3% that of the next-fastest MD platform.

Dror, Ron

2013-03-01

319

Parallel 3-d simulations for porous media models in soil mechanics

Numerical simulations in 3-d for porous media models in soil mechanics are a difficult task for the engineering modelling\\u000a as well as for the numerical realization. Here, we present a general numerical scheme for the simulation of two-phase models\\u000a in combination with an abstract material model via the stress response with a specialized parallel saddle point solver. Therefore,\\u000a we give

C. Wieners; M. Ammann; S. Diebels; W. Ehlers

2002-01-01

320

Xyce parallel electronic simulator reference guide, Version 6.0.1.

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1] .

Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David Gregory. [Raytheon, Albuquerque, NM] [Raytheon, Albuquerque, NM

2014-01-01

321

A Time-coherent Model for the Steering of Parallel Simulations

-33405 Talence, France Abstract. The on-line visualization and the computational steering of parallel expressed the need for new computational steering tools to better grasp the complex internal structure of large-scale scientific ap- plications. The computational steering is an effort to make the simulations

Paris-Sud XI, UniversitÃ© de

322

Parallel hp-Finite Element Simulations of 3D Resistivity Logging Instruments

Parallel hp-Finite Element Simulations of 3D Resistivity Logging Instruments M. PaszyÂ´nski1,3 , D-oriented hp-Finite Element Method (FEM) that delivers exponential convergence rates in terms of the quantity at the receiver antenna with minimal number of degrees of freedom in the mesh. The 3D hp finite element mesh

Torres-VerdÃn, Carlos

323

Parallel Logic Simulation Using Time Warp on Shared-Memory Multiprocessors

The article presents an efficient parallel logic-circuit simulation scheme based on the Time Warp optimistic algorithm. The Time Warp algorithm is integrated with a new global virtual time (GVT) computation scheme for fossil collection. The new GVT computation is based on a token ring passing method, so that global synchronization is not required in a shared-memory multiprocessor system. This allows

Hong K. Kim; Soon Myoung Chung

1994-01-01

324

Hardware In the Loop Simulation of a Diesel Parallel Mild-Hybrid Electric Vehicle

Hybrid vehicles present a real potential to reduce CO2 emission and energy dependency. The simulation of these vehicles is well adapted to highlight the first order influent parameters. However, more realistic components and HEVs performance versus cost could be identified and improved by testing using the HIL concept. This paper deals with the test and validation of a parallel mild-hybrid

R. Trigui; B. Jeanneret; B. Malaquin; F. Badin; C. Plasse

2007-01-01

325

Leveraging the power of nowadays graphics process- ing units for robust power grid simulation remains a challenging task. Existing preconditioned iterative methods that require incomplete matrix factorizations cannot be effectively accelerated on graphics processing unit (GPU) due to its limited hardware resource as well as data parallel computing. This paper presents an efficient GPU-based multigrid preconditioning algorithm for robust power

Zhuo Feng; Xueqian Zhao; Zhiyu Zeng

2011-01-01

326

An unstructured-grid, finite-volume, nonhydrostatic, parallel coastal ocean simulator

their generation mechanism is not clear, it is hypothesized that they form when long, internal waves of tidal freAn unstructured-grid, finite-volume, nonhydrostatic, parallel coastal ocean simulator O.B. Fringer; received in revised form 23 March 2006; accepted 23 March 2006 Available online 3 May 2006 Abstract

Fringer, Oliver B.

327

A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)

NASA Technical Reports Server (NTRS)

A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.

Carroll, Chester C.; Owen, Jeffrey E.

1988-01-01

328

Practical Parallel SimulationApplied to Aviation Modeling Dr. Frederick Wieland

have used DPAT results to support their conclusions, and discuss the design and performance from its conceptual design through iniplementation, as opposed to adding it afterwards. 1. Introductionand Background Designing parallel simulations that are optimistically synchronized is more of an art

Tropper, Carl

329

Trading accuracy for speed in parallel simulated annealing with simultaneous moves

A common approach to parallelizing simulated annealing to generate several perturbations to the current solution simultaneously, requiring synchronization to guarantee correct evaluation of the cost function. The cost of this synchronization may be reduced by allowing inaccuracies in the cost calculations. We provide a framework for understanding the theoretical implications of this approach based on a model of processor interaction

M. D. Durand; Steve R. White

2000-01-01

330

A Scalable Multi-scale Framework for Parallel Simulation and Visualization of Microbial Evolution

regulatory and biochemical networks. For each time point, the evolutionary "fossil record" is recordedA Scalable Multi-scale Framework for Parallel Simulation and Visualization of Microbial Evolution and fastest evolving life forms in the planet, yet even in their case, evolution is painstakingly difficult

Tagkopoulos, Ilias

331

A parallel algorithm for the detailed multidimensional numerical simulation of laminar flames able to work efficiently with loosely coupled computers is described. The governing equations have been discretized using the finite volume technique over staggered grids. A SIMPLE-like method has been employed to solve the velocity pressure fields while the species equations have been calculated in a segregated manner using

R. Cònsul; C. D. Pérez-Segarra; K. Claramunt; J. Cadafalch; A. Oliva

2003-01-01

332

The Gillespie Stochastic Simulation Algorithm (GSSA) and its variants are cornerstone techniques to simulate reaction kinetics in situations where the concentration of the reactant is too low to allow deterministic techniques such as differential equations. The inherent limitations of the GSSA include the time required for executing a single run and the need for multiple runs for parameter sweep exercises due to the stochastic nature of the simulation. Even very efficient variants of GSSA are prohibitively expensive to compute and perform parameter sweeps. Here we present a novel variant of the exact GSSA that is amenable to acceleration by using graphics processing units (GPUs). We parallelize the execution of a single realization across threads in a warp (fine-grained parallelism). A warp is a collection of threads that are executed synchronously on a single multi-processor. Warps executing in parallel on different multi-processors (coarse-grained parallelism) simultaneously generate multiple trajectories. Novel data-structures and algorithms reduce memory traffic, which is the bottleneck in computing the GSSA. Our benchmarks show an 8×?120× performance gain over various state-of-the-art serial algorithms when simulating different types of models. PMID:23152751

Komarov, Ivan; D'Souza, Roshan M.

2012-01-01

333

Automated Control Synthesis for an Assembly Line using Discrete Event System Control Theory

- tems, an educational test-bed that simulates an automated car assembly line has been built using LEGO r, University of Kentucky, Lexington, KY; and R. Kumar with the Department of Electrical and Com- puter of the safety and progress control specifica- tions of the system. Â· Use of supervisory control theory to

Kumar, Ratnesh

334

Automated Control Synthesis for an Assembly Line using Discrete Event System Control Theory

- tems, an educational test-bed that simulates an automated car assembly line has been built using LEGO r, University of Kentucky, Lexington, KY; and R. Kumar with the Department of Electrical and Com- puter to be controlled. Â· FSM models of the safety and progress control specifica- t

Kumar, Ratnesh

335

Robust large-scale parallel nonlinear solvers for simulations.

This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write and easily portable. However, the method usually takes twice as long to solve as Newton-GMRES on general problems because it solves two linear systems at each iteration. In this paper, we discuss modifications to Bouaricha's method for a practical implementation, including a special globalization technique and other modifications for greater efficiency. We present numerical results showing computational advantages over Newton-GMRES on some realistic problems. We further discuss a new approach for dealing with singular (or ill-conditioned) matrices. In particular, we modify an algorithm for identifying a turning point so that an increasingly ill-conditioned Jacobian does not prevent convergence.

Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson (Sandia National Laboratories, Livermore, CA)

2005-11-01

336

PATHWAY: a simulation model of radionuclide-transport through agricultural food chains

PATHWAY simulates the transport of radionuclides from fallout through an agricultural ecosystem. The agro-ecosystem is subdivided into several land management units, each of which is used either for grazing animals, for growing hay, or for growing food crops. The model simulates the transport of radionuclides by both discrete events and continuous, time-dependent processes. The discrete events include tillage of soil,

T. B. Kirchner; F. W. Whicker; M. D. Otis

1982-01-01

337

Simulation of Earthquake Liquefaction Response on Parallel Computers Jun Peng, Jinchi Lu, Kincho H. Law and Ahmed Elgamal INTRODUCTION Simulations of earthquake responses and liquefaction effects element programs, must be re-designed in order to take full advantage of parallel computing. This joint

Stanford University

338

This paper investigates an efficient implementation of circuit simulator sparta based on the preconditionedJacobi method on a loosely coupled distributed memory machine Fujitsu AP1000. The preconditioned relaxationmethods are linear solvers effective in large scale circuit simulations. Because the preconditioned Jacobi methodhas a high parallelism, the only problem of parallelization of sparta is the application of the preconditioner to theresidual vector.

Reiji Suda; Yoshio Oyanagi

1995-01-01

339

Parallel 3D Multi-Stage Simulation of a Turbofan Engine

NASA Technical Reports Server (NTRS)

A 3D multistage simulation of each component of a modern GE Turbofan engine has been made. An axisymmetric view of this engine is presented in the document. This includes a fan, booster rig, high pressure compressor rig, high pressure turbine rig and a low pressure turbine rig. In the near future, all components will be run in a single calculation for a solution of 49 blade rows. The simulation exploits the use of parallel computations by using two levels of parallelism. Each blade row is run in parallel and each blade row grid is decomposed into several domains and run in parallel. 20 processors are used for the 4 blade row analysis. The average passage approach developed by John Adamczyk at NASA Lewis Research Center has been further developed and parallelized. This is APNASA Version A. It is a Navier-Stokes solver using a 4-stage explicit Runge-Kutta time marching scheme with variable time steps and residual smoothing for convergence acceleration. It has an implicit K-E turbulence model which uses an ADI solver to factor the matrix. Between 50 and 100 explicit time steps are solved before a blade row body force is calculated and exchanged with the other blade rows. This outer iteration has been coined a "flip." Efforts have been made to make the solver linearly scaleable with the number of blade rows. Enough flips are run (between 50 and 200) so the solution in the entire machine is not changing. The K-E equations are generally solved every other explicit time step. One of the key requirements in the development of the parallel code was to make the parallel solution exactly (bit for bit) match the serial solution. This has helped isolate many small parallel bugs and guarantee the parallelization was done correctly. The domain decomposition is done only in the axial direction since the number of points axially is much larger than the other two directions. This code uses MPI for message passing. The parallel speed up of the solver portion (no 1/0 or body force calculation) for a grid which has 227 points axially.

Turner, Mark G.; Topp, David A.

1998-01-01

340

Applying parallel block predictor-corrector methods to a Space Shuttle main rocket engine simulation

The effectiveness of applying two types of parallel block predictor-corrector algorithms presented by L.F. Shampine and H.A. Watts (Math. Comput., vol.23, p.731-40, 1969; BIT, vol.12, p.252-66, 1972) and L.G. Birta and O. Abdou-Rabia (IEEE Trans. on Comput. vol.C-36, no.3, p.299-311 March 1987) to the simulation of a Space Shuttle main rocket engine is discussed. Comparisons between the sequential and parallel

B. Earl Wells; Chester C. Carroll

1990-01-01

341

NASA Technical Reports Server (NTRS)

This paper will describe the Entry, Descent and Landing simulation tradeoffs and techniques that were used to provide the Monte Carlo data required to approve entry during a critical period just before entry of the Genesis Sample Return Capsule. The same techniques will be used again when Stardust returns on January 15, 2006. Only one hour was available for the simulation which propagated 2000 dispersed entry states to the ground. Creative simulation tradeoffs combined with parallel processing were needed to provide the landing footprint statistics that were an essential part of the Go/NoGo decision that authorized release of the Sample Return Capsule a few hours before entry.

Lyons, Daniel T.; Desai, Prasun N.

2005-01-01

342

Design of a real-time wind turbine simulator using a custom parallel architecture

NASA Technical Reports Server (NTRS)

The design of a new parallel-processing digital simulator is described. The new simulator has been developed specifically for analysis of wind energy systems in real time. The new processor has been named: the Wind Energy System Time-domain simulator, version 3 (WEST-3). Like previous WEST versions, WEST-3 performs many computations in parallel. The modules in WEST-3 are pure digital processors, however. These digital processors can be programmed individually and operated in concert to achieve real-time simulation of wind turbine systems. Because of this programmability, WEST-3 is very much more flexible and general than its two predecessors. The design features of WEST-3 are described to show how the system produces high-speed solutions of nonlinear time-domain equations. WEST-3 has two very fast Computational Units (CU's) that use minicomputer technology plus special architectural features that make them many times faster than a microcomputer. These CU's are needed to perform the complex computations associated with the wind turbine rotor system in real time. The parallel architecture of the CU causes several tasks to be done in each cycle, including an IO operation and the combination of a multiply, add, and store. The WEST-3 simulator can be expanded at any time for additional computational power. This is possible because the CU's interfaced to each other and to other portions of the simulation using special serial buses. These buses can be 'patched' together in essentially any configuration (in a manner very similar to the programming methods used in analog computation) to balance the input/ output requirements. CU's can be added in any number to share a given computational load. This flexible bus feature is very different from many other parallel processors which usually have a throughput limit because of rigid bus architecture.

Hoffman, John A.; Gluck, R.; Sridhar, S.

1995-01-01

343

Parallel molecular dynamics simulations of alkane/hydroxylated ? -aluminum oxide interfaces

NASA Astrophysics Data System (ADS)

In this paper we describe a practical implementation of parallel computation for the molecular dynamics (MD) simulation of an alkane/aluminum oxide interface. A serial MD program was converted into a parallel code utilizing the message passing interface (MPI). This code was evaluated on a twelve processor symmetrical multiprocessor as well as on a cluster of four processor SMPs. A maximum speedup of 5.25 was achieved with twelve processors on the large shared memory machine. The cluster performance saturated at a speedup of 4.5 with two nodes. High communication costs and considerable load imbalance in the system were identified as areas that need further investigation for obtaining better performance. This paper is published as part of a thematic issue on Parallel Computing in Chemical Physics.

Roy, S.; Jin, R. Y.; Chaudhary, V.; Hase, W. L.

2000-06-01

344

Cholla : A New Massively-Parallel Hydrodynamics Code For Astrophysical Simulation

We present Cholla (Computational Hydrodynamics On ParaLLel Architectures), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind (CTU) algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Cholla performs all hydrodynamical calculations in a massively-parallel manner, using GPUs to evolve the fluid properties of thousands of cells simultaneously while leaving the power of central processing units (CPUs) available for modeling additional physics. On current hardware, Cholla can update more than ten million cells per GPU-second while using an exact Riemann solver and PPM reconstruction with the CTU algorithm. Owing to the massively-parallel architecture of GPUs and the design of the Cholla ...

Schneider, Evan E

2014-01-01

345

Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN

NASA Astrophysics Data System (ADS)

To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted.

Hammond, G. E.; Lichtner, P. C.; Mills, R. T.

2014-01-01

346

Parallel-vector algorithms for particle simulations on shared-memory multiprocessors

Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton's third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.

Nishiura, Daisuke, E-mail: nishiura@jamstec.go.j [Institute for Research on Earth Evolution, Japan Agency for Marine-Earth Science and Technology, Kanagawa 236-0001 (Japan); Sakaguchi, Hide [Institute for Research on Earth Evolution, Japan Agency for Marine-Earth Science and Technology, Kanagawa 236-0001 (Japan)

2011-03-01

347

Some popular iterative solvers for non-symmetric systems arising from the finite-element discretization of three-dimensional groundwater contaminant transport problem are implemented and compared on distributed memory parallel platforms. This paper attempts to determine which solvers are most suitable for the contaminant transport problem under varied conditions for large scale simulations on distributed parallel platforms. The original parallel implementation was targeted for the 1024 node Intel paragon platform using explicit message passing with the NX library. This code was then ported to SGI Power Challenge Array, Convex Exemplar, and Origin 2000 machines using an MPI implementation. The performance of these solvers is studied for increasing problem size, roughness of the coefficients, and selected problem scenarios. These conditions affect the properties of the matrix and hence the difficulty level of the solution process. Performance is analyzed in terms of convergence behavior, overall time, parallel efficiency, and scalability. The solvers that are presented are BiCGSTAB, GMRES, ORTHOMIN, and CGS. A simple diagonal preconditioner is used in this parallel implementation for all the methods. The results indicate that all methods are comparable in performance with BiCGSTAB slightly outperforming the other methods for most problems. The authors achieved very good scalability in all the methods up to 1024 processors of the Intel Paragon XPS/150. They demonstrate scalability by solving 100 time steps of a 40 million element problem in about 5 minutes using either BiCGSTAB or GMRES.

Mahinthakumar, G. [Oak Ridge National Lab., TN (United States). Center for Computational Sciences; Saied, F.; Valocchi, A.J. [Univ. of Illinois, Urbana, IL (United States)

1997-03-01

348

Relevance of the parallel nonlinearity in gyrokinetic simulations of tokamak plasmas

The influence of the parallel nonlinearity on transport in gyrokinetic simulations is assessed for values of {rho}{sub *} which are typical of current experiments. Here, {rho}{sub *}={rho}{sub s}/a is the ratio of gyroradius, {rho}{sub s}, to plasma minor radius, a. The conclusion, derived from simulations with both GYRO [J. Candy and R. E. Waltz, J. Comput. Phys., 186, 585 (2003)] and GEM [Y. Chen and S. E. Parker J. Comput. Phys., 189, 463 (2003)] is that no measurable effect of the parallel nonlinearity is apparent for {rho}{sub *}<0.012. This result is consistent with scaling arguments, which suggest that the parallel nonlinearity should be O({rho}{sub *}) smaller than the ExB nonlinearity. Indeed, for the plasma parameters under consideration, the magnitude of the parallel nonlinearity is a factor of 8{rho}{sub *} smaller (for 0.000 75<{rho}{sub *}<0.012) than the other retained terms in the nonlinear gyrokinetic equation.

Candy, J.; Waltz, R. E.; Parker, S. E.; Chen, Y. [General Atomics, San Diego, California 92121 (United States); Center for Integrated Plasma Studies, University of Colorado at Boulder, Boulder, Colorado 80309 (United States)

2006-07-15

349

A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.

Bui, Trong T.

1999-01-01

350

Parallel Solutions for Voxel-Based Simulations of Reaction-Diffusion Systems

There is an increasing awareness of the pivotal role of noise in biochemical processes and of the effect of molecular crowding on the dynamics of biochemical systems. This necessity has given rise to a strong need for suitable and sophisticated algorithms for the simulation of biological phenomena taking into account both spatial effects and noise. However, the high computational effort characterizing simulation approaches, coupled with the necessity to simulate the models several times to achieve statistically relevant information on the model behaviours, makes such kind of algorithms very time-consuming for studying real systems. So far, different parallelization approaches have been deployed to reduce the computational time required to simulate the temporal dynamics of biochemical systems using stochastic algorithms. In this work we discuss these aspects for the spatial TAU-leaping in crowded compartments (STAUCC) simulator, a voxel-based method for the stochastic simulation of reaction-diffusion processes which relies on the S?-DPP algorithm. In particular we present how the characteristics of the algorithm can be exploited for an effective parallelization on the present heterogeneous HPC architectures. PMID:25045716

D'Agostino, Daniele; Pasquale, Giulia; Clematis, Andrea; Maj, Carlo; Mosca, Ettore; Milanesi, Luciano; Merelli, Ivan

2014-01-01

351

Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P

In recent years, SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic time-domain code T3P. Higher-order Finite Element methods on conformal unstructured meshes and massively parallel processing allow unprecedented simulation accuracy for wakefield computations and simulations of transient effects in realistic accelerator structures. Applications include simulation of wakefield damping in the Compact Linear Collider (CLIC) power extraction and transfer structure (PETS).

Candel, A.; Kabel, A.; Lee, L.; Li, Z.; Ng, C.; Schussman, G.; Ko, K.; /SLAC; Syratchev, I.; /CERN

2009-06-19

352

Massively parallel molecular dynamics simulations of two-dimensional materials at high strain rates

NASA Astrophysics Data System (ADS)

Large scale molecular dynamics simulations on a massively parallel computer are performed to investigate the mechanical behavior of 2-dimensional materials. A pair potential and a model embedded atom many-body potential are examined, corresponding to 'brittle' and 'ductile' materials, respectively. A parallel molecular dynamics (MD) algorithm is developed to exploit the architecture of the Connection Machine, enabling simulations of greater than 10(exp 6) atoms. A model spallation experiment is performed on a 2-D triagonal crystal with a well-defined nanocrystalline defect on the spall plane. The process of spallation is modeled as a uniform adiabatic expansion. The spall strength is shown to be proportional to the logarithm of the applied strain rate and a dislocation dynamics model is used to explain the results. Good predictions for the onset of spallation in the computer experiments is found from the simple model. The nanocrystal defect affects the propagation of the shock front and failure is enhanced along the grain boundary.

Wagner, N. J.; Holian, B. L.

1992-11-01

353

Adaptive finite element simulation of flow and transport applications on parallel computers

NASA Astrophysics Data System (ADS)

The subject of this work is the adaptive finite element simulation of problems arising in flow and transport applications on parallel computers. Of particular interest are new contributions to adaptive mesh refinement (AMR) in this parallel high-performance context, including novel work on data structures, treatment of constraints in a parallel setting, generality and extensibility via object-oriented programming, and the design/implementation of a flexible software framework. This technology and software capability then enables more robust, reliable treatment of multiscale--multiphysics problems and specific studies of fine scale interaction such as those in biological chemotaxis (Chapter 4) and high-speed shock physics for compressible flows (Chapter 5). The work begins by presenting an overview of key concepts and data structures employed in AMR simulations. Of particular interest is how these concepts are applied in the physics-independent software framework which is developed here and is the basis for all the numerical simulations performed in this work. This open-source software framework has been adopted by a number of researchers in the U.S. and abroad for use in a wide range of applications. The dynamic nature of adaptive simulations pose particular issues for efficient implementation on distributed-memory parallel architectures. Communication cost, computational load balance, and memory requirements must all be considered when developing adaptive software for this class of machines. Specific extensions to the adaptive data structures to enable implementation on parallel computers is therefore considered in detail. The libMesh framework for performing adaptive finite element simulations on parallel computers is developed to provide a concrete implementation of the above ideas. This physics-independent framework is applied to two distinct flow and transport applications classes in the subsequent application studies to illustrate the flexibility of the design and to demonstrate the capability for resolving complex multiscale processes efficiently and reliably. The first application considered is the simulation of chemotactic biological systems such as colonies of Escherichia coli. This work appears to be the first application of AMR to chemotactic processes. These systems exhibit transient, highly localized features and are important in many biological processes, which make them ideal for simulation with adaptive techniques. A nonlinear reaction-diffusion model for such systems is described and a finite element formulation is developed. The solution methodology is described in detail. Several phenomenological studies are conducted to study chemotactic processes and resulting biological patterns which use the parallel adaptive refinement capability developed in this work. The other application study is much more extensive and deals with fine scale interactions for important hypersonic flows arising in aerospace applications. These flows are characterized by highly nonlinear, convection-dominated flowfields with very localized features such as shock waves and boundary layers. These localized features are well-suited to simulation with adaptive techniques. A novel treatment of the inviscid flux terms arising in a streamline-upwind Petrov-Galerkin finite element formulation of the compressible Navier-Stokes equations is also presented and is found to be superior to the traditional approach. The parallel adaptive finite element formulation is then applied to several complex flow studies, culminating in fully three-dimensional viscous flows about complex geometries such as the Space Shuttle Orbiter. Physical phenomena such as viscous/inviscid interaction, shock wave/boundary layer interaction, shock/shock interaction, and unsteady acoustic-driven flowfield response are considered in detail. A computational investigation of a 25°/55° double cone configuration details the complex multiscale flow features and investigates a potential source of experimentally-observed unsteady flowfield response.

Kirk, Benjamin Shelton

354

Parallel spatial direct numerical simulation of boundary-layer flow transition on IBM SP1

The spatially evolving disturbances that are associated with laminar-to-turbulent transition in three-dimensional boundary-layer flows are computed with the PSDNS code on an IBM SP1 parallel supercomputer. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40 percent for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a ``real world`` simulation that consists of 1.7 million grid points. Comparisons to the Cray Y/MP and Cray C-90 are made for this large scale simulation.

Hanebutte, U.R. [Argonne National Lab., IL (United States); Joslin, R.D. [National Aeronautics and Space Administration, Hampton, VA (United States). Langley Research Center; Zubair, M. [International Business Machines Corp., Yorktown Heights, NY (United States). Thomas J. Watson Research Center

1995-07-01

355

Modeling and simulation of auction-based shop-floor control using parallel computing

The high level of complexity and cost involved in the development, testing, and implementation of software for traditional, hierarchical shop-floor control of automated manufacturing systems has motivated considerable research in recent years on the distributed shop-floor control paradigm. In this paper, we describe a methodology for modeling and simulation of an auction-based shop-floor control scheme in a parallel and distributed

DHARMARAJ VEERAMANI; KUNG-JENG WANG; JOSE ROJAS

1998-01-01

356

Alpha-Helix Formation in C-Peptide Rnase-A Investigated by Parallel Tempering Simulations

We have performed parallel tempering simulations of a 13-residue peptide fragment of ribonuclease-A, c-peptide, in implicit solvent with constant dielectric permittivity. This peptide has a strong tendency to form alpha-helical conformations in solvent as suggested by circular dichroism (CD) and nuclear magnetic resonance (NMR) experiments. Our results demonstrate that 5th and 8-12 residues are in the alpha-helical region of the

Gökhan Gökoglu; Tarik Çelik

2007-01-01

357

3D Parallel Simulation Model of Continuous Beam Electron Cloud Interactions

Long-term beam-electron cloud interaction is modeled with a 3D parallel continuous model originally developed for plasma wakefield acceleration modeling. The simulation results are compared with the two macro-particle model for strong head-tail instability. The two macro- particle model qualitatively captures some of the instability features of the beam. The code is then used to model and make predictions for the

A. Z. Ghalam; T. Katsouleas; V. K. Decyk; C. K. Huang; W. B. Mori; G. Rumolo; E. Benedetto; F. Zimmermann

2005-01-01

358

Fault Simulation for the Electrostatic Parallel-Plate Micro-actuator

The fault analysis for the electrostatic parallel-plate is a critical issue in the design flow, as it played an important role in determining device reliability and the voltage required for successful operation. The simulations for pull-in and release voltages were carried in CoventorWare. The FEM model was constructed. The displacement of the upper electrode and corresponding capacitance between the upper

Yongping Hao; Fengli Liu

2009-01-01

359

Whole-body optical molecular imaging is rapidly developing for preclinical research. It is essential and necessary to develop novel simulation methods of light propagation for optical imaging, especially when a priori knowledge, large-volume domain and wide-range optical properties need to be considered in the reconstruction algorithm. In this paper, we develop a three dimensional parallel adaptive finite element method with simplified

Yujie Lu; Arion F. Chatziioannou

2008-01-01

360

A scalable parallel algorithm for large-scale reactive force-field molecular dynamics simulations

A scalable parallel algorithm has been designed to perform multimillion-atom molecular dynamics (MD) simulations, in which first principles- based reactive force fields (ReaxFF) describe chemical reactions. Environment-dependent bond orders associated with atomic pairs and their derivatives are reused extensively with the aid of linked-list cells to minimize the computation associated with atomic n-tuple interactions (n 4 explicitly and 6 due

Ken-Ichi Nomura; Rajiv K. Kalia; Aiichiro Nakano; Priya Vashishta

2008-01-01

361

NASA Astrophysics Data System (ADS)

Parallel molecular dynamics (MD) simulations are performed to investigate pressure-induced solid-to-solid structural phase transformations in cadmium selenide (CdSe) nanorods. The effects of the size and shape of nanorods on different aspects of structural phase transformations are studied. Simulations are based on interatomic potentials validated extensively by experiments. Simulations range from 105 to 106 atoms. These simulations are enabled by highly scalable algorithms executed on massively parallel Beowulf computing architectures. Pressure-induced structural transformations are studied using a hydrostatic pressure medium simulated by atoms interacting via Lennard-Jones potential. Four single-crystal CdSe nanorods, each 44A in diameter but varying in length, in the range between 44A and 600A, are studied independently in two sets of simulations. The first simulation is the downstroke simulation, where each rod is embedded in the pressure medium and subjected to increasing pressure during which it undergoes a forward transformation from a 4-fold coordinated wurtzite (WZ) crystal structure to a 6-fold coordinated rocksalt (RS) crystal structure. In the second so-called upstroke simulation, the pressure on the rods is decreased and a reverse transformation from 6-fold RS to a 4-fold coordinated phase is observed. The transformation pressure in the forward transformation depends on the nanorod size, with longer rods transforming at lower pressures close to the bulk transformation pressure. Spatially-resolved structural analyses, including pair-distributions, atomic-coordinations and bond-angle distributions, indicate nucleation begins at the surface of nanorods and spreads inward. The transformation results in a single RS domain, in agreement with experiments. The microscopic mechanism for transformation is observed to be the same as for bulk CdSe. A nanorod size dependency is also found in reverse structural transformations, with longer nanorods transforming more readily than smaller ones. Nucleation initiates at the center of the rod and grows outward.

Lee, Nicholas Jabari Ouma

362

A Hybrid MPI-OpenMP Parallel Implementation for Simulating Taylor-Couette Flow

A hybrid-parallel direct-numerical-simulation method for turbulent Taylor-Couette flow is presented. The Navier-Stokes equations are discretized in cylindrical coordinates with the spectral Fourier-Galerkin method in the axial and azimuthal directions, and high-order finite differences in the radial direction. Time is advanced by a second-order, semi-implicit projection scheme, which requires the solution of five Helmholtz/Poisson equations, avoids staggered grids and renders very small slip velocities. Nonlinear terms are computed with the pseudospectral method. The code is parallelized using a hybrid MPI-OpenMP strategy, which is simpler to implement, reduces inter-node communications and is more efficient compared to a flat MPI parallelization. A strong scaling study shows that the hybrid code maintains very good scalability up to $\\mathcal{O}(10^4)$ processor cores and thus allows to perform simulations at higher resolutions than previously feasible, and opens up the possibility to simulate turbulent Tayl...

Shi, Liang; Hof, Bjoern; Avila, Marc

2013-01-01

363

NASA Technical Reports Server (NTRS)

This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.

Morgan, Philip E.

2004-01-01

364

Simulation study of a parallel processor with unbalanced loads. Master's thesis

The purpose of this thesis was twofold: to estimate the impact of unbalanced computational loads on a parallel-processing architecture via Monte Carlo simulation; and second to investigate the impact of representing the dynamics of the parallel-processing problem via animated simulation. It is constrained to the hypercube architecture in which each node is connected in a predetermined topology and allowed to communicate to other nodes through calls to the operating system. Routing of messages through the network is fixed and specified within the operating system. Message-transmission preempts nodal processing causing internodal communications to complicate the concurrent operation of the network. Two independent variables are defined: 1) the degree of imbalance characterizes the nature or severity of the load imbalance, and 2) the degree of locality characterizes the node loadings with respect to node locations across the cube. A SLAM II simulation model of a generic 16 node hypercube was constructed in which each node processes a predetermined number of computational tasks and, following each task, sends a message to a single randomly chosen receiver node. An experiment was designed in which the independent variables, degree of imbalance and degree of locality were varied across two computation-to-IO ratios to determine their separate and interactive effects on the dependent variable, job speedup. ANOVA and regression techniques were used to estimate the relationship between load imbalance, locality, computation-to-IO ratio, and their interactions to job speedup. Results show that load imbalance severely impacts a parallel-processor's performance.

Moore, T.S.

1987-12-01

365

The objective of this article is to report the parallel implementation of the 3D molecular dynamic simulation code for laser-cluster interactions. The benchmarking of the code has been done by comparing the simulation results with some of the experiments reported in the literature. Scaling laws for the computational time is established by varying the number of processor cores and number of macroparticles used. The capabilities of the code are highlighted by implementing various diagnostic tools. To study the dynamics of the laser-cluster interactions, the executable version of the code is available from the author.

Holkundkar, Amol R. [Department of Physics, Birla Institute of Technology and Science, Pilani-333 031 (India)] [Department of Physics, Birla Institute of Technology and Science, Pilani-333 031 (India)

2013-11-15

366

NASA Astrophysics Data System (ADS)

Large scale ground motion simulation requires supercomputing systems in order to obtain reliable and useful results within reasonable elapsed time. In this study, we develop a framework for terascale ground motion simulations in highly heterogeneous basins. As part of the development, we present a parallel octree-based multiresolution finite element methodology for the elastodynamic wave propagation problem. The octree-based multiresolution finite element method reduces memory use significantly and improves overall computational performance. The framework is comprised of three parts; (1) an octree-based mesh generator, Euclid developed by TV and O'Hallaron, (2) a parallel mesh partitioner, ParMETIS developed by Karypis et al.[2], and (3) a parallel octree-based multiresolution finite element solver, QUAKE developed in this study. Realistic earthquakes parameters, soil material properties, and sedimentary basins dimensions will produce extremely large meshes. The out-of-core versional octree-based mesh generator, Euclid overcomes the resulting severe memory limitations. By using a parallel, distributed-memory graph partitioning algorithm, ParMETIS partitions large meshes, overcoming the memory and cost problem. Despite capability of the Octree-Based Multiresolution Mesh Method ( OBM3), large problem sizes necessitate parallelism to handle large memory and work requirements. The parallel OBM 3 elastic wave propagation code, QUAKE has been developed to address these issues. The numerical methodology and the framework have been used to simulate the seismic response of both idealized systems and of the Greater Los Angeles basin to simple pulses and to a mainshock of the 1994 Northridge Earthquake, for frequencies of up to 1 Hz and domain size of 80 km x 80 km x 30 km. In the idealized models, QUAKE shows good agreement with the analytical Green's function solutions. In the realistic models for the Northridge earthquake mainshock, QUAKE qualitatively agrees, with at most a factor of 2.5, with the observational data. Through simulations for several models, ranging in size from 400,000 to 300 million degrees of freedom on the 512-processors Cray T3E and the 3000-processors HP-Compaq AlphaServer Cluster at the Pittsburgh Supercomputing Center, we achieve excellent performance and scalability.

Kim, Eui Joong

367

Parallel and Distributed Simulation from Many Cores to the Public Cloud (Extended Version)

In this tutorial paper, we will firstly review some basic simulation concepts and then introduce the parallel and distributed simulation techniques in view of some new challenges of today and tomorrow. More in particular, in the last years there has been a wide diffusion of many cores architectures and we can expect this trend to continue. On the other hand, the success of cloud computing is strongly promoting the everything as a service paradigm. Is parallel and distributed simulation ready for these new challenges? The current approaches present many limitations in terms of usability and adaptivity: there is a strong need for new evaluation metrics and for revising the currently implemented mechanisms. In the last part of the paper, we propose a new approach based on multi-agent systems for the simulation of complex systems. It is possible to implement advanced techniques such as the migration of simulated entities in order to build mechanisms that are both adaptive and very easy to use. Adaptive mechanisms...

D'Angelo, Gabriele

2011-01-01

368

NASA Astrophysics Data System (ADS)

The Geophysical Finite Element Simulation Tool (GeoFEST) can be used to simulate and produce synthetic observable time-dependent surface deformations over both short and long time scales. Such simulations aid in interpretation of GPS, InSar and other geodetic techniques that will require detailed analysis as increasingly large data volumes from NASA remote sensing programs are developed and deployed. The NASA Earth Science Technology Office Computational Technologies Program (ESTO/CT) has funded extensions to GeoFEST to support larger-scale simulations, adaptive methods, and scalability across a variety of parallel computing systems. The software and hardware technologies applied to make this transition, as well as additional near-term development plans for GeoFEST, will be described.

Norton, C. D.; Lyzenga, G. A.; Parker, J. W.; Tisdale, E. R.

2004-12-01

369

DEVS framework for modelling, simulation, analysis, and design of hybrid systems

We make the case that Discrete Event System Specification (DEVS) is a universal formalism for discrete event dynamical systems (DEDS). DEVS offers an expressive framework for modelling, design, analysis and simulation of autonomous and hybrid systems. We review some known features of DEVS and its extensions. We then focus on the use of DEVS to formulate and synthesize supervisory level

Bernard P. Zeigler; Hae Sang Song; Tag Gon Kim; Herbert Praehofer

370

National Technical Information Service (NTIS)

Anti-air warfare (AAW) has been a top priority for the world's navies in developing tactics and choosing the most effective ship defense systems. This thesis develops a model as an analysis tool to measure the effectiveness of radar and IR sensors in AAW ...

O. Kulac

1999-01-01

371

World-class utilization of manufacturing resources is of vital importance to any manufacturing enterprise in the global competition of today. This requirement calls for superior performance of all processes related to the manufacturing of products. One of these processes is the resetting of machinery and equipment between two product runs, which will be the focus area of this text. This paper

Bjørn Johansson; Jürgen Kaiser

2002-01-01

372

Heavy industries operate equipment having a long life to generate revenue or perform a mission. These industries must invest in the specialized service parts needed to maintain their equipment, because unlike in other ...

Bradley, Randolph L. (Randolph Lewis)

2012-01-01

373

Field-Scale, Massively Parallel Simulation of Production from Oceanic Gas Hydrate Deposits

NASA Astrophysics Data System (ADS)

The quantity of hydrocarbon gases trapped in natural hydrate accumulations is enormous, leading to significant interest in the evaluation of their potential as an energy source. It has been shown that large volumes of gas can be readily produced at high rates for long times from some types of methane hydrate accumulations by means of depressurization-induced dissociation, and using conventional technologies with horizontal or vertical well configurations. However, these systems are currently assessed using simplified or reduced-scale 3D or even 2D production simulations. In this study, we use the massively parallel TOUGH+HYDRATE code (pT+H) to assess the production potential of a large, deep-ocean hydrate reservoir and develop strategies for effective production. The simulations model a full 3D system of over 24 km2 extent, examining the productivity of vertical and horizontal wells, single or multiple wells, and explore variations in reservoir properties. Systems of up to 2.5M gridblocks, running on thousands of supercomputing nodes, are required to simulate such large systems at the highest level of detail. The simulations reveal the challenges inherent in producing from deep, relatively cold systems with extensive water-bearing channels and connectivity to large aquifers, including the difficulty of achieving depressurizing, the challenges of high water removal rates, and the complexity of production design. Also highlighted are new frontiers in large-scale reservoir simulation of coupled flow, transport, thermodynamics, and phase behavior, including the construction of large meshes, the use parallel numerical solvers and MPI, and large-scale, parallel 3D visualization of results.

Reagan, M. T.; Moridis, G. J.; Freeman, C. M.; Pan, L.; Boyle, K. L.; Johnson, J. N.; Husebo, J. A.

2012-12-01

374

Massively parallel Monte Carlo for many-particle simulations on GPUs

Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.

Anderson, Joshua A.; Jankowski, Eric [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)] [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Grubb, Thomas L. [Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)] [Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Engel, Michael [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)] [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Glotzer, Sharon C., E-mail: sglotzer@umich.edu [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)

2013-12-01

375

Massively parallel Monte Carlo for many-particle simulations on GPUs

Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.

Joshua A. Anderson; Eric Jankowski; Thomas L. Grubb; Michael Engel; Sharon C. Glotzer

2012-11-07

376

Massively parallel Monte Carlo for many-particle simulations on GPUs

NASA Astrophysics Data System (ADS)

Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. We present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU.[1] We reproduce results of serial high-precision Monte Carlo runs to verify the method.[2] This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a GeForce GTX 680, our GPU implementation executes 95 times faster than on a single Intel Xeon E5540 CPU core, enabling 17 times better performance per dollar and cutting energy usage by a factor of 10. [1] J.A. Anderson, E. Jankowski, T. Grubb, M. Engel and S.C. Glotzer, arXiv:1211.1646. [2] J.A. Anderson, M. Engel, S.C. Glotzer, M. Isobe, E.P. Bernard and W. Krauth, arXiv:1211.1645.

Glotzer, Sharon; Anderson, Joshua; Jankowski, Eric; Grubb, Thomas; Engel, Michael

2013-03-01

377

Simulation and Analysis of Parallel Self-Excited Induction Generators for Islanded Wind Farm Systems Bhaskara mathematical model to de- scribe the transient behavior of a system of self-excited induction generators (SEIGs Terms--Induction generators, parallel machines, state-space methods, transient analysis. NOMENCLATURE

SimÃµes, Marcelo Godoy

378

This paper proposes a novel Lightweight Time Warp (LTW) protocol for high-performance parallel optimistic simulation of large-scale DEVS and Cell- DEVS models. By exploiting the characteristics of t he simulation process, the protocol is able to set fre e most logical processes (LPs) from the Time Warp mechanism , while the overall simulation still executes optimistically, driven by only a

Qi Liu; Gabriel A. Wainer

2008-01-01

379

-i- Simulation of Cooperative Water Supply and Flood Operations for Two Parallel Reservoirs: _____________________________________ _____________________________________ _____________________________________ Committee in Charge 2003 #12;-ii- Simulation of Cooperative Water Supply and Flood Operations for Two participatory input from ?????. Reallocation and re-operation project alternatives were simulated on a monthly

Lund, Jay R.

380

The three-dimensional boiling water reactor (BWR) core following the daily load was simulated by the use of the processor array for continuum simulation (PACS-32), a newly developed parallel microprocessor system. The PACS system consists of 32 processing units (PUs) (microprocessors) and has a multiinstruction, multidata type architecture, being optimum to the numerical simulation of the partial differential equations. The BWR

T. Hoshino; T. Shirakawa

1982-01-01

381

We report on a comparative study of the performance of shared and distributed memory parallel simulation algorithms on a large-scale military logistics simulation, and describe the nature of the application and its parallelisation in some detail. We demonstrate that the patterns of communication in the simulation were such that a standard implementation of Breathing Time Buckets (BTB) was unable to

C. J. M. Booth; D. I. Bruce; P. R. Hoare; M. J. Kirton; K. R. Milner; I. J. Relf

1996-01-01

382

We report on a comparative study of the performance of shared and distributed memory parallel simulation algo rithms on a large-scale military logistics simulation, and describe the nature of the application and its parallelisa tion in some detail. We demonstrate that the patterns of communication in the simulation were such that a standard implementation of Breathing Time Buckets (BTB) was

Chris J. M. Booth; David I. Bruce; Peter R. Hoare; Michael J. Kirton; K. Roy Milner; Ian J. Relf

1996-01-01

383

Parallel, adaptive, multi-object trajectory integrator for space simulation applications

NASA Astrophysics Data System (ADS)

Computer simulation is a very helpful approach for improving results from space born experiments. Initial-value problems (IVPs) can be applied for modeling dynamics of different objects - artificial Earth satellites, charged particles in magnetic and electric fields, charged or non-charged dust particles, space debris. An ordinary differential equations systems (ODESs) integrator based on applying different order embedded Runge-Kutta-Fehlberg methods is developed. These methods enable evaluation of the local error. Instead of step-size control based on local error evaluation, an optimal integration method is selected. Integration while meeting the required local error proceeds with constant-sized steps. This optimal scheme selection reduces the amount of calculation needed for solving the IVPs. In addition, for an implementation on a multi core processor and parallelization based on threads application, we describe how to solve multiple systems of IVPs efficiently in parallel. The proposed integrator allows the application of a different force model for every object in multi-satellite simulation models. Simultaneous application of the integrator toward different kinds of problems in the frames of one combined simulation model is possible too. The basic application of the integrator is solving mechanical IVPs in the context of simulation models and their application in complex multi-satellite space missions and as a design tool for experiments.

Atanassov, Atanas Marinov

2014-10-01

384

NASA Astrophysics Data System (ADS)

We use molecular dynamics simulations to study the structure, dynamics, and transport properties of nano-confined water between parallel graphite plates with separation distances (H) from 7 to 20 A? at different water densities with an emphasis on anisotropies generated by confinement. The behavior of the confined water phase is compared to non-confined bulk water under similar pressure and temperature conditions. Our simulations show anisotropic structure and dynamics of the confined water phase in directions parallel and perpendicular to the graphite plate. The magnitude of these anisotropies depends on the slit width H. Confined water shows ``solid-like'' structure and slow dynamics for the water layers near the plates. The mean square displacements (MSDs) and velocity autocorrelation functions (VACFs) for directions parallel and perpendicular to the graphite plates are calculated. By increasing the confinement distance from H = 7 A? to H = 20 A?, the MSD increases and the behavior of the VACF indicates that the confined water changes from solid-like to liquid-like dynamics. If the initial density of the water phase is set up using geometric criteria (i.e., distance between the graphite plates), large pressures (in the order of ~10 katm), and large pressure anisotropies are established within the water. By decreasing the density of the water between the confined plates to about 0.9 g cm-3, bubble formation and restructuring of the water layers are observed.

Mosaddeghi, Hamid; Alavi, Saman; Kowsari, M. H.; Najafi, Bijan

2012-11-01

385

Parallelization of Particle-Particle, Particle-Mesh Method within N-Body Simulation

NSDL National Science Digital Library

The N-Body problem has become an intricate part of the computational sciences, and there has been rise to many methods to solve and approximate the problem. The solution potentially requires on the order of calculations each time step, therefore efficient performance of these N-Body algorithms is very significant [5]. This work describes the parallelization and optimization of the Particle-Particle, Particle-Mesh (P3M) algorithm within GalaxSeeHPC, an open-source N-Body Simulation code. Upon successful profiling, MPI (Message Passing Interface) routines were implemented into the population of the density grid in the P3M method in GalaxSeeHPC. Each problem size recorded different results, and for a problem set dealing with 10,000 celestial bodies, speedups up to 10x were achieved. However, in accordance to Amdahl's Law, maximum speedups for the code should have been closer to 16x. In order to achieve maximum optimization, additional research is needed and parallelization of the Fourier Transform routines could prove to be rewarding. In conclusion, the GalaxSeeHPC Simulation was successfully parallelized and obtained very respectable results, while further optimization remains possible.

Nocito, Nicholas

386

Data Parallel Execution Challenges and Runtime Performance of Agent Simulations on GPUs

Programmable graphics processing units (GPUs) have emerged as excellent computational platforms for certain general-purpose applications. The data parallel execution capabilities of GPUs specifically point to the potential for effective use in simulations of agent-based models (ABM). In this paper, the computational efficiency of ABM simulation on GPUs is evaluated on representative ABM benchmarks. The runtime speed of GPU-based models is compared to that of traditional CPU-based implementation, and also to that of equivalent models in traditional ABM toolkits (Repast and NetLogo). As expected, it is observed that, GPU-based ABM execution affords excellent speedup on simple models, with better speedup on models exhibiting good locality and fair amount of computation per memory element. Execution is two to three orders of magnitude faster with a GPU than with leading ABM toolkits, but at the cost of decrease in modularity, ease of programmability and reusability. At a more fundamental level, however, the data parallel paradigm is found to be somewhat at odds with traditional model-specification approaches for ABM. Effective use of data parallel execution, in general, seems to require resolution of modeling and execution challenges. Some of the challenges are identified and related solution approaches are described.

Perumalla, Kalyan S [ORNL; Aaby, Brandon G [ORNL

2008-01-01

387

Parallel Agent-Based Simulations on Clusters of GPUs and Multi-Core Processors

An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Message Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.

Aaby, Brandon G [ORNL; Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL

2010-01-01

388

De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers

We present a de novo hierarchical simulation framework for first-principles based predictive simulations of materials and their validation on high-end parallel supercomputers and geographically distributed clusters. In this framework, high-end chemically reactive and non-reactive molecular dynamics (MD) simulations explore a wide solution space to discover microscopic mechanisms that govern macroscopic material properties, into which highly accurate quantum mechanical (QM) simulations are embedded to validate the discovered mechanisms and quantify the uncertainty of the solution. The framework includes an embedded divide-and-conquer (EDC) algorithmic framework for the design of linear-scaling simulation algorithms with minimal bandwidth complexity and tight error control. The EDC framework also enables adaptive hierarchical simulation with automated model transitioning assisted by graph-based event tracking. A tunable hierarchical cellular decomposition parallelization framework then maps the O(N) EDC algorithms onto Petaflops computers, while achieving performance tunability through a hierarchy of parameterized cell data/computation structures, as well as its implementation using hybrid Grid remote procedure call + message passing + threads programming. High-end computing platforms such as IBM BlueGene/L, SGI Altix 3000 and the NSF TeraGrid provide an excellent test grounds for the framework. On these platforms, we have achieved unprecedented scales of quantum-mechanically accurate and well validated, chemically reactive atomistic simulations--1.06 billion-atom fast reactive force-field MD and 11.8 million-atom (1.04 trillion grid points) quantum-mechanical MD in the framework of the EDC density functional theory on adaptive multigrids--in addition to 134 billion-atom non-reactive space-time multiresolution MD, with the parallel efficiency as high as 0.998 on 65,536 dual-processor BlueGene/L nodes. We have also achieved an automated execution of hierarchical QM/MD simulation on a Grid consisting of 6 supercomputer centers in the US and Japan (in total of 150 thousand processor-hours), in which the number of processors change dynamically on demand and resources are allocated and migrated dynamically in response to faults. Furthermore, performance portability has been demonstrated on a wide range of platforms such as BlueGene/L, Altix 3000, and AMD Opteron-based Linux clusters.

Nakano, A; Kalia, R K; Nomura, K; Sharma, A; Vashishta, P; Shimojo, F; van Duin, A; Goddard, III, W A; Biswas, R; Srivastava, D; Yang, L H

2006-09-04

389

Parallel traffic flow simulation of freeway networks: Phase 2. Final report 1994--1995

Explicit and implicit numerical methods for solving simple macroscopic traffic flow continuum models have been studied and efficiently implemented in traffic simulation codes in the past. The authors have already studied and implemented explicit methods for solving the high-order flow conservation traffic model. Implicit methods allow much larger time step size than explicit methods, for the same accuracy. However, at each time step a nonlinear system must be solved. They use the Newton method coupled with a linear iterative (Orthomin). They accelerate the convergence of Orthomin with parallel incomplete LU factorization preconditionings. The authors implemented this implicit method on a 16 processor nCUBE2 parallel computer and obtained significant execution time speedup.

Chronopoulos, A.

1997-07-01

390

Xyce parallel electronic simulator design : mathematical formulation, version 2.0.

This document is intended to contain a detailed description of the mathematical formulation of Xyce, a massively parallel SPICE-style circuit simulator developed at Sandia National Laboratories. The target audience of this document are people in the role of 'service provider'. An example of such a person would be a linear solver expert who is spending a small fraction of his time developing solver algorithms for Xyce. Such a person probably is not an expert in circuit simulation, and would benefit from an description of the equations solved by Xyce. In this document, modified nodal analysis (MNA) is described in detail, with a number of examples. Issues that are unique to circuit simulation, such as voltage limiting, are also described in detail.

Hoekstra, Robert John; Waters, Lon J.; Hutchinson, Scott Alan; Keiter, Eric Richard; Russo, Thomas V.

2004-06-01

391

Embedded Microclusters in Zeolites and Cluster Beam Sputtering -- Simulation on Parallel Computers

This report summarizes the research carried out under DOE supported program (DOE/ER/45477) Computer Science--during the course of this project. Large-scale molecular-dynamics (MD) simulations were performed to investigate: (1) sintering of microporous and nanophase Si{sub 3}N{sub 4}; (2) crack-front propagation in amorphous silica; (3) phonons in highly efficient multiscale algorithms and dynamic load-balancing schemes for mapping process, structural correlations, and mechanical behavior including dynamic fracture in graphitic tubules; and (4) amorphization and fracture in nanowires. The simulations were carried out with irregular atomistic simulations on distributed-memory parallel architectures. These research activities resulted in fifty-three publications and fifty-five invited presentations.

Greenwell, Donald L.; Kalia, Rajiv K.; Vashishta, Priya

1996-12-01

392

FLY. A parallel tree N-body code for cosmological simulations

NASA Astrophysics Data System (ADS)

FLY is a parallel treecode which makes heavy use of the one-sided communication paradigm to handle the management of the tree structure. In its public version the code implements the equations for cosmological evolution, and can be run for different cosmological models. This reference guide describes the actual implementation of the algorithms of the public version of FLY, and suggests how to modify them to implement other types of equations (for instance, the Newtonian ones). Program summary Title of program: FLY Catalogue identifier: ADSC Program summary URL: http://cpc.cs.qub.ac.uk/summaries/ADSC Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Computer for which the program is designed and others on which it has been tested: Cray T3E, Sgi Origin 3000, IBM SP Operating systems or monitors under which the program has been tested: Unicos 2.0.5.40, Irix 6.5.14, Aix 4.3.3 Programming language used: Fortran 90, C Memory required to execute with typical data: about 100 Mwords with 2 million-particles Number of bits in a word: 32 Number of processors used: parallel program. The user can select the number of processors >=1 Has the code been vectorized or parallelized?: parallelized Number of bytes in distributed program, including test data, etc.: 4615604 Distribution format: tar gzip file Keywords: Parallel tree N-body code for cosmological simulations Nature of physical problem: FLY is a parallel collisionless N-body code for the calculation of the gravitational force. Method of solution: It is based on the hierarchical oct-tree domain decomposition introduced by Barnes and Hut (1986). Restrictions on the complexity of the program: The program uses the leapfrog integrator schema, but could be changed by the user. Typical running time: 50 seconds for each time-step, running a 2-million-particles simulation on an Sgi Origin 3800 system with 8 processors having 512 Mbytes RAM for each processor. Unusual features of the program: FLY uses the one-side communications libraries: the SHMEM library on the Cray T3E system and Sgi Origin system, and the LAPI library on IBM SP system

Antonuccio-Delogu, V.; Becciani, U.; Ferro, D.

2003-10-01

393

Supporting the Development of Resilient Message Passing Applications using Simulation

An emerging aspect of high-performance computing (HPC) hardware/software co-design is investigating performance under failure. The work in this paper extends the Extreme-scale Simulator (xSim), which was designed for evaluating the performance of message passing interface (MPI) applications on future HPC architectures, with fault-tolerant MPI extensions proposed by the MPI Fault Tolerance Working Group. xSim permits running MPI applications with millions of concurrent MPI ranks, while observing application performance in a simulated extreme-scale system using a lightweight parallel discrete event simulation. The newly added features offer user-level failure mitigation (ULFM) extensions at the simulated MPI layer to support algorithm-based fault tolerance (ABFT). The presented solution permits investigating performance under failure and failure handling of ABFT solutions. The newly enhanced xSim is the very first performance tool that supports ULFM and ABFT.

Naughton, III, Thomas J [ORNL; Engelmann, Christian [ORNL] [ORNL; Vallee, Geoffroy R [ORNL] [ORNL; Boehm, Swen [ORNL] [ORNL

2014-01-01

394

NASA Astrophysics Data System (ADS)

I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring any modification of existing code. This is an advantage for the development and testing of computational modeling software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. Support for parallel programming is also provided by allowing users to select which simulation variables to transfer between processes via a Message Passing Interface library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class presented here requires a C++ compiler that supports variadic templates which were standardized in 2011 (C++11). The code is available at: https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those that do are kindly requested to cite this work.

Honkonen, I.

2014-07-01

395

NASA Astrophysics Data System (ADS)

Flow within the healthy human vascular system is typically laminar but diseased conditions can alter the geometry sufficiently to produce transitional/turbulent flows in regions focal (and immediately downstream) of the diseased section. The mean unsteadiness (pulsatile or respiratory cycle) further complicates the situation making traditional turbulence simulation techniques (e.g., Reynolds-averaged Navier-Stokes simulations (RANSS)) suspect. At the other extreme, direct numerical simulation (DNS) while fully appropriate can lead to large computational expense, particularly when the simulations must be done quickly since they are intended to affect the outcome of a medical treatment (e.g., virtual surgical planning). To produce simulations in a clinically relevant time frame requires; 1) adaptive meshing technique that closely matches the desired local mesh resolution in all three directions to the highly anisotropic physical length scales in the flow, 2) efficient solution algorithms, and 3) excellent scaling on massively parallel computers. In this presentation we will demonstrate results for a subject-specific simulation of an abdominal aortic aneurysm using stabilized finite element method on anisotropically adapted meshes consisting of O(10^8) elements over O(10^4) processors.

Sahni, Onkar; Jansen, Kenneth; Shephard, Mark; Taylor, Charles

2007-11-01

396

We present three-dimensional hybrid simulations of collisionless shocks that propagate parallel to the background magnetic field to study the acceleration of protons that forms a high-energy tail on the distribution. We focus on the initial acceleration of thermal protons and compare it with results from one-dimensional simulations. We find that for both one- and three-dimensional simulations, particles that end up in the high-energy tail of the distribution later in the simulation gained their initial energy right at the shock. This confirms previous results but is the first to demonstrate this using fully three-dimensional fields. The result is not consistent with the ''thermal leakage'' model. We also show that the gyrocenters of protons in the three-dimensional simulation can drift away from the magnetic field lines on which they started due to the removal of ignorable coordinates that exist in one- and two-dimensional simulations. Our study clarifies the injection problem for diffusive shock acceleration.

Guo Fan [Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545 (United States); Giacalone, Joe, E-mail: guofan.ustc@gmail.com [Department of Planetary Sciences and Lunar and Planetary Laboratory, University of Arizona, 1629 E. University Blvd., Tucson, AZ 85721 (United States)

2013-08-20

397

A general parallelization strategy for random path based geostatistical simulation methods

NASA Astrophysics Data System (ADS)

The size of simulation grids used for numerical models has increased by many orders of magnitude in the past years, and this trend is likely to continue. Efficient pixel-based geostatistical simulation algorithms have been developed, but for very large grids and complex spatial models, the computational burden remains heavy. As cluster computers become widely available, using parallel strategies is a natural step for increasing the usable grid size and the complexity of the models. These strategies must profit from of the possibilities offered by machines with a large number of processors. On such machines, the bottleneck is often the communication time between processors. We present a strategy distributing grid nodes among all available processors while minimizing communication and latency times. It consists in centralizing the simulation on a master processor that calls other slave processors as if they were functions simulating one node every time. The key is to decouple the sending and the receiving operations to avoid synchronization. Centralization allows having a conflict management system ensuring that nodes being simulated simultaneously do not interfere in terms of neighborhood. The strategy is computationally efficient and is versatile enough to be applicable to all random path based simulation methods.

Mariethoz, Grégoire

2010-07-01

398

Understanding Performance of Parallel Scientific Simulation Codes using Open|SpeedShop

Conclusions of this presentation are: (1) Open SpeedShop's (OSS) is convenient to use for large, parallel, scientific simulation codes; (2) Large codes benefit from uninstrumented execution; (3) Many experiments can be run in a short time - might need multiple shots e.g. usertime for caller-callee, hwcsamp for HW counters; (4) Decent idea of code's performance is easily obtained; (5) Statistical sampling calls for decent number of samples; and (6) HWC data is very useful for micro-analysis but can be tricky to analyze.

Ghosh, K K

2011-11-07

399

Visualization of parallel molecular dynamics simulation on a remote visualization platform

Visualization requires high performance computers. In order to use these shared high performance computers located at national centers, the authors need an environment for remote visualization. Remote visualization is a special process that uses computing resources and data that are physically distributed over long distances. In their experimental environment, a parallel raytracer is designed for the rendering task. It allows one to efficiently visualize molecular dynamics simulations represented by three dimensional ball-and-stick models. Different issues encountered in creating their platform are discussed, such as I/O, load balancing, and data distribution.

Lee, T.Y.; Raghavendra, C.S. [Washington State Univ., Pullman, WA (United States); Nicholas, J.B. [Pacific Northwest Lab., Richland, WA (United States). Molecular Science Research Center

1994-09-01

400

A parallel multigrid preconditioner for the simulation of large fracture networks

Computational modeling of a fracture in disordered materials using discrete lattice models requires the solution of a linear system of equations every time a new lattice bond is broken. Solving these linear systems of equations successively is the most expensive part of fracture simulations using large three-dimensional networks. In this paper, we present a parallel multigrid preconditioned conjugate gradient algorithm to solve these linear systems. Numerical experiments demonstrate that this algorithm performs significantly better than the algorithms previously used to solve this problem.

Sampath, Rahul S [ORNL; Barai, Pallab [ORNL; Nukala, Phani K [ORNL

2010-01-01

401

pWeb: A High-Performance, Parallel-Computing Framework for Web-Browser-Based Medical Simulation.

This work presents a pWeb - a new language and compiler for parallelization of client-side compute intensive web applications such as surgical simulations. The recently introduced HTML5 standard has enabled creating unprecedented applications on the web. Low performance of the web browser, however, remains the bottleneck of computationally intensive applications including visualization of complex scenes, real time physical simulations and image processing compared to native ones. The new proposed language is built upon web workers for multithreaded programming in HTML5. The language provides fundamental functionalities of parallel programming languages as well as the fork/join parallel model which is not supported by web workers. The language compiler automatically generates an equivalent parallel script that complies with the HTML5 standard. A case study on realistic rendering for surgical simulations demonstrates enhanced performance with a compact set of instructions. PMID:24732497

Halic, Tansel; Ahn, Woojin; De, Suvranu

2014-01-01

402

Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determined the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons) and processor pool sizes (1 to 256 processors). Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers. PMID:24416069

Pesce, Lorenzo L; Lee, Hyong C; Hereld, Mark; Visser, Sid; Stevens, Rick L; Wildeman, Albert; van Drongelen, Wim

2013-01-01

403

Gait simulation via a 6-DOF parallel robot with iterative learning control.

We have developed a robotic gait simulator (RGS) by leveraging a 6-degree of freedom parallel robot, with the goal of overcoming three significant challenges of gait simulation, including: 1) operating at near physiologically correct velocities; 2) inputting full scale ground reaction forces; and 3) simulating motion in all three planes (sagittal, coronal and transverse). The robot will eventually be employed with cadaveric specimens, but as a means of exploring the capability of the system, we have first used it with a prosthetic foot. Gait data were recorded from one transtibial amputee using a motion analysis system and force plate. Using the same prosthetic foot as the subject, the RGS accurately reproduced the recorded kinematics and kinetics and the appropriate vertical ground reaction force was realized with a proportional iterative learning controller. After six gait iterations the controller reduced the root mean square (RMS) error between the simulated and in situ; vertical ground reaction force to 35 N during a 1.5 s simulation of the stance phase of gait with a prosthetic foot. This paper addresses the design, methodology and validation of the novel RGS. PMID:18334421

Aubin, Patrick M; Cowley, Matthew S; Ledoux, William R

2008-03-01

404

Simulation/Emulation Techniques: Compressing Schedules With Parallel (HW/SW) Development

NASA Technical Reports Server (NTRS)

NASA has always been in the business of balancing new technologies and techniques to achieve human space travel objectives. NASA's Kedalion engineering analysis lab has been validating and using many contemporary avionics HW/SW development and integration techniques, which represent new paradigms to NASA's heritage culture. Kedalion has validated many of the Orion HW/SW engineering techniques borrowed from the adjacent commercial aircraft avionics solution space, inserting new techniques and skills into the Multi - Purpose Crew Vehicle (MPCV) Orion program. Using contemporary agile techniques, Commercial-off-the-shelf (COTS) products, early rapid prototyping, in-house expertise and tools, and extensive use of simulators and emulators, NASA has achieved cost effective paradigms that are currently serving the Orion program effectively. Elements of long lead custom hardware on the Orion program have necessitated early use of simulators and emulators in advance of deliverable hardware to achieve parallel design and development on a compressed schedule.

Mangieri, Mark L.; Hoang, June

2014-01-01

405

Parallel Simulation of HGMS of Weakly Magnetic Nanoparticles in Irrotational Flow of Inviscid Fluid

The process of high gradient magnetic separation (HGMS) using a microferromagnetic wire for capturing weakly magnetic nanoparticles in the irrotational flow of inviscid fluid is simulated by using parallel algorithm developed based on openMP. The two-dimensional problem of particle transport under the influences of magnetic force and fluid flow is considered in an annular domain surrounding the wire with inner radius equal to that of the wire and outer radius equal to various multiples of wire radius. The differential equations governing particle transport are solved numerically as an initial and boundary values problem by using the finite-difference method. Concentration distribution of the particles around the wire is investigated and compared with some previously reported results and shows the good agreement between them. The results show the feasibility of accumulating weakly magnetic nanoparticles in specific regions on the wire surface which is useful for applications in biomedical and environmental works. The speedup of parallel simulation ranges from 1.8 to 21 depending on the number of threads and the domain problem size as well as the number of iterations. With the nature of computing in the application and current multicore technology, it is observed that 4–8 threads are sufficient to obtain the optimized speedup. PMID:24955411

Hournkumnuard, Kanok

2014-01-01

406

MDSLB: A new static load balancing method for parallel molecular dynamics simulations

NASA Astrophysics Data System (ADS)

Large-scale parallelization of molecular dynamics simulations is facing challenges which seriously affect the simulation efficiency, among which the load imbalance problem is the most critical. In this paper, we propose, a new molecular dynamics static load balancing method (MDSLB). By analyzing the characteristics of the short-range force of molecular dynamics programs running in parallel, we divide the short-range force into three kinds of force models, and then package the computations of each force model into many tiny computational units called “cell loads”, which provide the basic data structures for our load balancing method. In MDSLB, the spatial region is separated into sub-regions called “local domains”, and the cell loads of each local domain are allocated to every processor in turn. Compared with the dynamic load balancing method, MDSLB can guarantee load balance by executing the algorithm only once at program startup without migrating the loads dynamically. We implement MDSLB in OpenFOAM software and test it on TianHe-1A supercomputer with 16 to 512 processors. Experimental results show that MDSLB can save 34%-64% time for the load imbalanced cases.

Wu, Yun-Long; Xu, Xin-Hai; Yang, Xue-Jun; Zou, Shun; Ren, Xiao-Guang

2014-02-01

407

We present a method for performing parallel temperature-accelerated dynamics TAD simulations over extended length scales. In our method, a two-dimensional spatial decomposition is used along with the recently proposed semirigorous synchronous sublattice algorithm of Shim and Amar Phys. Rev. B 71, 125432 2005. The scaling behavior of the simulation time as a function of system size is studied and compared

Yunsic Shim; Jacques G. Amar; B. P. Uberuaga; A. F. Voter

2007-01-01

408

We present a method for performing parallel temperature-accelerated dynamics (TAD) simulations over extended length scales. In our method, a two-dimensional spatial decomposition is used along with the recently proposed semirigorous synchronous sublattice algorithm of Shim and Amar [Phys. Rev. B 71, 125432 (2005)]. The scaling behavior of the simulation time as a function of system size is studied and compared

Yunsic Shim; Jacques G. Amar; B. P. Uberuaga; A. F. Voter

2007-01-01

409

NASA Technical Reports Server (NTRS)

In recent efforts by NASA, the Army, and Advanced Rotorcraft Technology, Inc. (ART), the application of parallel processing techniques to real-time simulation have been studied. Traditionally, real-time helicopter simulations have omitted the modeling of high-frequency phenomena in order to achieve real-time operation on affordable computers. Parallel processing technology can now provide the means for significantly improving the fidelity of real-time simulation, and one specific area for improvement is the modeling of rotor dynamics. This paper focuses on the results of a piloted simulation in which a traditional rotor-map mathematical model was compared with a more sophisticated blade-element mathematical model that had been implemented using parallel processing hardware and software technology.

Corliss, Lloyd; Du Val, Ronald W.; Gillman, Herbert, III; Huynh, Loc C.

1990-01-01

410

NASA Astrophysics Data System (ADS)

An algorithm to parallelize the Metropolis sampling was devised and applied to the Monte Carlo simulations using quantum mechanical calculations (QM/MC simulations). As a 4 CPU calculation needs 670 000 s to finish the same QM/MC simulation with the parallelized program, the present algorism reduced the computational time to almost half as long as that of one CPU calculation. The parallelized QM/MC simulation was applied to Diels-Alder reactions between methylvinylketone and cyclopentadiene. The QM/MC (B3LYP/6-311++G??//MP2/6-31G?,PM3) calculation gave averaged activation energies of 12.7, 13.8 and 15.2 kcal mol -1 in aqueous, methanol and propane solutions. These results are well consistent with those observed.

Yamaguchi, Toru; Sumimoto, Michinori; Hori, Kenzi

2008-07-01

411

DEVSim++ Toolset for Defense Modeling and Simulation and Interoperation

Discrete Event Systems Specification (DEVS) formalism supports the specification of discrete event models in a hierarchical and modular manner. Efforts have been made to develop the simulation environments for the modeling and simulation (M&S) of systems using DEVS formalism, particularly in defense M&S domains. This paper introduces the DEVSim++ toolset and its applications. The Object-Analysis Index (OAI) matrix is a

Tag Gon Kim; Chang Ho Sung; Su-Youn Hong; Jeong Hee Hong; Chang Beom Choi; Jeong Hoon Kim; Kyung Min Seo; Jang Won Bae

2011-01-01

412

There has been a great concern about the origin of the parallel electric field in the frame of fluid equations in the auroral acceleration region. This paper proposes a new method to simulate magnetohydrodynamic (MHD) equations that include the electron convection term and shows its efficiency with simulation results in one dimension. We apply a third-order semi-discrete central scheme to investigate the characteristics of the electron convection term including its nonlinearity. At a steady state discontinuity, the sum of the ion and electron convection terms balances with the ion pressure gradient. We find that the electron convection term works like the gradient of the negative pressure and reduces the ion sound speed or amplifies the sound mode when parallel current flows. The electron convection term enables us to describe a situation in which a parallel electric field and parallel electron acceleration coexist, which is impossible for ideal or resistive MHD.

Matsuda, K.; Terada, N.; Katoh, Y. [Space and Terrestrial Plasma Physics Laboratory, Department of Geophysics, Graduate School of Science, Tohoku University, Sendai, Miyagi 980-8578 (Japan); Misawa, H. [Planetary Plasma and Atmospheric Research Center, Graduate School of Science, Tohoku University, Sendai, Miyagi 980-8578 (Japan)

2011-08-15

413

Parallel Adaptive Simulation of Weak and Strong Transverse-Wave Structures in H2-O2 Detonations

Two- and three-dimensional simulation results are presented that investigate at great detail the temporal evolution of Mach reflection sub-structure patterns intrinsic to gaseous detonation waves. High local resolution is achieved by utilizing a distributed memory parallel shock-capturing finite volume code that employs block-structured dynamic mesh adaptation. The computational approach, the implemented parallelization strategy, and the software design are discussed.

Deiterding, Ralf [ORNL] [ORNL

2010-01-01

414

NASA Astrophysics Data System (ADS)

Molecular dynamics (MD) simulations of RDX is carried out using the ReaxFF force field supplied with the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). Validation of ReaxFF to model RDX is carried out by extracting the (i) crystal unit cell parameters, (ii) bulk modulus and (iii) thermal expansion coefficient and comparing with reported values from both experiments and simulations.

Warrier, M.; Pahari, P.; Chaturvedi, S.

2010-12-01

415

Molecular dynamics (MD) simulations of RDX is carried out using the ReaxFF force field supplied with the Large-scale Atomic\\/Molecular Massively Parallel Simulator (LAMMPS). Validation of ReaxFF to model RDX is carried out by extracting the (i) crystal unit cell parameters, (ii) bulk modulus and (iii) thermal expansion coefficient and comparing with reported values from both experiments and simulations.

M. Warrier; P. Pahari; S. Chaturvedi

2010-01-01

416

A parallelized particle tracing code for massive 3D mantle flow simulations

NASA Astrophysics Data System (ADS)

The problem of convective flows in a highly viscous fluid represents a common research direction in Earth Sciences. For tracing the convective motion of the fluid material, a source passive particles (or tracers) that flow at a local convection velocity and do not affect the pattern of flow it is commonly used. Here we present a parallelized tracer code that uses passive and weightless particles with their position computed from their displacement during a small time interval at the velocity of flow previously calculated for a given point in space and time. The tracer code is integrated in the open source package CitcomS, which is widely used in the solid earth community (www.geodynamics.org). We benchmarked the tracer code on the state-of-the-art CyberDyn parallel machine, a High Performance Computing (HPC) Cluster with 1344 computing cores available at the Institute of Geodynamics of the Romanian Academy. The benchmark tests are performed using a series of 3D geodynamic settings where we introduced various clusters of tracers at different places in the models. Using several millions of particles, the benchmark results show that the parallelized tracer code performs well with an optimum number of computing cores between 32 and 64. Because of the large amount of communications among the computing cores, high-resolution CFD simulations for geodynamic predictions that require tens of millions, or even billions of tracers to accurately track mantle flow, will greatly benefit from HPC systems based on low-latency high-speed interconnects. In this paper we will present several study cases regarding the 3D mantle flow as revealed by tracers in active subduction zones, as the subduction of Rivera and Cocos plates beneath North America plate as well as the subduction of Nazca plate beneath the South America plate.

Manea, V.; Manea, M.; Pomeran, M.; Besutiu, L.; Zlagnean, L.

2013-05-01

417

NASA Astrophysics Data System (ADS)

Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallel efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O( N ) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes---in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with measured data from simulations in OpenMC on a full-core benchmark problem. Finally, a novel algorithm for decomposing large tally data was proposed, analyzed, and implemented/tested in OpenMC. The algorithm relies on disjoint sets of compute processes and tally servers. The analysis showed that for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead. Tests were performed on Intrepid and Titan and demonstrated that the algorithm did indeed perform well over a wide range of parameters. (Copies available exclusively from MIT Libraries, libraries.mit.edu/docs - docs mit.edu)

Romano, Paul Kollath

418

NASA Astrophysics Data System (ADS)

An implicit-particle simulation of the collisionless parallel shock created at the interface between an injected beam and a stationary plasma is performed in one-dimensional geometry. The solar wind plasma, which consists of ions and electrons, is injected into a stationary dense plasma that corresponds to the planetary ionosphere. Electromagnetic waves with right-hand circular polarization that propagate upstream (R- waves) are generated at the interface of the two plasmas, which decelerate the solar wind to form a shock. The shock transition region is not monotonic but consists of two distinct regions, a pedestal and a shock ramp. The transition region, which contains the ionopause, is a few thousand electron skin depths long. The parallel shock varies in time and periodically collapses and re-forms. The right-hand circularly polarized electromagnetic waves that propagate downstream (R+ waves) are excited at the shock ramp. Nonlinear wave-particle interaction between the solar wind and the R+ waves causes wave condensation and density modulation. These R+ waves may be sweeping away the downstream plasma to suppress its thermal diffusion across the shock. The electrons at the shock ramp exhibit a flat-topped velocity distribution along the magnetic field owing to the ion acoustic-like electrostatic waves.

Shimazu, H.; Machida, S.; Tanaka, M.

1996-04-01

419

Asymptotic dispersion in 2D heterogeneous porous media determined by parallel numerical simulations

NASA Astrophysics Data System (ADS)

We determine the asymptotic dispersion coefficients in 2D exponentially correlated lognormally distributed permeability fields by using parallel computing. Fluid flow is computed by solving the flow equation discretized on a regular grid and transport triggered by advection and diffusion is simulated by a particle tracker. To obtain a well-defined asymptotic regime under ergodic conditions (initial plume size much larger than the correlation length of the permeability field), the characteristic dimension of the simulated computational domains was of the order of 103 correlation lengths with a resolution of ten cells by correlation length. We determine numerically the asymptotic effective longitudinal and transverse dispersion coefficients over 100 simulations for a broad range of heterogeneities ?2 ? [0, 9], where ?2 is the lognormal permeability variance. For purely advective transport, the asymptotic longitudinal dispersion coefficient depends linearly on ?2 for ?2 < 1 and quadratically on ?2 for ?2 > 1 and the asymptotic transverse dispersion coefficient is zero. Addition of homogeneous isotropic diffusion induces an increase of transverse dispersion and a decrease of longitudinal dispersion.

de Dreuzy, Jean-Raynald; Beaudoin, Anthony; Erhel, Jocelyne

2007-10-01

420

Experiences with serial and parallel algorithms for channel routing using simulated annealing

NASA Technical Reports Server (NTRS)

Two algorithms for channel routing using simulated annealing are presented. Simulated annealing is an optimization methodology which allows the solution process to back up out of local minima that may be encountered by inappropriate selections. By properly controlling the annealing process, it is very likely that the optimal solution to an NP-complete problem such as channel routing may be found. The algorithm presented proposes very relaxed restrictions on the types of allowable transformations, including overlapping nets. By freeing that restriction and controlling overlap situations with an appropriate cost function, the algorithm becomes very flexible and can be applied to many extensions of channel routing. The selection of the transformation utilizes a number of heuristics, still retaining the pseudorandom nature of simulated annealing. The algorithm was implemented as a serial program for a workstation, and a parallel program designed for a hypercube computer. The details of the serial implementation are presented, including many of the heuristics used and some of the resulting solutions.

Brouwer, Randall Jay

1988-01-01

421

Mechanisms for the convergence of time-parallelized, parareal turbulent plasma simulations

Parareal is a recent algorithm able to parallelize the time dimension in spite of its sequential nature. It has been applied to several linear and nonlinear problems and, very recently, to a simulation of fully-developed, two-dimensional drift wave turbulence. The mere fact that parareal works in such a turbulent regime is in itself somewhat unexpected, due to the characteristic sensitivity of turbulence to any change in initial conditions. This fundamental property of any turbulent system should render the iterative correction procedure characteristic of the parareal method inoperative, but this seems not to be the case. In addition, the choices that must be made to implement parareal (division of the temporal domain, election of the coarse solver and so on) are currently made using trial-and-error approaches. Here, we identify the mechanisms responsible for the convergence of parareal of these simulations of drift wave turbulence. We also investigate which conditions these mechanisms impose on any successful parareal implementation. The results reported here should be useful to guide future implementations of parareal within the much wider context of fully-developed fluid and plasma turbulent simulations.

Reynolds-Barredo, J. [University of Alaska; University Carlos III de Madrid; Newman, David E [University of Alaska; Sanchez, R. [Universidad Carlos III, Madrid, Spain; Samaddar, D. [ITER Organization, Saint Paul Lez Durance, France; Berry, Lee A [ORNL; Elwasif, Wael R [ORNL

2012-01-01

422

Parallel 3D-TLM algorithm for simulation of the Earth-ionosphere cavity

NASA Astrophysics Data System (ADS)

A parallel 3D algorithm for solving time-domain electromagnetic problems with arbitrary geometries is presented. The technique employed is the Transmission Line Modeling (TLM) method implemented in Shared Memory (SM) environments. The benchmarking performed reveals that the maximum speedup depends on the memory size of the problem as well as multiple hardware factors, like the disposition of CPUs, cache, or memory. A maximum speedup of 15 has been measured for the largest problem. In certain circumstances of low memory requirements, superlinear speedup is achieved using our algorithm. The model is employed to model the Earth-ionosphere cavity, thus enabling a study of the natural electromagnetic phenomena that occur in it. The algorithm allows complete 3D simulations of the cavity with a resolution of 10 km, within a reasonable timescale.

Toledo-Redondo, Sergio; Salinas, Alfonso; Morente-Molinera, Juan Antonio; Méndez, Antonio; Fornieles, Jesús; Portí, Jorge; Morente, Juan Antonio

2013-03-01

423

NASA Astrophysics Data System (ADS)

The work is devoted to 3D and 2D parallel numerical computation of pressure and velocity fields around an elastically supported airfoil self-oscillating due to interaction with the airflow. Numerical solution is computed in the OpenFOAM package, an open-source software package based on finite volume method. Movement of airfoil is described by translation and rotation, identified from experimental data. A new boundary condition for the 2DOF motion of the airfoil was implemented. The results of numerical simulations (velocity) are compared with data measured in a wind tunnel, where a physical model of NACA0015 airfoil was mounted and tuned to exhibit the flutter instability. The experimental results were obtained previously in the Institute of Thermomechanics by interferographic measurements in a subsonic wind tunnel in Nový Knín.

?idký, Václav; Šidlof, Petr

2014-03-01

424

A parallel direct numerical simulation of dust particles in a turbulent flow

NASA Astrophysics Data System (ADS)

Due to their effects on radiation transport, aerosols play an important role in the global climate. Mineral dust aerosol is a predominant natural aerosol in the desert and semi-desert regions of the Middle East and North Africa (MENA). The Arabian Peninsula is one of the three predominant source regions on the planet "exporting" dust to almost the entire world. Mineral dust aerosols make up about 50% of the tropospheric aerosol mass and therefore produces a significant impact on the Earth's climate and the atmospheric environment, especially in the MENA region that is characterized by frequent dust storms and large aerosol generation. Understanding the mechanisms of dust emission, transport and deposition is therefore essential for correctly representing dust in numerical climate prediction. In this study we present results of numerical simulations of dust particles in a turbulent flow to study the interaction between dust and the atmosphere. Homogenous and passive dust particles in the boundary layers are entrained and advected under the influence of a turbulent flow. Currently no interactions between particles are included. Turbulence is resolved through direct numerical simulation using a parallel incompressible Navier-Stokes flow solver. Model output provides information on particle trajectories, turbulent transport of dust and effects of gravity on dust motion, which will be used to compare with the wind tunnel experiments at University of Texas at Austin. Results of testing of parallel efficiency and scalability is provided. Future versions of the model will include air-particle momentum exchanges, varying particle sizes and saltation effect. The results will be used for interpreting wind tunnel and field experiments and for improvement of dust generation parameterizations in meteorological models.

Nguyen, H. V.; Yokota, R.; Stenchikov, G.; Kocurek, G.

2012-04-01

425

NASA Astrophysics Data System (ADS)

Oxidation of a flat aluminum (111) surface and the reactive wetting of the aluminum (Al) droplet on a flat alumina (alpha-Al2O 3) surface are investigated by using parallel molecular-dynamics simulations with dynamic charge transfer among atoms on a microscopic length scale. The interatomic potential, based on the formalism of Streitz and Mintmire, allows atoms to vary their charges dynamically between anions and cations, when atoms move and their local environment is altered. We investigate the oxidation thickness as a function of time and the oxygen density which is 10--40 times that of the normal state (1 atm and 300 K). Stable amorphous oxide scales form around 51 A at 4.42 ns, 2.862 ns, and 2.524 ns, respectively, and molecular oxygen density 10--40 times the normal state. We also study structural correlations in the resulting final oxide scale. The structure of final oxide scales depend on depth, where density of aluminum (Al) and oxygen (0) atoms change. Reactive wetting of aluminum nanodroplet on alumina surface is also studied using parallel MD. We study heat transfer, diffusion within droplet, and the structure of the inter-metallic phases in the liquid-solid interface. Oxygen (0) atoms diffuse into the spherical aluminum (Al) droplet and form an interface between the flat solid substrate and the Al droplet. This diffusion of oxygen atoms may be the main source of adhesion between the Al drop and the flat alpha-Al 2O3 substrate. The temperature in the flat alpha-Al 2O3 bulk substrate rises from OK to 200 K at the end of the simulation, 8.5 ps, but the temperature becomes much higher at the reactive interface. We have examined which oxygen atoms from the substrate participate in the wetting and the formation of a solder joint at the Al/alpha-Al 2O3 interface.

Aral, Gurcan

426

Implementation of a parallel algorithm for thermo-chemical nonequilibrium flow simulations

Massively parallel (MP) computing is considered to be the future direction of high performance computing. When engineers apply this new MP computing technology to solve large-scale problems, one major interest is what is the maximum problem size that a MP computer can handle. To determine the maximum size, it is important to address the code scalability issue. Scalability implies whether the code can provide an increase in performance proportional to an increase in problem size. If the size of the problem increases, by utilizing more computer nodes, the ideal elapsed time to simulate a problem should not increase much. Hence one important task in the development of the MP computing technology is to ensure scalability. A scalable code is an efficient code. In order to obtain good scaled performance, it is necessary to first have the code optimized for a single node performance before proceeding to a large-scale simulation with a large number of computer nodes. This paper will discuss the implementation of a massively parallel computing strategy and the process of optimization to improve the scaled performance. Specifically, we will look at domain decomposition, resource management in the code, communication overhead, and problem mapping. By incorporating these improvements and adopting an efficient MP computing strategy, an efficiency of about 85% and 96%, respectively, has been achieved using 64 nodes on MP computers for both perfect gas and chemically reactive gas problems. A comparison of the performance between MP computers and a vectorized computer, such as Cray-YMP, will also be presented.

Wong, C.C.; Blottner, F.G.; Payne, J.L. [Sandia National Labs., Albuquerque, NM (United States); Soetrisno, M. [Amtec Engineering, Inc., Bellevue, WA (United States)

1995-01-01

427

A study of Gd-based parallel plate avalanche counter for thermal neutrons by MC simulation

NASA Astrophysics Data System (ADS)

In this work, we demonstrate the feasibility and characteristics of a single-gap parallel plate avalanche counter (PPAC) as a low energy neutron detector, based on Gd-converter coating. Upon falling on the Gd-converter surface, the incident low energy neutrons produce internal conversion electrons which are evaluated and detected. For estimating the performance of the Gd-based PPAC, a simulation study has been performed using GEANT4 Monte Carlo (MC) code. The detector response as a function of incident neutron energies in the range of 25-100 meV has been evaluated with two different physics lists. Using the QGSP_BIC_HP physics list and assuming 5 ?m converter thickness, 11.8%, 18.48%, and 30.28% detection efficiencies have been achieved for the forward-, the backward-, and the total response of the converter-based PPAC. On the other hand, considering the same converter thickness and detector configuration, with the QGSP_BERT_HP physics list efficiencies of 12.19%, 18.62%, and 30.81%, respectively, were obtained. These simulation results are briefly discussed.

Rhee, J. T.; Kim, H. G.; Ahmad, Farzana; Jeon, Y. J.; Jamil, M.

2013-12-01

428

Significant problems facing all experimental andcomputationalsciences arise from growing data size and complexity. Commonto allthese problems is the need to perform efficient data I/O ondiversecomputer architectures. In our scientific application, thelargestparallel particle simulations generate vast quantitiesofsix-dimensional data. Such a simulation run produces data foranaggregate data size up to several TB per run. Motived by the needtoaddress data I/O and access challenges, we have implemented H5Part,anopen source data I/O API that simplifies the use of the HierarchicalDataFormat v5 library (HDF5). HDF5 is an industry standard forhighperformance, cross-platform data storage and retrieval that runsonall contemporary architectures from large parallel supercomputerstolaptops. H5Part, which is oriented to the needs of the particlephysicsand cosmology communities, provides support for parallelstorage andretrieval of particles, structured and in the future unstructuredmeshes.In this paper, we describe recent work focusing on I/O supportforparticles and structured meshes and provide data showing performance onmodernsupercomputer architectures like the IBM POWER 5.

Adelmann, Andreas; Gsell, Achim; Oswald, Benedikt; Schietinger,Thomas; Bethel, Wes; Shalf, John; Siegerist, Cristina; Stockinger, Kurt

2007-06-22

429

Simulated Wake Characteristics Data for Closely Spaced Parallel Runway Operations Analysis

NASA Technical Reports Server (NTRS)

A simulation experiment was performed to generate and compile wake characteristics data relevant to the evaluation and feasibility analysis of closely spaced parallel runway (CSPR) operational concepts. While the experiment in this work is not tailored to any particular operational concept, the generated data applies to the broader class of CSPR concepts, where a trailing aircraft on a CSPR approach is required to stay ahead of the wake vortices generated by a lead aircraft on an adjacent CSPR. Data for wake age, circulation strength, and wake altitude change, at various lateral offset distances from the wake-generating lead aircraft approach path were compiled for a set of nine aircraft spanning the full range of FAA and ICAO wake classifications. A total of 54 scenarios were simulated to generate data related to key parameters that determine wake behavior. Of particular interest are wake age characteristics that can be used to evaluate both time- and distance- based in-trail separation concepts for all aircraft wake-class combinations. A simple first-order difference model was developed to enable the computation of wake parameter estimates for aircraft models having weight, wingspan and speed characteristics similar to those of the nine aircraft modeled in this work.

Guerreiro, Nelson M.; Neitzke, Kurt W.

2012-01-01

430

This paper presents the use of a dynamically adaptive mesh refinement strategy for the simulations of shock-driven turbulent mixing. Large-eddy simulations are necessary due the high Reynolds number turbulent regime. In this approach, the large scales are simulated directly and small scales at which the viscous dissipation occurs are modeled. A low-numerical centered finite-difference scheme is used in turbulent flow regions while a shock-capturing method is employed to capture shocks. Three-dimensional parallel simulations of the Richtmyer-Meshkov instability performed in plane and converging geometries are described.

Lombardini, Manuel [California Institute of Technology, Pasadena] [California Institute of Technology, Pasadena; Deiterding, Ralf [ORNL] [ORNL

2010-01-01

431

Numerical field simulation for parallel transmission in MRI at 7 tesla

Parallel transmission (pTx) is a promising improvement to coil design that has been demonstrated to mitigate B1* inhomogeneity, manifest as center brightening, for high-field magnetic resonance imaging (MRI). Parallel ...

Bernier, Jessica A. (Jessica Ashley)

2011-01-01

432

Staged Simulation: A General Technique for Improving Simulation Scale and Performance

Staged Simulation: A General Technique for Improving Simulation Scale and Performance KEVIN WALSH and EMIN G Â¨UN SIRER Cornell University This article describes staged simulation, a technique for improving the run time performance and scale of discrete event simulators. Typical network simulations are limited

Hybinette, Maria

433

An EMI Simulator Based on the Parallel-Distributed FDTD Method for Large-Scale Printed Wiring Boards

This paper describes a full-wave EMI (electromagnetic in- terference) simulator for the new methodology of printed wir- ing board (PWB) design with the consideration of electromag- netic compatibility (EMC) and signal integrity (SI). This simu- lator is based on the parallel-distributed finite-difference time- domain (FDTD) method, and works on the PC-cluster. Using our simulator, the full-wave analysis of the large-scale

Hideki ASAI; Takayuki WATANABE; Tohru SASAKI; Kenji ARAKI

2002-01-01

434

Automated integration of genomic physical mapping data via parallel simulated annealing

The Human Genome Center at the Lawrence Livermore National Laboratory (LLNL) is nearing closure on a high-resolution physical map of human chromosome 19. We have build automated tools to assemble 15,000 fingerprinted cosmid clones into 800 contigs with minimal spanning paths identified. These islands are being ordered, oriented, and spanned by a variety of other techniques including: Fluorescence Insitu Hybridization (FISH) at 3 levels of resolution, ECO restriction fragment mapping across all contigs, and a multitude of different hybridization and PCR techniques to link cosmid, YAC, AC, PAC, and Pl clones. The FISH data provide us with partial order and distance data as well as orientation. We made the observation that map builders need a much rougher presentation of data than do map readers; the former wish to see raw data since these can expose errors or interesting biology. We further noted that by ignoring our length and distance data we could simplify our problem into one that could be readily attacked with optimization techniques. The data integration problem could then be seen as an M x N ordering of our N cosmid clones which ``intersect`` M larger objects by defining ``intersection`` to mean either contig/map membership or hybridization results. Clearly, the goal of making an integrated map is now to rearrange the N cosmid clone ``columns`` such that the number of gaps on the object ``rows`` are minimized. Our FISH partially-ordered cosmid clones provide us with a set of constraints that cannot be violated by the rearrangement process. We solved the optimization problem via simulated annealing performed on a network of 40+ Unix machines in parallel, using a server/client model built on explicit socket calls. For current maps we can create a map in about 4 hours on the parallel net versus 4+ days on a single workstation. Our biologists are now using this software on a daily basis to guide their efforts toward final closure.

Slezak, T.

1994-06-01

435

Numerical investigation of parallel airfoil-vortex interaction using large eddy simulation

NASA Astrophysics Data System (ADS)

Helicopter Blade-Vortex Interaction (BVI) occurs under certain conditions of powered descent or during extreme maneuvering. The vibration and acoustic problems associated with the interaction of rotor tip vortices and the following blades are major aerodynamic concerns for the helicopter community. Researchers have performed numerous experimental and computational studies over the last two decades in order to gain a better understanding of the physical mechanisms involved in BVI. The most severe interaction, in terms of generated noise, happens when the vortex filament is parallel to the blade, thus affecting a great portion of it. The majority of the previous numerical studies of parallel BVI fall within a potential flow framework, therefore excluding all viscous phenomena. Some Navier-Stokes approaches using dissipative numerical methods in conjunction with RANS-type turbulence models have also been attempted, but with limited success. In this work, the situation is improved by increasing the fidelity of both the numerical method and the turbulence model. A kinetic-energy conserving finite-volume scheme using a collocated-mesh arrangement, specially designed for simulation of turbulence in complex geometries, was implemented. For the turbulence model, a cost-effective zonal hybrid RANS/LES technique is used. A BANS zone covers the boundary layers on the airfoil and the wake region behind, while the remainder of the flow field, including the region occupied by the vortex makes up the dynamic LES zone. The concentrated tip vortex is not attenuated as it is convected downstream and over a NACA 0012 airfoil. The lift, drag, moment and friction coefficients induced by the passage of the vortex are monitored in time and compared with experimental data.

Felten, Frederic N.

436

Direct numerical simulation of instabilities in parallel flow with spherical roughness elements

NASA Technical Reports Server (NTRS)

Results from a direct numerical simulation of laminar flow over a flat surface with spherical roughness elements using a spectral-element method are given. The numerical simulation approximates roughness as a cellular pattern of identical spheres protruding from a smooth wall. Periodic boundary conditions on the domain's horizontal faces simulate an infinite array of roughness elements extending in the streamwise and spanwise directions, which implies the parallel-flow assumption, and results in a closed domain. A body force, designed to yield the horizontal Blasius velocity in the absence of roughness, sustains the flow. Instabilities above a critical Reynolds number reveal negligible oscillations in the recirculation regions behind each sphere and in the free stream, high-amplitude oscillations in the layer directly above the spheres, and a mean profile with an inflection point near the sphere's crest. The inflection point yields an unstable layer above the roughness (where U''(y) is less than 0) and a stable region within the roughness (where U''(y) is greater than 0). Evidently, the instability begins when the low-momentum or wake region behind an element, being the region most affected by disturbances (purely numerical in this case), goes unstable and moves. In compressible flow with periodic boundaries, this motion sends disturbances to all regions of the domain. In the unstable layer just above the inflection point, the disturbances grow while being carried downstream with a propagation speed equal to the local mean velocity; they do not grow amid the low energy region near the roughness patch. The most amplified disturbance eventually arrives at the next roughness element downstream, perturbing its wake and inducing a global response at a frequency governed by the streamwise spacing between spheres and the mean velocity of the most amplified layer.

Deanna, R. G.

1992-01-01

437

SIMULATION OF A RELIABLE PARALLEL ROBOT D.L. Hamilton, J.K. Bennett & I.D. Walker

SIMULATION OF A RELIABLE PARALLEL ROBOT CONTROLLER D.L. Hamilton, J.K. Bennett & I.D. Walker solutions to the robot control problem are an attractive alternative to singleÂprocessor systems as the need for fine motion robot control increases. Since multiple processors provide inherent redundancy, parÂ allel

Bennett, John K.

438

are not able to recognize correlation information in binding sites. We implement a genetic algorithmSimulation of the Evolution of Information Content in Transcription Factor Binding Sites Using a Parallelized Genetic Algorithm Joseph Cornish*, Robert Forder**, Ivan Erill*, Matthias K. Gobbert** *Department

Gobbert, Matthias K.

439

This paper presents the usefulness of simulation in studying the impacts of system failures and delays on the output and cycle time of finished weldments produced by a robotic work cell having both serial and parallel processes. Due to multiple processes and overlapped activities, process mapping plays a significant role in building the model. The model replicates a non-terminating welding

Carl R. Williams; P. Chompuming

2002-01-01

440

Verification and Validation of Agent-based Scientific Simulation Models

Most formalized model verification and validation tech- niques come from industrial and system engineering for discrete-event system simulations. These techniques are widely used in computational science. The agent-based modeling approach is different from discrete event modeling approaches largely used in industrial and system engineer- ing in many aspects. Since the agent-based modeling ap- proach has recently become an attractive and

Xiaorong Xiang; Ryan Kennedy; Gregory Madey; Steve Cabaniss

2005-01-01

441

Computational Steering in Simulation of Manufacturing Systems

Traditional steps in a discrete event manufacturing simulation are to prepare input variables, select simulation parameters, run the simulation and review the results after the execution is completed. The main drawback in such systems is that complex “what if scenario” analysis would require several simulation cycles before they divulge interesting or valuable information. In an approach proposed here, we have

T. Kesavadas; Abhishek Sudhir

2000-01-01

442

NASA Technical Reports Server (NTRS)

A piloted comparison of rigid and aeroelastic blade-element rotor models was conducted at the Crew Station Research and Development Facility (CSRDF) at Ames Research Center. A simulation development and analysis tool, FLIGHTLAB, was used to implement these models in real time using parallel processing technology. Pilot comments and quantitative analysis performed both on-line and off-line confirmed that elastic degrees of freedom significantly affect perceived handling qualities. Trim comparisons show improved correlation with flight test data when elastic modes are modeled. The results demonstrate the efficiency with which the mathematical modeling sophistication of existing simulation facilities can be upgraded using parallel processing, and the importance of these upgrades to simulation fidelity.

Hill, Gary; Duval, Ronald W.; Green, John A.; Huynh, Loc C.

1991-01-01

443

Mesoscale Simulations of Particulate Flows with Parallel Distributed Lagrange Multiplier Technique

Fluid particulate flows are common phenomena in nature and industry. Modeling of such flows at micro and macro levels as well establishing relationships between these approaches are needed to understand properties of the particulate matter. We propose a computational technique based on the direct numerical simulation of the particulate flows. The numerical method is based on the distributed Lagrange multiplier technique following the ideas of Glowinski et al. (1999). Each particle is explicitly resolved on an Eulerian grid as a separate domain, using solid volume fractions. The fluid equations are solved through the entire computational domain, however, Lagrange multiplier constrains are applied inside the particle domain such that the fluid within any volume associated with a solid particle moves as an incompressible rigid body. Mutual forces for the fluid-particle interactions are internal to the system. Particles interact with the fluid via fluid dynamic equations, resulting in implicit fluid-rigid-body coupling relations that produce realistic fluid flow around the particles (i.e., no-slip boundary conditions). The particle-particle interactions are implemented using explicit force-displacement interactions for frictional inelastic particles similar to the DEM method of Cundall et al. (1979) with some modifications using a volume of an overlapping region as an input to the contact forces. The method is flexible enough to handle arbitrary particle shapes and size distributions. A parallel implementation of the method is based on the SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) library, which allows handling of large amounts of rigid particles and enables local grid refinement. Accuracy and convergence of the presented method has been tested against known solutions for a falling sphere as well as by examining fluid flows through stationary particle beds (periodic and cubic packing). To evaluate code performance and validate particle contact physics algorithm, we performed simulations of a representative experiment conducted at the University of California at Berkley for pebble flow through a narrow opening.

Kanarska, Y

2010-03-24

444

NASA Technical Reports Server (NTRS)

Parallel processing of real-time dynamic systems simulation on a multiprocessor system named OSCAR is presented. In the simulation of dynamic systems, generally, the same calculation are repeated every time step. However, we cannot apply to Do-all or the Do-across techniques for parallel processing of the simulation since there exist data dependencies from the end of an iteration to the beginning of the next iteration and furthermore data-input and data-output are required every sampling time period. Therefore, parallelism inside the calculation required for a single time step, or a large basic block which consists of arithmetic assignment statements, must be used. In the proposed method, near fine grain tasks, each of which consists of one or more floating point operations, are generated to extract the parallelism from the calculation and assigned to processors by using optimal static scheduling at compile time in order to reduce large run time overhead caused by the use of near fine grain tasks. The practicality of the scheme is demonstrated on OSCAR (Optimally SCheduled Advanced multiprocessoR) which has been developed to extract advantageous features of static scheduling algorithms to the maximum extent.

Kasahara, Hironori; Honda, Hiroki; Narita, Seinosuke

1989-01-01

445

Multiscale modeling and simulation for polymer melt flows between parallel plates

The flow behaviors of polymer melt composed of short chains with ten beads between parallel plates are simulated by using a hybrid method of molecular dynamics and computational fluid dynamics. Three problems are solved: creep motion under a constant shear stress and its recovery motion after removing the stress, pressure-driven flows, and the flows in rapidly oscillating plates. In the creep/recovery problem, the delayed elastic deformation in the creep motion and evident elastic behavior in the recovery motion are demonstrated. The velocity profiles of the melt in pressure-driven flows are quite different from those of Newtonian fluid due to shear thinning. Velocity gradients of the melt become steeper near the plates and flatter at the middle between the plates as the pressure gradient increases and the temperature decreases. In the rapidly oscillating plates, the viscous boundary layer of the melt is much thinner than that of Newtonian fluid due to the shear thinning of the melt. Three different rheological regimes, i.e., the viscous fluid, visco-elastic liquid, and visco-elastic solid regimes, form over the oscillating plate according to the local Deborah numbers. The melt behaves as a viscous fluid in a region for $\\omega\\tau^R\\lesssim 1$, and the crossover between the liquid-like and solid-like regime takes place around $\\omega\\tau^\\alpha\\simeq 1$ (where $\\omega$ is the angular frequency of the plate and $\\tau^R$ and $\\tau^\\alpha$ are Rouse and $\\alpha$ relaxation time, respectively).

Shugo Yasuda; Ryoichi Yamamoto

2009-09-14

446

Over the past years, SLAC's Advanced Computations Department (ACD), under SciDAC sponsorship, has developed a suite of 3D (2D) parallel higher-order finite element (FE) codes, T3P (T2P) and Pic3P (Pic2P), aimed at accurate, large-scale simulation of wakefields and particle-field interactions in radio-frequency (RF) cavities of complex shape. The codes are built on the FE infrastructure that supports SLAC's frequency domain codes, Omega3P and S3P, to utilize conformal tetrahedral (triangular)meshes, higher-order basis functions and quadratic geometry approximation. For time integration, they adopt an unconditionally stable implicit scheme. Pic3P (Pic2P) extends T3P (T2P) to treat charged-particle dynamics self-consistently using the PIC (particle-in-cell) approach, the first such implementation on a conformal, unstructured grid using Whitney basis functions. Examples from applications to the International Linear Collider (ILC), Positron Electron Project-II (PEP-II), Linac Coherent Light Source (LCLS) and other accelerators will be presented to compare the accuracy and computational efficiency of these codes versus their counterparts using structured grids.

Candel, A.; Kabel, A.; Lee, L.; Li, Z.; Limborg, C.; Ng, C.; Prudencio, E.; Schussman, G.; Uplenchwar, R.; Ko, K.; /SLAC

2009-06-19

447

Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP

Future computer games need parallel programming to meet their ever growing hunger for per- formance. We report on our experiences in parallelizing the game-like C++ application Open- SteerDemo with OpenMP. To enable deterministic data-parallel processing of real-time agent steering behaviour, we had to change the high-level design, and refactor interfaces for explicit shared resource access. Our experience is summarized in

Bjoern Knafla; Claudia Leopold

2007-01-01

448

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f , (e.g. Verlet algorithm) is available to propagate the system from time ti (trajectory positions and velocities xi = (ri; vi)) to time ti+1 (xi+1) by xi+1 = fi(xi), the dynamics problem spanning an interval from t0 : : : tM can be transformed into a root finding problem, F(X) = [xi - f (x(i-1)]i=1;M = 0, for the trajectory variables. The root finding problem is solved using a variety of optimization techniques, including quasi-Newton and preconditioned quasi-Newton optimization schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed and the effectiveness of various approaches to solving the root finding problem are tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl+4H2O AIMD simulation at the MP2 level. The maximum speedup obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow TCP/IP networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl+4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. By using these algorithms we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 seconds per time step to 6.9 seconds per time step.

Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.

2013-08-21

449

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0[ellipsis (horizontal)]tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial?execution/timeparallel?execution?time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step. PMID:23968079

Bylaska, Eric J; Weare, Jonathan Q; Weare, John H

2013-08-21

450

Modelling and simulation of a distributed battery management system

This paper discusses the modelling and simulation of a distributed battery management system with a continuous and discrete-event simulation environment. The simulation model focuses on replicating the generic components within the system into model blocks to provide a structured approach in simulating battery networks. The simulation model also deals with three key network levels, which are the process level, the

Darren LIM; A. Anbuky

2004-01-01

451

Simulation and critical care modeling Jennifer E. Krekea

Simulation and critical care modeling Jennifer E. Krekea , Andrew J. Schaefera,b,c and Mark S in the literature suggest that simulation modeling techniques such as Markov modeling, Monte Carlo simulation, and discrete-event simulation are useful tools for analyzing complex systems in critical care. These simulation

Schaefer, Andrew

452

In this study, the method of characteristic line (MOC) was adopted to evaluate the valve-induced water hammer phenomena in a parallel pumps feedwater system (PPFS) during the alternate startup process of parallel pumps. Based on closed physical and mathematical equations supplied with reasonable boundary conditions, a code was developed to compute the transient phenomena including the pressure wave vibration, local

Wenxi Tian; G. H. Su; Gaopeng Wang; Suizheng Qiu; Zejun Xiao

2008-01-01

453

An Empirical Study of Data Partitioning and Replication in Parallel Simulation

Two of the important design decisions in developing a parallel program are how the data space is to be partitioned and how much data repli- cation there should be among the various processes. In this paper, we propose a strategy for handling data distribution and replication for the problem of proximity detection in a parallel combat simula- tion. The strategy

Frederick Wieland; Lawrence Hawley; Leo Blume

1990-01-01

454

NASA Astrophysics Data System (ADS)

Thousands of numerical tsunami simulations allow the computation of inundation and run-up along the coast for vulnerable areas over the time. A so-called Matching Scenario Database (MSDB) [1] contains this large number of simulations in text file format. In order to visualize these wave propagations the scenarios have to be reprocessed automatically. In the TRIDEC project funded by the seventh Framework Programme of the European Union a Virtual Scenario Database (VSDB) and a Matching Scenario Database (MSDB) were established amongst others by the working group of the University of Bologna (UniBo) [1]. One part of TRIDEC was the developing of a new generation of a Decision Support System (DSS) for tsunami Early Warning Systems (TEWS) [2]. A working group of the GFZ German Research Centre for Geosciences was responsible for developing the Command and Control User Interface (CCUI) as central software application which support operator activities, incident management and message disseminations. For the integration and visualization in the CCUI, the numerical tsunami simulations from MSDB must be converted into the shapefiles format. The usage of shapefiles enables a much easier integration into standard Geographic Information Systems (GIS). Since also the CCUI is based on two widely used open source products (GeoTools library and uDig), whereby the integration of shapefiles is provided by these libraries a priori. In this case, for an example area around the Western Iberian margin several thousand tsunami variations were processed. Due to the mass of data only a program-controlled process was conceivable. In order to optimize the computing efforts and operating time the use of an existing GFZ High Performance Computing Cluster (HPC) had been chosen. Thus, a geospatial software was sought after that is capable for parallel processing. The FOSS tool Geospatial Data Abstraction Library (GDAL/OGR) was used to match the coordinates with the wave heights and generates the different shapefiles for certain time steps. The shapefiles contain afterwards lines for visualizing the isochrones of the wave propagation and moreover, data about the maximum wave height and the Estimated Time of Arrival (ETA) at the coast. Our contribution shows the entire workflow and the visualizing results of the-processing for the example region Western Iberian ocean margin. [1] Armigliato A., Pagnoni G., Zaniboni F, Tinti S. (2013), Database of tsunami scenario simulations for Western Iberia: a tool for the TRIDEC Project Decision Support System for tsunami early warning, Vol. 15, EGU2013-5567, EGU General Assembly 2013, Vienna (Austria). [2] Löwe, P., Wächter, J., Hammitzsch, M., Lendholt, M., Häner, R. (2013): The Evolution of Service-oriented Disaster Early Warning Systems in the TRIDEC Project, 23rd International Ocean and Polar Engineering Conference - ISOPE-2013, Anchorage (USA).

Schroeder, Matthias; Jankowski, Cedric; Hammitzsch, Martin; Wächter, Joachim

2014-05-01

455

PeliGRIFF, a parallel DEM-DLM\\/FD direct numerical simulation tool for 3D particulate flows

The problem of particulate flows at moderate to high concentration and finite Reynolds number is addressed by parallel direct\\u000a numerical simulation. The present contribution is an extension of the work published in Computers & Fluids 38:1608 (2009), where systems of moderate size in a 2D geometry were examined. At the numerical level, the suggested method is inspired\\u000a by the framework

Anthony Wachs

456

The transient, viscous, incompressible, hydrodynamic Couette flow in a rotating porous medium channel is studied in this paper.\\u000a The channel comprises a pair of infinitely long parallel plates which rotate with uniform angular velocity about an axis normal\\u000a to the plates. The porous medium is simulated using a Darcy–Forchheimer drag force model which includes both bulk matrix porous\\u000a drag (dominant

O. Anwar Bég; H. S. Takhar; Joaquín Zueco; A. Sajid; R. Bhargava

2008-01-01

457

Tau-leaping is a stochastic simulation algorithm that efficiently reconstructs the temporal evolution of biological systems, modeled according to the stochastic formulation of chemical kinetics. The analysis of dynamical properties of these systems in physiological and perturbed conditions usually requires the execution of a large number of simulations, leading to high computational costs. Since each simulation can be executed independently from the others, a massive parallelization of tau-leaping can bring to relevant reductions of the overall running time. The emerging field of General Purpose Graphic Processing Units (GPGPU) provides power-efficient high-performance computing at a relatively low cost. In this work we introduce cuTauLeaping, a stochastic simulator of biological systems that makes use of GPGPU computing to execute multiple parallel tau-leaping simulations, by fully exploiting the Nvidia's Fermi GPU architecture. We show how a considerable computational speedup is achieved on GPU by partitioning the execution of tau-leaping into multiple separated phases, and we describe how to avoid some implementation pitfalls related to the scarcity of memory resources on the GPU streaming multiprocessors. Our results show that cuTauLeaping largely outperforms the CPU-based tau-leaping implementation when the number of parallel simulations increases, with a break-even directly depending on the size of the biological system and on the complexity of its emergent dynamics. In particular, cuTauLeaping is exploited to investigate the probability distribution of bistable states in the Schlögl model, and to carry out a bidimensional parameter sweep analysis to study the oscillatory regimes in the Ras/cAMP/PKA pathway in S. cerevisiae. PMID:24663957

Besozzi, Daniela; Pescini, Dario; Mauri, Giancarlo

2014-01-01

458

NASA Astrophysics Data System (ADS)

Since Urata et al. reported on the Smith Purcell superradiance, numerous studies have been carried out to develop novel type of terahertz free electron lasers. The particle-in-cell finite-difference time-domain (PIC-FDTD) method has been widely employed to numerically study the process. We show our studies on the parallel computing based on the general purpose computation on the graphic processing unit (GPGPU) to accelerate our homemade PIC-FDTD simulation. We have succeeded in reducing the computational time to the quarter of that required for the same simulation using only CPU.

Iwata, Tsuyoshi; Okajima, Akiko; Matsui, Tatsunosuke

2014-03-01

459

NASA Technical Reports Server (NTRS)

Part 1 of this paper presented the requirements for the real-time simulation of Cassini spacecraft along with some discussion of the DARTS algorithm. Here, in Part 2 we discuss the development and implementation of parallel/vectorized DARTS algorithm and architecture for real-time simulation. Development of the fast algorithms and architecture for real-time hardware-in-the-loop simulation of spacecraft dynamics is motivated by the fact that it represents a hard real-time problem, in the sense that the correctness of the simulation depends on both the numerical accuracy and the exact timing of the computation. For a given model fidelity, the computation should be computed within a predefined time period. Further reduction in computation time allows increasing the fidelity of the model (i.e., inclusion of more flexible modes) and the integration routine.

Fijany, A.; Roberts, J. A.; Jain, A.; Man, G. K.

1993-01-01

460

Scalable parallel simulation of small-scale structures in cold dark matter

We present a parallel implementation of the particle-particle/particle-mesh (P³M) algorithm for distributed memory clusters. The llp3m-hc code uses a hybrid method for both computation and domain decomposition. Long-range ...

Shirokov, Alexander V. (Alexander Victorovich)

2005-01-01

461

A Framework for Parallel Unstructured Grid Generation for Complex Aerodynamic Simulations

NASA Technical Reports Server (NTRS)

A framework for parallel unstructured grid generation targeting both shared memory multi-processors and distributed memory architectures is presented. The two fundamental building-blocks of the framework consist of: (1) the Advancing-Partition (AP) method used for domain decomposition and (2) the Advancing Front (AF) method used for mesh generation. Starting from the surface mesh of the computational domain, the AP method is applied recursively to generate a set of sub-domains. Next, the sub-domains are meshed in parallel using the AF method. The recursive nature of domain decomposition naturally maps to a divide-and-conquer algorithm which exhibits inherent parallelism. For the parallel implementation, the Master/Worker pattern is employed to dynamically balance the varying workloads of each task on the set of available CPUs. Performance results by this approach are presented and discussed in detail as well as future work and improvements.

Zagaris, George; Pirzadeh, Shahyar Z.; Chrisochoides, Nikos

2009-01-01

462

Semi-automatic parallelization of direct and inverse problems for geothermal simulation

We describe a strategy for parallelizing a geothermal simula- tion package using the shared-memory programming model OpenMP. During the code development OpenMP is em- ployed for the direct problem in such a way that, in a sub- sequent step, the OpenMP-parallelized code can be trans- formed via automatic differentiation into an OpenMP-paral- lelized code capable of computing derivatives for the

H. Martin Bücker; Arno Rasch; Volker Rath; Andreas Wolf

2009-01-01

463

We have designed a time-space multiresolution approach for large-scale molecular-dynamics (MD) simulations involving long-range Coulomb forces and three-body interactions. This approach has been implemented on various parallel architectures including the 512-node Intel Touchstone Delta at Caltech and the 128-processor IBM SP1 at Argonne National Laboratory. Parallel MD simulations involving 1.12-million particles have been performed to investigate the pore interface growth and the roughness of fracture surfaces in porous silica. When the mass density is reduced to a critical value, pores grow catastrophically to cause fracture. The roughness exponent for internally fractured surfaces, {alpha} = 0.87 {+-} 0.02, supports experimental claims about the universality of {alpha}. A reliable interatomic potential has been developed for MD simulations Of Si{sub 3}N{sub 4}. The nature of phonon densities-of-states due to low-energy floppy modes in crystalline and glassy states has been investigated. Floppy modes appear continuously in the glass as the connectivity of the system is reduced. In the crystal, they appear suddenly at 30% volume expansion. The density-of-states due to floppy modes varies linearly with energy, and the specific heat is significantly enhanced by these modes. Thermal conductivities of ceramic materials are calculated with a nonequilibrium MD method and the Kubo-Greenwood formula using a parallel eigensolver and the parallel MD approach. The calculations for amorphous silica agree well with experiments over a very wide range of temperatures above the plateau region. Currently, we are investigating thermal transport mechanisms in technologically important materials - porous glasses, nanophase ceramics, and zeolites.

Vashishta, P.; Kalia, R.K.; Greenwell, D.L.

1994-09-01

464

The design and implementation of the NCTUns network simulation engine

NCTUns is a network simulator running on Linux. It has several unique advantages over traditional network simula- tors. This paper presents the novel design and implementation of its simulation engine. This paper focuses on how to com- bine the kernel re-entering and discrete-event simulation methodologies to execute simulations quickly. The performance and scalability of NCTUns are also presented and discussed.

S. Y. Wang; C. L. Chou; C. C. Lin

2007-01-01

465

In this paper, the influence of the parallel nonlinearity on zonal flows and heat transport in global particle-in-cell ion-temperature-gradient simulations is studied. Although this term is in theory orders of magnitude smaller than the others, several authors [L. Villard, P. Angelino, A. Bottino et al., Plasma Phys. Contr. Fusion 46, B51 (2004); L. Villard, S. J. Allfrey, A. Bottino et al., Nucl. Fusion 44, 172 (2004); J. C. Kniep, J. N. G. Leboeuf, and V. C. Decyck, Comput. Phys. Commun. 164, 98 (2004);