Note: This page contains sample records for the topic parallel discrete-event simulation from Science.gov.
While these samples are representative of the content of Science.gov,
they are not comprehensive nor are they the most current set.
We encourage you to perform a real-time search of Science.gov
to obtain the most current and comprehensive results.
Last update: November 12, 2013.
1

Parallel discrete event simulation  

Microsoft Academic Search

Parallel discrete event simulation (PDES), sometimes called distributed simulation, refers to the execution of a single discrete event simulation program on a parallel computer. PDES has attracted a considerable amount of interest in recent years. From a pragmatic standpoint, this interest arises from the fact that large simulations in engineering, computer science, economics, and military applications, to mention a few,

Richard M. Fujimoto

1990-01-01

2

Towards Flexible, Reliable, High Throughput Parallel Discrete Event Simulations  

Microsoft Academic Search

\\u000a The excessive amount of time necessary to complete large-scale discrete-event simulations of complex systems such as telecommunication\\u000a networks, transportation systems, and multiprocessor computers continues to plague researchers and impede progress in many\\u000a important domains. Parallel discrete-event simulation techniques offer an attractive solution to addressing this problem by\\u000a enabling scalable execution, and much prior research has been focused on this approach.

Richard Fujimoto; Alfred Park; Jen-Chih Huang

2007-01-01

3

Scheduling critical channels in conservative parallel discrete event simulation  

Microsoft Academic Search

This paper introduces the Critical Channel Traversing (CCT) algorithm, a new scheduling algorithm for both sequential and parallel discrete event simulation. CCT is a general conservative algorithm that is aimed at the simulation of low-granularity network models on shared-memory multi-processor computers.An implementation of the CCT algorithm within a kernel called TasKit has demonstrated excellent performance for large ATM network simulations

Xiao Zhonge; Brian Unger; Rob Simmonds; John G. Cleary

1999-01-01

4

An adaptive synchronization protocol for parallel discrete event simulation  

SciTech Connect

Simulation, especially discrete event simulation (DES), is used in a variety of disciplines where numerical methods are difficult or impossible to apply. One problem with this method is that a sufficiently detailed simulation may take hours or days to execute, and multiple runs may be needed in order to generate the desired results. Parallel discrete event simulation (PDES) has been explored for many years as a method to decrease the time taken to execute a simulation. Many protocols have been developed which work well for particular types of simulations, but perform poorly when used for other types of simulations. Often it is difficult to know a priori whether a particular protocol is appropriate for a given problem. In this work, an adaptive synchronization method (ASM) is developed which works well on an entire spectrum of problems. The ASM determines, using an artificial neural network (ANN), the likelihood that a particular event is safe to process.

Bisset, K.R.

1998-12-01

5

The effects of parallel processing architectures on discrete event simulation  

Microsoft Academic Search

As systems become more complex, particularly those containing embedded decision algorithms, mathematical modeling presents a rigid framework that often impedes representation to a sufficient level of detail. Using discrete event simulation, one can build models that more closely represent physical reality, with actual algorithms incorporated in the simulations. Higher levels of detail increase simulation run time. Hardware designers have succeeded

William Cave; Edward Slatt; Robert E. Wassmer

2005-01-01

6

A Parallel Discrete-Event Simulation of Wafer Fabrication Processes  

Microsoft Academic Search

Simulation modeling is an important tool for planning factory operations, to identify and eliminate possible bottlenecks and to maintain h igh machine utilization. The objective of our project i s to app ly parallel simulation techniques for virtual f actory modeling in the electronics manufacturing sector. We have implemented a parallel wafer fabrication simulation model based on the Sematech data

Chu-Cheow LIM; Yoke-Hean LOW; Boon-Ping GAN; Stephen J. Turner; Sanjay Jain; Wentong CAI; Wen Jing HSU; Shell Ying

1998-01-01

7

Scaling properties of a conservative algorithm for distributed parallel discrete-event simulations  

Microsoft Academic Search

We study asymptotic scaling properties of conservative algorithms for parallel discrete-event simulations [Korniss et al., Science 299, 677 (2003); Kolakowska et al., Phys. Rev. E 68, 046705 (2003)] (e.g., for spatially distributed parallel simulations of dynamic Monte Carlo for spin systems). The key concept is a simulated time horizon that is an evolving nonequilibrium surface, specific for the particular algorithm.

P. S. Verma; A. K. Kolakowska; M. A. Novotny

2004-01-01

8

A scalable framework for parallel discrete event simulations on desktop grids  

Microsoft Academic Search

Utilizing desktop grid infrastructures is challenging for parallel discrete event simulation (POES) codes due to characteristics such as inter-process messaging, restricted execution, and overall lower concurrency than typical volunteer computing projects. The Aurora2 system uses an approach that simultaneously provides both replicated execution support and scalable performance of PDES applications through public resource computing. This is accomplished through a multithreaded

Alfred Park; Richard Fujimoto

2007-01-01

9

Explicit spatial scattering for load balancing in conservatively synchronized parallel discrete-event simulations  

SciTech Connect

We re-examine the problem of load balancing in conservatively synchronized parallel, discrete-event simulations executed on high-performance computing clusters, focusing on simulations where computational and messaging load tend to be spatially clustered. Such domains are frequently characterized by the presence of geographic 'hot-spots' - regions that generate significantly more simulation events than others. Examples of such domains include simulation of urban regions, transportation networks and networks where interaction between entities is often constrained by physical proximity. Noting that in conservatively synchronized parallel simulations, the speed of execution of the simulation is determined by the slowest (i.e most heavily loaded) simulation process, we study different partitioning strategies in achieving equitable processor-load distribution in domains with spatially clustered load. In particular, we study the effectiveness of partitioning via spatial scattering to achieve optimal load balance. In this partitioning technique, nearby entities are explicitly assigned to different processors, thereby scattering the load across the cluster. This is motivated by two observations, namely, (i) since load is spatially clustered, spatial scattering should, intuitively, spread the load across the compute cluster, and (ii) in parallel simulations, equitable distribution of CPU load is a greater determinant of execution speed than message passing overhead. Through large-scale simulation experiments - both of abstracted and real simulation models - we observe that scatter partitioning, even with its greatly increased messaging overhead, significantly outperforms more conventional spatial partitioning techniques that seek to reduce messaging overhead. Further, even if hot-spots change over the course of the simulation, if the underlying feature of spatial clustering is retained, load continues to be balanced with spatial scattering leading us to the observation that spatial scattering can often obviate the need for dynamic load balancing.

Thulasidasan, Sunil [Los Alamos National Laboratory; Kasiviswanathan, Shiva [Los Alamos National Laboratory; Eidenbenz, Stephan [Los Alamos National Laboratory; Romero, Philip [Los Alamos National Laboratory

2010-01-01

10

Distributed discrete-event simulation  

Microsoft Academic Search

Traditional discrete-event simulations employ an inherently sequential algorithm. In practice, simulations of large systems are limited by this sequentiality, because only a modest number of events can be simulated. Distributed discrete-event simulation (carried out on a network of processors with asynchronous message-communicating capabilities) is proposed as an alternative; it may provide better performance by partitioning the simulation among the component

Jayadev Misra

1986-01-01

11

Writing parallel, discrete-event simulations in ModSim: Insight and experience  

SciTech Connect

The Time Warp Operating System (TWOS) has been the focus of much research in parallel simulation. A new language, called ModSim, has been developed for use in conjunction with TWOS. The coupling of ModSim and TWOS provides a tool to construct large, complex simulation models that will run on several parallel and distributed computer systems. As part of the Griffin Project'' underway here at Los Alamos National Laboratory, there is strong interest in assessing the coupling of ModSim and TWOS from an application-oriented perspective. To this end, a key component of the Eagle combat simulation has been implemented in ModSim for execution on TWOS. In this paper brief overviews of ModSim and TWOS will be presented. Finally, the compatibility of the computational models presented by the language and the operating system will be examined in light of experience gained to date. 18 refs., 4 figs.

Rich, D.O.; Michelsen, R.E.

1989-09-11

12

SIMONE, a Discrete Event Simulation Supervisor.  

National Technical Information Service (NTIS)

The general discrete event simulation supervisory system SIMONE is described. The facilities available and method of use are detailed and illustrated with an example. It uses a three phase system to control time and event selection, and provides additiona...

B. K. McMillan

1988-01-01

13

Discrete Event Simulation Model Decomposition.  

National Technical Information Service (NTIS)

Simulation models are currently being used for a multitude of purposes. Some simulation models are concise and well organized thereby facilitating their usage. However, some of these models may be quite lengthy and complex which causes development costs t...

S. R. Matthes

1988-01-01

14

Hierarchical Modeling for Discrete Event Simulation (Panel)  

Microsoft Academic Search

This panel session is 1.0 discuss the issucs and current research in hierarchical modeling for discrete event simulation. Three academic researchcrs are to briefly describe their research in hicrarchical modeling and the issues and one industrial practitioncr will present the issues from a user's perspcctivc. 1 BACKGROUND

R. G. Sargent; J. H. Mize; D. H. Withers; B. P. Zeigler

1993-01-01

15

THE OMNET++ DISCRETE EVENT SIMULATION SYSTEM  

Microsoft Academic Search

The paper introduces OMNeT++, a C++-based discrete event simulation package primarily targeted at simulating computer networks and other distributed systems. OMNeT++ is fully programmable and modular, and it was designed from the ground up to support modeling very large networks built from reusable model components. Large emphasis was placed also on easy traceability and debuggability of simulation models: one can

András Varga

2001-01-01

16

Performance prediction of large-scale parallel discrete event models of physical systems  

Microsoft Academic Search

A virtualization system is presented that is designed to help predict the performance of parallel\\/distributed discrete event simulations on massively parallel (supercomputing) platforms. It is intended to be useful in experimenting with and understanding the effects of execution parameters, such as different load balancing schemes and mixtures of model fidelity. A case study of the virtualization system is presented in

Kalyan S. Perumalla; Richard M. Fujimoto; Prashant J. Thakare; Santosh Pande; Homa Karimabadi; Yuri Omelchenko; Jonathan Driscoll

2005-01-01

17

Predicting the Performance of Synchronous Discrete Event Simulation  

Microsoft Academic Search

We develop a model to predict the performance of synchronous discrete event simulation. Our model considers the two most important factors for the performance of synchronous simulation: load balancing and communication. The effect of load balancing in a synchronous simulation is computed using probability distribution models. We derive a formula that computes the cost of synchronous simulation by combining a

Jinsheng Xu; Moon-jung Chung

2004-01-01

18

Integrating agent based modeling into a discrete event simulation  

Microsoft Academic Search

Movement of entities in discrete event simulation typically requires predefined paths with decision points that dictate entity movement. Human-like travel is difficult to model correctly with these constraints because that is not how people move and large individual differences exist in capabilities and strategies. Agent based modeling is considered a better way to simulate the real-time interaction of people with

Benjamin Dubiel; Omer Tsimhoni

2005-01-01

19

Hierarchical modeling for discrete event simulation (panel)  

Microsoft Academic Search

This panel session is to discuss the issues and current research in hierarchical modeling for discrctc event simulation. Three academic researchers are to briefly describe their research in hierarchical modeling and the issues and one industrial practitioner will present the issues from a user’s perspective.

Robert G. Sargent; Joe H. Mize; David H. Withers; Bernard P. Zeigler

1993-01-01

20

Oil-derivatives pipeline logistics using discrete-event simulation  

Microsoft Academic Search

The management of oil-product pipelines represents a critical task in the daily operation of petroleum supply chains. Efficient computational tools are needed to perform this activity in a reliable and cost-effective manner. This work presents a novel discrete event simulation system developed on Arena® for the detailed scheduling of a multiproduct pipeline consisting of a sequence of pipes that connect

Vanina G. Cafaro; Diego C. Cafaro; Carlos A. Méndez; Jaime Cerdá

2010-01-01

21

The complexity of rapid learning in discrete event simulation  

Microsoft Academic Search

Sensitivity analysis and optimization of discrete event simulation models require the ability to efficiently estimate performance measures under different parameter settings. One technique, termed rapid learning, aims at enumerating all possible sample paths of such models. There are two necessary conditions for this capability: observability and constructability. This paper shows that the verification of the observability condition is an NP-hard

SHELDON H. JACOBSON

1997-01-01

22

Quantitative, scalable discrete-event simulation of metabolic pathways.  

PubMed

DMSS (Discrete Metabolic Simulation System) is a framework for modelling and simulating metabolic pathways. Quantitative simulation of metabolic pathways is achieved using discrete-event techniques. The approach differs from most quantitative simulators of metabolism which employ either time-differentiated functions or mathematical modelling techniques. Instead, models are constructed from biochemical data and biological knowledge, with accessibility and relevance to biologists serving as key features of the system. PMID:10786301

Meric, P A; Wise, M J

1999-01-01

23

Discrete-Event Simulation of Health Care Systems  

Microsoft Academic Search

Over the past forty years, health care organizations have faced ever-increasing pressures to deliver quality care while facing\\u000a rising costs, lower reimbursements, and new regulatory demands. Discrete-event simulation has become a popular and effective\\u000a decision-making tool for the optimal allocation of scarce health care resources to improve patient flow, while minimizing\\u000a health care delivery costs and increasing patient satisfaction. The

Sheldon H. Jacobson; Shane N. Hall; James R. Swisher

24

Reversible Discrete Event Formulation and Optimistic Parallel Execution of Vehicular Traffic Models  

SciTech Connect

Vehicular traffic simulations are useful in applications such as emergency planning and traffic management. High speed of traffic simulations translates to speed of response and level of resilience in those applications. Discrete event formulation of traffic flow at the level of individual vehicles affords both the flexibility of simulating complex scenarios of vehicular flow behavior as well as rapid simulation time advances. However, efficient parallel/distributed execution of the models becomes challenging due to synchronization overheads. Here, a parallel traffic simulation approach is presented that is aimed at reducing the time for simulating emergency vehicular traffic scenarios. Our approach resolves the challenges that arise in parallel execution of microscopic, vehicular-level models of traffic. We apply a reverse computation-based optimistic execution approach to address the parallel synchronization problem. This is achieved by formulating a reversible version of a discrete event model of vehicular traffic, and by utilizing this reversible model in an optimistic execution setting. Three unique aspects of this effort are: (1) exploration of optimistic simulation applied to vehicular traffic simulation (2) addressing reverse computation challenges specific to optimistic vehicular traffic simulation (3) achieving absolute (as opposed to self-relative) speedup with a sequential speed close to that of a fast, de facto standard sequential simulator for emergency traffic. The design and development of the parallel simulation system is presented, along with a performance study that demonstrates excellent sequential performance as well as parallel performance. The benefits of optimistic execution are demonstrated, including a speed up of nearly 20 on 32 processors observed on a vehicular network of over 65,000 intersections and over 13 million vehicles.

Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL

2009-01-01

25

Discrete-Event Simulation Models of Plasmodium falciparum Malaria  

PubMed Central

We develop discrete-event simulation models using a single “timeline” variable to represent the Plasmodium falciparum lifecycle in individual hosts and vectors within interacting host and vector populations. Where they are comparable our conclusions regarding the relative importance of vector mortality and the durations of host immunity and parasite development are congruent with those of classic differential-equation models of malaria, epidemiology. However, our results also imply that in regions with intense perennial transmission, the influence of mosquito mortality on malaria prevalence in humans may be rivaled by that of the duration of host infectivity.

McKenzie, F. Ellis; Wong, Roger C.; Bossert, William H.

2008-01-01

26

Discrete Event Modeling and Massively Parallel Execution of Epidemic Outbreak Phenomena  

SciTech Connect

In complex phenomena such as epidemiological outbreaks, the intensity of inherent feedback effects and the significant role of transients in the dynamics make simulation the only effective method for proactive, reactive or post-facto analysis. The spatial scale, runtime speed, and behavioral detail needed in detailed simulations of epidemic outbreaks make it necessary to use large-scale parallel processing. Here, an optimistic parallel execution of a new discrete event formulation of a reaction-diffusion simulation model of epidemic propagation is presented to facilitate in dramatically increasing the fidelity and speed by which epidemiological simulations can be performed. Rollback support needed during optimistic parallel execution is achieved by combining reverse computation with a small amount of incremental state saving. Parallel speedup of over 5,500 and other runtime performance metrics of the system are observed with weak-scaling execution on a small (8,192-core) Blue Gene / P system, while scalability with a weak-scaling speedup of over 10,000 is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes exceeding several hundreds of millions of individuals in the largest cases are successfully exercised to verify model scalability.

Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL

2011-01-01

27

Multi-Decision Supervisory Control: Parallel Decentralized Architectures Cooperating for Controlling Discrete Event Systems  

Microsoft Academic Search

This paper deals with decentralized supervisory con- trol, where a set of local supervisors cooperate in order to achieve a given global specification by controlling a discrete event system. We propose a new framework, called multi-decision control, whose basic principle consists in using several decentralized supervisory control architectures working in parallel and whose decisions are combined disjunctively or conjunctively. We

Hicham Chakib; Ahmed Khoumsi

2011-01-01

28

Discrete event simulation using PL\\/I based general and special purpose simulation languages  

Microsoft Academic Search

This paper describes the architecture and language features of a simulation model which was developed using a new IBM discrete event simulation package based on PL\\/I. The package contains implementations of both the GPSS and SIMPL\\/I simulation languages, and in addition provides the capability for a model developer to create special purpose simulation languages tailored to his unique simulation application.

Walter C. Metz

1981-01-01

29

PC-based discrete event simulation model of the Civilian Radioactive Waste Management System.  

National Technical Information Service (NTIS)

A System Simulation Model has been developed for the Department of Energy to simulate the movement of individual waste packages (spent fuel assemblies and fuel containers) through the Civilian Radioactive Waste Management System (CRWMS). A discrete event ...

G. L. Airth D. S. Joy J. W. Nehls

1991-01-01

30

DISCRETE-EVENT PROCESS SIMULATION FOR THE CONTINUOUS SIMULATION MODEL BUILDER  

Microsoft Academic Search

Simulation models, whether discrete, continuous, or a combination of both, are characteristically built to improve the understanding of a system and the processes operating within that system. Continuous simulation models study continuous variables, amenable to analysis via mathematical techniques such as differential and difference equations. Discrete-event process simulation models study integer-valued or binary variables requiring analysis via methods of discrete

Edward J. Williams

2002-01-01

31

Integrating Heterogeneous Distributed COTS Discrete-Event Simulation Packages: An Emerging Standards-Based Approach  

Microsoft Academic Search

This paper reports on the progress made toward the emergence of standards to support the integration of heteroge- neous discrete-event simulations (DESs) created in specialist sup- port tools called commercial-off-the-shelf (COTS) discrete-event simulation packages (CSPs). The general standard for heteroge- neous integration in this area has been developed from research in distributed simulation and is the IEEE 1516 standard The

Simon J. E. Taylor; Xiaoguang Wang; Stephen John Turner; Malcolm Yoke-hean Low

2006-01-01

32

Desktop Modeling and Simulation: Parsimonious, yet Effective Discrete-Event Simulation Analysis.  

National Technical Information Service (NTIS)

This paper evaluates how quickly students can be trained to construct useful discrete-event simulation models using Excel The typical supply chain used by many large national retailers is described, and an Excel-based simulation model is constructed of it...

J. R. Bradley

2012-01-01

33

A discrete-event mesoscopic traffic simulation model for hybrid traffic simulation  

Microsoft Academic Search

The paper presents a mesoscopic traffic simulation model, particularly suited for the development of integrated meso-micro traffic simulation models. The model combines a number of the recent advances in simulation modeling, such as discrete-event time resolution and combined queue-server and speed-density modeling, with a number of new features such as the ability to integrate with microscopic models to create hybrid

Wilco Burghout; Haris N. Koutsopoulos; Ingmar Andreasson

2006-01-01

34

Discrete-event based simulation conceptual modeling of systems biology  

Microsoft Academic Search

The protein production from DNA to protein via RNA is a very complicated process, which could be called central dogma. In this paper, we used event based simulation to model, simulate, analyze and specify the three main processes that are involved in the process of protein production: replication, transcription, and translation. The whole control flow of event-based simulation is composed

Joe W. Yeol; Issac Barjis; Yeong S. Ryu; Joseph Barjis

2005-01-01

35

A discrete event simulation for the crew assignment process in North American freight railroads  

Microsoft Academic Search

This paper introduces a discrete event simulation for crew assignments and crew movements as a result of train traffic, labor rules, government regulations and optional crew schedules. The software is part of a schedule development system, FRCOS (Freight Rail Crew Optimization System), that was co-developed by Canadian National (CN) Rail and Circadian Technologies, Inc. The simulation allows verification of the

Rainer Guttkuhn; Todd Dawson; U. Trutschel

2003-01-01

36

A Comparison of System Dynamics (SD) and Discrete Event Simulation (DES)  

Microsoft Academic Search

System Dynamics modeling and Discrete Event Simulation both can be used to model corporate business decisions. However, there seems to be little dialog between the two communities of modelers. For instance, at the recent WinterSim simulation conference in Washington, DC, there were no presentations of System Dynamics models. Discussions held by the author with a number of practitioners in each

Al Sweetser; Andersen Consultng

37

A graphical, intelligent interface for discrete-event simulations  

SciTech Connect

This paper will present a prototype of anengagement analysis simulation tool. This simulation environment is to assist a user (analyst) in performing sensitivity analysis via the repeated execution of user-specified engagement scenarios. This analysis tool provides an intelligent front-end which is easy to use and modify. The intelligent front-end provides the capabilities to assist the use in the selection of appropriate scenario value. The incorporated graphics capabilities also provide additional insight into the simulation events as they are )openreverse arrowquotes)unfolding.)closingreverse arrowquotes) 4 refs., 4 figs.

Michelsen, C.; Dreicer, J.; Morgeson, D.

1988-01-01

38

Representing dynamic social networks in discrete event social simulation  

Microsoft Academic Search

One of the key structural components of social systems is the social network. The representation of this network structure is key to providing a valid representation of the society under study. The social science concept of homophily provides a conceptual model of how social networks are formed and evolve over time. Previous work described the results of social simulation using

Jonathan K. Alt; Stephen Lieberman

2010-01-01

39

Discrete Event Simulation Model (DESM) and the Associated Control and Example Input Files.  

National Technical Information Service (NTIS)

The Discrete Event Simulation Model (DESM) provides the capability to model the operation of a mass transit system operating over a network composed of guideway links and stations within a given time domain. A wide range of transit classes can be modelled...

L. S. Yuan L. E. Watson

1983-01-01

40

Managing Distribution in Refined Products Pipelines Using Discrete-Event Simulation  

Microsoft Academic Search

The management of oil-product pipelines represents a critical task in the daily operation of petroleum supply chains. Efficient computational tools are needed to schedule pipeline operations in a reliable and cost-effective manner. This work presents a novel discrete event simulation system for the detailed scheduling of a multiproduct pipeline consisting of a sequence of pipes that connects a single input

M. Fernanda Gleizes; Diego C. Cafaro

2012-01-01

41

Improving the rigor of discrete-event simulation in logistics and supply chain research  

Microsoft Academic Search

Purpose – The purpose of this paper is to present an eight-step simulation model development process (SMDP) for the design, implementation, and evaluation of logistics and supply chain simulation models, and to identify rigor criteria for each step. Design\\/methodology\\/approach – An extensive review of literature is undertaken to identify logistics and supply chain studies that employ discrete-event simulation modeling. From

Ila Manuj; John T. Mentzer; Melissa R. Bowers

2009-01-01

42

A PC-based discrete event simulation model of the Civilian Radioactive Waste Management System  

Microsoft Academic Search

A System Simulation Model has been developed for the Department of Energy to simulate the movement of individual waste packages (spent fuel assemblies and fuel containers) through the Civilian Radioactive Waste Management System (CRWMS). A discrete event simulation language, GPSS\\/PC, which runs on an IBM\\/PC and operates under DOS 5.0, mathematically represents the movement and processing of radioactive waste packages

G. L. Airth; D. S. Joy; J. W. Nehls

1991-01-01

43

Framework for Simulation of Hybrid Systems: Interoperation of Discrete Event and Continuous Simulators Using HLA\\/RTI  

Microsoft Academic Search

A hybrid system is a combination of discrete event and continuous systems that act together to perform a function not possible with any one of the individual system types alone. A simulation model for the system consists of two sub-models, a continuous system model, and a discrete event model, and interfaces between them. Naturally, the modeling\\/simulation tool\\/environment of each sub-model

Changho Sung; Tag Gon Kim

2011-01-01

44

Modeling of time in discrete-event simulation of systems-on-chip  

Microsoft Academic Search

Today's consumer electronics industry uses modeling and simulation to cope with the com- plexity and time-to-market challenges of designing high-tech devices. In such context, Transaction-Level Modeling (TLM) is a widely spread modeling ap- proach often used in conjunction with the IEEE standard SystemC discrete-event simulator. In this paper, we present a novel approach to mod- eling time that distinguishes between

Giovanni Funchal; Matthieu Moy

2011-01-01

45

Revisiting the Issue of Performance Enhancement of Discrete Event Simulation Software  

Microsoft Academic Search

New approaches are considered for performance en- hancement of discrete-event simulation software. In- stead of taking a purely algorithmic analysis view, we supplement algorithmic considerations with focus on system factors such as compiler\\/interpreter eciency, hybrid interpreted\\/compiled code, virtual and cache memory issues, and so on. The work here consists of a case study of the SimPy language, in which we

Alex Bahouth; Steven Crites; Norman Matloff; Todd Williamson

2007-01-01

46

Teaming discrete-event simulation and geographic information systems to solve a temporal\\/spatial business problem  

Microsoft Academic Search

Although discrete-event simulation has pedagogically been rooted in computer science, and the practicality of geographic information systems in geography, the combined use of both in the business world allows solving some very challenging temporal\\/spatial (time and space dependent) business problems. The discrete-event simulation language WebGPSS, an ideal simulation environment for the business person, is teamed with Microsoft MapPoint, a GIS

Richard G. Born

2005-01-01

47

Teaming discrete-event simulation and geographic information systems to solve a temporal\\/spatial business problem  

Microsoft Academic Search

Although discrete-event simulation has pedagogically been rooted in computer science, and the practicality of geo- graphic information systems in geography, the combined use of both in the business world allows solving some very challenging temporal\\/spatial (time and space dependent) business problems. The discrete-event simulation lan- guage WebGPSS, an ideal simulation environment for the business person, is teamed with Microsoft MapPoint,

Richard G. Born

2005-01-01

48

Discrete event simulation of the Defense Waste Processing Facility (DWPF) analytical laboratory  

SciTech Connect

A discrete event simulation of the Savannah River Site (SRS) Defense Waste Processing Facility (DWPF) analytical laboratory has been constructed in the GPSS language. It was used to estimate laboratory analysis times at process analytical hold points and to study the effect of sample number on those times. Typical results are presented for three different simultaneous representing increasing levels of complexity, and for different sampling schemes. Example equipment utilization time plots are also included. SRS DWPF laboratory management and chemists found the simulations very useful for resource and schedule planning.

Shanahan, K.L.

1992-02-01

49

Discrete event simulation as a tool in optimization of a professional complex adaptive system.  

PubMed

Similar urgent needs for improvement of health care systems exist in the developed and developing world. The culture and the organization of an emergency department in developing countries can best be described as a professional complex adaptive system, where each agent (employee) are ignorant of the behavior of the system as a whole; no one understands the entire system. Each agent's action is based on the state of the system at the moment (i.e. lack of medicine, unavailable laboratory investigation, lack of beds and lack of staff in certain functions). An important question is how one can improve the emergency service within the given constraints. The use of simulation signals is one new approach in studying issues amenable to improvement. Discrete event simulation was used to simulate part of the patient flow in an emergency department. A simple model was built using a prototyping approach. The simulation showed that a minor rotation among the nurses could reduce the mean number of visitors that had to be refereed to alternative flows within the hospital from 87 to 37 on a daily basis with a mean utilization of the staff between 95.8% (the nurses) and 87.4% (the doctors). We conclude that even faced with resource constraints and lack of accessible data discrete event simulation is a tool that can be used successfully to study the consequences of changes in very complex and self organizing professional complex adaptive systems. PMID:18487739

Nielsen, Anders Lassen; Hilwig, Helmer; Kissoon, Niranjan; Teelucksingh, Surujpal

2008-01-01

50

Simulation planning and rostering: a discrete event simulation for the crew assignment process in north american freight railroads  

Microsoft Academic Search

This paper introduces a discrete event simulation for crew assignments and crew movements as a result of train traffic, labor rules, government regulations and optional crew schedules. The software is part of a schedule development system, FRCOS (Freight Rail Crew Optimization System), that was co-developed by Canadian National (CN) Rail and Circadian Technologies, Inc. The simulation allows verification of the

Rainer Guttkuhn; Todd Dawson; Udo Trutschel; Jon Walker; Mike Moroz

2003-01-01

51

DeMO: An Ontology for Discrete-event Modeling and Simulation  

PubMed Central

Several fields have created ontologies for their subdomains. For example, the biological sciences have developed extensive ontologies such as the Gene Ontology, which is considered a great success. Ontologies could provide similar advantages to the Modeling and Simulation community. They provide a way to establish common vocabularies and capture knowledge about a particular domain with community-wide agreement. Ontologies can support significantly improved (semantic) search and browsing, integration of heterogeneous information sources, and improved knowledge discovery capabilities. This paper discusses the design and development of an ontology for Modeling and Simulation called the Discrete-event Modeling Ontology (DeMO), and it presents prototype applications that demonstrate various uses and benefits that such an ontology may provide to the Modeling and Simulation community.

Silver, Gregory A; Miller, John A; Hybinette, Maria; Baramidze, Gregory; York, William S

2011-01-01

52

A generic discrete-event simulation model for outpatient clinics in a large public hospital.  

PubMed

The orthopedic outpatient department (OPD) ward in a large Thai public hospital is modeled using Discrete-Event Stochastic (DES) simulation. Key Performance Indicators (KPIs) are used to measure effects across various clinical operations during different shifts throughout the day. By considering various KPIs such as wait times to see doctors, percentage of patients who can see a doctor within a target time frame, and the time that the last patient completes their doctor consultation, bottlenecks are identified and resource-critical clinics can be prioritized. The simulation model quantifies the chronic, high patient congestion that is prevalent amongst Thai public hospitals with very high patient-to-doctor ratios. Our model can be applied across five different OPD wards by modifying the model parameters. Throughout this work, we show how DES models can be used as decision-support tools for hospital management. PMID:23778015

Weerawat, Waressara; Pichitlamken, Juta; Subsombat, Peerapong

2013-01-01

53

The Impact of Inpatient Boarding on ED Efficiency: A Discrete-Event Simulation Study  

PubMed Central

In this study, a discrete-event simulation approach was used to model Emergency Department’s (ED) patient flow to investigate the effect of inpatient boarding on the ED efficiency in terms of the National Emergency Department Crowding Scale (NEDOCS) score and the rate of patients who leave without being seen (LWBS). The decision variable in this model was the boarder-released-ratio defined as the ratio of admitted patients whose boarding time is zero to all admitted patients. Our analysis shows that the Overcrowded+ (a NEDOCS score over 100) ratio decreased from 88.4% to 50.4%, and the rate of LWBS patients decreased from 10.8% to 8.4% when the boarder-released-ratio changed from 0% to 100%. These results show that inpatient boarding significantly impacts both the NEDOCS score and the rate of LWBS patient and this analysis provides a quantification of the impact of boarding on emergency department patient crowding.

Bair, Aaron E.; Chen, Yi-Chun; Morris, Beth A.

2009-01-01

54

Discrete event simulation model of sudden cardiac death predicts high impact of preventive interventions.  

PubMed

Sudden Cardiac Death (SCD) is responsible for at least 180,000 deaths a year and incurs an average cost of $286 billion annually in the United States alone. Herein, we present a novel discrete event simulation model of SCD, which quantifies the chains of events associated with the formation, growth, and rupture of atheroma plaques, and the subsequent formation of clots, thrombosis and on-set of arrhythmias within a population. The predictions generated by the model are in good agreement both with results obtained from pathological examinations on the frequencies of three major types of atheroma, and with epidemiological data on the prevalence and risk of SCD. These model predictions allow for identification of interventions and importantly for the optimal time of intervention leading to high potential impact on SCD risk reduction (up to 8-fold reduction in the number of SCDs in the population) as well as the increase in life expectancy. PMID:23648451

Andreev, Victor P; Head, Trajen; Johnson, Neil; Deo, Sapna K; Daunert, Sylvia; Goldschmidt-Clermont, Pascal J

2013-01-01

55

Discrete Event Simulation Model of Sudden Cardiac Death Predicts High Impact of Preventive Interventions  

NASA Astrophysics Data System (ADS)

Sudden Cardiac Death (SCD) is responsible for at least 180,000 deaths a year and incurs an average cost of $286 billion annually in the United States alone. Herein, we present a novel discrete event simulation model of SCD, which quantifies the chains of events associated with the formation, growth, and rupture of atheroma plaques, and the subsequent formation of clots, thrombosis and on-set of arrhythmias within a population. The predictions generated by the model are in good agreement both with results obtained from pathological examinations on the frequencies of three major types of atheroma, and with epidemiological data on the prevalence and risk of SCD. These model predictions allow for identification of interventions and importantly for the optimal time of intervention leading to high potential impact on SCD risk reduction (up to 8-fold reduction in the number of SCDs in the population) as well as the increase in life expectancy.

Andreev, Victor P.; Head, Trajen; Johnson, Neil; Deo, Sapna K.; Daunert, Sylvia; Goldschmidt-Clermont, Pascal J.

2013-05-01

56

Enabling smooth and scalable dynamic 3D visualization of discrete-event construction simulations in outdoor augmented reality  

Microsoft Academic Search

Visualization is a powerful method for verifying, validating, and communicating the results of a simulated model. Lack of visual understanding about a simulated model is one of the major reasons inhibiting contractors and engineers from using results obtained from discrete-event simulation to plan and design their construction processes and commit real resources on the job site. The fast emerging information

Amir H. Behzadan; Vineet R. Kamat

2007-01-01

57

Enabling smooth and scalable dynamic 3d visualization of discrete-event construction simulations in outdoor augmented reality  

Microsoft Academic Search

Visualization is a powerful method for verifying, validat- ing, and communicating the results of a simulated model. Lack of visual understanding about a simulated model is one of the major reasons inhibiting contractors and engi- neers from using results obtained from discrete-event simulation to plan and design their construction processes and commit real resources on the job site. The fast

Amir H. Behzadan; Vineet R. Kamat

2007-01-01

58

The effects of indoor environmental exposures on pediatric asthma: a discrete event simulation model  

PubMed Central

Background In the United States, asthma is the most common chronic disease of childhood across all socioeconomic classes and is the most frequent cause of hospitalization among children. Asthma exacerbations have been associated with exposure to residential indoor environmental stressors such as allergens and air pollutants as well as numerous additional factors. Simulation modeling is a valuable tool that can be used to evaluate interventions for complex multifactorial diseases such as asthma but in spite of its flexibility and applicability, modeling applications in either environmental exposures or asthma have been limited to date. Methods We designed a discrete event simulation model to study the effect of environmental factors on asthma exacerbations in school-age children living in low-income multi-family housing. Model outcomes include asthma symptoms, medication use, hospitalizations, and emergency room visits. Environmental factors were linked to percent predicted forced expiratory volume in 1 second (FEV1%), which in turn was linked to risk equations for each outcome. Exposures affecting FEV1% included indoor and outdoor sources of NO2 and PM2.5, cockroach allergen, and dampness as a proxy for mold. Results Model design parameters and equations are described in detail. We evaluated the model by simulating 50,000 children over 10 years and showed that pollutant concentrations and health outcome rates are comparable to values reported in the literature. In an application example, we simulated what would happen if the kitchen and bathroom exhaust fans were improved for the entire cohort, and showed reductions in pollutant concentrations and healthcare utilization rates. Conclusions We describe the design and evaluation of a discrete event simulation model of pediatric asthma for children living in low-income multi-family housing. Our model simulates the effect of environmental factors (combustion pollutants and allergens), medication compliance, seasonality, and medical history on asthma outcomes (symptom-days, medication use, hospitalizations, and emergency room visits). The model can be used to evaluate building interventions and green building construction practices on pollutant concentrations, energy savings, and asthma healthcare utilization costs, and demonstrates the value of a simulation approach for studying complex diseases such as asthma.

2012-01-01

59

A statistical process control approach to selecting a warm-up period for a discrete-event simulation  

Microsoft Academic Search

The selection of a warm-up period for a discrete-event simulation continues to be problematic. A variety of selection methods have been devised, and are briefly reviewed. It is apparent that no one method can be recommended above any other. A new approach, based upon the principles of statistical process control, is described (SPC method). Because simulation output data are often

Stewart Robinson

2007-01-01

60

Discrete-event simulation of a wide-area health care network.  

PubMed Central

OBJECTIVE: Predict the behavior and estimate the telecommunication cost of a wide-area message store-and-forward network for health care providers that uses the telephone system. DESIGN: A tool with which to perform large-scale discrete-event simulations was developed. Network models for star and mesh topologies were constructed to analyze the differences in performances and telecommunication costs. The distribution of nodes in the network models approximates the distribution of physicians, hospitals, medical labs, and insurers in the Province of Saskatchewan, Canada. Modeling parameters were based on measurements taken from a prototype telephone network and a survey conducted at two medical clinics. Simulation studies were conducted for both topologies. RESULTS: For either topology, the telecommunication cost of a network in Saskatchewan is projected to be less than $100 (Canadian) per month per node. The estimated telecommunication cost of the star topology is approximately half that of the mesh. Simulations predict that a mean end-to-end message delivery time of two hours or less is achievable at this cost. A doubling of the data volume results in an increase of less than 50% in the mean end-to-end message transfer time. CONCLUSION: The simulation models provided an estimate of network performance and telecommunication cost in a specific Canadian province. At the expected operating point, network performance appeared to be relatively insensitive to increases in data volume. Similar results might be anticipated in other rural states and provinces in North America where a telephone-based network is desired.

McDaniel, J G

1995-01-01

61

Using discrete-event simulation in strategic capacity planning for an outpatient physical therapy service.  

PubMed

This study uses a simulation model as a tool for strategic capacity planning for an outpatient physical therapy clinic in Taipei, Taiwan. The clinic provides a wide range of physical treatments, with 6 full-time therapists in each session. We constructed a discrete-event simulation model to study the dynamics of patient mixes with realistic treatment plans, and to estimate the practical capacity of the physical therapy room. The changes in time-related and space-related performance measurements were used to evaluate the impact of various strategies on the capacity of the clinic. The simulation results confirmed that the clinic is extremely patient-oriented, with a bottleneck occurring at the traction units for Intermittent Pelvic Traction (IPT), with usage at 58.9 %. Sensitivity analysis showed that attending to more patients would significantly increase the number of patients staying for overtime sessions. We found that pooling the therapists produced beneficial results. The average waiting time per patient could be reduced by 45 % when we pooled 2 therapists. We found that treating up to 12 new patients per session had no significantly negative impact on returning patients. Moreover, we found that the average waiting time for new patients decreased if they were given priority over returning patients when called by the therapists. PMID:23525907

Rau, Chi-Lun; Tsai, Pei-Fang Jennifer; Liang, Sheau-Farn Max; Tan, Jhih-Cian; Syu, Hong-Cheng; Jheng, Yue-Ling; Ciou, Ting-Syuan; Jaw, Fu-Shan

2013-03-24

62

Preventive maintenance optimisation of multi-equipment manufacturing systems by combining discrete event simulation and multi-objective evolutionary algorithms  

Microsoft Academic Search

This paper is focused on preventive maintenance optimisation in manufacturing environments, with the objective of determining the optimal preventive maintenance frequencies for multi-equipment systems under cost and profit criteria. The initiative considers the interaction of production, work in process material, quality and maintenance aspects. In this work the suitability of discrete event simulation to model or modify complex system models

A. Oyarbide-Zubillaga; A. Goti; A. Sanchez

2008-01-01

63

Discrete-Event Simulation Model for Evaluating Air Force Reusable Military Launch Vehicle Post-Landing Operations.  

National Technical Information Service (NTIS)

The purpose of this research was to develop a discrete-event computer simulation model of the post-landing vehicle recoveoperations to allow the Air Force Research Laboratory, Air Vehicles Directorate to evaluate design and process decisions and their imp...

M. Martindale

2006-01-01

64

Can discrete event simulation be of use in modelling major depression?  

PubMed Central

Background Depression is among the major contributors to worldwide disease burden and adequate modelling requires a framework designed to depict real world disease progression as well as its economic implications as closely as possible. Objectives In light of the specific characteristics associated with depression (multiple episodes at varying intervals, impact of disease history on course of illness, sociodemographic factors), our aim was to clarify to what extent "Discrete Event Simulation" (DES) models provide methodological benefits in depicting disease evolution. Methods We conducted a comprehensive review of published Markov models in depression and identified potential limits to their methodology. A model based on DES principles was developed to investigate the benefits and drawbacks of this simulation method compared with Markov modelling techniques. Results The major drawback to Markov models is that they may not be suitable to tracking patients' disease history properly, unless the analyst defines multiple health states, which may lead to intractable situations. They are also too rigid to take into consideration multiple patient-specific sociodemographic characteristics in a single model. To do so would also require defining multiple health states which would render the analysis entirely too complex. We show that DES resolve these weaknesses and that its flexibility allow patients with differing attributes to move from one event to another in sequential order while simultaneously taking into account important risk factors such as age, gender, disease history and patients attitude towards treatment, together with any disease-related events (adverse events, suicide attempt etc.). Conclusion DES modelling appears to be an accurate, flexible and comprehensive means of depicting disease progression compared with conventional simulation methodologies. Its use in analysing recurrent and chronic diseases appears particularly useful compared with Markov processes.

Le Lay, Agathe; Despiegel, Nicolas; Francois, Clement; Duru, Gerard

2006-01-01

65

Modelica-A General Object-Oriented Language for Continuous and Discrete-Event System Modeling and Simulation  

Microsoft Academic Search

Modelica is a general equation-based object-oriented language for continuous and discrete-event modeling of physical systems for the purpose of efficient simulation. The language unifies and generalizes previous object- oriented modeling languages. The Modelica modeling language and technology is being warmly received by the world community in modeling and simulation. It is bringing about a revolution in this area, based on

Peter Fritzson; Peter Bunus

2002-01-01

66

A discrete event-based simulation model for real-time traffic management in railways  

Microsoft Academic Search

Rail systems are highly complex and their control requires mathematical-computational tools. The main drawback of the models used to represent railway traffic, and to resolve any conflicts that occur, is the large computational time needed to obtain satisfactory results. Therefore the purpose of this paper is to study and design a discrete event-based model characterized by the positioning of trains

Jose L. Espinosa; Ricardo García-Ródenas

2012-01-01

67

Potential Applications of Discrete-event Simulation and Fuzzy Rule-based Systems to Structural Reliability and Availability  

Microsoft Academic Search

\\u000a This chapter discusses and illustrates some potential applications of discrete-event simulation (DES) techniques in structural\\u000a reliability and availability analysis, emphasizing the convenience of using probabilistic approaches in modern building and\\u000a civil engineering practices. After reviewing existing literature on the topic, some advantages of probabilistic techniques\\u000a over analytical ones are highlighted. Then, we introduce a general framework for performing structural reliability

Angel A. Juan; Albert Ferrer; Carles Serrat; Javier Faulin; Gleb Beliakov; Joshua Hester

68

Optimal location for a helicopter in a rural trauma system: prediction using discrete-event computer simulation.  

PubMed Central

A discrete-event computer simulation was developed using the C programming language to determine the optimal base location for a trauma system helicopter in Maine, a rural area with unevenly distributed population. Ambulance run reports from a one-year period provided input data on the times and places where major injuries occurred. Data from a statewide trauma registry were used to estimate the percentage of cases which would require trauma center care and the locations of functional trauma centers. Climatic data for this region were used to estimate the likelihood that a helicopter could not fly due to bad weather. The incidence of trauma events was modeled as a nonstationary Poisson process, and location of the events by an empirical distribution. For each simulated event, if the injuries were sufficiently severe, if weather permitted flying, if the occurrence were not within 20 miles of a center or outside the range of the helicopter, and if the helicopter were not already in service, then it was used for transportation. 35 simulated years were run for each of 4 proposed locations for the helicopter base. One of the geographically intermediate locations was shown to produce the most frequent utilization of the helicopter. Discrete-event simulation is a potentially useful tool in planning for emergency medical services systems. Further refinements and validation of predictions may lead to wider utilization.

Clark, D. E.; Hahn, D. R.; Hall, R. W.; Quaker, R. E.

1994-01-01

69

Combining Simulation and Animation of Queueing Scenarios in a Flash-Based Discrete Event Simulator  

Microsoft Academic Search

eLearning is an effective medium for delivering knowledge and skills to scattered learners. In spite of improvements in electronic\\u000a delivery technologies, eLearning is still a long way away from offering anything close to efficient and effective learning\\u000a environments. Much of the literature supports eLearning materials which embed simulation will improve eLearning experience\\u000a and promise many benefits for both teachers and

Ruzelan Khalid; Wolfgang Kreutzer; Tim Bell

2009-01-01

70

Discrete event fuzzy airport control  

Microsoft Academic Search

A discrete event simulation that uses a modified expert system as a controller is described. Fuzzy logic concepts from analog controllers are applied in the expert system controller to mimic human control of an airport, modeled with a combined discrete and continuous state space. The controller is adaptive so rule confidences are automatically varied to achieve near optimum system performance.

John R. Clymer; Philip D. Corey; Judith A. Gardner

1992-01-01

71

Keynote Speech 3: Discrete-Event Computer Simulation as a Paradigm of Scientific Investigations  

Microsoft Academic Search

Summary form only given. Advances of computer technology initiated in the twentieth century have resulted in adoption of computer simulation as the most popular tool of performance evaluation studies of such complex stochastic dynamic systems as e.g. modern multimedia telecommunication networks. Such widespread reliance on simulation studies raises the question of credibility of their results. This question needs to be

Krzysztof Pawlikowski

2007-01-01

72

Healthcare I: a discrete-event simulation application for clinics serving the poor  

Microsoft Academic Search

Healthcare management operates in an environment of aggressive pricing, tough competition, and rapidly changing guidelines. Computer simulation models are increasingly used by large healthcare institutions to meet these challenges. However, small healthcare facilities serving the poor are equally in need of meeting these challenges but lack the finances and personnel required to develop and implement their own simulation solutions. An

Christos Alexopoulos; David Goldsman; Mark Sawyer; Michelle De Guire; David Kopald; Kathy Holcomb

2001-01-01

73

Discrete-Event Simulation Model of Myocardial Electrical Activity: Mathematical Electrophysiology.  

National Technical Information Service (NTIS)

This work unit was opened to provide a channel for in-house work on mathematical modeling and computer simulation of the electrocardiogram (ECG) and its underlying electrophysiology. This work was intended to complement work being done in contracted effor...

R. B. Howe

1994-01-01

74

On Parallel Stochastic Simulation of Diffusive Systems  

Microsoft Academic Search

The parallel simulation of biochemical reactions is a very interesting problem: biochemical systems are inherently parallel,\\u000a yet the majority of the algorithms to simulate them, including the well-known and widespread Gillespie SSA, are strictly sequential.\\u000a Here we investigate, in a general way, how to characterize the simulation of biochemical systems in terms of Discrete Event\\u000a Simulation. We dissect their inherent

Lorenzo Dematté; Tommaso Mazza

2008-01-01

75

Discrete event simulation to support capacity planning for heart failure patients  

Microsoft Academic Search

Patients with heart failure (HF) are admitted to the hospital in a non-elective way. Receiving intravenous medication, they are very limited in their activities of daily living (ADL), resulting in intensive nursing care. To predict the needed capacity, a distinction is made between currently-admitted patients and patients that will be admitted in the near future. Computer simulation is used to

S. Groothuis; van Pol; N. Lencer; J. Stappers; J. Jans; J. Janssen; W. Dassen; P. Doevendans; M. Baljon; A. Hasman; G. G. van Merode

2001-01-01

76

Resource-constrained critical path analysis based on discrete event simulation and particle swarm optimization  

Microsoft Academic Search

The absence of a valid resource-constrained critical path method (CPM) not only hampers the widespread use of mainstream project scheduling software in construction management practice, but also destabilizes the very foundation of any sophisticated, CPM-based time or cost analysis in construction scheduling research. This has motivated us into developing an innovative, fully-automated solution to resource-constrained CPM called the Simplified Simulation-based

Ming Lu; Hoi-Ching Lam; Fei Dai

2008-01-01

77

Forest biomass supply logistics for a power plant using the discrete-event simulation approach  

SciTech Connect

This study investigates the logistics of supplying forest biomass to a potential power plant. Due to the complexities in such a supply logistics system, a simulation model based on the framework of Integrated Biomass Supply Analysis and Logistics (IBSAL) is developed in this study to evaluate the cost of delivered forest biomass, the equilibrium moisture content, and carbon emissions from the logistics operations. The model is applied to a proposed case of 300 MW power plant in Quesnel, BC, Canada. The results show that the biomass demand of the power plant would not be met every year. The weighted average cost of delivered biomass to the gate of the power plant is about C$ 90 per dry tonne. Estimates of equilibrium moisture content of delivered biomass and CO2 emissions resulted from the processes are also provided.

Mobini, Mahdi [University of British Columbia, Vancouver; Sowlati, T. [University of British Columbia, Vancouver; Sokhansanj, Shahabaddine [ORNL

2011-04-01

78

Aurora: An Approach to High Throughput Parallel Simulation  

Microsoft Academic Search

Abstract Amaster\\/worker paradigm for executing large-scale parallel discrete event simulation programs over network- enabled,computational resources is proposed ,and evaluated. In contrast to conventional approaches to parallel simulation, a client\\/server architecture is proposed where clients (workers) repeatedly download state vectors oflogical processes and associated message data from a server (master), perform simulation computations locally at the client, and then return the

Alfred Park; Richard M. Fujimoto

2006-01-01

79

A discrete event simulation model for evaluating the performances of an m/g/c/c state dependent queuing system.  

PubMed

M/G/C/C state dependent queuing networks consider service rates as a function of the number of residing entities (e.g., pedestrians, vehicles, and products). However, modeling such dynamic rates is not supported in modern Discrete Simulation System (DES) software. We designed an approach to cater this limitation and used it to construct the M/G/C/C state-dependent queuing model in Arena software. Using the model, we have evaluated and analyzed the impacts of various arrival rates to the throughput, the blocking probability, the expected service time and the expected number of entities in a complex network topology. Results indicated that there is a range of arrival rates for each network where the simulation results fluctuate drastically across replications and this causes the simulation results and analytical results exhibit discrepancies. Detail results that show how tally the simulation results and the analytical results in both abstract and graphical forms and some scientific justifications for these have been documented and discussed. PMID:23560037

Khalid, Ruzelan; Nawawi, Mohd Kamal M; Kawsar, Luthful A; Ghani, Noraida A; Kamil, Anton A; Mustafa, Adli

2013-04-01

80

A Discrete Event Simulation Model for Evaluating the Performances of an M/G/C/C State Dependent Queuing System  

PubMed Central

M/G/C/C state dependent queuing networks consider service rates as a function of the number of residing entities (e.g., pedestrians, vehicles, and products). However, modeling such dynamic rates is not supported in modern Discrete Simulation System (DES) software. We designed an approach to cater this limitation and used it to construct the M/G/C/C state-dependent queuing model in Arena software. Using the model, we have evaluated and analyzed the impacts of various arrival rates to the throughput, the blocking probability, the expected service time and the expected number of entities in a complex network topology. Results indicated that there is a range of arrival rates for each network where the simulation results fluctuate drastically across replications and this causes the simulation results and analytical results exhibit discrepancies. Detail results that show how tally the simulation results and the analytical results in both abstract and graphical forms and some scientific justifications for these have been documented and discussed.

Khalid, Ruzelan; M. Nawawi, Mohd Kamal; Kawsar, Luthful A.; Ghani, Noraida A.; Kamil, Anton A.; Mustafa, Adli

2013-01-01

81

The split system approach to managing time in simulations of hybrid systems having continuous and discrete event components  

Microsoft Academic Search

The efficient and accurate management of time in simulations of hybrid models is an outstanding engineering problem. General a priori knowledge about the dynamic behavior of the hybrid system (i.e. essentially continuous, essentially discrete, or 'truly hybrid') facilitates this task. Indeed, for essentially discrete and essentially continuous systems, existing software packages can be conveniently used to perform quite sophisticated and

James J Nutaro; Phani Teja Kuruganti; Vladimir A Protopopescu; Mallikarjun Shankar

2012-01-01

82

Agent-based simulation tutorial - simulation of emergent behavior and differences between agent-based simulation and discrete-event simulation  

Microsoft Academic Search

This tutorial demonstrates the use of agent-based simulation (ABS) in modeling emergent behaviors. We first introduce key concepts of ABS by using two simple examples: the Game of Life and the Boids models. We illustrate agent-based modeling issues and simulation of emergent behaviors by using examples in social networks, auction-type markets, emergency evacuation, crowd behavior under normal situations, biology, material

Young-Jun Son; Charles M. Macal

2010-01-01

83

Object-Oriented Military Simulation Baseline for Parallel Simulation Research.  

National Technical Information Service (NTIS)

This thesis documents the design and implementation of a discrete event military simulation using a modular object-oriented design and the C programming language. The basis simulation is one of interacting objects. The objects move along a predetermined p...

R. J. Rizza

1990-01-01

84

DISCRETE EVENT MODELING IN PTOLEMY II  

Microsoft Academic Search

Abstract This report describes the discrete-event semantics and its implementation,in the Ptolemy II software architecture. The discrete-event system representation is appropriate for time-oriented systems such as queueing systems, communication networks, and hardware systems. A key strength in our discrete-event implementation ,is that simultaneous ,events are handled systematically and deterministically. A formal and rigorous treatment of this property is given. One

Lukito Muliadi

85

Parallel simulation of DEVS and Cell-DEVS models on Windows-based PC cluster systems  

Microsoft Academic Search

The growing popularity of Networks of Workstations (NOW) in scientific computation has drawn increasing interest from the M&S community. This paper addresses the issue of parallel discrete-event simulation of DEVS and Cell-DEVS models on a Microsoft Windows-based cluster system comprising interconnected general-purpose personal computers. We present the architecture and features of PCD++Win, a parallel simulator that takes advantage of the

Bo Feng; Qi Liu; Gabriel A. Wainer

2008-01-01

86

Parallel Lisp Simulator,  

National Technical Information Service (NTIS)

CSIM is a simulator for parallel Lisp, based on a continuation passing interpreter. It models a shared-memory multiprocessor executing programs written in Common Lisp, extended with several primitives for creating and controlling processes. This paper des...

J. S. Weening

1988-01-01

87

Scaling Time Warp-based Discrete Event Execution to 104 Processors on Blue Gene Supercomputer  

SciTech Connect

Lately, important large-scale simulation applications, such as emergency/event planning and response, are emerging that are based on discrete event models. The applications are characterized by their scale (several millions of simulated entities), their fine-grained nature of computation (microseconds per event), and their highly dynamic inter-entity event interactions. The desired scale and speed together call for highly scalable parallel discrete event simulation (PDES) engines. However, few such parallel engines have been designed or tested on platforms with thousands of processors. Here an overview is given of a unique PDES engine that has been designed to support Time Warp-style optimistic parallel execution as well as a more generalized mixed, optimistic-conservative synchronization. The engine is designed to run on massively parallel architectures with minimal overheads. A performance study of the engine is presented, including the first results to date of PDES benchmarks demonstrating scalability to as many as 16,384 processors, on an IBM Blue Gene supercomputer. The results show, for the first time, the promise of effectively sustaining very large scale discrete event execution on up to 104 processors.

Perumalla, Kalyan S [ORNL

2007-01-01

88

Parallel Event-Driven Neural Network Simulations Using the Hodgkin-Huxley Neuron Model  

Microsoft Academic Search

Neural systems are composed of a large number of highly-connected neurons and are widely simulated within the neurological community. In this paper, we examine the application of parallel discrete event simulation techniques to networks of a complex model called the Hodgkin-Huxley neuron[1]. We describe the conversion of this model into an event-driven simulation, a technique that offers the potential of

Collin J. Lobb; Zenas Chao; Richard M. Fujimoto; Steve M. Potter

2005-01-01

89

Xyce parallel electronic simulator.  

SciTech Connect

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

2010-05-01

90

Scalable Parallel Crash Simulations  

SciTech Connect

We are pleased to submit our efforts in parallelizing the PRONTO application suite for con- sideration in the SuParCup 99 competition. PRONTO is a finite element transient dynamics simulator which includes a smoothed particle hydrodynamics (SPH) capability; it is similar in scope to the well-known DYNA, PamCrash, and ABAQUS codes. Our efforts over the last few years have produced a fully parallel version of the entire PRONTO code which (1) runs fast and scalably on thousands of processors, (2) has performed the largest finite-element transient dynamics simulations we are aware of, and (3) includes several new parallel algorithmic ideas that have solved some difficult problems associated with contact detection and SPH scalability. We motivate this work, describe the novel algorithmic advances, give performance numbers for PRONTO running on Sandia's Intel Teraflop machine, and highlight two prototypical large-scale computations we have performed with the parallel code. We have successfully parallelized a large-scale production transient dynamics code with a novel algorithmic approach that utilizes multiple decompositions for different key segments of the computations. To be able to simulate a more than ten million element model in a few tenths of second per timestep is unprecedented for solid dynamics simulations, especially when full global contact searches are required. The key reason is our new algorithmic ideas for efficiently parallelizing the contact detection stage. To our knowledge scalability of this computation had never before been demonstrated on more than 64 processors. This has enabled parallel PRONTO to become the only solid dynamics code we are aware of that can run effectively on 1000s of processors. More importantly, our parallel performance compares very favorably to the original serial PRONTO code which is optimized for vector supercomputers. On the container crush problem, a Teraflop node is as fast as a single processor of the Cray Jedi. This means that on the Teraflop machine we can now run simulations with tens of millions of elements thousands of times faster than we could on the Jedi! This is enabling transient dynamics simulations of unprecedented scale and fidelity. Not only can previous applications be run with vastly improved resolution and speed, but qualitatively new and different analyses have been made possible.

Attaway, Stephen; Barragy, Ted; Brown, Kevin; Gardner, David; Gruda, Jeff; Heinstein, Martin; Hendrickson, Bruce; Metzinger, Kurt; Neilsen, Mike; Plimpton, Steve; Pott, John; Swegle, Jeff; Vaughan, Courtenay

1999-06-01

91

Scalable Parallel Crash Simulations  

Microsoft Academic Search

We are pleased to submit our efforts in parallelizing the PRONTO application suite for con- sideration in the SuParCup 99 competition. PRONTO is a finite element transient dynamics simulator which includes a smoothed particle hydrodynamics (SPH) capability; it is similar in scope to the well-known DYNA, PamCrash, and ABAQUS codes. Our efforts over the last few years have produced a

Stephen Attaway; Ted Barragy; Kevin Brown; David Gardner; Jeff Gruda; Martin Heinstein; Bruce Hendrickson; Kurt Metzinger; Mike Neilsen; Steve Plimpton; John Pott; Jeff Swegle; Courtenay Vaughan

1999-01-01

92

Discrete Events as Units of Perceived Time  

ERIC Educational Resources Information Center

In visual images, we perceive both space (as a continuous visual medium) and objects (that inhabit space). Similarly, in dynamic visual experience, we perceive both continuous time and discrete events. What is the relationship between these units of experience? The most intuitive answer may be similar to the spatial case: time is perceived as an…

Liverence, Brandon M.; Scholl, Brian J.

2012-01-01

93

Asynchronous implementation of synchronous discrete event control  

Microsoft Academic Search

Discrete event control is typically designed under the synchronous hypothesis that sensing and actuation incur zero delays, i.e., there exists zero delay between an event execution at a plant site and its observation at a controller site, and also between a control computation at a controller site and its enforcement at a plant site. An actual implementation, however, is asynchronous,

S. Xu; R. Kumar

2008-01-01

94

Parallel implementation of VHDL simulations on the Intel iPSC/2 hypercube. Master's thesis  

SciTech Connect

VHDL models are executed sequentially in current commercial simulators. As chip designs grow larger and more complex, simulations must run faster. One approach to increasing simulation speed is through parallel processors. This research transforms the behavioral and structural models created by Intermetrics' sequential VHDL simulator into models for parallel execution. The models are simulated on an Intel iPSC/2 hypercube with synchronization of the nodes being achieved by utilizing the Chandy Misra paradigm for discrete-event simulations. Three eight-bit adders, the ripple carry, the carry save, and the carry-lookahead, are each run through the parallel simulator. Simulation time is cut in at least half for all three test cases over the sequential Intermetrics model. Results with regard to speedup are given to show effects of different mappings, varying workloads per node, and overhead due to output messages.

Comeau, R.C.

1991-12-01

95

Structural patterns or discrete events? A link between pattern recognition and discrete-event systems  

Microsoft Academic Search

There is an interesting analogy between recognition of noisy, distorted, or incomplete structural patterns using the weighted distance based on symbol-to-symbol operations and modelling of actual discrete-event systems, where different types of uncertainty can occur. A method of analysis and modelling of a complex system's behaviour represented by a sufficiently long time series of formal symbols is considered. The ideas

J. Pik

1992-01-01

96

Approximate Time-Parallel Cache Simulation  

Microsoft Academic Search

In time-parallel simulation, the simulation time axis is de- composed into a number of slices which are assigned to parallel processes for concurrent simulation. Although a promising parallelization technique, it is difficult to be ap- plied. Recently, using approximation with time-parallel simulation has been proposed to extend the class of suit- able models and to improve the performance of existing

Tobias Kiesling

2004-01-01

97

Reservoir Thermal Recover Simulation on Parallel Computers  

Microsoft Academic Search

The rapid development of parallel computers has provided a hardware background for massive refine reservoir simulation. However,\\u000a the lack of parallel reservoir simulation software has blocked the application of parallel computers on reservoir simulation.\\u000a Although a variety of parallel methods have been studied and applied to black oil, compositional, and chemical model numerical\\u000a simulations, there has been limited parallel software

Baoyan Li; Yuanle Ma

2000-01-01

98

Parallel multi-delay simulation  

Microsoft Academic Search

ABSTRACT The Multi-DelayPara llel (MDP) algorithm is an unconventional multi-delay algorithm in that it uses no timing wheel, or any event-sorting mechanism of any kind. Instead, wide bit-fields containing net values for several different times are used to resolve out-of-order events, and bit-parallel operations are performed to simulate the required gates. The MDP algorithm was designed to be implemented in

Yun Sik Lee; Peter M. Maurer

1993-01-01

99

An assessment of the ModSim/TWOS parallel simulation environment  

SciTech Connect

The Time Warp Operating System (TWOS) has been the focus of significant research in parallel, discrete-event simulation (PDES). A new language, ModSim, has been developed for use in conjunction with TWOS. The coupling of ModSim and TWOS is an attempt to address the development of large-scale, complex, discrete-event simulation models for parallel execution. The approach, simply stated, is to provide a high-level simulation-language that embodies well-known software engineering principles combined with a high-performance parallel execution environment. The inherent difficulty with this approach is the mapping of the simulation application to the parallel run-time environment. To use TWOS, Time Warp applications are currently developed in C and must be tailored according to a set of constraints and conventions. C/TWOS applications are carefully developed using explicit calls to the Time Warp primitives; thus, the mapping of application to parallel run-time environment is done by the application developer. The disadvantage to this approach is the questionable scalability to larger software efforts; the obvious advantage is the degree of control over managing the efficient execution of the application. The ModSim/TWOS system provides an automatic mapping from a ModSim application to an equivalent C/TWOS application. The major flaw with the ModSim/TWOS system is it currently exists is that there is no compiler support for mapping a ModSim application into an efficient C/TWOS application. Moreover, the ModSim language as currently defined does not provide explicit hooks into the Time Warp Operating System and hence the developer is unable to tailor a ModSim application in the same fashion that a C application can be tailored. Without sufficient compiler support, there is a mismatch between ModSim's object-oriented, process-based execution model and the Time Warp execution model.

Rich, D.O.; Michelsen, R.E.

1991-01-01

100

Xyce parallel electronic simulator design.  

SciTech Connect

This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

2010-09-01

101

Parallel Network Simulations with NEURON  

PubMed Central

The NEURON simulation environment has been extended to support parallel network simulations. Each processor integrates the equations for its subnet over an interval equal to the minimum (interprocessor) presynaptic spike generation to postsynaptic spike delivery connection delay. The performance of three published network models with very different spike patterns exhibits superlinear speedup on Beowulf clusters and demonstrates that spike communication overhead is often less than the benefit of an increased fraction of the entire problem fitting into high speed cache. On the EPFL IBM Blue Gene, almost linear speedup was obtained up to 100 processors. Increasing one model from 500 to 40,000 realistic cells exhibited almost linear speedup on 2000 processors, with an integration time of 9.8 seconds and communication time of 1.3 seconds. The potential for speed-ups of several orders of magnitude makes practical the running of large network simulations that could otherwise not be explored.

Migliore, M.; Cannia, C.; Lytton, W.W; Markram, Henry; Hines, M. L.

2009-01-01

102

Parallel network simulations with NEURON.  

PubMed

The NEURON simulation environment has been extended to support parallel network simulations. Each processor integrates the equations for its subnet over an interval equal to the minimum (interprocessor) presynaptic spike generation to postsynaptic spike delivery connection delay. The performance of three published network models with very different spike patterns exhibits superlinear speedup on Beowulf clusters and demonstrates that spike communication overhead is often less than the benefit of an increased fraction of the entire problem fitting into high speed cache. On the EPFL IBM Blue Gene, almost linear speedup was obtained up to 100 processors. Increasing one model from 500 to 40,000 realistic cells exhibited almost linear speedup on 2,000 processors, with an integration time of 9.8 seconds and communication time of 1.3 seconds. The potential for speed-ups of several orders of magnitude makes practical the running of large network simulations that could otherwise not be explored. PMID:16732488

Migliore, M; Cannia, C; Lytton, W W; Markram, Henry; Hines, M L

2006-05-26

103

Parallel simulation of the Sharks World problem  

Microsoft Academic Search

The Sharks World problem has been suggested as a suitable application to evaluate the effectiveness of parallel simulation algorithms. This paper develops a simulation model in Maisie, a C-based simulation language. With minor modifications, a Maisie progrmm may be executed using either sequential or parallel simulation algorithms. The paper presents the results of executing the Maisie model on a multicomputer

Rajive L. Bagrodia; Wen-Toh Liao

1990-01-01

104

State Estimation and Detectability of Probabilistic Discrete Event Systems1  

PubMed Central

A probabilistic discrete event system (PDES) is a nondeterministic discrete event system where the probabilities of nondeterministic transitions are specified. State estimation problems of PDES are more difficult than those of non-probabilistic discrete event systems. In our previous papers, we investigated state estimation problems for non-probabilistic discrete event systems. We defined four types of detectabilities and derived necessary and sufficient conditions for checking these detectabilities. In this paper, we extend our study to state estimation problems for PDES by considering the probabilities. The first step in our approach is to convert a given PDES into a nondeterministic discrete event system and find sufficient conditions for checking probabilistic detectabilities. Next, to find necessary and sufficient conditions for checking probabilistic detectabilities, we investigate the “convergence” of event sequences in PDES. An event sequence is convergent if along this sequence, it is more and more certain that the system is in a particular state. We derive conditions for convergence and hence for detectabilities. We focus on systems with complete event observation and no state observation. For better presentation, the theoretical development is illustrated by a simplified example of nephritis diagnosis.

Shu, Shaolong; Ying, Hao; Chen, Xinguang

2009-01-01

105

Exploiting Simulation Slack to Improve Parallel Simulation Speed  

Microsoft Academic Search

Parallel simulation is a technique to accelerate microarchitecture simulation of CMPs by exploiting the inherent parallelism of CMPs. In this paper, we explore the simulation paradigm of simulating each core of a target CMP in one thread and then spreading the threads across the hardware thread contexts of a host CMP. We start with cycle-by-cycle simulation and then relax the

Jianwei Chen; Murali Annavaram; Michel Dubois

2009-01-01

106

Perturbation analysis and augmented markov chains for discrete event systems  

Microsoft Academic Search

Given the complexity of stochastic Discrete Event Systems (DES), sample path analysis provides an approach through which performance sensitivity estimates can be obtained. In this paper, we review the development of this approach, and present the framework for one new direction known as augmented system analysis. The main issue we address is that of predicting a perturbed DES sample path

Christos G. Cassandras; Stephen G. Strickland

107

Parallel execution and scriptability in micromagnetic simulations  

NASA Astrophysics Data System (ADS)

We demonstrate the feasibility of an ``encapsulated parallelism'' approach toward micromagnetic simulations that combines offering a high degree of flexibility to the user with the efficient utilization of parallel computing resources. While parallelization is obviously desirable to address the high numerical effort required for realistic micromagnetic simulations through utilizing now widely available multiprocessor systems (including desktop multicore CPUs and computing clusters), conventional approaches toward parallelization impose strong restrictions on the structure of programs: numerical operations have to be executed across all processors in a synchronized fashion. This means that from the user's perspective, either the structure of the entire simulation is rigidly defined from the beginning and cannot be adjusted easily, or making modifications to the computation sequence requires advanced knowledge in parallel programming. We explain how this dilemma is resolved in the NMAG simulation package in such a way that the user can utilize without any additional effort on his side both the computational power of multiple CPUs and the flexibility to tailor execution sequences for specific problems: simulation scripts written for single-processor machines can just as well be executed on parallel machines and behave in precisely the same way, up to increased speed. We provide a simple instructive magnetic resonance simulation example that demonstrates utilizing both custom execution sequences and parallelism at the same time. Furthermore, we show that this strategy of encapsulating parallelism even allows to benefit from speed gains through parallel execution in simulations controlled by interactive commands given at a command line interface.

Fischbacher, Thomas; Franchin, Matteo; Bordignon, Giuliano; Knittel, Andreas; Fangohr, Hans

2009-04-01

108

Logical models of discrete event systems: A comparative exposition  

Microsoft Academic Search

The increasing complexity of man-made systems calls for new tools and techniques to model them efficiently and at the desired\\u000a level of abstraction. Well-established modelling paradigms, such as finite state machines, petri nets, communicating sequential\\u000a processes etc., which are borrowed from the fields of computer science and operations research, often lack certain essential\\u000a features for capturing discrete event dynamics. New

Amit Patra; Siddhartha Mukhopadhyay; Supratik Bose

1996-01-01

109

Acoustic simulation in architecture with parallel algorithm  

Microsoft Academic Search

In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel

Xiaohong Li; Xinrong Zhang; Dan Li

2004-01-01

110

A new proposal to provide estimation of QoS and QoE over WiMAX networks: An approach based on computational intelligence and discrete-event simulation  

Microsoft Academic Search

This paper presents an estimation of Quality of Experience (QoE) metrics based on Quality of Service (QoS) metrics in WiMAX networks. Applications used to generate such estimations were EvalVid and Network Simulator 2 (NS-2). The QoE was estimated by employing a Multilayer Artificial Neural Network by means of the WEKA tool. The results show a very efficient estimation of metrics

Victor A. Machado; Carlos N. Silva; Rosinei S. Oliveira; Alexandre M. Melo; Marcelino Silva; Carlos R. L. Frances; Joao C. W. A. Costa; Nandamudi L. Vijaykumar; Celso M. Hirata

2011-01-01

111

Acoustic simulation in architecture with parallel algorithm  

NASA Astrophysics Data System (ADS)

In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.

Li, Xiaohong; Zhang, Xinrong; Li, Dan

2004-03-01

112

Xyce parallel electronic simulator : users' guide.  

SciTech Connect

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

2011-05-01

113

Optimizing the Scalability of Parallelized GATE Simulations  

Microsoft Academic Search

GATE is a GEANT4 application toolkit for accurate simulation of positron emission tomography (PET) and single photon emission computed tomography (SPECT) systems. As Monte Carlo simulations are CPU-intensive, simulations often take up to several days to complete with state-of-the-art single-CPU computers. However, Monte Carlo simulations are also excellently suited for parallelization, theoretically showing a linear speed-up as a function of

Jan De Beenhouwer; Steven G. Staelens; Y. D'Asseler; I. Lemahieu

2006-01-01

114

Verifying Ptolemy II Discrete-Event Models Using Real-Time Maude  

NASA Astrophysics Data System (ADS)

This paper shows how Ptolemy II discrete-event (DE) models can be formally analyzed using Real-Time Maude. We formalize in Real-Time Maude the semantics of a subset of hierarchical Ptolemy II DE models, and explain how the code generation infrastructure of Ptolemy II has been used to automatically synthesize a Real-Time Maude verification model from a Ptolemy II design model. This enables a model-engineering process that combines the convenience of Ptolemy II DE modeling and simulation with formal verification in Real-Time Maude.

Bae, Kyungmin; Ölveczky, Peter Csaba; Feng, Thomas Huining; Tripakis, Stavros

115

Bias in parallel and distributed simulation systems  

Microsoft Academic Search

Even after several decades of research, modeling is considered an art, with a high liability to produce incorrect abstractions of real world systems. Therefore, validation and verification of simulation models is considered an indispensable method to establish the credibility of developed models. In the process of parallelizing or distributing a given credible simulation model, a bias is introduced, possibly leading

Tobias Kiesling; R. E. A. Khayari; J. Luthi

2005-01-01

116

Bias in parallel and distributed simulation systems  

Microsoft Academic Search

Even after several decades of research, modeling is con- sidered an art, with a high liability to produce incorrect abstractions of real world systems. Therefore, validation and verification of simulation models is considered an in- dispensable method to establish the credibility of developed models. In the process of parallelizing or distributing a given credible simulation model, a bias is introduced,

Tobias Kiesling; Johannes Lüthi; Rachid El Abdouni Khayari

2005-01-01

117

Simulating patients with Parallel Health State Networks.  

PubMed Central

The American Board of Family Practice is developing a computer-based recertification process to generate patient simulations from a knowledge base. Simulated patients require a stochastically generated history and response to treatment, suggesting a Monte Carlo-like patient generation process. Knowledge acquisition experiments revealed that description of a patient's overall health as a node in a Monte Carlo model was difficult for domain experts to use, severely limited knowledge reusability, and created a plethora of awkwardly defined health states. We explored a model in which patients traverse several parallel health state networks simultaneously, so that overall health is a vector describing the current nodes from every Parallel Network. This model has a reasonable biological basis, more easily defined data, and greatly improved reuse potential, at the cost of more complex simulation algorithms. Experiments using osteoarthritis stages, weight classification, and absence or presence of gastric ulcers as three Parallel Networks demonstrate the feasibility of this approach to simulating patients.

Sumner, W.; Truszczynski, M.; Marek, V. W.

1998-01-01

118

Simulating the scheduling of parallel supercomputer applications  

SciTech Connect

An Event Driven Simulator for Evaluating Multiprocessing Scheduling (EDSEMS) disciplines is presented. The simulator is made up of three components: machine model; parallel workload characterization ; and scheduling disciplines for mapping parallel applications (many processes cooperating on the same computation) onto processors. A detailed description of how the simulator is constructed, how to use it and how to interpret the output is also given. Initial results are presented from the simulation of parallel supercomputer workloads using Dog-Eat-Dog,'' Family'' and Gang'' scheduling disciplines. These results indicate that Gang scheduling is far better at giving the number of processors that a job requests than Dog-Eat-Dog or Family scheduling. In addition, the system throughput and turnaround time are not adversely affected by this strategy. 10 refs., 8 figs., 1 tab.

Seager, M.K.; Stichnoth, J.M.

1989-09-19

119

Particle simulations on massively parallel machines  

SciTech Connect

A wide variety of physical phenomena can be modeled with particles. Such simulations pose interesting challenges for parallel machines since the computations are often difficult to load-balance and can require irregular communication. We discuss the size of problems that can be simulated today, obstacles to higher performance, and areas where algorithmic improvements are need. The relevant issues are illustrated with two prototypical simulations: a Monte Carlo model of low-density fluid flow and molecular dynamics.

Plimpton, S.

1993-06-01

120

Parallel logic simulation on general purpose machines  

Microsoft Academic Search

Three parallel algorithms for logic simulation have been developed and implemented on a general purpose shared-memory parallel machine. The first algorithm is a synchronous version of a traditional event-driven algorithm which achieves speed-ups of 6 to 9 with 15 processors. The second algorithm is a synchronous unit-delay compiled mode algorithm which achieves speed-ups of 10 to 13 with 15 processors.

Larry Soulé; Tom Blank

1988-01-01

121

The Xyce Parallel Electronic Simulator - An Overview  

SciTech Connect

The Xyce{trademark} Parallel Electronic Simulator has been written to support the simulation needs of the Sandia National Laboratories electrical designers. As such, the development has focused on providing the capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). In addition, they are providing improved performance for numerical kernels using state-of-the-art algorithms, support for modeling circuit phenomena at a variety of abstraction levels and using object-oriented and modern coding-practices that ensure the code will be maintainable and extensible far into the future. The code is a parallel code in the most general sense of the phrase--a message passing parallel implementation--which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Furthermore, careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved even as the number of processors grows.

HUTCHINSON,SCOTT A.; KEITER,ERIC R.; HOEKSTRA,ROBERT J.; WATTS,HERMAN A.; WATERS,ARLON J.; SCHELLS,REGINA L.; WIX,STEVEN D.

2000-12-08

122

Estimating ICU bed capacity using discrete event simulation  

Microsoft Academic Search

Purpose – The intensive care unit (ICU) in a hospital caters for critically ill patients. The number of the ICU beds has a direct impact on many aspects of hospital performance. Lack of the ICU beds may cause ambulance diversion and surgery cancellation, while an excess of ICU beds may cause a waste of resources. This paper aims to develop

Zhecheng Zhu; Bee Hoon Hen; Kiok Liang Teow

2012-01-01

123

Xyce parallel electronic simulator release notes.  

SciTech Connect

The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.

Keiter, Eric Richard; Hoekstra, Robert John; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Rankin, Eric Lamont; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

2010-05-01

124

Parallel Molecular Dynamics Simulations of Biomolecular Systems  

Microsoft Academic Search

. We describe a general purpose parallel molecular dynamicscode, for simulations of arbitrary mixtures of flexible molecules in solution.The program allows us to simulate molecular systems describedby standard force fields like AMBER, GROMOS or CHARMM, containingterms for short-range interactions of the Lennard-Jones type,electrostatic interactions, covalent bonds, covalent angles and torsionalangles and a few other optional terms. The state-of-the-art molecular dynamicstechniques

Alexander Lyubartsev; Aatto Laaksonen

1998-01-01

125

A parallel computational model for GATE simulations.  

PubMed

GATE/Geant4 Monte Carlo simulations are computationally demanding applications, requiring thousands of processor hours to produce realistic results. The classical strategy of distributing the simulation of individual events does not apply efficiently for Positron Emission Tomography (PET) experiments, because it requires a centralized coincidence processing and large communication overheads. We propose a parallel computational model for GATE that handles event generation and coincidence processing in a simple and efficient way by decentralizing event generation and processing but maintaining a centralized event and time coordinator. The model is implemented with the inclusion of a new set of factory classes that can run the same executable in sequential or parallel mode. A Mann-Whitney test shows that the output produced by this parallel model in terms of number of tallies is equivalent (but not equal) to its sequential counterpart. Computational performance evaluation shows that the software is scalable and well balanced. PMID:24070545

Rannou, F R; Vega-Acevedo, N; El Bitar, Z

2013-08-19

126

Graphite: A distributed parallel simulator for multicores  

Microsoft Academic Search

This paper introduces the Graphite open-source distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multi- core processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development. Several techniques are used to achieve this including: direct execution, seamless mul- ticore and multi-machine

Jason E. Miller; Harshad Kasture; George Kurian; Charles Gruenwald III; Nathan Beckmann; Christopher Celio; Jonathan Eastep; Anant Agarwal

2010-01-01

127

Parallel adaptive simulations on unstructured meshes  

Microsoft Academic Search

This paper discusses methods being developed by the ITAPS center to support the execution of parallel adaptive simulations on unstructured meshes. The paper first outlines the ITAPS approach to the development of interoperable mesh, geometry and field services to support the needs of SciDAC application in these areas. The paper then demonstrates the ability of unstructured adaptive meshing methods built

M S Shephard; K E Jansen; O Sahni; L A Diachin

2007-01-01

128

Parallel algorithm strategies for circuit simulation.  

SciTech Connect

Circuit simulation tools (e.g., SPICE) have become invaluable in the development and design of electronic circuits. However, they have been pushed to their performance limits in addressing circuit design challenges that come from the technology drivers of smaller feature scales and higher integration. Improving the performance of circuit simulation tools through exploiting new opportunities in widely-available multi-processor architectures is a logical next step. Unfortunately, not all traditional simulation applications are inherently parallel, and quickly adapting mature application codes (even codes designed to parallel applications) to new parallel paradigms can be prohibitively difficult. In general, performance is influenced by many choices: hardware platform, runtime environment, languages and compilers used, algorithm choice and implementation, and more. In this complicated environment, the use of mini-applications small self-contained proxies for real applications is an excellent approach for rapidly exploring the parameter space of all these choices. In this report we present a multi-core performance study of Xyce, a transistor-level circuit simulation tool, and describe the future development of a mini-application for circuit simulation.

Thornquist, Heidi K.; Schiek, Richard Louis; Keiter, Eric Richard

2010-01-01

129

Parallel simulation of digital LSI circuits  

NASA Astrophysics Data System (ADS)

Integrated circuit technology has been advancing at phenomenal rate over the last several years, and promises to continue to do so. If circuit design is to keep pace with fabrication technology, radically new approaches to computer-aided design will be necessary. One appealing approach is general purpose parallel processing. This thesis explores the issues involved in developing a framework for circuit simulation which exploits the locality exhibited by circuit operation to achieve a high degree of parallelism. This framework maps the topology of the circuit onto the the multiprocessor, assigning the simulation of individual partitions to separate processors. A new form of snychronization is developed, based upon a history maintenance and roll back strategy. The circuit simulator PRSIM was designed and implemented to determine the efficacy of this approach. The results of several preliminary experiments are reported, along with an analysis of the behavior of PRSIM.

Arnold, J. M.

1985-02-01

130

Xyce parallel electronic simulator : reference guide.  

SciTech Connect

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to run on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

2011-05-01

131

Communication Reqirements in Parallel Crashworthiness Simulation  

Microsoft Academic Search

This paper deals with the design and implementation of communications strategies for the migration to distributed-memory, MIMD machines of an industrial crashworthiness simulation program, PAM-CRASH, using message-passing. A summary of the algorithmic features and parallelization approach is followed by a discussion of options to minimize overheads introduced by the need for global communication. Implementation issues will be specific to the

Guy Lonsdale; Jan Clinckemaillie; Stefanos Vlachoutsis; J. Dubois

1994-01-01

132

Parallelism extraction and program restructuring for parallel simulation of digital systems  

SciTech Connect

Two topics currently of interest to the computer aided design (CADF) for the very-large-scale integrated circuit (VLSI) community are using the VHSIC Hardware Description Language (VHDL) effectively and decreasing simulation times of VLSI designs through parallel execution of the simulator. The goal of this research is to increase the degree of parallelism obtainable in VHDL simulation, and consequently to decrease simulation times. The research targets simulation on massively parallel architectures. Experimentation and instrumentation were done on the SIMD Connection Machine. The author discusses her method used to extract parallelism and restructure a VHDL program, experimental results using this method, and requirements for a parallel architecture for fast simulation.

Vellandi, B.L.

1990-01-01

133

Decision Making in Fuzzy Discrete Event Systems1  

PubMed Central

The primary goal of the study presented in this paper is to develop a novel and comprehensive approach to decision making using fuzzy discrete event systems (FDES) and to apply such an approach to real-world problems. At the theoretical front, we develop a new control architecture of FDES as a way of decision making, which includes a FDES decision model, a fuzzy objective generator for generating optimal control objectives, and a control scheme using both disablement and enforcement. We develop an online approach to dealing with the optimal control problem efficiently. As an application, we apply the approach to HIV/AIDS treatment planning, a technical challenge since AIDS is one of the most complex diseases to treat. We build a FDES decision model for HIV/AIDS treatment based on expert’s knowledge, treatment guidelines, clinic trials, patient database statistics, and other available information. Our preliminary retrospective evaluation shows that the approach is capable of generating optimal control objectives for real patients in our AIDS clinic database and is able to apply our online approach to deciding an optimal treatment regimen for each patient. In the process, we have developed methods to resolve the following two new theoretical issues that have not been addressed in the literature: (1) the optimal control problem has state dependent performance index and hence it is not monotonic, (2) the state space of a FDES is infinite.

Lin, F.; Ying, H.; MacArthur, R. D.; Cohn, J.A.; Barth-Jones, D.; Crane, L.R.

2009-01-01

134

Parallel Strategies for Crash and Impact Simulations  

SciTech Connect

We describe a general strategy we have found effective for parallelizing solid mechanics simula- tions. Such simulations often have several computationally intensive parts, including finite element integration, detection of material contacts, and particle interaction if smoothed particle hydrody- namics is used to model highly deforming materials. The need to balance all of these computations simultaneously is a difficult challenge that has kept many commercial and government codes from being used effectively on parallel supercomputers with hundreds or thousands of processors. Our strategy is to load-balance each of the significant computations independently with whatever bal- ancing technique is most appropriate. The chief benefit is that each computation can be scalably paraIlelized. The drawback is the data exchange between processors and extra coding that must be written to maintain multiple decompositions in a single code. We discuss these trade-offs and give performance results showing this strategy has led to a parallel implementation of a widely-used solid mechanics code that can now be run efficiently on thousands of processors of the Pentium-based Sandia/Intel TFLOPS machine. We illustrate with several examples the kinds of high-resolution, million-element models that can now be simulated routinely. We also look to the future and dis- cuss what possibilities this new capabUity promises, as well as the new set of challenges it poses in material models, computational techniques, and computing infrastructure.

Attaway, S.; Brown, K.; Hendrickson, B.; Plimpton, S.

1998-12-07

135

Ladder diagram and Petri-net-based discrete-event control design methods  

Microsoft Academic Search

Ladder diagrams (LDs) for a programmable logic controller are a dominant method in discrete event control of industrial automated systems. Yet, the ever-increasing functionality and complexity of these systems have challenged the use of LDs to design their discrete-event controllers. Researchers are constantly pursuing integrated tools that eliminate the limitations of LDs. These tools are aimed not only for control

Shih Sen Peng; Meng Chu Zhou

2004-01-01

136

Discrete event theory for the monitoring and control of robotic systems  

Microsoft Academic Search

Discrete event systems are presented as a powerful framework for a large number of robot control tasks. Their advantage lies in the ability to abstract a complex problem to the essential elements needed for task level control. Discrete event control has proven to be successful in numerous robotic applications, including assembly, on-line training of robots, mobile navigation, control of perception

Brenan J. McCarragher

137

Integrating discrete events and continuous head movements for video-based interaction techniques  

Microsoft Academic Search

Human head gestures can potentially trigger different commands from the list of available options in graphical user interfaces or in virtual and smart environments. However, continuous tracking techniques are limited in generating discrete events which could be used to execute a predefined set of commands. In this article, we discuss a possibility to encode a set of discrete events by

Tatiana V. Evreinova; Grigori Evreinov; Roope Raisamo

2009-01-01

138

Improving the Teaching of Discrete-Event Control Systems Using a LEGO Manufacturing Prototype  

ERIC Educational Resources Information Center

|This paper discusses the usefulness of employing LEGO as a teaching-learning aid in a post-graduate-level first course on the control of discrete-event systems (DESs). The final assignment of the course is presented, which asks students to design and implement a modular hierarchical discrete-event supervisor for the coordination layer of a…

Sanchez, A.; Bucio, J.

2012-01-01

139

Application of discrete event systems theory for modeling and analysis of a power transmission network  

Microsoft Academic Search

Apart from the continuous time phenomena various discrete events occur in an electric power system. Previously, the size and operation of the power systems were considerably small and centralized compared to the large, decentralized power systems of the present day. Each discrete event occurring in the power system was manually acknowledged and necessary changes were made in the system operation.

Tamal Biswas; Asad Davari; Ali Feliachi

2004-01-01

140

On Building a Kohonen Neural Net Parallel Simulator  

Microsoft Academic Search

This paper presents a Kohonen neural net parallel simulator. The simulator was developed on a Sequent Balance 8000 computer system. Comparative results emphasize the impact of the different strategies of parallelization, the number of processors involved and inter-processor communications over the efficiency of the parallel implementation. The simulator was used in a pattern recognition application, in the automatic synthesis of

C. V. Buhusi; David J. Evans

1994-01-01

141

Optimal Parametric Discrete Event Control: Problem and Solution  

SciTech Connect

We present a novel optimization problem for discrete event control, similar in spirit to the optimal parametric control problem common in statistical process control. In our problem, we assume a known finite state machine plant model $G$ defined over an event alphabet $\\Sigma$ so that the plant model language $L = \\LanM(G)$ is prefix closed. We further assume the existence of a \\textit{base control structure} $M_K$, which may be either a finite state machine or a deterministic pushdown machine. If $K = \\LanM(M_K)$, we assume $K$ is prefix closed and that $K \\subseteq L$. We associate each controllable transition of $M_K$ with a binary variable $X_1,\\dots,X_n$ indicating whether the transition is enabled or not. This leads to a function $M_K(X_1,\\dots,X_n)$, that returns a new control specification depending upon the values of $X_1,\\dots,X_n$. We exhibit a branch-and-bound algorithm to solve the optimization problem $\\min_{X_1,\\dots,X_n}\\max_{w \\in K} C(w)$ such that $M_K(X_1,\\dots,X_n) \\models \\Pi$ and $\\LanM(M_K(X_1,\\dots,X_n)) \\in \\Con(L)$. Here $\\Pi$ is a set of logical assertions on the structure of $M_K(X_1,\\dots,X_n)$, and $M_K(X_1,\\dots,X_n) \\models \\Pi$ indicates that $M_K(X_1,\\dots,X_n)$ satisfies the logical assertions; and, $\\Con(L)$ is the set of controllable sublanguages of $L$.

Griffin, Christopher H [ORNL

2008-01-01

142

Experimental Analysis of Logical Process Simulation Algorithms inJAMES II  

Microsoft Academic Search

The notion of logical processes is a widely used modeling paradigm in parallel and distributed discrete-event simulation. Yet, the comparison among different simulation algorithms for LP models still remains difficult. Most simulation systems only provide a small subset of available algorithms, which are usually selected and tuned towards specific applications. Furthermore, many modeling and simulation frameworks blur the boundary between

Bing Wang; Jan Himmelspach; Roland Ewald; Adelinde M. Uhrmacher

2009-01-01

143

PARALLELIZATION OF THE PENELOPE MONTE CARLO PARTICLE TRANSPORT SIMULATION PACKAGE  

Microsoft Academic Search

We have parallelized the PENELOPE Monte Carlo particle transport simulation package (1). The motivation is to increase efficiency of Monte Carlo simulations for medical applications. Our parallelization is based on the standard MPI message passing interface. The parallel code is especially suitable for a distributed memory environment, and has been run on up to 256 processors on the Indiana University

R. B. Cruise; R. W. Sheppard; V. P. Moskvin

2003-01-01

144

MAPPING LARGE PARALLEL SIMULATION PROGRAMS TO MULTICOMPUTER SYSTEMS  

Microsoft Academic Search

We consider the problem of mapping parallel simulation programs to distributed memory parallel machines. Since a large fraction of computer simulations consists of solving partial differential equations, the communication patterns of the resulting parallel programs can be exploited to construct efficient mappings which lead to low communication overhead. We report about the application of Kohonen-networks to find such mappings. Most

Hans-Ulrich Heiss; Marcus Dormanns

1994-01-01

145

Fault Diagnosis in Discrete-Event Systems: Incomplete Models and Learning  

Microsoft Academic Search

Most model-based approaches to fault diagnosis of discrete-event systems (DESs) require a complete and accurate model of the system to be diagnosed. However, the discrete-event model may have arisen from abstraction and simplification of a continuous time system or through model building from input-output data. As such, it may not capture the dynamic behavior of the system completely. In this

Raymond H. Kwong; David L. Yonge-Mallo

2011-01-01

146

Parallel Cycle Based Logic Simulation Using Graphics Processing Units  

Microsoft Academic Search

Graphics Processing Units (GPUs) are gaining popularity for parallelization of general purpose applications. GPUs are massively parallel processors with huge performance in a small and readily available package. At the same time, the emergence of general purpose programming environments for GPUs such as CUDA shorten the learning curve of GPU programming. We present a GPU-based parallelization of logic simulation algorithm

Alper Sen; Baris Aksanli; Murat Bozkurt; Melih Mert

2010-01-01

147

Simulation of Sheared Suspensions With a Parallel Implementation of QDPD  

Microsoft Academic Search

A parallel quaternion-based dissipative particle dynamics (QDPD) program has been developed in Fortran to study the flow properties of complex fluids subject to shear. The parallelization allows for simulations of greater size and complexity and is accomplished with a parallel link- cell spatial (domain) decomposition using MPI. The technique has novel features arising from the DPD formalism, the use of

James S. Sims

2002-01-01

148

Parallel magnetic field perturbations in gyrokinetic simulations  

SciTech Connect

At low beta it is common to neglect parallel magnetic field perturbations on the basis that they are of order beta{sup 2}. This is only true if effects of order beta are canceled by a term in the nablaB drift also of order beta[H. L. Berk and R. R. Dominguez, J. Plasma Phys. 18, 31 (1977)]. To our knowledge this has not been rigorously tested with modern gyrokinetic codes. In this work we use the gyrokinetic code GS2[Kotschenreuther et al., Comput. Phys. Commun. 88, 128 (1995)] to investigate whether the compressional magnetic field perturbation B{sub ||} is required for accurate gyrokinetic simulations at low beta for microinstabilities commonly found in tokamaks. The kinetic ballooning mode (KBM) demonstrates the principle described by Berk and Dominguez strongly, as does the trapped electron mode, in a less dramatic way. The ion and electron temperature gradient (ETG) driven modes do not typically exhibit this behavior; the effects of B{sub ||} are found to depend on the pressure gradients. The terms which are seen to cancel at long wavelength in KBM calculations can be cumulative in the ion temperature gradient case and increase with eta{sub e}. The effect of B{sub ||} on the ETG instability is shown to depend on the normalized pressure gradient beta{sup '} at constant beta.

Joiner, N.; Hirose, A. [Department of Physics and Engineering Physics, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5E2 (Canada); Dorland, W. [University of Maryland, College Park, Maryland 20742 (United States)

2010-07-15

149

Optimistic Simulation of Parallel Architectures Using Program Executables  

Microsoft Academic Search

A key tool of computer architects is computer simu- lation at the level of detail that can execute program ex- ecutables. The time and memory requirements of such simulations can be enormous, especially when the ma- chine under design-the target -is a parallel machine. Thus, it is attractive to use parallel simulation, as suc- cessfully demonstrated by the Wisconsin Wind

S. Chandrasekaran; M. D. Hill

1996-01-01

150

Coupled Dipole Simulations of Elastic Light Scattering on Parallel Systems  

Microsoft Academic Search

The Coupled Dipole method is used to simulate Elastic Light Scattering from arbitrary shaped particles. To facilitate simulation of relative large particles, such as human white blood cells, the number of dipoles required for the simulation is approximately 105 to 106. In order to carry out such simulations, very powerful computers are necessary. We have designed a parallel version of

A. G. Hoekstra; P. M. A. Sloot

1995-01-01

151

Parallel architecture for real-time simulation. Master's thesis  

SciTech Connect

This thesis is concerned with the development of a very fast and highly efficient parallel computer architecture for real-time simulation of continuous systems. Currently, several parallel processing systems exist that may be capable of executing a complex simulation in real-time. These systems are examined and the pros and cons of each system discussed. The thesis then introduced a custom-designed parallel architecture based upon The University of Alabama's OPERA architecture. Each component of this system is discussed and rationale presented for its selection. The problem selected, real-time simulation of the Space Shuttle Main Engine for the test and evaluation of the proposed architecture, is explored, identifying the areas where parallelism can be exploited and parallel processing applied. Results from the test and evaluation phase are presented and compared with the results of the same problem that has been processed on a uniprocessor system.

Cockrell, C.D.

1989-01-01

152

Parallel Monte Carlo Driver (PMCD)-a software package for Monte Carlo simulations in parallel  

NASA Astrophysics Data System (ADS)

Thanks to the dramatic decrease of computer costs and the no less dramatic increase in those same computer's capabilities and also thanks to the availability of specific free software and libraries that allow the set up of small parallel computation installations the scientific community is now in a position where parallel computation is within easy reach even to moderately budgeted research groups. The software package PMCD (Parallel Monte Carlo Driver) was developed to drive the Monte Carlo simulation of a wide range of user supplied models in parallel computation environments. The typical Monte Carlo simulation involves using a software implementation of a function to repeatedly generate function values. Typically these software implementations were developed for sequential runs. Our driver was developed to enable the run in parallel of the Monte Carlo simulation, with minimum changes to the original code that implements the function of interest to the researcher. In this communication we present the main goals and characteristics of our software, together with a simple study its expected performance. Monte Carlo simulations are informally classified as ``embarrassingly parallel'', meaning that the gains in parallelizing a Monte Carlo run should be close to ideal, i.e. with speed ups close to linear. In this paper our simple study shows that without compromising the easiness of use and implementation, one can get performances very close to the ideal.

Mendes, B.; Pereira, A.

2003-03-01

153

Massively Parallel Simulations of Diffusion in Dense Polymeric Structures.  

National Technical Information Service (NTIS)

An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel T...

J. L. W. R. Faulon

1997-01-01

154

Parallel algorithm for transient solid dynamics simulations with contact detection.  

National Technical Information Service (NTIS)

Solid dynamics simulations with Lagrangian finite elements are used to model a wide variety of problems, such as the calculation of impact damage to shipping containers for nuclear waste and the analysis of vehicular crashes. Using parallel computers for ...

S. Attaway B. Hendrickson S. Plimpton D. Gardner C. Vaughan

1996-01-01

155

From System Dynamics and Discrete Event to Practical Agent Based Modeling: Reasons, Techniques, Tools  

Microsoft Academic Search

This paper may be considered as a practical reference for those who wish to add (now sufficiently matured) Agent Based modeling to their analysis toolkit and may or may not have some System Dynamics or Discrete Event modeling background. We focus on systems that contain large numbers of active objects (people, business units, animals, vehicles, or even things like projects,

Andrei Borshchev; Alexei Filippov

156

Supremica – A Tool for Verification and Synthesis of Discrete Event Supervisors  

Microsoft Academic Search

Abstract— A tool for automatic verification and synthesis of controllers for discrete event systems is presented. The tool, called Supremica, and it implements the main ideas in the Supervisory control theory. In addition to the verification and,synthesis algorithms, Supremica can automatically generate code for C- like languages and also IEC 61131 languages. It is also possible to execute the control

Knut Û Akesson; Martin Fabian; Hugo Flordal; Arash Vahidi

157

EDEN: An Intelligent Software Environment for Diagnosis of Discrete-Event Systems  

Microsoft Academic Search

A software environment, called EDEN, that prototypes a recent approach to model-based diagnosis of discrete-event systems, is presented. The environment integrates a specification language, called SMILE, a model base, and a diagnostic engine. SMILE enables the user to create libraries of models and systems, which are permanently stored in the model base, wherein both final and intermediate results of the

Gianfranco Lamperti; Marina Zanella

2003-01-01

158

ASYNCHRONOUS IMPLEMENTATION OF A PETRI NET BASED DISCRETE EVENT CONTROL SYSTEM USING A XILINX FPGA  

Microsoft Academic Search

This paper presents an asynchronous implementation of a Petri net based discrete event control system (DECS) using a Xilinx field programmable gate array (FPGA). Unlike microprocessor, microcontroller or programmable logic controller (PLC)-based software implementations, and hardware- based synchronous implementations, the implementation method used in this paper is asynchronous and based on hardware offering very high speed to control fast plants

Murat UZAM; I. Burak KOÇ; Gökhan GELEN; B. Hakan AKSEBZECI

159

Discrete event control system design using automation Petri nets and their ladder diagram implementation  

Microsoft Academic Search

As automated manufacturing systems become more complex, the need for an effective design tool to produce both high-level discrete event control systems (DECS) and low-level implementations becomes more important. Petri nets represent the most effective method for both the design and implementation of DECSs. In this paper, automation Petri nets (APN) are introduced to provide a new method for the

M. Uzam; A. H. Jones

1998-01-01

160

Parallel computing in conceptual sewer simulations.  

PubMed

Integrated urban drainage modelling is used to analyze how existing urban drainage systems respond to particular conditions. Based on these integrated models, researchers and engineers are able to e.g. estimate long-term pollution effects, optimize the behaviour of a system by comparing impacts of different measures on the desired target value or get new insights on systems interactions. Although the use of simplified conceptual models reduces the computational time significantly, searching the enormous vector space that is given by comparing different measures or that the input parameters span, leads to the fact, that computational time is still a limiting factor. Owing to the stagnation of single thread performance in computers and the rising number of cores one needs to adapt algorithms to the parallel nature of the new CPUs to fully utilize the available computing power. In this work a new developed software tool named CD3 for parallel computing in integrated urban drainage systems is introduced. From three investigated parallel strategies two showed promising results and one results in a speedup of up to 4.2 on an eight-way hyperthreaded quad core CPU and shows even for all investigated sewer systems significant run-time reductions. PMID:20107253

Burger, G; Fach, S; Kinzel, H; Rauch, W

2010-01-01

161

Xyce parallel electronic simulator : users' guide. Version 5.1.  

SciTech Connect

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

2009-11-01

162

Xyce Parallel Electronic Simulator : users' guide, version 4.1.  

SciTech Connect

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

2009-02-01

163

HPC Infrastructure for Solid Earth Simulation on Parallel Computers  

NASA Astrophysics Data System (ADS)

Recently, various types of parallel computers with various types of architectures and processing elements (PE) have emerged, which include PC clusters and the Earth Simulator. Moreover, users can easily access to these computer resources through network on Grid environment. It is well-known that thorough tuning is required for programmers to achieve excellent performance on each computer. The method for tuning strongly depends on the type of PE and architecture. Optimization by tuning is a very tough work, especially for developers of applications. Moreover, parallel programming using message passing library such as MPI is another big task for application programmers. In GeoFEM project (http://gefeom.tokyo.rist.or.jp), authors have developed a parallel FEM platform for solid earth simulation on the Earth Simulator, which supports parallel I/O, parallel linear solvers and parallel visualization. This platform can efficiently hide complicated procedures for parallel programming and optimization on vector processors from application programmers. This type of infrastructure is very useful. Source codes developed on PC with single processor is easily optimized on massively parallel computer by linking the source code to the parallel platform installed on the target computer. This parallel platform, called HPC Infrastructure will provide dramatic efficiency, portability and reliability in development of scientific simulation codes. For example, line number of the source codes is expected to be less than 10,000 and porting legacy codes to parallel computer takes 2 or 3 weeks. Original GeoFEM platform supports only I/O, linear solvers and visualization. In the present work, further development for adaptive mesh refinement (AMR) and dynamic load-balancing (DLB) have been carried out. In this presentation, examples of large-scale solid earth simulation using the Earth Simulator will be demonstrated. Moreover, recent results of a parallel computational steering tool using an MxN communication model will be shown. In an MxN communication model, the large-scale computation modules run on M PE's and high performance parallel visualization modules run on N PE's, concurrently. This can allow computation and visualization to select suitable parallel hardware environments respectively. Meanwhile, real-time steering can be achieved during computation so that the users can check and adjust the computation process in real time. Furthermore, different numbers of PE's can achieve better configuration between computation and visualization under Grid environment.

Nakajima, K.; Chen, L.; Okuda, H.

2004-12-01

164

n-body simulations using message passing parallel computers.  

NASA Astrophysics Data System (ADS)

The authors present new parallel formulations of the Barnes-Hut method for n-body simulations on message passing computers. These parallel formulations partition the domain efficiently incurring minimal communication overhead. This is in contrast to existing schemes that are based on sorting a large number of keys or on the use of global data structures. The new formulations are augmented by alternate communication strategies which serve to minimize communication overhead. The impact of these communication strategies is experimentally studied. The authors report on experimental results obtained from an astrophysical simulation on an nCUBE2 parallel computer.

Grama, A. Y.; Kumar, V.; Sameh, A.

165

SCGPSim: A fast SystemC simulator on GPUs  

Microsoft Academic Search

The main objective of this paper is to speed up the simulation performance of SystemC designs at the RTL abstraction level by exploiting the high degree of parallelism afforded by today's general purpose graphics processors (GPGPUs). Our approach parallelizes SystemC's discrete-event simulation (DES) on GPGPUs by transforming the model of computation of DES into a model of concurrent threads that

Mahesh Nanjundappa; Hiren D. Patel; Bijoy A. Jose; Sandeep K. Shukla

2010-01-01

166

High-performance retargetable simulator for parallel architectures. Technical report  

SciTech Connect

In this thesis, the authors describe Proteus, a high-performance simulation-based system for the evaluation of parallel algorithms and system software. Proteus is built around a retargetable parallel architecture simulator and a flexible data collection and display component. The simulator uses a combination of simulation and direct execution to achieve high performance, while retaining simulation accuracy. Proteus can be configured to simulate a wide range of shared memory and message passing MIMD architectures and the level of simulation detail can be chosen by the user. Detailed memory, cache and network simulation is supported. Parallel programs can be written using a programming model based on C and a set of runtime system calls for thread and memory management. The system allows nonintrusive monitoring of arbitrary information about an execution, and provides flexible graphical utilities for displaying recorded data. To validate the accuracy of the system, a number of published experiments were reproduced on Proteus. In all cases the results obtained by simulation are very close to those published, a fact that provides support for the reliability of the system. Performance measurements demonstrate that the simulator is one to two orders of magnitude faster than other similar multiprocessor simulators.

Dellarocas, C.N.

1991-06-01

167

3-D massively parallel impact simulations using PCTH  

SciTech Connect

Simulations of hypervelocity impact problems are performed frequently by government laboratories and contractors for armor/anti-armor applications. These simulations need to deal with shock wave physics phenomena, large material deformation, motion of debris particles and complex geometries. As a result, memory and processing time requirements are large for detailed, three-dimensional calculations. The large massively parallel supercomputing systems of the future will provide the power necessary to greatly reduce simulation times currently required by shared-memory, vector supercomputers. This paper gives an introduction to PCTH, a next-generation shock wave physics code which is being built at Sandia National Laboratories for massively parallel supercomputers, and demonstrates that massively parallel hydrocodes, such as PCTH, can provide highly-detailed, three-dimensional simulations of armor/anti-armor systems.

Fang, H.E.; Robinson, A.C.

1992-01-01

168

3-D massively parallel impact simulations using PCTH  

SciTech Connect

Simulations of hypervelocity impact problems are performed frequently by government laboratories and contractors for armor/anti-armor applications. These simulations need to deal with shock wave physics phenomena, large material deformation, motion of debris particles and complex geometries. As a result, memory and processing time requirements are large for detailed, three-dimensional calculations. The large massively parallel supercomputing systems of the future will provide the power necessary to greatly reduce simulation times currently required by shared-memory, vector supercomputers. This paper gives an introduction to PCTH, a next-generation shock wave physics code which is being built at Sandia National Laboratories for massively parallel supercomputers, and demonstrates that massively parallel hydrocodes, such as PCTH, can provide highly-detailed, three-dimensional simulations of armor/anti-armor systems.

Fang, H.E.; Robinson, A.C.

1992-12-31

169

Fully Implicit Parallel Simulation of Single Neurons  

PubMed Central

When a multi-compartment neuron is divided into subtrees such that no subtree has more than two connection points to other subtrees, the subtrees can be on different processors and the entire system remains amenable to direct Gaussian elimination with only a modest increase in complexity. Accuracy is the same as with standard Gaussian elimination on a single processor. It is often feasible to divide a 3-d reconstructed neuron model onto a dozen or so processors and experience almost linear speedup. We have also used the method for purposes of load balance in network simulations when some cells are so large that their individual computation time is much longer than the average processor computation time or when there are many more processors than cells. The method is available in the standard distribution of the NEURON simulation program.

Hines, Michael L.; Markram, Henry; Schurmann, Felix

2009-01-01

170

Efficient parallel simulation of CO2 geologic sequestration insaline aquifers  

SciTech Connect

An efficient parallel simulator for large-scale, long-termCO2 geologic sequestration in saline aquifers has been developed. Theparallel simulator is a three-dimensional, fully implicit model thatsolves large, sparse linear systems arising from discretization of thepartial differential equations for mass and energy balance in porous andfractured media. The simulator is based on the ECO2N module of the TOUGH2code and inherits all the process capabilities of the single-CPU TOUGH2code, including a comprehensive description of the thermodynamics andthermophysical properties of H2O-NaCl- CO2 mixtures, modeling singleand/or two-phase isothermal or non-isothermal flow processes, two-phasemixtures, fluid phases appearing or disappearing, as well as saltprecipitation or dissolution. The new parallel simulator uses MPI forparallel implementation, the METIS software package for simulation domainpartitioning, and the iterative parallel linear solver package Aztec forsolving linear equations by multiple processors. In addition, theparallel simulator has been implemented with an efficient communicationscheme. Test examples show that a linear or super-linear speedup can beobtained on Linux clusters as well as on supercomputers. Because of thesignificant improvement in both simulation time and memory requirement,the new simulator provides a powerful tool for tackling larger scale andmore complex problems than can be solved by single-CPU codes. Ahigh-resolution simulation example is presented that models buoyantconvection, induced by a small increase in brine density caused bydissolution of CO2.

Zhang, Keni; Doughty, Christine; Wu, Yu-Shu; Pruess, Karsten

2007-01-01

171

Parallel in Time Simulation of Multiscale Stochastic Chemical Kinetics  

Microsoft Academic Search

A version of the time-parallel algorithm parareal is analyzed and applied to\\u000astochastic models in chemical kinetics. A fast predictor at the macroscopic\\u000ascale (evaluated in serial) is available in the form of the usual reaction rate\\u000aequations. A stochastic simulation algorithm is used to obtain an exact\\u000arealization of the process at the mesoscopic scale (in parallel).\\u000a The underlying

Stefan Engblom

2009-01-01

172

Xyce Parallel Electronic Simulator : users' guide, version 2.0.  

SciTech Connect

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator capable of simulating electrical circuits at a variety of abstraction levels. Primarily, Xyce has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability the current state-of-the-art in the following areas: {sm_bullet} Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. {sm_bullet} Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. {sm_bullet} Device models which are specifically tailored to meet Sandia's needs, including many radiation-aware devices. {sm_bullet} A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). {sm_bullet} Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing of computing platforms. These include serial, shared-memory and distributed-memory parallel implementation - which allows it to run efficiently on the widest possible number parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. One feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the Xyce These input formats include standard analytical models, behavioral models look-up Parallel Electronic Simulator is designed to support a variety of device model inputs. tables, and mesh-level PDE device models. Combined with this flexible interface is an architectural design that greatly simplifies the addition of circuit models. One of the most important feature of Xyce is in providing a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia now has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods) research and development can be performed. Ultimately, these capabilities are migrated to end users.

Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont; Fixel, Deborah A.; Russo, Thomas V.; Keiter, Eric Richard; Hutchinson, Scott Alan; Pawlowski, Roger Patrick; Wix, Steven D.

2004-06-01

173

A hybrid parallel framework for the cellular Potts model simulations  

SciTech Connect

The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approach achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).

Jiang, Yi [Los Alamos National Laboratory; He, Kejing [SOUTH CHINA UNIV; Dong, Shoubin [SOUTH CHINA UNIV

2009-01-01

174

Efficient Analysis of Large Discrete-Event Systems with Binary Decision Diagrams  

Microsoft Academic Search

Efficient analysis and controller synthesis in the context of Discrete-Event Systems (DES) is discussed in this paper. We consider efficient reachability search for solving common problems in the Supervisory Control Theory (SCT). The search is based on symbolic computations including crucial partitioning techniques. Finally, the efficiency of the presented algorithms is demonstrated on a set of hand-made and real-world industrial

Arash Vahidi; Bengt Lennartson; Martin Fabian

2005-01-01

175

Verifying Ptolemy II Discrete-Event Models Using Real-Time Maude  

Microsoft Academic Search

This paper shows how Ptolemy II discrete-event (DE) models can be formally analyzed using Real-Time Maude. We formalize in\\u000a Real-Time Maude the semantics of a subset of hierarchical Ptolemy II DE models, and explain how the code generation infrastructure\\u000a of Ptolemy II has been used to automatically synthesize a Real-Time Maude verification model from a Ptolemy II design model.\\u000a This

Kyungmin Bae; Peter Csaba Ölveczky; Thomas Huining Feng; Stavros Tripakis

2009-01-01

176

Parallelization of a Monte Carlo particle transport simulation code  

NASA Astrophysics Data System (ADS)

We have developed a high performance version of the Monte Carlo particle transport simulation code MC4. The original application code, developed in Visual Basic for Applications (VBA) for Microsoft Excel, was first rewritten in the C programming language for improving code portability. Several pseudo-random number generators have been also integrated and studied. The new MC4 version was then parallelized for shared and distributed-memory multiprocessor systems using the Message Passing Interface. Two parallel pseudo-random number generator libraries (SPRNG and DCMT) have been seamlessly integrated. The performance speedup of parallel MC4 has been studied on a variety of parallel computing architectures including an Intel Xeon server with 4 dual-core processors, a Sun cluster consisting of 16 nodes of 2 dual-core AMD Opteron processors and a 200 dual-processor HP cluster. For large problem size, which is limited only by the physical memory of the multiprocessor server, the speedup results are almost linear on all systems. We have validated the parallel implementation against the serial VBA and C implementations using the same random number generator. Our experimental results on the transport and energy loss of electrons in a water medium show that the serial and parallel codes are equivalent in accuracy. The present improvements allow for studying of higher particle energies with the use of more accurate physical models, and improve statistics as more particles tracks can be simulated in low response time.

Hadjidoukas, P.; Bousis, C.; Emfietzoglou, D.

2010-05-01

177

Xyce Parallel Electronic Simulator : reference guide, version 4.1.  

SciTech Connect

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

2009-02-01

178

Xyce Parallel Electronic Simulator : reference guide, version 2.0.  

SciTech Connect

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont; Fixel, Deborah A.; Russo, Thomas V.; Keiter, Eric Richard; Hutchinson, Scott Alan; Pawlowski, Roger Patrick; Wix, Steven D.

2004-06-01

179

Smoldyn on graphics processing units: massively parallel Brownian dynamics simulations.  

PubMed

Space is a very important aspect in the simulation of biochemical systems; recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and detailed models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localized fluctuations, transportation phenomena, and diffusion. A common drawback of spatial models lies in their complexity: models can become very large, and their simulation could be time consuming, especially if we want to capture the systems behavior in a reliable way using stochastic methods in conjunction with a high spatial resolution. In order to deliver the promise done by systems biology to be able to understand a system as whole, we need to scale up the size of models we are able to simulate, moving from sequential to parallel simulation algorithms. In this paper, we analyze Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, and we propose an alternative, innovative implementation that exploits the parallelism of Graphics Processing Units (GPUs). The implementation executes the most computational demanding steps (computation of diffusion, unimolecular, and bimolecular reaction, as well as the most common cases of molecule-surface interaction) on the GPU, computing them in parallel on each molecule of the system. The implementation offers good speed-ups and real time, high quality graphics output PMID:21788675

Dematté, Lorenzo

180

Leaky Modes in Parallel-Plate EMP Simulators  

Microsoft Academic Search

The finite-width parallel-plate waveguide is a useful tool as an EMP simulator, and its characteristics have recently been investigated by a number of workers. In this paper, we report the results of a study of the modal fields in such a waveguide. Once these modal fields and their corresponding wavenumbers are known, the problem of source excitation in such a

Ali Rushdi; Ronald Menendez; Raj Mittra; Shung-Wu Lee

1978-01-01

181

A Parallel Engine for Graphical Interactive Molecular Dynamics Simulations  

Microsoft Academic Search

The current work proposes a parallel implementation for interactive molecular dynamics simulations (MD). The interactive capability is modeled by finite automata that are executed in the processing nodes. Any interaction implies in a communication between the user interface and the finite automata. The ADKS, an interactive sequential MD code that provides graphical output was chosen as a case study. A

Eduardo Rocha Rodrigues; Airam Jonatas Preto; Stephan Stephany

2004-01-01

182

Massively parallel architectures for large scale neural network simulations  

Microsoft Academic Search

A toroidal lattice architecture (TLA) and a planar lattice architecture (PLA) are proposed as massively parallel neurocomputer architectures for large-scale simulations. The performance of these architectures is almost proportional to the number of node processors, and they adopt the most efficient two-dimensional processor connections for WSI implementation. They also give a solution to the connectivity problem, the performance degradation caused

Yoshiji Fujimoto; Naoyuki Fukuda; Toshio Akabane

1992-01-01

183

Scalable parallel solution coupling for multiphysics reactor simulation  

NASA Astrophysics Data System (ADS)

Reactor simulation depends on the coupled solution of various physics types, including neutronics, thermal/hydraulics, and structural mechanics. This paper describes the formulation and implementation of a parallel solution coupling capability being developed for reactor simulation. The coupling process consists of mesh and coupler initialization, point location, field interpolation, and field normalization. We report here our test of this capability on an example problem, namely, a reflector assembly from an advanced burner test reactor. Performance of this coupler in parallel is reasonable for the chosen problem size and range of processor counts. The runtime is dominated by startup costs, which amortize over the entire coupled simulation. Future efforts will include adding more sophisticated interpolation and normalization methods, to accommodate different numerical solvers used in various physics modules and to obtain better conservation properties for certain field types.

Tautges, Timothy J.; Caceres, Alvaro

2009-07-01

184

Random number generators for massively parallel simulations on GPU  

NASA Astrophysics Data System (ADS)

High-performance streams of (pseudo) random numbers are crucial for the efficient implementation of countless stochastic algorithms, most importantly, Monte Carlo simulations and molecular dynamics simulations with stochastic thermostats. A number of implementations of random number generators has been discussed for GPU platforms before and some generators are even included in the CUDA supporting libraries. Nevertheless, not all of these generators are well suited for highly parallel applications where each thread requires its own generator instance. For this specific situation encountered, for instance, in simulations of lattice models, most of the high-quality generators with large states such as Mersenne twister cannot be used efficiently without substantial changes. We provide a broad review of existing CUDA variants of random-number generators and present the CUDA implementation of a new massively parallel high-quality, high-performance generator with a small memory load overhead.

Manssen, M.; Weigel, M.; Hartmann, A. K.

2012-08-01

185

Molecular simulation of rheological properties using massively parallel supercomputers  

SciTech Connect

Advances in parallel supercomputing now make possible molecular-based engineering and science calculations that will soon revolutionize many technologies, such as those involving polymers and those involving aqueous electrolytes. We have developed a suite of message-passing codes for classical molecular simulation of such complex fluids and amorphous materials and have completed a number of demonstration calculations of problems of scientific and technological importance with each. In this paper, we will focus on the molecular simulation of rheological properties, particularly viscosity, of simple and complex fluids using parallel implementations of non-equilibrium molecular dynamics. Such calculations represent significant challenges computationally because, in order to reduce the thermal noise in the calculated properties within acceptable limits, large systems and/or long simulated times are required.

Bhupathiraju, R.K.; Cui, S.T.; Gupta, S.A.; Cummings, P.T. [Univ. of Tennessee, Knoxville, TN (United States). Dept of Chemical Engineering; Cochran, H.D. [Oak Ridge National Lab., TN (United States)

1996-11-01

186

PRATHAM: Parallel Thermal Hydraulics Simulations using Advanced Mesoscopic Methods  

SciTech Connect

At the Oak Ridge National Laboratory, efforts are under way to develop a 3D, parallel LBM code called PRATHAM (PaRAllel Thermal Hydraulic simulations using Advanced Mesoscopic Methods) to demonstrate the accuracy and scalability of LBM for turbulent flow simulations in nuclear applications. The code has been developed using FORTRAN-90, and parallelized using the message passing interface MPI library. Silo library is used to compact and write the data files, and VisIt visualization software is used to post-process the simulation data in parallel. Both the single relaxation time (SRT) and multi relaxation time (MRT) LBM schemes have been implemented in PRATHAM. To capture turbulence without prohibitively increasing the grid resolution requirements, an LES approach [5] is adopted allowing large scale eddies to be numerically resolved while modeling the smaller (subgrid) eddies. In this work, a Smagorinsky model has been used, which modifies the fluid viscosity by an additional eddy viscosity depending on the magnitude of the rate-of-strain tensor. In LBM, this is achieved by locally varying the relaxation time of the fluid.

Joshi, Abhijit S [ORNL; Jain, Prashant K [ORNL; Mudrich, Jaime A [ORNL; Popov, Emilian L [ORNL

2012-01-01

187

Potts-model grain growth simulations: Parallel algorithms and applications  

SciTech Connect

Microstructural morphology and grain boundary properties often control the service properties of engineered materials. This report uses the Potts-model to simulate the development of microstructures in realistic materials. Three areas of microstructural morphology simulations were studied. They include the development of massively parallel algorithms for Potts-model grain grow simulations, modeling of mass transport via diffusion in these simulated microstructures, and the development of a gradient-dependent Hamiltonian to simulate columnar grain growth. Potts grain growth models for massively parallel supercomputers were developed for the conventional Potts-model in both two and three dimensions. Simulations using these parallel codes showed self similar grain growth and no finite size effects for previously unapproachable large scale problems. In addition, new enhancements to the conventional Metropolis algorithm used in the Potts-model were developed to accelerate the calculations. These techniques enable both the sequential and parallel algorithms to run faster and use essentially an infinite number of grain orientation values to avoid non-physical grain coalescence events. Mass transport phenomena in polycrystalline materials were studied in two dimensions using numerical diffusion techniques on microstructures generated using the Potts-model. The results of the mass transport modeling showed excellent quantitative agreement with one dimensional diffusion problems, however the results also suggest that transient multi-dimension diffusion effects cannot be parameterized as the product of the grain boundary diffusion coefficient and the grain boundary width. Instead, both properties are required. Gradient-dependent grain growth mechanisms were included in the Potts-model by adding an extra term to the Hamiltonian. Under normal grain growth, the primary driving term is the curvature of the grain boundary, which is included in the standard Potts-model Hamiltonian.

Wright, S.A.; Plimpton, S.J.; Swiler, T.P. [and others

1997-08-01

188

Simulations of Space Station data links and ground processing  

NASA Astrophysics Data System (ADS)

A program aimed at the study of the possibilities of using parallel processing configurations for the real-time processing of Space Station data is reviewed. The potential configurations are evaluated using a program based on discrete-event simulation models. The major near-term goals of the program include: simulation of specific configurations to verify the methodology of using the simulation packages; model development for the ground-based data processing components; model development for commercially available parallel architectures; and modeling of various data transport configurations to simulate the operation of parallel processing configurations.

Horan, Stephen

189

Supervisor Localization: A Top-Down Approach to Distributed Control of Discrete-Event Systems  

NASA Astrophysics Data System (ADS)

A purely distributed control paradigm is proposed for discrete-event systems (DES). In contrast to control by one or more external supervisors, distributed control aims to design built-in strategies for individual agents. First a distributed optimal nonblocking control problem is formulated. To solve it, a top-down localization procedure is developed which systematically decomposes an external supervisor into local controllers while preserving optimality and nonblockingness. An efficient localization algorithm is provided to carry out the computation, and an automated guided vehicles (AGV) example presented for illustration. Finally, the `easiest' and `hardest' boundary cases of localization are discussed.

Cai, K.; Wonham, W. M.

2009-03-01

190

Determining the significance of associations between two series of discrete events : bootstrap methods /  

SciTech Connect

We review and develop techniques to determine associations between series of discrete events. The bootstrap, a nonparametric statistical method, allows the determination of the significance of associations with minimal assumptions about the underlying processes. We find the key requirement for this method: one of the series must be widely spaced in time to guarantee the theoretical applicability of the bootstrap. If this condition is met, the calculated significance passes a reasonableness test. We conclude with some potential future extensions and caveats on the applicability of these methods. The techniques presented have been implemented in a Python-based software toolkit.

Niehof, Jonathan T.; Morley, Steven K.

2012-01-01

191

Schedulability Analysis of Periodic and Sporadic Tasks Using a Timed Discrete Event Model with Memorable Events  

NASA Astrophysics Data System (ADS)

In a real-time system, when the execution of a task is preempted by another task, the interrupted task falls into a blocked state. Since its re-execution begins from the interrupted point generally, the task's timer containing the remaining time until its completion should be maintained in the blocked state. This is the reason for introducing the notion of memorable events in this paper. We present a new timed discrete event model (TDEM) that-adds the memorable events to the TDEM framework of Brandin and Wonham (1994). Using supervisory control theory upon the proposed TDEM, we analyze the schedulability of preemptable periodic and sporadic tasks executing on a uniprocessor.

Yang, Jung-Min; Park, Seong-Jin

192

Flow simulations by parallel computer MiPAX  

SciTech Connect

The authors have developed some parallel computer programs to show that the parallel computer MiPAX is suitable to deal with fluid flow simulations. MiPAX is the first commercially available parallel computer for scientific applications in Japan. They describe two typical methods for incompressible viscous flow problems: the MAC method and the third-order upwind scheme in the general curvilinear coordinate system. The techniques of mapping a physical space onto a processing unit array and the procedure of data transfer are also presented for MiPAX-32JFV, which consists of 32 processing units. They conclude that a program for MiPAX is basically identical with one for conventional machines.

Hara, H.; Kodera, Y.; Kanehiro (Mitsui Engineering and Shipbuilding Co., Ltd., Tamano, Okayama 706 (JP))

1988-01-01

193

Modularized Parallel Neutron Instrument Simulation on the TeraGrid  

SciTech Connect

In order to build a bridge between the TeraGrid (TG), a national scale cyberinfrastructure resource, and neutron science, the Neutron Science TeraGrid Gateway (NSTG) is focused on introducing productive HPC usage to the neutron science community, primarily the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL). Monte Carlo simulations are used as a powerful tool for instrument design and optimization at SNS. One of the successful efforts of a collaboration team composed of NSTG HPC experts and SNS instrument scientists is the development of a software facility named PSoNI, Parallelizing Simulations of Neutron Instruments. Parallelizing the traditional serial instrument simulation on TeraGrid resources, PSoNI quickly computes full instrument simulation at sufficient statistical levels in instrument de-sign. Upon SNS successful commissioning, to the end of 2007, three out of five commissioned instruments in SNS target station will be available for initial users. Advanced instrument study, proposal feasibility evalua-tion, and experiment planning are on the immediate schedule of SNS, which pose further requirements such as flexibility and high runtime efficiency on fast instrument simulation. PSoNI has been redesigned to meet the new challenges and a preliminary version is developed on TeraGrid. This paper explores the motivation and goals of the new design, and the improved software structure. Further, it describes the realized new fea-tures seen from MPI parallelized McStas running high resolution design simulations of the SEQUOIA and BSS instruments at SNS. A discussion regarding future work, which is targeted to do fast simulation for automated experiment adjustment and comparing models to data in analysis, is also presented.

Chen, Meili [ORNL; Cobb, John W [ORNL; Hagen, Mark E [ORNL; Miller, Stephen D [ORNL; Lynch, Vickie E [ORNL

2007-01-01

194

Noise simulation in cone beam CT imaging with parallel computing  

NASA Astrophysics Data System (ADS)

We developed a computer noise simulation model for cone beam computed tomography imaging using a general purpose PC cluster. This model uses a mono-energetic x-ray approximation and allows us to investigate three primary performance components, specifically quantum noise, detector blurring and additive system noise. A parallel random number generator based on the Weyl sequence was implemented in the noise simulation and a visualization technique was accordingly developed to validate the quality of the parallel random number generator. In our computer simulation model, three-dimensional (3D) phantoms were mathematically modelled and used to create 450 analytical projections, which were then sampled into digital image data. Quantum noise was simulated and added to the analytical projection image data, which were then filtered to incorporate flat panel detector blurring. Additive system noise was generated and added to form the final projection images. The Feldkamp algorithm was implemented and used to reconstruct the 3D images of the phantoms. A 24 dual-Xeon PC cluster was used to compute the projections and reconstructed images in parallel with each CPU processing 10 projection views for a total of 450 views. Based on this computer simulation system, simulated cone beam CT images were generated for various phantoms and technique settings. Noise power spectra for the flat panel x-ray detector and reconstructed images were then computed to characterize the noise properties. As an example among the potential applications of our noise simulation model, we showed that images of low contrast objects can be produced and used for image quality evaluation.

Tu, Shu-Ju; Shaw, Chris C.; Chen, Lingyun

2006-03-01

195

Non-intrusive parallelization of multibody system dynamic simulations  

NASA Astrophysics Data System (ADS)

This paper evaluates two non-intrusive parallelization techniques for multibody system dynamics: parallel sparse linear equation solvers and OpenMP. Both techniques can be applied to existing simulation software with minimal changes in the code structure; this is a major advantage over Message Passing Interface, the standard parallelization method in multibody dynamics. Both techniques have been applied to parallelize a starting sequential implementation of a global index-3 augmented Lagrangian formulation combined with the trapezoidal rule as numerical integrator, in order to solve the forward dynamics of a variable-loop four-bar mechanism. Numerical experiments have been performed to measure the efficiency as a function of problem size and matrix filling. Results show that the best parallel solver (Pardiso) performs better than the best sequential solver (CHOLMOD) for multibody problems of large and medium sizes leading to matrix fillings above 10. OpenMP also proved to be advantageous even for problems of small sizes. Both techniques delivered speedups above 70% of the maximum theoretical values for a wide range of multibody problems.

González, Francisco; Luaces, Alberto; Lugrís, Urbano; González, Manuel

2009-09-01

196

Massively parallelized replica-exchange simulations of polymers on GPUs  

NASA Astrophysics Data System (ADS)

We discuss the advantages of parallelization by multithreading on graphics processing units (GPUs) for parallel tempering Monte Carlo computer simulations of an exemplified bead-spring model for homopolymers. Since the sampling of a large ensemble of conformations is a prerequisite for the precise estimation of statistical quantities such as typical indicators for conformational transitions like the peak structure of the specific heat, the advantage of a strong increase in performance of Monte Carlo simulations cannot be overestimated. Employing multithreading and utilizing the massive power of the large number of cores on GPUs, being available in modern but standard graphics cards, we find a rapid increase in efficiency when porting parts of the code from the central processing unit (CPU) to the GPU.

Gross, Jonathan; Janke, Wolfhard; Bachmann, Michael

2011-08-01

197

Diffuse minor ions upstream of simulated quasi-parallel shocks  

Microsoft Academic Search

We have performed a number of one-dimensional hybrid (particle ions, massless fluid electrons) simulations of quasi-parallel collisionless shocks in order to elucidate the origin of diffuse upstream minor ions. Minor ions are treated self-consistently by the hybrid code; however, a number of runs have also been performed wherein the minor ions are treated as test particles. We have investigated the

K. J. Trattner; M. Scholer

1994-01-01

198

Simulation-Based Analysis of Parallel Runge-Kutta Solvers  

Microsoft Academic Search

We use simulation-based analysis to compare and investi- gate dierent shared-memory implementations of parallel and sequential embedded Runge-Kutta solvers for systems of ordinary dierential equa- tions. The results of the analysis help to provide a better understanding of the locality and scalability behavior of the implementations and can be used as a starting point for further optimizations. exploit the memory

Matthias Korch; Thomas Rauber

2004-01-01

199

Parallel density matrix propagation in spin dynamics simulations.  

PubMed

Several methods for density matrix propagation in parallel computing environments are proposed and evaluated. It is demonstrated that the large communication overhead associated with each propagation step (two-sided multiplication of the density matrix by an exponential propagator and its conjugate) may be avoided and the simulation recast in a form that requires virtually no inter-thread communication. Good scaling is demonstrated on a 128-core (16 nodes, 8 cores each) cluster. PMID:22299862

Edwards, Luke J; Kuprov, Ilya

2012-01-28

200

Massively parallel simulations of diffusion in dense polymeric structures  

Microsoft Academic Search

An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon (1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of

Jean-Loup Faulon; J. David Hobbs; David M. Ford; Robert T. Wilcox

1997-01-01

201

Massively Parallel Simulations of Diffusion in Dense Polymeric Structures  

Microsoft Academic Search

An original technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases is studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9200 Pentium Pro processors) and Intel Paragon (1840 processors). Compared to the current state-of-the-art equilibration methods the new technique is faster by some orders of magnitude. The main

Jean-Loup Faulon; J. David Hobbs; D. M. Ford; R. T. Wilcox

1997-01-01

202

Parallel density matrix propagation in spin dynamics simulations  

NASA Astrophysics Data System (ADS)

Several methods for density matrix propagation in parallel computing environments are proposed and evaluated. It is demonstrated that the large communication overhead associated with each propagation step (two-sided multiplication of the density matrix by an exponential propagator and its conjugate) may be avoided and the simulation recast in a form that requires virtually no inter-thread communication. Good scaling is demonstrated on a 128-core (16 nodes, 8 cores each) cluster.

Edwards, Luke J.; Kuprov, Ilya

2012-01-01

203

Time parallelization of advanced operation scenario simulations of ITER plasma  

SciTech Connect

This work demonstrates that simulations of advanced burning plasma operation scenarios can be successfully parallelized in time using the parareal algorithm. CORSICA - an advanced operation scenario code for tokamak plasmas is used as a test case. This is a unique application since the parareal algorithm has so far been applied to relatively much simpler systems except for the case of turbulence. In the present application, a computational gain of an order of magnitude has been achieved which is extremely promising. A successful implementation of the Parareal algorithm to codes like CORSICA ushers in the possibility of time efficient simulations of ITER plasmas.

Samaddar, D. [ITER Organization, Saint Paul Lez Durance, France; Casper, T. A. [Lawrence Livermore National Laboratory (LLNL); Kim, S. H. [ITER Organization, Saint Paul Lez Durance, France; Berry, Lee A [ORNL; Elwasif, Wael R [ORNL; Batchelor, Donald B [ORNL; Houlberg, Wayne A [ORNL

2013-01-01

204

Discrete event versus continuous approach to reproduction in structured population dynamics.  

PubMed

The governing equations are derived for the dynamics of a population consisting of organisms which reproduce by laying one egg at the time, on the basis of a simple physiological model for the uptake and use of energy. Two life stages are assumed, the egg and the adult stage where the adults do not grow. These assumptions hold true, for instance, for rotifers. From the model for the life history of the individuals, a physiologically structured population model for a rotifer population is derived. On the basis of this discrete event reproduction population model a continuous reproduction population model is proposed. The population model together with the equation for the food result in chemostat equations which are solved numerically. We show that for the calculation of the transient population dynamic behaviour after a step-wise change of the dilution rate, an age structure suffices, despite the size and energy structure used to describe the dynamics of the individuals. Aggregation of the continuous reproduction population model yields an approximate lumped parameter model in terms of delay differential equations. In order to assess the performance of the models, experimental data from the literature are fitted. The main purpose of this paper is to discuss the consequences of discrete event versus continuous reproduction. In both population models death by starvation is taken into account. Unlike the continuous reproduction model, the discrete model captures the experimentally observed lack of egg production shortly after the step change in the dilution rate of the chemostat. PMID:10438671

Kooi, B W; Kooijman, S A

1999-08-01

205

An automated parallel simulation execution and analysis approach  

NASA Astrophysics Data System (ADS)

State-of-the-art simulation computing requirements are continually approaching and then exceeding the performance capabilities of existing computers. This trend remains true even with huge yearly gains in processing power and general computing capabilities; simulation scope and fidelity often increases as well. Accordingly, simulation studies often expend days or weeks executing a single test case. Compounding the problem, stochastic models often require execution of each test case with multiple random number seeds to provide valid results. Many techniques have been developed to improve the performance of simulations without sacrificing model fidelity: optimistic simulation, distributed simulation, parallel multi-processing, and the use of supercomputers such as Beowulf clusters. An approach and prototype toolset has been developed that augments existing optimization techniques to improve multiple-execution timelines. This approach, similar in concept to the SETI @ home experiment, makes maximum use of unused licenses and computers, which can be geographically distributed. Using a publish/subscribe architecture, simulation executions are dispatched to distributed machines for execution. Simulation results are then processed, collated, and transferred to a single site for analysis.

Dallaire, Joel D.; Green, David M.; Reaper, Jerome H.

2004-08-01

206

An Approach to ease the redevelopment of a parallel simulation package  

Microsoft Academic Search

. An important obstacle for an industrial break through of parallel computingis the complexity of parallel programming and of porting existing simulation packagesto parallel computers. In the current paper a rigorous approach is worked out whichsimplifies the port of a package and parallel programming in general. The approach offersflexibility in modifying or extending a simulation package by identifying stages or

H. H. Ten Cate; Edwin A. H. Vollebregt; Mark R. Roest; Hai-xiang Lin

1996-01-01

207

Massively Parallel Processing for Fast and Accurate Stamping Simulations  

NASA Astrophysics Data System (ADS)

The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.

Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-Tao; Paul, Sabu

2005-08-01

208

Numerical techniques for parallel dynamics in electromagnetic gyrokinetic Vlasov simulations  

NASA Astrophysics Data System (ADS)

Numerical techniques for parallel dynamics in electromagnetic gyrokinetic simulations are introduced to regulate unphysical grid-size oscillations in the field-aligned coordinate. It is found that a fixed boundary condition and the nonlinear mode coupling in the field-aligned coordinate, as well as numerical errors of non-dissipative finite difference methods, produce fluctuations with high parallel wave numbers. The theoretical and numerical analyses demonstrate that an outflow boundary condition and a low-pass filter efficiently remove the numerical oscillations, providing small but acceptable errors of the entropy variables. The new method is advantageous for quantitative evaluation of the entropy balance that is required for obtaining a steady state in gyrokinetic turbulence.

Maeyama, S.; Ishizawa, A.; Watanabe, T.-H.; Nakajima, N.; Tsuji-Iio, S.; Tsutsui, H.

2013-11-01

209

Parallel grid library for rapid and flexible simulation development  

NASA Astrophysics Data System (ADS)

As the single CPU core performance is saturating while the number of cores in the fastest supercomputers increases exponentially, the parallel performance of simulations on distributed memory machines is crucial. At the same time, utilizing efficiently the large number of available cores presents a challenge, especially in simulations with run-time adaptive mesh refinement which can be the key to high performance. We have developed a generic grid library (dccrg) that is easy to use and scales well up to tens of thousands of cores. The grid has several attractive features: It 1) allows an arbitrary C++ class or structure to be used as cell data; 2) is easy to use and provides a simple interface for run-time adaptive mesh refinement ; 3) transfers the data of neighboring cells between processes transparently and asynchronously; and 4) provides a simple interface to run-time load balancing, e.g. domain decomposition, through the Zoltan library. Dccrg is freely available from https://gitorious.org/dccrg for anyone to use, study and modify under the GNU Lesser General Public License version 3. We present an overview of the implementation of dccrg, its parallel scalability and several source code examples of its usage in different types of simulations.

Honkonen, Ilja; von Alfthan, Sebastian; Sandroos, Arto; Janhunen, Pekka; Palmroth, Minna

2013-04-01

210

Simulation of hypervelocity impact on massively parallel supercomputer  

SciTech Connect

Hypervelocity impact studies are important for debris shield and armor/anti-armor research and development. Numerical simulations are frequently performed to complement experimental studies, and to evaluate code accuracy. Parametric computational studies involving material properties, geometry and impact velocity can be used to understand hypervelocity impact processes. These impact simulations normally need to address shock wave physics phenomena, material deformation and failure, and motion of debris particles. Detailed, three-dimensional calculations of such events have large memory and processing time requirements. At Sandia National Laboratories, many impact problems of interest require tens of millions of computational cells. Furthermore, even the inadequately resolved problems often require tens or hundred of Cray CPU hours to complete. Recent numerical studies done by Grady and Kipp at Sandia using the Eulerian shock wave physics code CTH demonstrated very good agreement with many features of a copper sphere-on-steel plate oblique impact experiment, fully utilizing the compute power and memory of Sandia`s Cray supercomputer. To satisfy requirements for more finely resolved simulations in order to obtain a better understanding of the crater formation process and impact ejecta motion, the numerical work has been moved from the shared-memory Cray to a large, distributed-memory, massively parallel supercomputing system using PCTH, a parallel version of CTH. The current work is a continuation of the studies, but done on Sandia`s Intel 1840-processor Paragon X/PS parallel computer. With the great compute power and large memory provided by the Paragon, a highly detailed PCTH calculation has been completed for the copper sphere impacting steel plate experiment. Although the PCTH calculation used a mesh which is 4.5 times bigger than the original Cray setup, it finished in much less CPU time.

Fang, H.E.

1994-12-31

211

A new parallel environment for interactive simulations implementing safe multithreading with MPI  

Microsoft Academic Search

This work presents a new parallel environment for interactive simulations. This environment integrates a MPI-based parallel simulation engine, a visualization module, and a user interface that supports modification of simulation parameters and visualization at runtime. This requires multiple threads, one to execute the simulation or the visualization, and other to receive user input. Since many MPI implementations are not thread-safe,

Eduardo Rocha Rodrigues; Airam Jonatas Preto; Stephan Stephany

2005-01-01

212

Integrating System Performance Engineering into MASCOT Methodology through Discrete-Event Simulation  

Microsoft Academic Search

\\u000a Software design methodologies are incorporating non functional features on design system descriptions. MASCOT which has been\\u000a a traditional design methodology for European defence companies has no performance extension. In this paper we present a set\\u000a of performance annotations to the MASCOT methodology, called MASCOTime. These annotations are extending the MDL (MASCOT Description\\u000a Language) design components transparently. Thus, in order to evaluate the

Pere P. Sancho; Carlos Juiz; Ramón Puigjaner

2004-01-01

213

Systems Operations Studies for Automated Guideway Transit Systems - Discrete Event Simulation Model User's Manual.  

National Technical Information Service (NTIS)

In order to examine specific Automated Guideway Transit (AGT) developments and concepts, and to build a better knowledge base for future decision-making, the Urban Mass Transportation Administration (UMTA) undertook a new program of studies and technology...

J. F. Duke R. Blanchard

1982-01-01

214

Discrete Event Simulation Model of the Ground Maintenance Operations Cycle of a Reusable Launch Vehicle.  

National Technical Information Service (NTIS)

The Air Force uses a family of expendable launch vehicles to meet its spacelift needs. Unfortunately, this method is not responsive: months of preparation are typically required and launch costs are high. Consequently, the Air Force seeks a reusable milit...

J. T. Pope

2006-01-01

215

System Operations Studies for Automated Guideway Transit Systems: Discrete Event Simulation Model Programmer's Manual.  

National Technical Information Service (NTIS)

In order to examine specific automated guideway transit (AGT) developments and concepts, UMTA undertook a program of studies and technology investigations called Automated Guideway Transit Technology (AGTT) Program. The objectives of one segment of the AG...

J. F. Duke R. Blanchard

1982-01-01

216

Discrete Event Simulation Model for Evaluating Air Force Reusable Military Launch Vehicle Prelaunch Operations.  

National Technical Information Service (NTIS)

As the control and exploitation of space becomes more important to the United States military, a responsive spacelift capability will become essential. Responsive spacelift could be defined as the ability to launch a vehicle within hours or days from the ...

A. T. Stiegelmeier

2006-01-01

217

User Guide and Specification for Discrete-Event Minehunting Simulation Model MHUNT.  

National Technical Information Service (NTIS)

Minehunting is a complex process involving detection and classification of contacts using sonar and the subsequent identification and disposal of mines generally using a remotely operated underwater vehicle (ROV). However, making suitable assumptions, min...

R. B. Watson P. J. Ryan B. Gilmartin

1993-01-01

218

Analyzing Discrete Event Simulation Models of Complex Manufacturing Systems: A Computational Complexity Approach.  

National Technical Information Service (NTIS)

Discrete manufacturing process design optimization is a challenging problem for the Air Force, due to the large number of manufacturing process design sequences that exist for a given part. This has forced researchers to develop heuristic strategies to ad...

S. H. Jacobson

1998-01-01

219

Oscillator Model for High-Precision Synchronization Protocol Discrete Event Simulation.  

National Technical Information Service (NTIS)

It is well known that a common notion of time in distributed systems can be used to ensure additional properties such as real-time behavior or the identification of the order of events. As large-scale hardware testbeds for such systems are neither efficie...

A. Nagy G. Gaderer J. Mad P. Loschmidt R. Beigelbeck

2007-01-01

220

Massively Parallel Simulations of Diffusion in Dense Polymeric Structures  

SciTech Connect

An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon(1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of magnitude.The main advantage of the technique is that one can circumvent the bottlenecks in configuration space that inhibit relaxation in molecular dynamics simulations. The technique is based on the fact that tetravalent atoms (such as carbon and silicon) fit in the center of a regular tetrahedron and that regular tetrahedrons can be used to mesh the three-dimensional space. Thus, the problem of polymer equilibration described by continuous equations in molecular dynamics is reduced to a discrete problem where solutions are approximated by simple algorithms. Practical modeling applications include the constructing of butyl rubber and ethylene-propylene-dimer-monomer (EPDM) models for oxygen and water diffusion calculations. Butyl and EPDM are used in O-ring systems and serve as sealing joints in many manufactured objects. Diffusion coefficients of small gases have been measured experimentally on both polymeric systems, and in general the diffusion coefficients in EPDM are an order of magnitude larger than in butyl. In order to better understand the diffusion phenomena, 10, 000 atoms models were generated and equilibrated for butyl and EPDM. The models were submitted to a massively parallel molecular dynamics simulation to monitor the trajectories of the diffusing species.

Faulon, Jean-Loup, Wilcox, R.T. [Sandia National Labs., Albuquerque, NM (United States)], Hobbs, J.D. [Montana Tech of the Univ. of Montana, Butte, MT (United States). Dept. of Chemistry and Geochemistry], Ford, D.M. [Texas A and M Univ., College Station, TX (United States). Dept. of Chemical Engineering

1997-11-01

221

Parallel finite element simulation of large ram-air parachutes  

NASA Astrophysics Data System (ADS)

In the near future, large ram-air parachutes are expected to provide the capability of delivering 21 ton payloads from altitudes as high as 25,000 ft. In development and test and evaluation of these parachutes the size of the parachute needed and the deployment stages involved make high-performance computing (HPC) simulations a desirable alternative to costly airdrop tests. Although computational simulations based on realistic, 3D, time-dependent models will continue to be a major computational challenge, advanced finite element simulation techniques recently developed for this purpose and the execution of these techniques on HPC platforms are significant steps in the direction to meet this challenge. In this paper, two approaches for analysis of the inflation and gliding of ram-air parachutes are presented. In one of the approaches the point mass flight mechanics equations are solved with the time-varying drag and lift areas obtained from empirical data. This approach is limited to parachutes with similar configurations to those for which data are available. The other approach is 3D finite element computations based on the Navier-Stokes equations governing the airflow around the parachute canopy and Newtons law of motion governing the 3D dynamics of the canopy, with the forces acting on the canopy calculated from the simulated flow field. At the earlier stages of canopy inflation the parachute is modelled as an expanding box, whereas at the later stages, as it expands, the box transforms to a parafoil and glides. These finite element computations are carried out on the massively parallel supercomputers CRAY T3D and Thinking Machines CM-5, typically with millions of coupled, non-linear finite element equations solved simultaneously at every time step or pseudo-time step of the simulation.

Kalro, V.; Aliabadi, S.; Garrard, W.; Tezduyar, T.; Mittal, S.; Stein, K.

1997-06-01

222

Massively parallel molecular dynamics simulations with EAM potentials  

NASA Astrophysics Data System (ADS)

Molecular dynamics of cascades in pure iron and iron-copper alloys using embedded atom method type of interatomic potentials are presented. Reliable simulations of radiation damage at the atomic scale with high energy Primary Knocked Atoms (PKA) need systems with large numbers of particles and very long computational time. To perform the simulation in a reasonable amount of time high-performance computer systems such as massively parallel machines need to be used. This paper presents the parallelisation strategy applied to a serial classical Molecular Dynamics code: DYMOKA. The original sequential Fortran code CDCMD from the University of Connecticut was first improved algorithmically by applying a link cell method for the neighbour list construction of the Verlet list, resulting in a fully linear algorithm. The parallelisation strategy adopted is a multidimensional domain decomposition of the simulation box using a link cell method and a Verlet list method for each subdomain independently. The program paradigm is based on explicit message passing, and the standard Message-Passing Interface (MPI) was chosen in order to achieve portability. First measurements have demonstrated that the simulation of a system of 2.000.000 (750.000) atoms on 128 (32) processors costs 2.5 (10.) ?s per atom per step. The current implementation has proven good scalability up to 32 processors on a NEC Cenju-3 machine. To study the effects of irradiation on copper segregation, simulations with up to 1.000.000 atoms in iron and iron-copper were performed with PKA energies up to 20 keV.

Becquart, C. S.; Decker, K. M.; Domain, C.; Ruste, J.; Souffez, Y.; Turbatte, J. C.; van Duysen, J. C.

223

Ion dynamics at supercritical quasi-parallel shocks: Hybrid simulations  

SciTech Connect

By separating the incident ions into directly transmitted, downstream thermalized, and diffuse ions, we perform one-dimensional (1D) hybrid simulations to investigate ion dynamics at a supercritical quasi-parallel shock. In the simulations, the angle between the upstream magnetic field and shock nominal direction is {theta}{sub Bn}=30 Degree-Sign , and the Alfven Mach number is M{sub A}{approx}5.5. The shock exhibits a periodic reformation process. The ion reflection occurs at the beginning of the reformation cycle. Part of the reflected ions is trapped between the old and new shock fronts for an extended time period. These particles eventually form superthermal diffuse ions after they escape to the upstream of the new shock front at the end of the reformation cycle. The other reflected ions may return to the shock immediately or be trapped between the old and new shock fronts for a short time period. When the amplitude of the new shock front exceeds that of the old shock front and the reformation cycle is finished, these ions become thermalized ions in the downstream. No noticeable heating can be found in the directly transmitted ions. The relevance of our simulations to the satellite observations is also discussed in the paper.

Su Yanqing; Lu Quanming; Gao Xinliang; Huang Can; Wang Shui [CAS Key Laboratory of Basic Plasma Physics, Department of Geophysics and Planetary Science, University of Science and Technology of China, Hefei 230026 (China)

2012-09-15

224

Ion dynamics at supercritical quasi-parallel shocks: Hybrid simulations  

NASA Astrophysics Data System (ADS)

By separating the incident ions into directly transmitted, downstream thermalized, and diffuse ions, we perform one-dimensional (1D) hybrid simulations to investigate ion dynamics at a supercritical quasi-parallel shock. In the simulations, the angle between the upstream magnetic field and shock nominal direction is ?Bn=30°, and the Alfven Mach number is MA~5.5. The shock exhibits a periodic reformation process. The ion reflection occurs at the beginning of the reformation cycle. Part of the reflected ions is trapped between the old and new shock fronts for an extended time period. These particles eventually form superthermal diffuse ions after they escape to the upstream of the new shock front at the end of the reformation cycle. The other reflected ions may return to the shock immediately or be trapped between the old and new shock fronts for a short time period. When the amplitude of the new shock front exceeds that of the old shock front and the reformation cycle is finished, these ions become thermalized ions in the downstream. No noticeable heating can be found in the directly transmitted ions. The relevance of our simulations to the satellite observations is also discussed in the paper.

Su, Yanqing; Lu, Quanming; Gao, Xinliang; Huang, Can; Wang, Shui

2012-09-01

225

Parallel computation for reservoir thermal simulation of multicomponent and multiphase fluid flow  

Microsoft Academic Search

We consider parallel computing technology for the thermal simulation of multicomponent, multiphase fluid flow in petroleum reservoirs. This paper reports the development and applications of a parallel thermal recovery simulation code. This code utilizes the message passing interface (MPI) library, overlapping domain decomposition, and dynamic memory allocation techniques. Its efficiency is investigated through simulation of two three-dimensional multicomponent, multiphase field

Yuanle Ma; Zhangxin Chen

2004-01-01

226

The Investigation of Lamarckian Inheritance with Classifier Systems in a Massively Parallel Simulation Environment  

Microsoft Academic Search

In contrast to simulators for ecological processes as they are designed and implemented today, the ParalLife system which is introduced in this article can make use of problem inherent parallelism to speed up the simulation process. We describe how the simulated environment can be distributed over massively parallel computer architectures and what can be gained by doing so. We then

Eckhard Bartscht; Jens Engel; Christian Miiller-Schloer

1995-01-01

227

Wisconsin Wind Tunnel II: A Fast and Portable Parallel Architecture Simulator  

Microsoft Academic Search

The design of future parallel computers requires rapid simulation of target designs running realistic workloads. These simulations have been accelerated using two techniques: direct execution and the use of a parallel host. Historically, these techniques have been considered to have poor portability. This paper identi- fies and describes the implementation of four key oper- ations necessary to make such simulation

Shubhendu S. Mukherjee; Steven K. Reinhardt; Babak Falsafi; Mike Litzkow; Steve Huss-Lederman; Mark D. Hill; James R. Larus; David A. Wood

1997-01-01

228

Full Particle Simulations of Quasi-parallel Shock  

NASA Astrophysics Data System (ADS)

Dynamics of quasi-parallel shock propagating in supercritrical regime is analyzed with the help of full particle simulations. In contrast with previous hybrid simula- tions, accessibility to very small time and spatial scales (in particular electrons) is al- lowed. Self-reformation of shock front and associated features are fully retrieved and are found to be in good agreement with previous work of Scholer (1993) : (i) emis- sion of upstream long wavelength waves (excited by ion beam-plasma interaction), (ii) local ion reflection (with temporary trapping vortex), (iii) shrinking (with associated wave steepening) of the usptream waves spatial scales when penetrating progressively within the downstram region, (iv) emission of whistler from the steepened waves, and (v) relatively stable waves patterns far within the downstream region. Each wave struc- ture is analysed in details in order to determine their local interaction with both ions and electrons, and to propose a global scenario of the quasiparallel shock dynamics. Results are compared with other previous hybrid and full particle simulations.

Tsubouchi, K.; Lembege, B.

229

Parallel grid library for rapid and flexible simulation development  

NASA Astrophysics Data System (ADS)

We present an easy to use and flexible grid library for developing highly scalable parallel simulations. The distributed cartesian cell-refinable grid (dccrg) supports adaptive mesh refinement and allows an arbitrary C++ class to be used as cell data. The amount of data in grid cells can vary both in space and time allowing dccrg to be used in very different types of simulations, for example in fluid and particle codes. Dccrg transfers the data between neighboring cells on different processes transparently and asynchronously allowing one to overlap computation and communication. This enables excellent scalability at least up to 32 k cores in magnetohydrodynamic tests depending on the problem and hardware. In the version of dccrg presented here part of the mesh metadata is replicated between MPI processes reducing the scalability of adaptive mesh refinement (AMR) to between 200 and 600 processes. Dccrg is free software that anyone can use, study and modify and is available at https://gitorious.org/dccrg. Users are also kindly requested to cite this work when publishing results obtained with dccrg.

Honkonen, I.; von Alfthan, S.; Sandroos, A.; Janhunen, P.; Palmroth, M.

2013-04-01

230

The Parallel Waveform IQMR Algorithm for Transient Simulation of Semiconductor Devices  

Microsoft Academic Search

We mainly study the parallelization aspects of the accelerated waveform relaxation algorithms for the transient simulation of semiconductor devices on parallel distributed memory computers since these methods are competitive with standard pointwise methods on serial machines, but are significantly faster on parallel computers. We propose an improved version of the quasi-minimal residual (IQMR) method by using the Lanczos process as

Laurence Tianruo Yang

2000-01-01

231

Concatenation Algorithms for Parallel Numerical Simulation of Radiation Hydrodynamics coupled with Neutron Transport  

Microsoft Academic Search

Complex physical phenomena can be usually split into several interacting physical computational models and can be numerically simulated by coupling parallel codes individually designed for these models. Besides rational splitting and efficient numerical methods for different models, we must design scalable parallel algorithms to concatenate these parallel codes. Meanwhile, three objectives should be well balanced. The first is how to

Mo Zeyao

2005-01-01

232

Partitioning and packing mathematical simulation models for calculation on parallel computers  

Microsoft Academic Search

The development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system is described. Degrees of parallelism (i.e., coupling between the equations) and their impact on parallel processing are discussed. The problem of identifying computational parallelism within sets of closely coupled equations that require the exchange of current values of variables is described. A technique

D. J. Arpasi; E. J. Milner

1986-01-01

233

Parallel Monte Carlo Driver (PMCD)—a software package for Monte Carlo simulations in parallel  

Microsoft Academic Search

Thanks to the dramatic decrease of computer costs and the no less dramatic increase in those same computer's capabilities and also thanks to the availability of specific free software and libraries that allow the set up of small parallel computation installations the scientific community is now in a position where parallel computation is within easy reach even to moderately budgeted

B. Mendes; A. Pereira

2003-01-01

234

Discrete-event-based planning and control of telerobotic part-mating process with communication delay and geometric uncertainty  

Microsoft Academic Search

This paper presents a new planning\\/control method to integrate the master and slave site in a teleprogramming system in the framework of a discrete-event dynamic system model. Specifically, a simple telerobotic part-mating task environment is modelled based on a class of controlled Petri net (CPN) by associating contact states between objects with state places, and the model is used as

Young-Jo Cho; Tetsuo Kotoku; Kazuo Tanie

1995-01-01

235

Parallel climate model (PCM) control and transient simulations  

NASA Astrophysics Data System (ADS)

The Department of Energy (DOE) supported Parallel Climate Model (PCM) makes use of the NCAR Community Climate Model (CCM3) and Land Surface Model (LSM) for the atmospheric and land surface components, respectively, the DOE Los Alamos National Laboratory Parallel Ocean Program (POP) for the ocean component, and the Naval Postgraduate School sea-ice model. The PCM executes on several distributed and shared memory computer systems. The coupling method is similar to that used in the NCAR Climate System Model (CSM) in that a flux coupler ties the components together, with interpolations between the different grids of the component models. Flux adjustments are not used in the PCM. The ocean component has 2/3° average horizontal grid spacing with 32 vertical levels and a free surface that allows calculation of sea level changes. Near the equator, the grid spacing is approximately 1/2° in latitude to better capture the ocean equatorial dynamics. The North Pole is rotated over northern North America thus producing resolution smaller than 2/3° in the North Atlantic where the sinking part of the world conveyor circulation largely takes place. Because this ocean model component does not have a computational point at the North Pole, the Arctic Ocean circulation systems are more realistic and similar to the observed. The elastic viscous plastic sea ice model has a grid spacing of 27km to represent small-scale features such as ice transport through the Canadian Archipelago and the East Greenland current region. Results from a 300year present-day coupled climate control simulation are presented, as well as for a transient 1% per year compound CO2 increase experiment which shows a global warming of 1.27°C for a 10year average at the doubling point of CO2 and 2.89°C at the quadrupling point. There is a gradual warming beyond the doubling and quadrupling points with CO2 held constant. Globally averaged sea level rise at the time of CO2 doubling is approximately 7cm and at the time of quadrupling it is 23cm. Some of the regional sea level changes are larger and reflect the adjustments in the temperature, salinity, internal ocean dynamics, surface heat flux, and wind stress on the ocean. A 0.5% per year CO2 increase experiment also was performed showing a global warming of 1.5°C around the time of CO2 doubling and a similar warming pattern to the 1% CO2 per year increase experiment. El Niño and La Niña events in the tropical Pacific show approximately the observed frequency distribution and amplitude, which leads to near observed levels of variability on interannual time scales.

Washington, W. M.; Weatherly, J. W.; Meehl, G. A.; Semtner, A. J., Jr.; Bettge, T. W.; Craig, A. P.; Strand, W. G., Jr.; Arblaster, J.; Wayland, V. B.; James, R.; Zhang, Y.

236

Towards Real CFD Simulations on Parallel Computers in the Aeronautic and Automotive Industry  

Microsoft Academic Search

We describe in this paper the solution of some real industrial cases with the parallel version of the N3S-MUSCL 3D CFD code. The parallelization has been achieved on parallel MIMD machines using the message-passing paradigm. Performance numbers on MPPs or on a cluster of workstations are allowing the use of such parallel code for industrial simulations at reasonable cost. This

Alain Stoessel I; Emmanuel Issman; Mark Loriot

1996-01-01

237

Contact-impact simulations on massively parallel SIMD supercomputers  

SciTech Connect

The implementation of explicit finite element methods with contact-impact on massively parallel SIMD computers is described. The basic parallel finite element algorithm employs an exchange process which minimizes interprocessor communication at the expense of redundant computations and storage. The contact-impact algorithm is based on the pinball method in which compatibility is enforced by preventing interpenetration on spheres embedded in elements adjacent to surfaces. The enhancements to the pinball algorithm include a parallel assembled surface normal algorithm and a parallel detection of interpenetrating pairs. Some timings with and without contact-impact are given.

Plaskacz, E.J. (Argonne National Lab., IL (United States)); Belytscko, T.; Chiang, H.Y. (Northwestern Univ., Evanston, IL (United States))

1992-01-01

238

A parallel framework for multidisciplinary aerospace engineering simulations using unstructured meshes  

Microsoft Academic Search

High performance parallel computers offer the promise of sufficient computational power to enable the routine use of large scale simulations during the process of engineering design. With this in mind, and with particular reference to the aerospace industry, this paper describes developments that have been undertaken to provide parallel implementations of algorithms for simulation, mesh generation and visualization. Designers are

K. Morgan; N. P. Weatherill; O. Hassan; P. J. Brookes; R. Said; J. Jones

1999-01-01

239

Vectorized and Parallelized Algorithms for Multi-Million Particle Md-Simulation  

NASA Astrophysics Data System (ADS)

We present fully vectorized and parallelized algorithms for the molecular dynamics simulation of particle systems with short-range interaction. Speed of million particle updates per second are achieved with modern vector and parallel computers, thus making the simulation of million particle systems possible. We present results for the Cray YMP, JAERI/NEC MONTE-4 vector computers and the Intel Paragon XP/S and Thinking Machine CM-5 parallel computers.

Form, Wolfgang; Ito, Nobuyasu; Kohring, Gregory A.

240

Parallel Vehicular Traffic Simulation using Reverse Computation-based Optimistic Execution  

SciTech Connect

Vehicular traffic simulations are useful in applications such as emergency management and homeland security planning tools. High speed of traffic simulations translates directly to speed of response and level of resilience in those applications. Here, a parallel traffic simulation approach is presented that is aimed at reducing the time for simulating emergency vehicular traffic scenarios. Three unique aspects of this effort are: (1) exploration of optimistic simulation applied to vehicular traffic simulation (2) addressing reverse computation challenges specific to optimistic vehicular traffic simulation (3) achieving absolute (as opposed to self-relative) speedup with a sequential speed equal to that of a fast, de facto standard sequential simulator for emergency traffic. The design and development of the parallel simulation system is presented, along with a performance study that demonstrates excellent sequential performance as well as parallel performance.

Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL

2008-01-01

241

Application of Distributed Workstation Environment to the Parallel Simulation of Mobile Networks  

Microsoft Academic Search

In this paper, a distributed workstation environment, Diworse, for parallel simulation is presented. Distribution of the simulation model in Diworse is based on the modification of the conservative Chandy-Misra algorithm with null message delivery and constant lookahead. Simulation experiments have been carried out in the Ethernet network with TCP\\/IP protocols. In these experiments, a GSM network application has been simulated

Veikko Hara; Jarmo Harju; Jouni Ikonen; Jari Porras

1996-01-01

242

Parallel Simulation of a High-Speed Wormhole Routing Network  

Microsoft Academic Search

A flexible simulator has been developed to simulatea two-level metropolitan area network which useswormhole routing. To accurately model the natureof wormhole routing, the simulator performs discretebyterather than discrete-packet simulation. Despitethe increased computational workload that this implies,it has been possible to create a simulator with acceptableperformance by writing it in Maisie, a paralleldiscrete-event simulation language. The simulatorprovides an accurate model of...

Rajive Bagrodia; Yu-an Chen; Mario Gerla; Bruce Kwan; Jay Martin; Prasasth Palnati; Simon Walton

1996-01-01

243

Parallel computing in enterprise modeling.  

SciTech Connect

This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

2008-08-01

244

An Efficient and Adaptive Mechanism for Parallel Simulation Replication  

Microsoft Academic Search

Simulation replication is a necessity for all stochastic simulations. Its efficient execution is particularly important when additional techniques are used on top, such as optimization or sensitivity analysis. One way to improve replication efficiency is to ensure that the best configuration of the simulation system is used for execution. A selection of the best configuration is possible when the number

Roland Ewald; Stefan Leye; Adelinde M. Uhrmacher

2009-01-01

245

Parallel computation approaches for flexible multibody dynamics simulations  

Microsoft Academic Search

Finite element based formulations for flexible multibody systems are becoming increasingly popular and as the complexity of the configurations to be treated increases, so does the computational cost. It seems natural to investigate the applicability of parallel processing to this type of problems; domain decomposition techniques have been used extensively for this purpose. In this approach, the computational domain is

Olivier A. Bauchau

2010-01-01

246

PARALLEL SIMULATION OF TRAFFIC IN GENEVA USING CELLULAR AUTOMATA  

Microsoft Academic Search

Road trac microsimulations based on the individual motion of all vehicles are now recognized as an important tool to describe, understand and manage road trac. Cellular automata models are a very ecient way to implement car motion. This paper presents a detailed description of a parallel cellular automata trac microsimulator. We discuss the data structure, domain decomposition, and provide a

ALEXANDRE DUPUIS; BASTIEN CHOPARD

247

Three-dimensional shock wave physics simulations with MIMD PAGOSA on massively parallel computers  

SciTech Connect

The numerical modeling of penetrator-armor interactions for design studies requires rapid, detailed, three-dimensional simulation of complex interactions of exotic materials at high speeds and high rates of strain. To perform such simulations, we have developed a multiple-instruction, multiple-data (MIMD) version of the PAGOSA hydrocode. The code includes a variety of models for material strength, fracture, and the detonation of high explosives. We present a typical armor/antiarmor penetration simulation conducted with this code, and measurements of its performance. The scaled speedups for MIMD PAGOSA on the 1024-processor nCUBE 2 parallel computer, measured as the simulation size is increased with the number of processors, reveal that small grind times (computational time per cell per cycle) and parallel scaled efficiencies of 90% can be achieved for realistic problems. This simulation demonstrates that massively parallel hydrocodes can provide rapid, highly-detailed armor/ antiarmor simulations.

Gardner, D.R.; Vaughan, C.T. (Sandia National Labs., Albuquerque, NM (United States)); Cline, D.D. (Texas Univ., Austin, TX (United States). Center for High Performance Computing)

1992-01-01

248

Three-dimensional shock wave physics simulations with MIMD PAGOSA on massively parallel computers  

SciTech Connect

The numerical modeling of penetrator-armor interactions for design studies requires rapid, detailed, three-dimensional simulation of complex interactions of exotic materials at high speeds and high rates of strain. To perform such simulations, we have developed a multiple-instruction, multiple-data (MIMD) version of the PAGOSA hydrocode. The code includes a variety of models for material strength, fracture, and the detonation of high explosives. We present a typical armor/antiarmor penetration simulation conducted with this code, and measurements of its performance. The scaled speedups for MIMD PAGOSA on the 1024-processor nCUBE 2 parallel computer, measured as the simulation size is increased with the number of processors, reveal that small grind times (computational time per cell per cycle) and parallel scaled efficiencies of 90% can be achieved for realistic problems. This simulation demonstrates that massively parallel hydrocodes can provide rapid, highly-detailed armor/ antiarmor simulations.

Gardner, D.R.; Vaughan, C.T. [Sandia National Labs., Albuquerque, NM (United States); Cline, D.D. [Texas Univ., Austin, TX (United States). Center for High Performance Computing

1992-12-31

249

Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers.  

National Technical Information Service (NTIS)

This final report contains reports of research related to the tasks 'Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers' and 'Develop High-Performance Time-Domain Computational Electroma...

P. E. Morgan

2004-01-01

250

Comparative Evaluation of Nodal and Supernodal Parallel Sparse Matrix Factorization: Detailed Simulation Results.  

National Technical Information Service (NTIS)

In the paper the authors consider the problem of factoring a large sparse system of equations on a modestly parallel shared-memory multiprocessor with a non-trivial memory hierarchy. Using detailed multiprocessor simulation, the authors study the behavior...

E. Rothberg A. Gupta

1990-01-01

251

CONSTRUCTING PARALLEL SIMULATION EXERCISES FOR ASSESSMENT CENTERS AND OTHER FORMS OF BEHAVIORAL ASSESSMENT  

Microsoft Academic Search

Assessment centers rely on multiple, carefully constructed behavioral simulation exercises to measure individuals on multiple performance dimensions. Although methods for establishing parallelism among al- ternate forms of paper-and-pencil tests have been well researched (i.e., to equate tests on difficulty such that the scores can be compared), little re- search has considered the why and how of parallel simulation exercises. This

BRADLEY J. BRUMMEL; DEBORAH E. RUPP; SETH M. SPAIN

2009-01-01

252

3D Visualization of Molecular Simulations in High-performance Parallel Computing Environments  

Microsoft Academic Search

This paper presents a novel tool for interactive 3D visualization and computational steering of molecular simulations and other computer simulation techniques such as computational fluid dynamics in parallel computing environments. The visualization system consists of three major components—data source, streaming server and viewer—which are distributed in intra\\/internet networks. A parallelized data extraction and visualization library, which generates 3D scenes, is

Karsten Meier; Christopher Holzknecht; Stephan Kabelac; Stephan Olbrich; Karsten Chmielewski

2004-01-01

253

PROTEUS: A High-Performance Parallel-Architecture Simulator  

Microsoft Academic Search

Proteus is a high-performance simulator for MIMD multiprocessors. It is fast, accurate, and flexible:it is one to two orders of magnitude faster than comparable simulators, it can reproduce results from realmultiprocessors, and it is easily configured to simulate a wide range of architectures. Proteus providesa modular structure that simplifies customization and independent replacement of parts of architecture.There are typically multiple

Eric A. Brewer; Chrysanthos N. Dellarocas; Adrian Colbrook; William E. Weihl

1992-01-01

254

Molecular dynamics simulations of covalent amorphous insulators on parallel computers  

Microsoft Academic Search

Algorithms are designed to implement molecular dynamics (MD) simulations on emerging concurrent architectures. A highly efficient multiresolution algorithm is designed to carry out large-scale MD simulations for systems with long-range Coulomb and three-body covalent interactions. Large-scale MD simulations of amorphous silica are carried out on systems containing up to 41 472 particles. The intermediate-range order represented by the first sharp

Priya Vashishta; Aiichiro Nakano; Rajiv K. Kalia; Ingvar Ebbsjö

1995-01-01

255

Compressible Flow Simulations on a Massively Parallel Computer  

NASA Astrophysics Data System (ADS)

This paper describes model development and computations of multidimensional, highly compressible, time-dependent reacting on a Connection Machine (CM). We briefly discuss computational timings compared to a Cray YMP speed, optimal use of the hardware and software available, treatment of boundary conditions, and parallel solution of terms representing chemical reactions. In addition, we show the practical use of the system for large-scale reacting and nonreacting flows.

Oran, Elaine S.; Boris, Jay P.

256

Parallel three-dimensional acoustic and elastic wave simulation methods with applications in nondestructive evaluation  

NASA Astrophysics Data System (ADS)

In this dissertation, we present two parallelized 3D simulation techniques for three-dimensional acoustic and elastic wave propagation based on the finite integration technique. We demonstrate their usefulness in solving real-world problems with examples in the three very different areas of nondestructive evaluation, medical imaging, and security screening. More precisely, these include concealed weapons detection, periodontal ultrasography, and guided wave inspection of complex piping systems. We have employed these simulation methods to study complex wave phenomena and to develop and test a variety of signal processing and hardware configurations. Simulation results are compared to experimental measurements to confirm the accuracy of the parallel simulation methods.

Rudd, Kevin Edward

257

Estimation of Transitional Probabilities of Discrete Event Systems from Cross-Sectional Survey and its Application in Tobacco Control  

PubMed Central

In order to find better strategies for tobacco control, it is often critical to know the transitional probabilities among various stages of tobacco use. Traditionally, such probabilities are estimated by analyzing data from longitudinal surveys that are often time-consuming and expensive to conduct. Since cross-sectional surveys are much easier to conduct, it will be much more practical and useful to estimate transitional probabilities from cross-sectional survey data if possible. However, no previous research has attempted to do this. In this paper, we propose a method to estimate transitional probabilities from cross-sectional survey data. The method is novel and is based on a discrete event system framework. In particular, we introduce state probabilities and transitional probabilities to conventional discrete event system models. We derive various equations that can be used to estimate the transitional probabilities. We test the method using cross-sectional data of the National Survey on Drug Use and Health. The estimated transitional probabilities can be used in predicting the future smoking behavior for decision-making, planning and evaluation of various tobacco control programs. The method also allows a sensitivity analysis that can be used to find the most effective way of tobacco control. Since there are much more cross-sectional survey data in existence than longitudinal ones, the impact of this new method is expected to be significant.

Lin, Feng; Chen, Xinguang

2009-01-01

258

Parallel Adaptive Multi-Mechanics Simulations using Diablo  

SciTech Connect

Coupled multi-mechanics simulations (such as thermal-stress and fluidstructure interaction problems) are of substantial interest to engineering analysts. In addition, adaptive mesh refinement techniques present an attractive alternative to current mesh generation procedures and provide quantitative error bounds that can be used for model verification. This paper discusses spatially adaptive multi-mechanics implicit simulations using the Diablo computer code. (U)

Parsons, D; Solberg, J

2004-12-03

259

Massively parallel molecular dynamics simulations with EAM potentials  

Microsoft Academic Search

Molecular dynamics of cascades in pure iron and iron-copper alloys using embedded atom method type of interatomic potentials are presented. Reliable simulations of radiation damage at the atomic scale with high energy Primary Knocked Atoms (PKA) need systems with large numbers of particles and very long computational time. To perform the simulation in a reasonable amount of time high-performance computer

C. S. Becquart; K. M. Decker; J. Ruste; Y. Souffez; J. C. Turbatte; J. C. van Duysen

1997-01-01

260

Parallelizing discrete dislocation dynamics simulations on multi-core systems  

Microsoft Academic Search

Materials science simulations are among the leading applications for scientific supercomputing. Discrete dislocation dynamics (DDD) is a numerical tool used to model the plastic behavior of crystalline materials using the elastic theory of dislocations. DDD simulations require very long running times to produce meaningful scientific results. This paper presents early experiences and results on improving the running time of Micromegas,

Florina M. Ciorba; Sebastien Groh; Mark F. Horstemeyer

2010-01-01

261

Pelegant : a parallel accelerator simulation code for electron generation and tracking.  

SciTech Connect

elegant is a general-purpose code for electron accelerator simulation that has a worldwide user base. Recently, many of the time-intensive elements were parallelized using MPI. Development has used modest Linux clusters and the BlueGene/L supercomputer at Argonne National Laboratory. This has provided very good performance for some practical simulations, such as multiparticle tracking with synchrotron radiation and emittance blow-up in the vertical rf kick scheme. The effort began with development of a concept that allowed for gradual parallelization of the code, using the existing beamline-element classification table in elegant. This was crucial as it allowed parallelization without major changes in code structure and without major conflicts with the ongoing evolution of elegant. Because of rounding error and finite machine precision, validating a parallel program against a uniprocessor program with the requirement of bitwise identical results is notoriously difficult. We will report validating simulation results of parallel elegant against those of serial elegant by applying Kahan's algorithm to improve accuracy dramatically for both versions. The quality of random numbers in a parallel implementation is very important for some simulations. Some practical experience with generating parallel random numbers by offsetting the seed of each random sequence according to the processor ID will be reported.

Wang, Y.; Borland, M. D.; Accelerator Systems Division (APS)

2006-01-01

262

Event Based Simulator for Parallel Computing over the Wide Area Network for Real Time Visualization  

NASA Astrophysics Data System (ADS)

As the computational requirement of applications in computational science continues to grow tremendously, the use of computational resources distributed across the Wide Area Network (WAN) becomes advantageous. However, not all applications can be executed over the WAN due to communication overhead that can drastically slowdown the computation. In this paper, we introduce an event based simulator to investigate the performance of parallel algorithms executed over the WAN. The event based simulator known as SIMPAR (SIMulator for PARallel computation), simulates the actual computations and communications involved in parallel computation over the WAN using time stamps. Visualization of real time applications require steady stream of processed data flow for visualization purposes. Hence, SIMPAR may prove to be a valuable tool to investigate types of applications and computing resource requirements to provide uninterrupted flow of processed data for real time visualization purposes. The results obtained from the simulation show concurrence with the expected performance using the L-BSP model.

Sundararajan, Elankovan; Harwood, Aaron; Kotagiri, Ramamohanarao; Satria Prabuwono, Anton

263

Molecular Dynamic Simulations of Nanostructured Ceramic Materials on Parallel Computers  

SciTech Connect

Large-scale molecular-dynamics (MD) simulations have been performed to gain insight into: (1) sintering, structure, and mechanical behavior of nanophase SiC and SiO2; (2) effects of dynamic charge transfers on the sintering of nanophase TiO2; (3) high-pressure structural transformation in bulk SiC and GaAs nanocrystals; (4) nanoindentation in Si3N4; and (5) lattice mismatched InAs/GaAs nanomesas. In addition, we have designed a multiscale simulation approach that seamlessly embeds MD and quantum-mechanical (QM) simulations in a continuum simulation. The above research activities have involved strong interactions with researchers at various universities, government laboratories, and industries. 33 papers have been published and 22 talks have been given based on the work described in this report.

Vashishta, Priya; Kalia, Rajiv

2005-02-24

264

Massively Parallel Methods for Simulating the Phase-Field Model.  

National Technical Information Service (NTIS)

Prediction of the evolution of microstructures in weapons systems is critical to meeting the objectives of stockpile stewardship in accordance with the Nuclear Weapons Test Ban Treaty. For example, accurate simulation of microstructural evolution in solde...

V. Tikare D. Fan S. J. Plimpton R. M. Fye

2000-01-01

265

Parallel simulation of the global epidemiology of Avian Influenza  

Microsoft Academic Search

SEARUMS is an Eco-modeling, bio-simulation, and analy- sis environment to study the global epidemiology of Avian Influenza. Originally developed in Java, SEARUMS enables comprehensive epidemiological analysis, forecast epicen- ters, and time lines of epidemics for prophylaxis; thereby mitigating disease outbreaks. However, SEARUMS-based simulations were time consuming due to the size and com- plexity of the models. In an endeavor to

Dhananjai M. Rao; Alexander Chernyakhovsky

2008-01-01

266

IMD - A Massively Parallel Molecular Dynamics Package for Classical Simulations in Condensed Matter Physics  

Microsoft Academic Search

We describe the current development status of IMD (ITAPMolecular Dynamics), a software package for classical molecular dynamics simulations on massively-parallel computers. IMD is a general purpose program which can be used for all kinds of two -and three-dimensional studies in condensed matter physics, in addition to the usual MD features it contains a number of special routines for simulation of

Johannes Roth; Jorg Stadler; Marco Brunelli; Dietmar Bunz; Franz Gahler; Jutta Hahn; Martin Hohl; Christof Horn; Jutta Kaiser; Ralf Mikulla; Gunther Schaaf; Joachim Stelzer; Hans-Rainer Trebin

1999-01-01

267

Parallel Numerical Simulation of Boltzmann Transport in Single-Walled Carbon Nanotubes  

NSDL National Science Digital Library

This module teaches the basic principles of semi-classical transport simulation based on the time-dependent Boltzmann transport equation (BTE) formalism with performance considerations for parallel implementations of multi-dimensional transport simulation and the numerical methods for efficient and accurate solution of the BTE for both electronic and thermal transport using the simple finite difference discretization and the stable upwind method.

Aksamija, Zlatan

268

Parallelization of the 'Particle-in-Cell' (PIC) density calculations in plasma computer simulations  

Microsoft Academic Search

The paper deals with the problem of parallelized Particle-In-Cell charge density calculations used in computer plasma simulation. The dependencies between the execution time and the simulation parameters such as the number of 'macro particles', plasma density, particular charge distribution technique and the number of processing units are presented. The local computer cluster and MPI standard have been used in order

Marcin Brzuszek; Marcin Turek; Juliusz Sielanko

269

Parallel and distributed simulation from many cores to the public cloud  

Microsoft Academic Search

In this tutorial paper, we will firstly review some basic simulation concepts and then introduce the parallel and distributed simulation techniques in view of some new challenges of today and tomorrow. More in particular, in the last years there has been a wide diffusion of many cores architectures and we can expect this trend to continue. On the other hand,

Gabriele D'Angelo

2011-01-01

270

Parallel implicit algorithms for direct numerical simulations of hypersonic boundary layer stability and transition  

Microsoft Academic Search

Due to the progress in computer technology in recent years, distributed memory parallel computer systems are rapidly gaining importance in direct numerical simulation (DNS) of the stability and transition of compressible boundary layers. In most works, explicit methods have mainly been used in such simulations to advance the compressible Navier-Stokes equations in time. However, the small wall-normal grid sizes for

Haibo Dong

2003-01-01

271

Parallel adaptive numerical simulation of dry avalanches over natural terrain  

NASA Astrophysics Data System (ADS)

High-fidelity computational simulation can be an invaluable tool in planning strategies for hazard risk mitigation. The accuracy and reliability of the predictions are crucial elements of these tools being successful. We present here a new simulation tool for dry granular avalanches using several new techniques for enhancing numerical solution accuracy. Highlights of our new methodology are the use of a depth-averaged model of the conservation laws and an adaptive grid Godunov solver to solve the resulting equations. The software is designed to run on distributed memory supercomputers and makes use of digital elevation data dynamically, i.e., refine the grid and input data to finer resolutions to better capture flow features as the flow evolves. Our simulations are validated using quantitative and qualitative comparisons to tabletop experiments and data from field observations. Our software is freely available and uses only publicly available libraries and hence can be used on a wide range of hardware and software platforms.

Patra, A. K.; Bauer, A. C.; Nichita, C. C.; Pitman, E. B.; Sheridan, M. F.; Bursik, M.; Rupp, B.; Webber, A.; Stinton, A. J.; Namikawa, L. M.; Renschler, C. S.

2005-01-01

272

Object-oriented particle simulation on parallel computers  

SciTech Connect

A general purpose, object-oriented particle simulation (OOPS) library has been developed for use on a variety of system architectures with a uniform high-level interface. This includes the development of library implementations for the CM5, Intel Paragon, and CRI T3D. Codes written on any of these platforms can be ported to other platforms without modifications by utilizing the high-level library. The general character of the library allows application to such diverse areas as plasma physics, suspension flows, vortex simulations, porous media, and materials science.

Reynders, J.V.W.; Forslund, D.W.; Hinker, P.J.; Tholburn, M.; Kilman, D.G.; Humphrey, W.F.

1994-04-01

273

Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL  

SciTech Connect

We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.

Zuo, Wangda; McNeil, Andrew; Wetter, Michael; Lee, Eleanor

2011-09-06

274

Toward parallel, adaptive mesh refinement for chemically reacting flow simulations  

SciTech Connect

Adaptive numerical methods offer greater efficiency than traditional numerical methods by concentrating computational effort in regions of the problem domain where the solution is difficult to obtain. In this paper, the authors describe progress toward adding mesh refinement to MPSalsa, a computer program developed at Sandia National laboratories to solve coupled three-dimensional fluid flow and detailed reaction chemistry systems for modeling chemically reacting flow on large-scale parallel computers. Data structures that support refinement and dynamic load-balancing are discussed. Results using uniform refinement with mesh sequencing to improve convergence to steady-state solutions are also presented. Three examples are presented: a lid driven cavity, a thermal convection flow, and a tilted chemical vapor deposition reactor.

Devine, K.D.; Shadid, J.N.; Salinger, A.G. Hutchinson, S.A. [Sandia National Labs., Albuquerque, NM (United States); Hennigan, G.L. [New Mexico State Univ., Las Cruces, NM (United States)

1997-12-01

275

Virtual reality visualization of parallel molecular dynamics simulation  

SciTech Connect

When performing communications mapping experiments for massively parallel processors, it is important to be able to visualize the mappings and resulting communications. In a molecular dynamics model, visualization of the atom to atom interaction and the processor mappings provides insight into the effectiveness of the communications algorithms. The basic quantities available for visualization in a model of this type are the number of molecules per unit volume, the mass, and velocity of each molecule. The computational information available for visualization is the atom to atom interaction within each time step, the atom to processor mapping, and the energy resealing events. We use the CAVE (CAVE Automatic Virtual Environment) to provide interactive, immersive visualization experiences.

Disz, T.; Papka, M.; Stevens, R.; Pellegrino, M. [Argonne National Lab., IL (United States); Taylor, V. [Northwestern Univ., Evanston, IL (United States). Electrical Engineering and Computer Science

1995-12-31

276

DC simulator of large-scale nonlinear systems for parallel processors  

NASA Astrophysics Data System (ADS)

In this paper it is shown how the idea of the BBD decomposition of large-scale nonlinear systems can be implemented in a parallel DC circuit simulation algorithm. Usually, the BBD nonlinear circuits decomposition was used together with the multi-level Newton-Raphson iterative process. We propose the simulation consisting in the circuit decomposition and the process parallelization on the single level only. This block-parallel approach may give a considerable profit in simulation time though it is strongly dependent on the system topology and, of course, on the processor type. The paper presents the architecture of the decomposition-based algorithm, explains details of its implementation, including two steps of the one level bypassing techniques and discusses a construction of the dedicated benchmarks for this simulation software.

Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel

2012-05-01

277

Constructing School Timetables Using Simulated Annealing: Sequential and Parallel Algorithms  

Microsoft Academic Search

This paper considers a solution to the school timetabling problem. The timetabling problem involves scheduling a number of tuples, each consisting of class of students, a teacher, a subject and a room, to a fixed number of time slots. A Monte Carlo scheme called simulated annealing is used as an optimisation technique. The paper introduces the timetabling problem, and then

D. Abramson

1991-01-01

278

Timesteps and Parallel Domain Decomposition with Application to Astrophysical Simulations  

Microsoft Academic Search

Many modelling applications target systems with a broad range of dynamical timescales. If only a fraction of the modelled system requires small timesteps, large speedups in processing can be achieved by integrating each part of the system with a local timestep. In the astrophysical simulation example taken here with a range of 214 in timesteps, speed-ups over a single global

James Wadsley; Thomas Quinn

2006-01-01

279

xSim: The Extreme-Scale Simulator  

SciTech Connect

Investigating parallel application performance properties at scale is becoming an important part of high-performance computing (HPC) application development and deployment. The Extreme-scale Simulator (xSim) is a performance investigation toolkit that permits running an application in a controlled environment at extreme scale without the need for a respective extreme-scale HPC system. Using a lightweight parallel discrete event simulation, xSim executes a parallel application with a virtual wall clock time, such that performance data can be extracted based on a processor model and a network model. This paper presents significant enhancements to the xSim toolkit prototype that provide a more complete Message Passing Interface (MPI) support and improve its versatility. These enhancements include full virtual MPI group, communicator and collective communication support, and global variables support. The new capabilities are demonstrated by executing the entire NAS Parallel Benchmark suite in a simulated HPC environment.

Boehm, Swen [ORNL; Engelmann, Christian [ORNL

2011-01-01

280

Discrete-event-dynamic-system-based approaches for control in integrated voice/data multihop radio networks  

NASA Astrophysics Data System (ADS)

This report describes the progress made towards the effort to develop and apply new discrete event dynamic system based techniques for the transmission scheduling problem in radio networks (RN). The report is in two parts: in part one we examine the transmission scheduling problem in the context of (fixed length) data traffic. In contrast to our earlier work which dealt with the broadcast scheduling problem in a fully connected packet RN, the first part of this report extends the proposed methodology therein to general topology networks. In part two, we look at the scheduling problem when processing packetized voice calls in a N-node multihop RN. As will be seen, the different grade-of-service (GOS) requirements associated with voice and data traffic lead to two decidedly different scheduling policies. We begin with a brief review of the transmission scheduling problem in N-node radio networks.

Cassandras, Christos G.; Julka, Vibhor

1993-11-01

281

Parallel Monte Carlo Electron and Photon Transport Simulation Code (PMCEPT code)  

NASA Astrophysics Data System (ADS)

Simulations for customized cancer radiation treatment planning for each patient are very useful for both patient and doctor. These simulations can be used to find the most effective treatment with the least possible dose to the patient. This typical system, so called ``Doctor by Information Technology", will be useful to provide high quality medical services everywhere. However, the large amount of computing time required by the well-known general purpose Monte Carlo(MC) codes has prevented their use for routine dose distribution calculations for a customized radiation treatment planning. The optimal solution to provide ``accurate" dose distribution within an ``acceptable" time limit is to develop a parallel simulation algorithm on a beowulf PC cluster because it is the most accurate, efficient, and economic. I developed parallel MC electron and photon transport simulation code based on the standard MPI message passing interface. This algorithm solved the main difficulty of the parallel MC simulation (overlapped random number series in the different processors) using multiple random number seeds. The parallel results agreed well with the serial ones. The parallel efficiency approached 100% as was expected.

Kum, Oyeon

2004-11-01

282

Numerical simulation of filamentary discharges with parallel adaptive mesh refinement  

Microsoft Academic Search

Direct simulation of filamentary gas discharges like streamers or dielectric barrier micro-discharges requires the use of an adaptive mesh. The objective of this paper is to develop a strategy which can use a set of grids with suitable local refinements for the continuity equations and Poisson’s equation in 2D and 3D geometries with a high-order discretization. The advantages of this

S. Pancheshnyi; P. Ségur; J. Capeillère; A. Bourdon

2008-01-01

283

Strong-strong beam-beam simulation on parallel computer  

SciTech Connect

The beam-beam interaction puts a strong limit on the luminosity of the high energy storage ring colliders. At the interaction points, the electromagnetic fields generated by one beam focus or defocus the opposite beam. This can cause beam blowup and a reduction of luminosity. An accurate simulation of the beam-beam interaction is needed to help optimize the luminosity in high energy colliders.

Qiang, Ji

2004-08-02

284

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs  

NASA Astrophysics Data System (ADS)

In this study computations of the two-dimensional Direct Simulation Monte Carlo (DSMC) method using Graphics Processing Units (GPUs) are presented. An all-device (GPU) computational approach is adopted-where the entire computation is performed on the GPU device, leaving the CPU idle-which includes particle moving, indexing, collisions between particles and state sampling. The subsequent application to GPU computation requires various changes to the original DSMC method to ensure efficient performance on the GPU device. Communications between the host (CPU) and device (GPU) occur only during problem initialization and simulation conclusion when results are only copied from the device to the host. Several multi-dimensional benchmark tests are employed to demonstrate the correctness of the DSMC implementation. We demonstrate here the application of DSMC using a single-GPU, with speedups of 3~10 times as compared to a high-end Intel CPU (Intel Xeon X5472) depending upon the size and the level of rarefaction encountered in the simulation.

Su, C.-C.; Hsieh, C.-W.; Smith, M. R.; Jermy, M. C.; Wu, J.-S.

2011-05-01

285

Automated Link16 Testing Using the Discrete Event System Specification and Extensible Markup Language  

Microsoft Academic Search

With the modernization of Department of Defense (DoD) systems and the growing complexity of communication equipment, traditional test methods and processes have to evolve in order to maintain their effectiveness. DoD acquisition policy requires the use of modeling and simulation (M&S) in all phases of system development life-cycles in order to ensure technical certification and mission effectiveness. The complexity of

Eddie Mak; Saurabh Mittal; Moon-Ho Hwang; James J. Nutaro

2010-01-01

286

DEVS-based Software Process Simulation Modeling: Formally Specified, Modularized, and Extensible SPSM  

Microsoft Academic Search

This paper proposes DEVS (Discrete Event System Specification)-based software process simulation model- ing method which is a formally specified, modularized, and extensible simulation modeling approach. The proposed ap- proach adopts DEVS formalism, a general purpose discrete event modeling and simulation framework, to the soft- ware process simulation modeling domain. This approach enables us to clearly understand the software process sim-

KeungSik Choi; Doo-Hwan Bae; TagGon Kim

287

Parallel simulation of beam-beam interaction in high energy accelerators  

SciTech Connect

In this paper, we present a self-consistent simulation model of beam-beam interaction in high energy accelerators. Using a parallel particle-in-cell approach, we have calculated the electromagnetic fields between two colliding beams. Dynamic load balance is implemented to improve the parallel efficiency. A preliminary performance test on IBM SP Power3, Cray T3E and PC cluster is presented. As an application, we studied the coherent beam-beam oscillation in the proposed Large Hadron Collider.

Qiang, Ji; Furman, Miguel A.; Ryne, Robert D.

2002-02-02

288

Parallel-adaptive simulation with the multigrid-based software framework UG  

Microsoft Academic Search

In this paper we present design aspects and concepts of the unstructured grids (UG) software framework that are relevant for\\u000a parallel-adaptive simulation of time-dependent, nonlinear partial differential equations. The architectural design is discussed\\u000a on system, subsystem and component level for distributed mesh management and local adaptation capabilities. Parallelization\\u000a is founded on top of the innovative programming model dynamic distributed data

Stefan Lang

2006-01-01

289

A new parallel P 3 M code for very large-scale cosmological simulations  

Microsoft Academic Search

We have developed a parallel Particle–Particle, Particle–Mesh (P3M) simulation code for the Cray T3E parallel supercomputer that is well suited to studying the time evolution of systems of particles interacting via gravity and gas forces in cosmological contexts. The parallel code is based upon the public-domain serial Adaptive P3M-SPH (http:\\/\\/coho.astro.uwo.ca\\/pub\\/hydra\\/hydra.html) code of Couchman et al. (1995)[ApJ, 452, 797]. The algorithm

Tom MacFarland; H. M. P. Couchman; Frazer Pearce; Jakob Pichlmeier

1998-01-01

290

Parallelization of a Monte Carlo algorithm for the simulation of polymer melts  

NASA Astrophysics Data System (ADS)

The Continuum-configurational Bias Monte Carlo algorithm (CBMC) is a very efficient method for the simulation of polymer melts. This algorithm is well-suited to parallelization on shared-memory multiprocessors. The main effort for parallelization is often spent in the decomposition of the data structures and the design of the parallel program sections. We present some generally applicable methods to improve the optimization process on shared-memory multiprocessors using global address space and on distributed-memory multicomputers using message passing.

Widmann, Albert H.; Suter, Ulrich W.

1995-12-01

291

Parallelization issues of a code for physically-based simulation of fabrics  

NASA Astrophysics Data System (ADS)

The simulation of fabrics, clothes, and flexible materials is an essential topic in computer animation of realistic virtual humans and dynamic sceneries. New emerging technologies, as interactive digital TV and multimedia products, make necessary the development of powerful tools to perform real-time simulations. Parallelism is one of such tools. When analyzing computationally fabric simulations we found these codes belonging to the complex class of irregular applications. Frequently this kind of codes includes reduction operations in their core, so that an important fraction of the computational time is spent on such operations. In fabric simulators these operations appear when evaluating forces, giving rise to the equation system to be solved. For this reason, this paper discusses only this phase of the simulation. This paper analyzes and evaluates different irregular reduction parallelization techniques on ccNUMA shared memory machines, applied to a real, physically-based, fabric simulator we have developed. Several issues are taken into account in order to achieve high code performance, as exploitation of data access locality and parallelism, as well as careful use of memory resources (memory overhead). In this paper we use the concept of data affinity to develop various efficient algorithms for reduction parallelization exploiting data locality.

Romero, Sergio; Gutiérrez, Eladio; Romero, Luis F.; Plata, Oscar; Zapata, Emilio L.

2004-10-01

292

Gyrokinetic Simulation of Finite-Beta Plasmas on Parallel Architectures  

NASA Astrophysics Data System (ADS)

There exists a wide body of research on the linear and non-linear properties of plasma microinstabilities which are induced by density and temperature gradients. Recently, however, there has been an interest in the electromagnetic or "finite-beta" effects on these microinstabilities. This thesis focuses on the finite -beta modification of an ion temperature gradient (ITG) driven microinstability in a two-dimensional shearless and sheared-slab geometries. A gyrokinetic model is employed in both the numerical and analytic studies of this instability. This thesis is outlined as follows: Chapter 1 introduces the electromagnetic gyrokinetic model which is employed in both the numerical and analytic studies of the ITG instability. Some discussion of the Klimontovich particle representation of the gyrokinetic Vlasov equation and a multiple scale model of the background plasma gradient is also presented. Chapter 2 describes in more detail the computational issues facing an electromagnetic gyrokinetic particle simulation of the ITG mode. An electromagnetic extension of the partially linearized algorithm is presented along with a comparison of quiet particle initialization routines. Chapter 3 presents and compares algorithms for the gyrokinetic particle simulation technique on SIMD and MIMD computing platforms. Chapter 4 discusses electromagnetic gyrokinetic fluctuation theory and provides a comparison of analytic and numerical results. An anomalous numerical instability is reported which does not correspond to the normal modes of the system. Some suggestions as to this instability's origin are presented. Chapter 5 contains both a linear and a non-linear three-wave coupling analysis of the finite-beta modified ITG mode in a shearless slab geometry. Comparisons are made with linear and partially linearized gyrokinetic simulation results. Finally, Chapter 6 presents results from a finite -beta modified ITG mode in a sheared slab geometry. The linear dispersion relation is derived and results from an integral eigenvalue code are presented. Comparisons are made with the gyrokinetic particle code in a variety of limits with both adiabatic and non-adiabatic electrons. Evidence of ITG driven microtearing is presented.

Reynders, John Van Wicheren

1993-01-01

293

Extendsim Advanced Techology: Discrete Rate Simulation  

Microsoft Academic Search

ExtendSim is used to model continuous, discrete event, discrete rate, and agent-based systems. This paper will focus on the ExtendSim discrete rate capabilities for modeling high-speed and rate based systems. Continuous, discrete event, and discrete rate simulation models will be compared.

David Krahl

2009-01-01

294

Performance Evaluation of Large-scale Parallel Simulation Codes and Designing New Language Features on the HPF (High Performance Fortran) Data-Parallel Programming Environment  

Microsoft Academic Search

High Performance Fortran (HPF) is provided for parallelizing your programs on the Earth Simulator (ES). We developed an optimal implementation of NAS Parallel Benchmarks (NPB) and some other benchmarks with the HPF compiler available on the ES, namely HPF\\/ES, and evaluated them on the ES. The result shows that the HPF implementation can achieve perform- ance comparable to the MPI

Yasuo Okabe; Hitoshi Murai

295

Application of integration algorithms in a parallel processing environment for the simulation of jet engines  

SciTech Connect

Illustrates the application of predictor-corrector integration algorithms developed for the digital parallel processing environment. The algorithms are implemented and evaluated through the use of a software simulator which provides an approximate representation of the parallel processing hardware. Test cases which focus on the use of the algorithms are presented and a specific application using a linear model of a turbofan engine is considered. Results are presented showing the effects of integration step size and the number of processors on simulation accuracy. Real-time performance, inter-processor communication and algorithm startup are also discussed. 10 references.

Krosel, S.M.; Milner, E.J.

1982-01-01

296

A parallel finite volume algorithm for large-eddy simulation of turbulent flows  

NASA Astrophysics Data System (ADS)

A parallel unstructured finite volume algorithm is developed for large-eddy simulation of compressible turbulent flows. Major components of the algorithm include piecewise linear least-square reconstruction of the unknown variables, trilinear finite element interpolation for the spatial coordinates, Roe flux difference splitting, and second-order MacCormack explicit time marching. The computer code is designed from the start to take full advantage of the additional computational capability provided by the current parallel computer systems. Parallel implementation is done using the message passing programming model and message passing libraries such as the Parallel Virtual Machine (PVM) and Message Passing Interface (MPI). The development of the numerical algorithm is presented in detail. The parallel strategy and issues regarding the implementation of a flow simulation code on the current generation of parallel machines are discussed. The results from parallel performance studies show that the algorithm is well suited for parallel computer systems that use the message passing programming model. Nearly perfect parallel speedup is obtained on MPP systems such as the Cray T3D and IBM SP2. Performance comparison with the older supercomputer systems such as the Cray YMP show that the simulations done on the parallel systems are approximately 10 to 30 times faster. The results of the accuracy and performance studies for the current algorithm are reported. To validate the flow simulation code, a number of Euler and Navier-Stokes simulations are done for internal duct flows. Inviscid Euler simulation of a very small amplitude acoustic wave interacting with a shock wave in a quasi-1D convergent-divergent nozzle shows that the algorithm is capable of simultaneously tracking the very small disturbances of the acoustic wave and capturing the shock wave. Navier-Stokes simulations are made for fully developed laminar flow in a square duct, developing laminar flow in a rectangular duct, and developing laminar flow in a 90-degree square bend. The Navier-Stokes solutions show good agreements with available analytical solutions and experimental data. To validate the flow simulation code for turbulence simulation, LES of fully-developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. The accuracy of the above algorithm for turbulence simulations is evaluated by comparison with the DNS solution. The effects of grid resolution, upwind numerical dissipation, and subgrid scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux difference splitting dissipation adversely affect the accuracy of the turbulence simulation. This problem is unique to the turbulence simulation, since it does not occur in the Euler and laminar Navier-Stokes simulations using the same code. For accurate turbulence simulation, it is found that only three to five percent of the standard Roe flux difference splitting dissipation is needed.

Bui, Trong Tri

1998-11-01

297

Robust large-scale parallel nonlinear solvers for simulations.  

SciTech Connect

This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write and easily portable. However, the method usually takes twice as long to solve as Newton-GMRES on general problems because it solves two linear systems at each iteration. In this paper, we discuss modifications to Bouaricha's method for a practical implementation, including a special globalization technique and other modifications for greater efficiency. We present numerical results showing computational advantages over Newton-GMRES on some realistic problems. We further discuss a new approach for dealing with singular (or ill-conditioned) matrices. In particular, we modify an algorithm for identifying a turning point so that an increasingly ill-conditioned Jacobian does not prevent convergence.

Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson (Sandia National Laboratories, Livermore, CA)

2005-11-01

298

A new parallel method for molecular dynamics simulation of macromolecular systems  

SciTech Connect

Short-range molecular dynamics simulations of molecular systems are commonly parallelized by replicated-data methods, where each processor stores a copy of all atom positions. This enables computation of bonded 2-, 3-, and 4-body forces within the molecular topology to be partitioned among processors straightforwardly. A drawback to such methods is that the inter-processor communication scales as N, the number of atoms, independent of P, the number of processors. Thus, their parallel efficiency falls off rapidly when large numbers of processors are used. In this paper a new parallel method called force-decomposition for simulating macromolecular or small-molecule systems is presented. Its memory and communication costs scale as N/{radical}P, allowing larger problems to be run faster on greater numbers of processors. Like replicated-data techniques, and in contrast to spatial-decomposition approaches, the new method can be simply load-balanced and performs well even for irregular simulation geometries. The implementation of the algorithm in a prototypical macromolecular simulation code ParBond is also discussed. On a 1024-processor Intel Paragon, ParBond runs a standard benchmark simulation of solvated myoglobin with a parallel efficiency of 61% and at 40 times the speed of a vectorized version of CHARMM running on a single Cray Y-MP processor.

Plimpton, S.; Hendrickson, B.

1994-08-01

299

Study of the parallel-plate EMP simulator and the simulator-obstacle interaction. Final technical report  

SciTech Connect

The Parallel-Plate Bounded-Wave EMP Simulator is typically used to test the vulnerability of electronic systems to the electromagnetic pulse (EMP) produced by a high altitude nuclear burst by subjecting the systems to a simulated EMP environment. However, when large test objects are placed within the simulator for investigation, the desired EMP environment may be affected by the interaction between the simulator and the test object. This simulator/obstacle interaction can be attributed to the following phenomena: (1) mutual coupling between the test object and the simulator, (2) fringing effects due to the finite width of the conducting plates of the simulator, and (3) multiple reflections between the object and the simulator's tapered end-sections. When the interaction is significant, the measurement of currents coupled into the system may not accurately represent those induced by an actual EMP. To better understand the problem of simulator/obstacle interaction, a dynamic analysis of the fields within the parallel-plate simulator is presented. The fields are computed using a moment method solution based on a wire mesh approximation of the conducting surfaces of the simulator. The fields within an empty simulator are found to be predominately transversse electromagnetic (TEM) for frequencies within the simulator's bandwidth, properly simulating the properties of the EMP propagating in free space. However, when a large test object is placed within the simulator, it is found that the currents induced on the object can be quite different from those on an object situated in free space. A comprehensive study of the mechanisms contributing to this deviation is presented.

Gedney, S.D.

1990-12-01

300

Transient dynamics simulations: Parallel algorithms for contact detection and smoothed particle hydrodynamics  

SciTech Connect

Transient dynamics simulations are commonly used to model phenomena such as car crashes, underwater explosions, and the response of shipping containers to high-speed impacts. Physical objects in such a simulation are typically represented by Lagrangian meshes because the meshes can move and deform with the objects as they undergo stress. Fluids (gasoline, water) or fluid-like materials (earth) in the simulation can be modeled using the techniques of smoothed particle hydrodynamics. Implementing a hybrid mesh/particle model on a massively parallel computer poses several difficult challenges. One challenge is to simultaneously parallelize and load-balance both the mesh and particle portions of the computation. A second challenge is to efficiently detect the contacts that occur within the deforming mesh and between mesh elements and particles as the simulation proceeds. These contacts impart forces to the mesh elements and particles which must be computed at each timestep to accurately capture the physics of interest. In this paper we describe new parallel algorithms for smoothed particle hydrodynamics and contact detection which turn out to have several key features in common. Additionally, we describe how to join the new algorithms with traditional parallel finite element techniques to create an integrated particle/mesh transient dynamics simulation. Our approach to this problem differs from previous work in that we use three different parallel decompositions, a static one for the finite element analysis and dynamic ones for particles and for contact detection. We have implemented our ideas in a parallel version of the transient dynamics code PRONTO-3D and present results for the code running on a large Intel Paragon.

Hendrickson, B.; Plimpton, S.; Attaway, S.; Swegle, J. [and others

1996-09-01

301

Object-Oriented NeuroSys: Parallel Programs for Simulating Large Networks of Biologically Accurate Neurons  

SciTech Connect

Object-oriented NeuroSys (ooNeuroSys) is a collection of programs for simulating very large networks of biologically accurate neurons on distributed memory parallel computers. It includes two principle programs: ooNeuroSys, a parallel program for solving the large systems of ordinary differential equations arising from the interconnected neurons, and Neurondiz, a parallel program for visualizing the results of ooNeuroSys. Both programs are designed to be run on clusters and use the MPI library to obtain parallelism. ooNeuroSys also includes an easy-to-use Python interface. This interface allows neuroscientists to quickly develop and test complex neuron models. Both ooNeuroSys and Neurondiz have a design that allows for both high performance and relative ease of maintenance.

Pacheco, P; Miller, P; Kim, J; Leese, T; Zabiyaka, Y

2003-05-07

302

Parallel Adaptive Mesh Refinement for Large Eddy Simulation Using the Finite Element Method  

Microsoft Academic Search

This paper describes work in progress at Hitachi Dublin Laboratory to develop a parallel adaptive mesh refinement library.\\u000a The library has been designed to be linked with a finite element simulation engine for solving three-dimensional unstructured\\u000a turbulent fluid dynamics problems, using large eddy simulation. The library takes as input a distributed mesh and a list of\\u000a mesh elements to be

Darach Golden; Neil Hurley; Sean Mcgrath

1998-01-01

303

Multiphysics simulation of flow-induced vibrations and aeroelasticity on parallel computing platforms  

Microsoft Academic Search

This article describes the application of multiphysics simulation on parallel computing platforms to model aeroelastic instabilities and flow-induced vibrations. Multiphysics simulation is based on a single computational framework for the modeling of multiple interacting physical phenomena. Within the multiphysics framework, the finite element treatment of fluids is based on the Galerkin-Least-Squares (GLS) method with discontinuity capturing operators. The arbitrary-Lagrangian—Eulerian (ALE)

Steven M. Rifai; Zden?k Johan; Wen-Ping Wang; Jean-Pierre Grisval; Thomas J. R. Hughes; Robert M. Ferencz

1999-01-01

304

GalaxSee HPC Module 1: The N-Body Problem, Serial and Parallel Simulation  

NSDL National Science Digital Library

This module introduces the N-body problem, which seeks to account for the dynamics of systems of multiple interacting objects. Galaxy dynamics serves as the motivating example to introduce a variety of computational methods for simulating change and criteria that can be used to check for model accuracy. Finally, the basic issues and ideas that must be considered when developing a parallel implementation of the simulation are introduced.

Joiner, David

305

Switching Surges on Parallel HV and EHV Untransposed Transmission Lines Studied by Analog Simulation  

Microsoft Academic Search

This paper describes techniques for analog computer simulation of coupled parallel transmission lines erected on the same right-of- way. A particular simulation is given consisting of one 500-kV and two 230-kV three-phase transmission lines interconnecting areas separated by 126 miles within the Florida Power Corporation network. Some typical results of a comprehensive line switching study are presented and discussed.

D. H. Welle; R. A. Hedin; R. W. Weishaupt; C. H. Thomas

1972-01-01

306

Overcoming Communication Latency Barriers in Massively Parallel Molecular Dynamics Simulations on Anton  

NASA Astrophysics Data System (ADS)

Strong scaling of scientific applications on parallel architectures is increasingly limited by communication latency. This talk will describe the techniques used to reduce latency and mitigate its effects on performance in Anton, a massively parallel special-purpose machine that accelerates molecular dynamics (MD) simulations by orders of magnitude compared with the previous state of the art. Achieving this speedup required both specialized hardware mechanisms and a restructuring of the application software to reduce network latency, sender and receiver overhead, and synchronization costs. Key elements of Anton's approach, in addition to tightly integrated communication hardware, include formulating data transfer in terms of counted remote writes and leveraging fine-grained communication. Anton delivers end-to-end inter-node latency significantly lower than any other large-scale parallel machine, and the total critical-path communication time for an Anton MD simulation is less than 3% that of the next-fastest MD platform.

Dror, Ron

2013-03-01

307

Time complexity of a parallel conjugate gradient solver for light scattering simulations: theory and SPMD implementation  

Microsoft Academic Search

We describe parallelization for distributed memory computers of a preconditioned Conjugate Gradient method,applied to solve systems of equations emerging from Elastic Light Scattering simulations. The execution timeof the Conjugate Gradient method is analyzed theoretically. First expressions for the execution time for threedifferent data decompositions are derived. Next two processor network topologies are taken into account and thetheoretical execution times are

P. M. A. Sloot; W. Hoffmann; L. O. Hertzberger

1992-01-01

308

Asymptotic dispersion in 2D heterogeneous porous media determined by parallel numerical simulations  

Microsoft Academic Search

We determine the asymptotic dispersion coefficients in 2D exponentially correlated lognormally distributed permeability fields by using parallel computing. Fluid flow is computed by solving the flow equation discretized on a regular grid and transport triggered by advection and diffusion is simulated by a particle tracker. To obtain a well-defined asymptotic regime under ergodic conditions (initial plume size much larger than

Jean-Raynald de Dreuzy; Anthony Beaudoin; Jocelyne Erhel

2007-01-01

309

Massively parallel simulation of flow and transport in variably saturated porous and fractured media  

SciTech Connect

This paper describes a massively parallel simulation method and its application for modeling multiphase flow and multicomponent transport in porous and fractured reservoirs. The parallel-computing method has been implemented into the TOUGH2 code and its numerical performance is tested on a Cray T3E-900 and IBM SP. The efficiency and robustness of the parallel-computing algorithm are demonstrated by completing two simulations with more than one million gridblocks, using site-specific data obtained from a site-characterization study. The first application involves the development of a three-dimensional numerical model for flow in the unsaturated zone of Yucca Mountain, Nevada. The second application is the study of tracer/radionuclide transport through fracture-matrix rocks for the same site. The parallel-computing technique enhances modeling capabilities by achieving several-orders-of-magnitude speedup for large-scale and high resolution modeling studies. The resulting modeling results provide many new insights into flow and transport processes that could not be obtained from simulations using the single-CPU simulator.

Wu, Yu-Shu; Zhang, Keni; Pruess, Karsten

2002-01-15

310

Load Balanced Parallel Simulation of Particle-Fluid DEM-SPH Systems with Moving Boundaries  

Microsoft Academic Search

We propose a new pure Lagrangian method for the parallel load balanced simulation of particle- fluid systems with moving boundaries or free surfaces. Our method is completely meshless and models solid objects as well as the fluid as particles. By an Orthogonal Recursive Bisection we obtain a domain decomposition that is well suited for a controller based load balancing. This

Florian Fleissner; Peter Eberhard

2007-01-01

311

NGEN: A Massively Parallel Reconfigurable Computer for Biological Simulation: Towards a Self-Organizing Computer  

Microsoft Academic Search

NGEN is a flexible computer hardware for rapid custom-circuit simulation of fine grained physical processes via a massively parallel architecture. It is optimized to implement dataflow architectures and systolic algorithms for large problems. High speed distributed SRAM on the chip-to chip interconnect enables a transparent extension of problem size beyond the limits posed by the number of available processors. For

John S. Mccaskill; Thomas Maeke; Udo Gemm; Ludger Schulte; Uwe Tangen

1996-01-01

312

Two-dimensional particle simulation of plasma expansion between plane parallel electrodes  

Microsoft Academic Search

We simulate in two dimensions the expansion of a plasma between biased plane parallel electrodes using the particle-in-cell method. Such a plasma is frequently created in many experiments by the interaction of a pulsed laser with atomic vapor or gas stream. We describe the motion of the electrons and ions and reproduce the experimentally observed bulk drift of the plasma

Kartik Patel; V. K. Mago

1995-01-01

313

DSMC Simulations Of Low-Density Choked Flows In Parallel-Plate Channels  

Microsoft Academic Search

Rarefied choked flows in parallel-plate channels have been studied using the direct simulation Monte Carlo technique. Calculations are performed for various transitional flows, and results are presented for the computed flowfield quantities, wall pressures and discharge coefficients. Comparisons are made with the available experimental data, and the physical and numerical factors which affect the solutions are discussed. Separate calculations are

M. Ilgaz; M. C. Çelenligil

2003-01-01

314

Moldy: a portable molecular dynamics simulation program for serial and parallel computers  

NASA Astrophysics Data System (ADS)

Moldy is a highly portable C program for performing molecular-dynamics simulations of solids and liquids using periodic boundary conditions. It runs in serial mode on a conventional workstation or on a parallel system using an interface to a parallel communications library such as MPI or BSP. The "replicated data" parallelization strategy is used to achieve reasonable performance with a minimal difference between serial and parallel code. The code has been optimized for high performance in both serial and parallel cases. The model system is completely specified in a run-time input file and may contain atoms, molecules or ions in any mixture. Molecules or molecular ions are treated in the rigid-molecule approximation and their rotational motion is modeled using quaternion methods. The equations of motion are integrated using a modified form of the Beeman algorithm. Simulations may be performed in the usual NVE ensemble or in isobaric and/or isothermal ensembles. Potential functions of the Lennard-Jones, 6-exp and MCY forms are supported and the code is structured to give an straightforward interface to add a new functional form. The Ewald method is used to calculate long-ranged electrostatic forces.

Refson, Keith

2000-04-01

315

The Distributed Diagonal Force Decomposition Method for Parallelizing Molecular Dynamics Simulations  

PubMed Central

Parallelization is an effective way to reduce the computational time needed for molecular dynamics simulations. We describe a new parallelization method, the distributed-diagonal force decomposition method, with which we extend and improve the existing force decomposition methods. Our new method requires less data communication during molecular dynamics simulations than replicated data and current force decomposition methods, increasing the parallel efficiency. It also dynamically load-balances the processors' computational load throughout the simulation. The method is readily implemented in existing molecular dynamics codes and it has been incorporated into the CHARMM program, allowing its immediate use in conjunction with the many molecular dynamics simulation techniques that are already present in the program. We also present the design of the Force Decomposition Machine, a cluster of personal computers and networks that is tailored to running molecular dynamics simulations using the distributed diagonal force decomposition method. The design is expandable and provides various degrees of fault resilience. This approach is easily adaptable to computers with Graphics Processing Units because it is independent of the processor type being used.

Borsnik, Urban; Miller, Benjamin T.; Brooks, Bernard R.; Janezic, Dusanka

2011-01-01

316

Parallel Simulation Algorithms for the Three Dimensional Strong-Strong Beam-Beam Interaction  

SciTech Connect

The strong-strong beam-beam effect is one of the most important effects limiting the luminosity of ring colliders. Little is known about it analytically, so most studies utilize numeric simulations. The two-dimensional realm is readily accessible to workstation-class computers (cf.,e.g.,[1, 2]), while three dimensions, which add effects such as phase averaging and the hourglass effect, require vastly higher amounts of CPU time. Thus, parallelization of three-dimensional simulation techniques is imperative; in the following we discuss parallelization strategies and describe the algorithms used in our simulation code, which will reach almost linear scaling of performance vs. number of CPUs for typical setups.

Kabel, A.C.; /SLAC

2008-03-17

317

Parallel Monte Carlo simulations on an ARC-enabled computing grid  

NASA Astrophysics Data System (ADS)

Grid computing opens new possibilities for running heavy Monte Carlo simulations of physical systems in parallel. The presentation gives an overview of GaMPI, a system for running an MPI-based random walker simulation on grid resources. Integrating the ARC middleware and the new storage system Chelonia with the Ganga grid job submission and control system, we show that MPI jobs can be run on a world-wide computing grid with good performance and promising scaling properties. Results for relatively communication-heavy Monte Carlo simulations run on multiple heterogeneous, ARC-enabled computing clusters in several countries are presented.

Nilsen, Jon K.; Samset, Bjørn H.

2011-12-01

318

Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems  

SciTech Connect

Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-static solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance.

BAER,THOMAS A.; SACKINGER,PHILIP A.; SUBIA,SAMUEL R.

1999-10-14

319

Application of parallel computing to seismic damage process simulation of an arch dam  

NASA Astrophysics Data System (ADS)

The simulation of damage process of high arch dam subjected to strong earthquake shocks is significant to the evaluation of its performance and seismic safety, considering the catastrophic effect of dam failure. However, such numerical simulation requires rigorous computational capacity. Conventional serial computing falls short of that and parallel computing is a fairly promising solution to this problem. The parallel finite element code PDPAD was developed for the damage prediction of arch dams utilizing the damage model with inheterogeneity of concrete considered. Developed with programming language Fortran, the code uses a master/slave mode for programming, domain decomposition method for allocation of tasks, MPI (Message Passing Interface) for communication and solvers from AZTEC library for solution of large-scale equations. Speedup test showed that the performance of PDPAD was quite satisfactory. The code was employed to study the damage process of a being-built arch dam on a 4-node PC Cluster, with more than one million degrees of freedom considered. The obtained damage mode was quite similar to that of shaking table test, indicating that the proposed procedure and parallel code PDPAD has a good potential in simulating seismic damage mode of arch dams. With the rapidly growing need for massive computation emerged from engineering problems, parallel computing will find more and more applications in pertinent areas.

Zhong, Hong; Lin, Gao; Li, Jianbo

2010-06-01

320

Application of parallel computing techniques to a large-scale reservoir simulation  

SciTech Connect

Even with the continual advances made in both computational algorithms and computer hardware used in reservoir modeling studies, large-scale simulation of fluid and heat flow in heterogeneous reservoirs remains a challenge. The problem commonly arises from intensive computational requirement for detailed modeling investigations of real-world reservoirs. This paper presents the application of a massive parallel-computing version of the TOUGH2 code developed for performing large-scale field simulations. As an application example, the parallelized TOUGH2 code is applied to develop a three-dimensional unsaturated-zone numerical model simulating flow of moisture, gas, and heat in the unsaturated zone of Yucca Mountain, Nevada, a potential repository for high-level radioactive waste. The modeling approach employs refined spatial discretization to represent the heterogeneous fractured tuffs of the system, using more than a million 3-D gridblocks. The problem of two-phase flow and heat transfer within the model domain leads to a total of 3,226,566 linear equations to be solved per Newton iteration. The simulation is conducted on a Cray T3E-900, a distributed-memory massively parallel computer. Simulation results indicate that the parallel computing technique, as implemented in the TOUGH2 code, is very efficient. The reliability and accuracy of the model results have been demonstrated by comparing them to those of small-scale (coarse-grid) models. These comparisons show that simulation results obtained with the refined grid provide more detailed predictions of the future flow conditions at the site, aiding in the assessment of proposed repository performance.

Zhang, Keni; Wu, Yu-Shu; Ding, Chris; Pruess, Karsten

2001-02-01

321

Discrete-event simulation of the shielding failure of the arrester protected overhead-lines to evaluate risk of flashover  

Microsoft Academic Search

Damages caused by lightning stroke in power system networks are severe for insulations and result in less reliable energy supply. Knowledge of protection schemes and better selection of these devices in power systems is a goal of designers to reduce the risk of flashover in any risky point. In this paper, a statistical procedure is presented to evaluate risk of

M. R. Bank Tavakoli; B. Vahidi; S. H. Hosseinian

2008-01-01

322

Cricket\\/Mica2 Based Discrete Event Simulator for WiHoc Ver.1.0 Localization Analysis  

Microsoft Academic Search

The deployment of wireless sensor networks (WSN) has enabled applications to operate at its optimum and maximum potential. This paper presents the enhancement of the wireless hockey (WiHoc) system ver.1.0. The sensors are utilized to acquire the movement of field hockey players on a coaching strategy board. These are denoted and executed by the Cricket \\/Mica-2 sensors which have been

S. Shamala; T. Sangeetha; A. Moqry

2009-01-01

323

Production control in a failure-prone manufacturing network using discrete event simulation and automated response surface methodology  

Microsoft Academic Search

In this paper, a system consisting of a network of machines with random breakdown and repair times is considered. The machines\\u000a in this system can be in one of four states: operational, in repair, starved, and blocked. Failure and repair times of the\\u000a machines are exponentially distributed. Previous research on multi-machine failure-prone manufacturing systems (FPMS) has\\u000a focused on systems consisting

Seyed Mojtaba Sajadi; Mir Mehdi Seyed Esfahani; Kenneth Sörensen

2011-01-01

324

Battle Experiments of Naval Air Defense with Discrete Event System-based Mission-level Modeling and Simulations  

Microsoft Academic Search

The modern naval air defense of a fleet is a critical task dictating the equipment, the operation, and the management of the fleet. Military modelers consider that an improved weapon system in naval air defense (i.e. the AEGIS system) is the most critical enabler of defense at the engagement level. However, at the mission execution level, naval air defense is

Jeong Hoon Kim; Chang Beom Choi; Tag Gon Kim

2011-01-01

325

libMesh: a C++ library for parallel adaptive mesh refinement\\/coarsening simulations  

Microsoft Academic Search

In this paper we describe the \\u000a libMesh\\u000a (http:\\/\\/libmesh.sourceforge.net) framework for parallel adaptive finite element applications. \\u000a libMesh\\u000a is an open-source software library that has been developed to facilitate serial and parallel simulation of multiscale, multiphysics\\u000a applications using adaptive mesh refinement and coarsening strategies. The main software development is being carried out\\u000a in the CFDLab (http:\\/\\/cfdlab.ae.utexas.edu) at the University of Texas, but

Benjamin S. Kirk; John W. Peterson; Roy H. Stogner; Graham F. Carey

2006-01-01

326

A new parallel P3M code for very large-scale cosmological simulations  

NASA Astrophysics Data System (ADS)

We have developed a parallel Particle-Particle, Particle-Mesh (P3M) simulation code for the Cray T3E parallel supercomputer that is well suited to studying the time evolution of systems of particles interacting via gravity and gas forces in cosmological contexts. The parallel code is based upon the public-domain serial Adaptive P3M-SPH (http://coho.astro.uwo.ca/pub/hydra/hydra.html) code of Couchman et al. (1995)[ApJ, 452, 797]. The algorithm resolves gravitational forces into a long-range component computed by discretizing the mass distribution and solving Poisson's equation on a grid using an FFT convolution method, and a short-range component computed by direct force summation for sufficiently close particle pairs. The code consists primarily of a particle-particle computation parallelized by domain decomposition over blocks of neighbour-cells, a more regular mesh calculation distributed in planes along one dimension, and several transformations between the two distributions. The load balancing of the P3M code is static, since this greatly aids the ongoing implementation of parallel adaptive refinements of the particle and mesh systems. Great care was taken throughout to make optimal use of the available memory, so that a version of the current implementation has been used to simulate systems of up to 109 particles with a 10243 mesh for the long-range force computation. These are the largest Cosmological N-body simulations of which we are aware. We discuss these memory optimizations as well as those motivated by computational performance. Performance results are very encouraging, and, even without refinements, the code has been used effectively for simulations in which the particle distribution becomes highly clustered as well as for other non-uniform systems of astrophysical interest.

MacFarland, Tom; Couchman, H. M. P.; Pearce, F. R.; Pichlmeier, Jakob

1998-12-01

327

Distributed parallel computers versus PVM on a workstation cluster in the simulation of time dependent partial differential equations  

Microsoft Academic Search

In the present paper, we present some experimental results about the parallel numerical simulation of a time dependent partial differential equation, the two dimensional nonlinear Schrodinger equation, on a message passing parallel machine and using PVM on a cluster of Sparc-stations. An implicit finite difference method has been used to carry out the simulation. Some features about the different scaling

I. Martin; Juan C. Fabero; Francisco Tirado; Alfredo Bautista

1995-01-01

328

Arrival Generators for Queueing Simulations.  

National Technical Information Service (NTIS)

The paper presents two methods for modeling cyclic inputs to a congested system in a discrete event digital simulation. Specifically, it is assumed that the interarrival times follow a probability distribution whose parameters are functions of time. Assum...

G. S. Fishman E. P. C. Kao

1974-01-01

329

Recent progress in 3D EM/EM-PIC simulation with ARGUS and parallel ARGUS  

SciTech Connect

ARGUS is an integrated, 3-D, volumetric simulation model for systems involving electric and magnetic fields and charged particles, including materials embedded in the simulation region. The code offers the capability to carry out time domain and frequency domain electromagnetic simulations of complex physical systems. ARGUS offers a boolean solid model structure input capability that can include essentially arbitrary structures on the computational domain, and a modular architecture that allows multiple physics packages to access the same data structure and to share common code utilities. Physics modules are in place to compute electrostatic and electromagnetic fields, the normal modes of RF structures, and self-consistent particle-in-cell (PIC) simulation in either a time dependent mode or a steady state mode. The PIC modules include multiple particle species, the Lorentz equations of motion, and algorithms for the creation of particles by emission from material surfaces, injection onto the grid, and ionization. In this paper, we present an updated overview of ARGUS, with particular emphasis given in recent algorithmic and computational advances. These include a completely rewritten frequency domain solver which efficiently treats lossy materials and periodic structures, a parallel version of ARGUS with support for both shared memory parallel vector (i.e. CRAY) machines and distributed memory massively parallel MIMD systems, and numerous new applications of the code.

Mankofsky, A.; Petillo, J.; Krueger, W.; Mondelli, A. [Science Applications International Corp., McLean, VA (United States); McNamara, B.; Philp, R. [Leabrook Computing Ltd., Oxford (United Kingdom)

1994-12-31

330

On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods  

PubMed Central

We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design.

Lee, Anthony; Yau, Christopher; Giles, Michael B.; Doucet, Arnaud; Holmes, Christopher C.

2011-01-01

331

Massively parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada  

SciTech Connect

This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-one-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the one-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models.

Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G.S.

2001-08-31

332

Parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada.  

PubMed

This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-1-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the 1-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models. PMID:12714301

Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G S

333

Relevance of the parallel nonlinearity in gyrokinetic simulations of tokamak plasmas  

SciTech Connect

The influence of the parallel nonlinearity on transport in gyrokinetic simulations is assessed for values of {rho}{sub *} which are typical of current experiments. Here, {rho}{sub *}={rho}{sub s}/a is the ratio of gyroradius, {rho}{sub s}, to plasma minor radius, a. The conclusion, derived from simulations with both GYRO [J. Candy and R. E. Waltz, J. Comput. Phys., 186, 585 (2003)] and GEM [Y. Chen and S. E. Parker J. Comput. Phys., 189, 463 (2003)] is that no measurable effect of the parallel nonlinearity is apparent for {rho}{sub *}<0.012. This result is consistent with scaling arguments, which suggest that the parallel nonlinearity should be O({rho}{sub *}) smaller than the ExB nonlinearity. Indeed, for the plasma parameters under consideration, the magnitude of the parallel nonlinearity is a factor of 8{rho}{sub *} smaller (for 0.000 75<{rho}{sub *}<0.012) than the other retained terms in the nonlinear gyrokinetic equation.

Candy, J.; Waltz, R. E.; Parker, S. E.; Chen, Y. [General Atomics, San Diego, California 92121 (United States); Center for Integrated Plasma Studies, University of Colorado at Boulder, Boulder, Colorado 80309 (United States)

2006-07-15

334

A scalable parallel algorithm for large-scale reactive force-field molecular dynamics simulations  

NASA Astrophysics Data System (ADS)

A scalable parallel algorithm has been designed to perform multimillion-atom molecular dynamics (MD) simulations, in which first principles-based reactive force fields (ReaxFF) describe chemical reactions. Environment-dependent bond orders associated with atomic pairs and their derivatives are reused extensively with the aid of linked-list cells to minimize the computation associated with atomic n-tuple interactions (n?4 explicitly and ?6 due to chain-rule differentiation). These n-tuple computations are made modular, so that they can be reconfigured effectively with a multiple time-step integrator to further reduce the computation time. Atomic charges are updated dynamically with an electronegativity equalization method, by iteratively minimizing the electrostatic energy with the charge-neutrality constraint. The ReaxFF-MD simulation algorithm has been implemented on parallel computers based on a spatial decomposition scheme combined with distributed n-tuple data structures. The measured parallel efficiency of the parallel ReaxFF-MD algorithm is 0.998 on 131,072 IBM BlueGene/L processors for a 1.01 billion-atom RDX system.

Nomura, Ken-Ichi; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya

2008-01-01

335

Parallel spatial direct numerical simulation of boundary-layer flow transition on IBM SP1  

SciTech Connect

The spatially evolving disturbances that are associated with laminar-to-turbulent transition in three-dimensional boundary-layer flows are computed with the PSDNS code on an IBM SP1 parallel supercomputer. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40 percent for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a ``real world`` simulation that consists of 1.7 million grid points. Comparisons to the Cray Y/MP and Cray C-90 are made for this large scale simulation.

Hanebutte, U.R. [Argonne National Lab., IL (United States); Joslin, R.D. [National Aeronautics and Space Administration, Hampton, VA (United States). Langley Research Center; Zubair, M. [International Business Machines Corp., Yorktown Heights, NY (United States). Thomas J. Watson Research Center

1995-07-01

336

Adaptive finite element simulation of flow and transport applications on parallel computers  

NASA Astrophysics Data System (ADS)

The subject of this work is the adaptive finite element simulation of problems arising in flow and transport applications on parallel computers. Of particular interest are new contributions to adaptive mesh refinement (AMR) in this parallel high-performance context, including novel work on data structures, treatment of constraints in a parallel setting, generality and extensibility via object-oriented programming, and the design/implementation of a flexible software framework. This technology and software capability then enables more robust, reliable treatment of multiscale--multiphysics problems and specific studies of fine scale interaction such as those in biological chemotaxis (Chapter 4) and high-speed shock physics for compressible flows (Chapter 5). The work begins by presenting an overview of key concepts and data structures employed in AMR simulations. Of particular interest is how these concepts are applied in the physics-independent software framework which is developed here and is the basis for all the numerical simulations performed in this work. This open-source software framework has been adopted by a number of researchers in the U.S. and abroad for use in a wide range of applications. The dynamic nature of adaptive simulations pose particular issues for efficient implementation on distributed-memory parallel architectures. Communication cost, computational load balance, and memory requirements must all be considered when developing adaptive software for this class of machines. Specific extensions to the adaptive data structures to enable implementation on parallel computers is therefore considered in detail. The libMesh framework for performing adaptive finite element simulations on parallel computers is developed to provide a concrete implementation of the above ideas. This physics-independent framework is applied to two distinct flow and transport applications classes in the subsequent application studies to illustrate the flexibility of the design and to demonstrate the capability for resolving complex multiscale processes efficiently and reliably. The first application considered is the simulation of chemotactic biological systems such as colonies of Escherichia coli. This work appears to be the first application of AMR to chemotactic processes. These systems exhibit transient, highly localized features and are important in many biological processes, which make them ideal for simulation with adaptive techniques. A nonlinear reaction-diffusion model for such systems is described and a finite element formulation is developed. The solution methodology is described in detail. Several phenomenological studies are conducted to study chemotactic processes and resulting biological patterns which use the parallel adaptive refinement capability developed in this work. The other application study is much more extensive and deals with fine scale interactions for important hypersonic flows arising in aerospace applications. These flows are characterized by highly nonlinear, convection-dominated flowfields with very localized features such as shock waves and boundary layers. These localized features are well-suited to simulation with adaptive techniques. A novel treatment of the inviscid flux terms arising in a streamline-upwind Petrov-Galerkin finite element formulation of the compressible Navier-Stokes equations is also presented and is found to be superior to the traditional approach. The parallel adaptive finite element formulation is then applied to several complex flow studies, culminating in fully three-dimensional viscous flows about complex geometries such as the Space Shuttle Orbiter. Physical phenomena such as viscous/inviscid interaction, shock wave/boundary layer interaction, shock/shock interaction, and unsteady acoustic-driven flowfield response are considered in detail. A computational investigation of a 25°/55° double cone configuration details the complex multiscale flow features and investigates a potential source of experimentally-observed unsteady flowfield response.

Kirk, Benjamin Shelton

337

Object-Oriented Parallel Particle-in-Cell Code for Beam Dynamics Simulation in Linear Accelerators  

SciTech Connect

In this paper, we present an object-oriented three-dimensional parallel particle-in-cell code for beam dynamics simulation in linear accelerators. A two-dimensional parallel domain decomposition approach is employed within a message passing programming paradigm along with a dynamic load balancing. Implementing object-oriented software design provides the code with better maintainability, reusability, and extensibility compared with conventional structure based code. This also helps to encapsulate the details of communications syntax. Performance tests on SGI/Cray T3E-900 and SGI Origin 2000 machines show good scalability of the object-oriented code. Some important features of this code also include employing symplectic integration with linear maps of external focusing elements and using z as the independent variable, typical in accelerators. A successful application was done to simulate beam transport through three superconducting sections in the APT linac design.

Qiang, J.; Ryne, R.D.; Habib, S.; Decky, V.

1999-11-13

338

Parallel molecular dynamics simulations of pressure-induced structural transformations in cadmium selenide nanocrystals  

NASA Astrophysics Data System (ADS)

Parallel molecular dynamics (MD) simulations are performed to investigate pressure-induced solid-to-solid structural phase transformations in cadmium selenide (CdSe) nanorods. The effects of the size and shape of nanorods on different aspects of structural phase transformations are studied. Simulations are based on interatomic potentials validated extensively by experiments. Simulations range from 105 to 106 atoms. These simulations are enabled by highly scalable algorithms executed on massively parallel Beowulf computing architectures. Pressure-induced structural transformations are studied using a hydrostatic pressure medium simulated by atoms interacting via Lennard-Jones potential. Four single-crystal CdSe nanorods, each 44A in diameter but varying in length, in the range between 44A and 600A, are studied independently in two sets of simulations. The first simulation is the downstroke simulation, where each rod is embedded in the pressure medium and subjected to increasing pressure during which it undergoes a forward transformation from a 4-fold coordinated wurtzite (WZ) crystal structure to a 6-fold coordinated rocksalt (RS) crystal structure. In the second so-called upstroke simulation, the pressure on the rods is decreased and a reverse transformation from 6-fold RS to a 4-fold coordinated phase is observed. The transformation pressure in the forward transformation depends on the nanorod size, with longer rods transforming at lower pressures close to the bulk transformation pressure. Spatially-resolved structural analyses, including pair-distributions, atomic-coordinations and bond-angle distributions, indicate nucleation begins at the surface of nanorods and spreads inward. The transformation results in a single RS domain, in agreement with experiments. The microscopic mechanism for transformation is observed to be the same as for bulk CdSe. A nanorod size dependency is also found in reverse structural transformations, with longer nanorods transforming more readily than smaller ones. Nucleation initiates at the center of the rod and grows outward.

Lee, Nicholas Jabari Ouma

339

A scalable parallel algorithm for large-scale reactive force-field molecular dynamics simulations  

Microsoft Academic Search

A scalable parallel algorithm has been designed to perform multimillion-atom molecular dynamics (MD) simulations, in which first principles- based reactive force fields (ReaxFF) describe chemical reactions. Environment-dependent bond orders associated with atomic pairs and their derivatives are reused extensively with the aid of linked-list cells to minimize the computation associated with atomic n-tuple interactions (n 4 explicitly and 6 due

Ken-Ichi Nomura; Rajiv K. Kalia; Aiichiro Nakano; Priya Vashishta

2008-01-01

340

Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU  

Microsoft Academic Search

Particle-In-Cell (PIC) methods have been widely used for plasma physics simulations in the past three decades. To ensure an acceptable level of statistical accuracy relatively large numbers of particles are needed. State-of-the-art Graphics Processing Units (GPUs), with their high memory bandwidth, hundreds of SPMD processors, and half-a-teraflop performance potential, offer a viable alternative to distributed memory parallel computers for running

George Stantchev; William Dorland; Nail A. Gumerov

2008-01-01

341

Component versus Anti-parallel Merging at the Magnetopause: Nonlinear Theory and Particle Simulations  

NASA Astrophysics Data System (ADS)

One of the controversial issues regarding magnetic reconnection at the dayside magnetopause is the location where reconnection first occurs during periods of a large interplanetary magnetic field By. The question is whether magnetic reconnection can occur in locations where the field is not exactly anti-parallel. Nonlinear theory of the collisionless tearing mode predicts that in the presence of a guide field the instability saturates at too small amplitudes to be of any relevance at the magnetopause. Our recent linear theory analysis shows that the growth rate of the collisionless tearing mode remains significant even in the presence of a large guide field. This would indicate the possibility of both component and anti-parallel merging. However, aside from sizeable linear growth, the tearing mode must saturate at sufficiently large amplitude if it is to be a viable mechanism for reconnection at the magnetopause. To this end, we have performed a series of full particle simulations to address the nonlinear saturation of the tearing instability both in the presence and in the absence of the guide field. These simulations are the first of their kind and were performed for large mass ratios and with very high resolution in order to address the saturation problem. Our results show major deviations from previous studies of both anti-parallel and component merging scenarios. The results of these simulations and their implications for the magnetopause will be presented.

Quest, K.; Karimabadi, H.; Daughton, W.

2003-12-01

342

Simulation study of a parallel processor with unbalanced loads. Master's thesis  

SciTech Connect

The purpose of this thesis was twofold: to estimate the impact of unbalanced computational loads on a parallel-processing architecture via Monte Carlo simulation; and second to investigate the impact of representing the dynamics of the parallel-processing problem via animated simulation. It is constrained to the hypercube architecture in which each node is connected in a predetermined topology and allowed to communicate to other nodes through calls to the operating system. Routing of messages through the network is fixed and specified within the operating system. Message-transmission preempts nodal processing causing internodal communications to complicate the concurrent operation of the network. Two independent variables are defined: 1) the degree of imbalance characterizes the nature or severity of the load imbalance, and 2) the degree of locality characterizes the node loadings with respect to node locations across the cube. A SLAM II simulation model of a generic 16 node hypercube was constructed in which each node processes a predetermined number of computational tasks and, following each task, sends a message to a single randomly chosen receiver node. An experiment was designed in which the independent variables, degree of imbalance and degree of locality were varied across two computation-to-IO ratios to determine their separate and interactive effects on the dependent variable, job speedup. ANOVA and regression techniques were used to estimate the relationship between load imbalance, locality, computation-to-IO ratio, and their interactions to job speedup. Results show that load imbalance severely impacts a parallel-processor's performance.

Moore, T.S.

1987-12-01

343

Scalable parallel Monte Carlo algorithm for atomistic simulations of precipitation in alloys  

NASA Astrophysics Data System (ADS)

We present an extension of the semi-grand-canonical (SGC) ensemble that we refer to as the variance-constrained semi-grand-canonical (VC-SGC) ensemble. It allows for transmutation Monte Carlo simulations of multicomponent systems in multiphase regions of the phase diagram and lends itself to scalable simulations on massively parallel platforms. By combining transmutation moves with molecular dynamics steps, structural relaxations and thermal vibrations in realistic alloys can be taken into account. In this way, we construct a robust and efficient simulation technique that is ideally suited for large-scale simulations of precipitation in multicomponent systems in the presence of structural disorder. To illustrate the algorithm introduced in this work, we study the precipitation of Cu in nanocrystalline Fe.

Sadigh, Babak; Erhart, Paul; Stukowski, Alexander; Caro, Alfredo; Martinez, Enrique; Zepeda-Ruiz, Luis

2012-05-01

344

Real-time simulation of MHD/steam power plants by digital parallel processors  

NASA Astrophysics Data System (ADS)

Attention is given to a large FORTRAN coded program which simulates the dynamic response of the MHD/steam plant on either a SEL 32/55 or VAX 11/780 computer. The code realizes a detailed first-principle model of the plant. Quite recently, in addition to the VAX 11/780, an AD-10 has been installed for usage as a real-time simulation facility. The parallel processor AD-10 is capable of simulating the MHD/steam plant at several times real-time rates. This is desirable in order to develop rapidly a large data base of varied plant operating conditions. The combined-cycle MHD/steam plant model is discussed, taking into account a number of disadvantages. The disadvantages can be overcome with the aid of an array processor used as an adjunct to the unit processor. The conversion of some computations for real-time simulation is considered.

Johnson, R. M.; Rudberg, D. A.

345

Facilitating arrhythmia simulation: the method of quantitative cellular automata modeling and parallel running  

PubMed Central

Background Many arrhythmias are triggered by abnormal electrical activity at the ionic channel and cell level, and then evolve spatio-temporally within the heart. To understand arrhythmias better and to diagnose them more precisely by their ECG waveforms, a whole-heart model is required to explore the association between the massively parallel activities at the channel/cell level and the integrative electrophysiological phenomena at organ level. Methods We have developed a method to build large-scale electrophysiological models by using extended cellular automata, and to run such models on a cluster of shared memory machines. We describe here the method, including the extension of a language-based cellular automaton to implement quantitative computing, the building of a whole-heart model with Visible Human Project data, the parallelization of the model on a cluster of shared memory computers with OpenMP and MPI hybrid programming, and a simulation algorithm that links cellular activity with the ECG. Results We demonstrate that electrical activities at channel, cell, and organ levels can be traced and captured conveniently in our extended cellular automaton system. Examples of some ECG waveforms simulated with a 2-D slice are given to support the ECG simulation algorithm. A performance evaluation of the 3-D model on a four-node cluster is also given. Conclusions Quantitative multicellular modeling with extended cellular automata is a highly efficient and widely applicable method to weave experimental data at different levels into computational models. This process can be used to investigate complex and collective biological activities that can be described neither by their governing differentiation equations nor by discrete parallel computation. Transparent cluster computing is a convenient and effective method to make time-consuming simulation feasible. Arrhythmias, as a typical case, can be effectively simulated with the methods described.

Zhu, Hao; Sun, Yan; Rajagopal, Gunaretnam; Mondry, Adrian; Dhar, Pawan

2004-01-01

346

Use of Parallel Micro-Platform for the Simulation the Space Exploration  

NASA Astrophysics Data System (ADS)

The purpose of this work is to create a parallel micro-platform, that simulates the virtual movements of a space exploration in 3D. One of the innovations presented in this design consists of the application of a lever mechanism for the transmission of the movement. The development of such a robot is a challenging task very different of the industrial manipulators due to a totally different target system of requirements. This work presents the study and simulation, aided by computer, of the movement of this parallel manipulator. The development of this model has been developed using the platform of computer aided design Unigraphics, in which it was done the geometric modeled of each one of the components and end assembly (CAD), the generation of files for the computer aided manufacture (CAM) of each one of the pieces and the kinematics simulation of the system evaluating different driving schemes. We used the toolbox (MATLAB) of aerospace and create an adaptive control module to simulate the system.

Velasco Herrera, Victor Manuel; Velasco Herrera, Graciela; Rosano, Felipe Lara; Rodriguez Lozano, Salvador; Lucero Roldan Serrato, Karen

347

Switching to High Gear: Opportunities for Grand-scale Real-time Parallel Simulations  

SciTech Connect

The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 10^5 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher gear to exploit the new, immense levels of computational power. The potential for grand-scale real-time solutions is illustrated using preliminary results from prototypes in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) very large-scale epidemic modeling, (c) modeling the propagation of wireless network signals in very large, cluttered terrains, and, (d) country- or world-scale social behavioral modeling. We believe the stage is perfectly poised for the parallel/distributed simulation community to envision and formulate similar grand-scale, real-time simulation-based solutions in many application areas.

Perumalla, Kalyan S [ORNL

2009-01-01

348

Massively parallel Monte Carlo for many-particle simulations on GPUs  

NASA Astrophysics Data System (ADS)

Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. We present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU.[1] We reproduce results of serial high-precision Monte Carlo runs to verify the method.[2] This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a GeForce GTX 680, our GPU implementation executes 95 times faster than on a single Intel Xeon E5540 CPU core, enabling 17 times better performance per dollar and cutting energy usage by a factor of 10. [1] J.A. Anderson, E. Jankowski, T. Grubb, M. Engel and S.C. Glotzer, arXiv:1211.1646. [2] J.A. Anderson, M. Engel, S.C. Glotzer, M. Isobe, E.P. Bernard and W. Krauth, arXiv:1211.1645.

Glotzer, Sharon; Anderson, Joshua; Jankowski, Eric; Grubb, Thomas; Engel, Michael

2013-03-01

349

Implementation of unsteady sampling procedures for the parallel direct simulation Monte Carlo method  

NASA Astrophysics Data System (ADS)

An unsteady sampling routine for a general parallel direct simulation Monte Carlo method called PDSC is introduced, allowing the simulation of time-dependent flow problems in the near continuum range. A post-processing procedure called DSMC rapid ensemble averaging method (DREAM) is developed to improve the statistical scatter in the results while minimising both memory and simulation time. This method builds an ensemble average of repeated runs over small number of sampling intervals prior to the sampling point of interest by restarting the flow using either a Maxwellian distribution based on macroscopic properties for near equilibrium flows (DREAM-I) or output instantaneous particle data obtained by the original unsteady sampling of PDSC for strongly non-equilibrium flows (DREAM-II). The method is validated by simulating shock tube flow and the development of simple Couette flow. Unsteady PDSC is found to accurately predict the flow field in both cases with significantly reduced run-times over single processor code and DREAM greatly reduces the statistical scatter in the results while maintaining accurate particle velocity distributions. Simulations are then conducted of two applications involving the interaction of shocks over wedges. The results of these simulations are compared to experimental data and simulations from the literature where there these are available. In general, it was found that 10 ensembled runs of DREAM processing could reduce the statistical uncertainty in the raw PDSC data by 2.5 3.3 times, based on the limited number of cases in the present study.

Cave, H. M.; Tseng, K.-C.; Wu, J.-S.; Jermy, M. C.; Huang, J.-C.; Krumdieck, S. P.

2008-06-01

350

Parallel Agent-Based Simulations on Clusters of GPUs and Multi-Core Processors  

SciTech Connect

An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Message Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.

Aaby, Brandon G [ORNL; Perumalla, Kalyan S [ORNL; Seal, Sudip K [ORNL

2010-01-01

351

Data Parallel Execution Challenges and Runtime Performance of Agent Simulations on GPUs  

SciTech Connect

Programmable graphics processing units (GPUs) have emerged as excellent computational platforms for certain general-purpose applications. The data parallel execution capabilities of GPUs specifically point to the potential for effective use in simulations of agent-based models (ABM). In this paper, the computational efficiency of ABM simulation on GPUs is evaluated on representative ABM benchmarks. The runtime speed of GPU-based models is compared to that of traditional CPU-based implementation, and also to that of equivalent models in traditional ABM toolkits (Repast and NetLogo). As expected, it is observed that, GPU-based ABM execution affords excellent speedup on simple models, with better speedup on models exhibiting good locality and fair amount of computation per memory element. Execution is two to three orders of magnitude faster with a GPU than with leading ABM toolkits, but at the cost of decrease in modularity, ease of programmability and reusability. At a more fundamental level, however, the data parallel paradigm is found to be somewhat at odds with traditional model-specification approaches for ABM. Effective use of data parallel execution, in general, seems to require resolution of modeling and execution challenges. Some of the challenges are identified and related solution approaches are described.

Perumalla, Kalyan S [ORNL; Aaby, Brandon G [ORNL

2008-01-01

352

Simulations of structural and dynamic anisotropy in nano-confined water between parallel graphite plates  

NASA Astrophysics Data System (ADS)

We use molecular dynamics simulations to study the structure, dynamics, and transport properties of nano-confined water between parallel graphite plates with separation distances (H) from 7 to 20 A? at different water densities with an emphasis on anisotropies generated by confinement. The behavior of the confined water phase is compared to non-confined bulk water under similar pressure and temperature conditions. Our simulations show anisotropic structure and dynamics of the confined water phase in directions parallel and perpendicular to the graphite plate. The magnitude of these anisotropies depends on the slit width H. Confined water shows ``solid-like'' structure and slow dynamics for the water layers near the plates. The mean square displacements (MSDs) and velocity autocorrelation functions (VACFs) for directions parallel and perpendicular to the graphite plates are calculated. By increasing the confinement distance from H = 7 A? to H = 20 A?, the MSD increases and the behavior of the VACF indicates that the confined water changes from solid-like to liquid-like dynamics. If the initial density of the water phase is set up using geometric criteria (i.e., distance between the graphite plates), large pressures (in the order of ~10 katm), and large pressure anisotropies are established within the water. By decreasing the density of the water between the confined plates to about 0.9 g cm-3, bubble formation and restructuring of the water layers are observed.

Mosaddeghi, Hamid; Alavi, Saman; Kowsari, M. H.; Najafi, Bijan

2012-11-01

353

Simulations of structural and dynamic anisotropy in nano-confined water between parallel graphite plates.  

PubMed

We use molecular dynamics simulations to study the structure, dynamics, and transport properties of nano-confined water between parallel graphite plates with separation distances (H) from 7 to 20 Å at different water densities with an emphasis on anisotropies generated by confinement. The behavior of the confined water phase is compared to non-confined bulk water under similar pressure and temperature conditions. Our simulations show anisotropic structure and dynamics of the confined water phase in directions parallel and perpendicular to the graphite plate. The magnitude of these anisotropies depends on the slit width H. Confined water shows "solid-like" structure and slow dynamics for the water layers near the plates. The mean square displacements (MSDs) and velocity autocorrelation functions (VACFs) for directions parallel and perpendicular to the graphite plates are calculated. By increasing the confinement distance from H = 7 Å to H = 20 Å, the MSD increases and the behavior of the VACF indicates that the confined water changes from solid-like to liquid-like dynamics. If the initial density of the water phase is set up using geometric criteria (i.e., distance between the graphite plates), large pressures (in the order of ~10 katm), and large pressure anisotropies are established within the water. By decreasing the density of the water between the confined plates to about 0.9 g cm(-3), bubble formation and restructuring of the water layers are observed. PMID:23163385

Mosaddeghi, Hamid; Alavi, Saman; Kowsari, M H; Najafi, Bijan

2012-11-14

354

Parallelization of Particle-Particle, Particle-Mesh Method within N-Body Simulation  

NSDL National Science Digital Library

The N-Body problem has become an intricate part of the computational sciences, and there has been rise to many methods to solve and approximate the problem. The solution potentially requires on the order of calculations each time step, therefore efficient performance of these N-Body algorithms is very significant [5]. This work describes the parallelization and optimization of the Particle-Particle, Particle-Mesh (P3M) algorithm within GalaxSeeHPC, an open-source N-Body Simulation code. Upon successful profiling, MPI (Message Passing Interface) routines were implemented into the population of the density grid in the P3M method in GalaxSeeHPC. Each problem size recorded different results, and for a problem set dealing with 10,000 celestial bodies, speedups up to 10x were achieved. However, in accordance to Amdahl's Law, maximum speedups for the code should have been closer to 16x. In order to achieve maximum optimization, additional research is needed and parallelization of the Fourier Transform routines could prove to be rewarding. In conclusion, the GalaxSeeHPC Simulation was successfully parallelized and obtained very respectable results, while further optimization remains possible.

Nocito, Nicholas

355

De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers  

SciTech Connect

We present a de novo hierarchical simulation framework for first-principles based predictive simulations of materials and their validation on high-end parallel supercomputers and geographically distributed clusters. In this framework, high-end chemically reactive and non-reactive molecular dynamics (MD) simulations explore a wide solution space to discover microscopic mechanisms that govern macroscopic material properties, into which highly accurate quantum mechanical (QM) simulations are embedded to validate the discovered mechanisms and quantify the uncertainty of the solution. The framework includes an embedded divide-and-conquer (EDC) algorithmic framework for the design of linear-scaling simulation algorithms with minimal bandwidth complexity and tight error control. The EDC framework also enables adaptive hierarchical simulation with automated model transitioning assisted by graph-based event tracking. A tunable hierarchical cellular decomposition parallelization framework then maps the O(N) EDC algorithms onto Petaflops computers, while achieving performance tunability through a hierarchy of parameterized cell data/computation structures, as well as its implementation using hybrid Grid remote procedure call + message passing + threads programming. High-end computing platforms such as IBM BlueGene/L, SGI Altix 3000 and the NSF TeraGrid provide an excellent test grounds for the framework. On these platforms, we have achieved unprecedented scales of quantum-mechanically accurate and well validated, chemically reactive atomistic simulations--1.06 billion-atom fast reactive force-field MD and 11.8 million-atom (1.04 trillion grid points) quantum-mechanical MD in the framework of the EDC density functional theory on adaptive multigrids--in addition to 134 billion-atom non-reactive space-time multiresolution MD, with the parallel efficiency as high as 0.998 on 65,536 dual-processor BlueGene/L nodes. We have also achieved an automated execution of hierarchical QM/MD simulation on a Grid consisting of 6 supercomputer centers in the US and Japan (in total of 150 thousand processor-hours), in which the number of processors change dynamically on demand and resources are allocated and migrated dynamically in response to faults. Furthermore, performance portability has been demonstrated on a wide range of platforms such as BlueGene/L, Altix 3000, and AMD Opteron-based Linux clusters.

Nakano, A; Kalia, R K; Nomura, K; Sharma, A; Vashishta, P; Shimojo, F; van Duin, A; Goddard, III, W A; Biswas, R; Srivastava, D; Yang, L H

2006-09-04

356

Xyce parallel electronic simulator design : mathematical formulation, version 2.0.  

SciTech Connect

This document is intended to contain a detailed description of the mathematical formulation of Xyce, a massively parallel SPICE-style circuit simulator developed at Sandia National Laboratories. The target audience of this document are people in the role of 'service provider'. An example of such a person would be a linear solver expert who is spending a small fraction of his time developing solver algorithms for Xyce. Such a person probably is not an expert in circuit simulation, and would benefit from an description of the equations solved by Xyce. In this document, modified nodal analysis (MNA) is described in detail, with a number of examples. Issues that are unique to circuit simulation, such as voltage limiting, are also described in detail.

Hoekstra, Robert John; Waters, Lon J.; Hutchinson, Scott Alan; Keiter, Eric Richard; Russo, Thomas V.

2004-06-01

357

Embedded Microclusters in Zeolites and Cluster Beam Sputtering -- Simulation on Parallel Computers  

SciTech Connect

This report summarizes the research carried out under DOE supported program (DOE/ER/45477) Computer Science--during the course of this project. Large-scale molecular-dynamics (MD) simulations were performed to investigate: (1) sintering of microporous and nanophase Si{sub 3}N{sub 4}; (2) crack-front propagation in amorphous silica; (3) phonons in highly efficient multiscale algorithms and dynamic load-balancing schemes for mapping process, structural correlations, and mechanical behavior including dynamic fracture in graphitic tubules; and (4) amorphization and fracture in nanowires. The simulations were carried out with irregular atomistic simulations on distributed-memory parallel architectures. These research activities resulted in fifty-three publications and fifty-five invited presentations.

Greenwell, Donald L.; Kalia, Rajiv K.; Vashishta, Priya

1996-12-01

358

Multiple Point Geostatistical Simulation with Simulated Annealing: Implementation Using Speculative Parallel Computing  

Microsoft Academic Search

\\u000a Multiple-point geostatistical simulation aims at generating realizations that reproduce pattern statistics inferred from some\\u000a training source, usually a training image. The most widely used algorithm is based on solving a single normal equation at\\u000a each location using the conditional probabilities inferred during the training process. Simulated annealing offers an alternative\\u000a implementation that, in addition, permits to incorporate additional statistics to

Julián M. Ortiz; Oscar Peredo

359

LUsim: A Framework for Simulation-Based Performance Modelingand Prediction of Parallel Sparse LU Factorization  

SciTech Connect

Sparse parallel factorization is among the most complicated and irregular algorithms to analyze and optimize. Performance depends both on system characteristics such as the floating point rate, the memory hierarchy, and the interconnect performance, as well as input matrix characteristics such as such as the number and location of nonzeros. We present LUsim, a simulation framework for modeling the performance of sparse LU factorization. Our framework uses micro-benchmarks to calibrate the parameters of machine characteristics and additional tools to facilitate real-time performance modeling. We are using LUsim to analyze an existing parallel sparse LU factorization code, and to explore a latency tolerant variant. We developed and validated a model of the factorization in SuperLU_DIST, then we modeled and implemented a new variant of slud, replacing a blocking collective communication phase with a non-blocking asynchronous point-to-point one. Our strategy realized a mean improvement of 11percent over a suite of test matrices.

Univ. of California, San Diego; Li, Xiaoye Sherry; Cicotti, Pietro; Li, Xiaoye Sherry; Baden, Scott B.

2008-04-15

360

A study of the parallel algorithm for large-scale DC simulation of nonlinear systems  

NASA Astrophysics Data System (ADS)

Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.

Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel

2012-05-01

361

A fast parallel Poisson solver on irregular domains applied to beam dynamics simulations  

SciTech Connect

We discuss the scalable parallel solution of the Poisson equation within a Particle-In-Cell (PIC) code for the simulation of electron beams in particle accelerators of irregular shape. The problem is discretized by Finite Differences. Depending on the treatment of the Dirichlet boundary the resulting system of equations is symmetric or 'mildly' nonsymmetric positive definite. In all cases, the system is solved by the preconditioned conjugate gradient algorithm with smoothed aggregation (SA) based algebraic multigrid (AMG) preconditioning. We investigate variants of the implementation of SA-AMG that lead to considerable improvements in the execution times. We demonstrate good scalability of the solver on distributed memory parallel processor with up to 2048 processors. We also compare our iterative solver with an FFT-based solver that is more commonly used for applications in beam dynamics.

Adelmann, A. [Paul Scherrer Institut, CH-5234 Villigen (Switzerland)], E-mail: andreas.adelmann@psi.ch; Arbenz, P. [ETH Zuerich, Chair of Computational Science, Universitaetsstrasse 6, CH-8092 Zuerich (Switzerland)], E-mail: arbenz@inf.ethz.ch; Ineichen, Y. [Paul Scherrer Institut, CH-5234 Villigen (Switzerland); ETH Zuerich, Chair of Computational Science, Universitaetsstrasse 6, CH-8092 Zuerich (Switzerland)], E-mail: ineichen@inf.ethz.ch

2010-06-20

362

Tree Particle-Mesh: An Adaptive, Efficient, and Parallel Code for Collisionless Cosmological Simulation  

NASA Astrophysics Data System (ADS)

An improved implementation of an N-body code for simulating collisionless cosmological dynamics is presented. TPM (tree particle-mesh) combines the PM method on large scales with a tree code to handle particle-particle interactions at small separations. After the global PM forces are calculated, spatially distinct regions above a given density contrast are located; the tree code calculates the gravitational interactions inside these denser objects at higher spatial and temporal resolution. The new implementation includes individual particle time steps within trees, an improved treatment of tidal forces on trees, new criteria for higher force resolution and choice of time step, and parallel treatment of large trees. TPM is compared to P3M and a tree code (GADGET) and is found to give equivalent results in significantly less time. The implementation is highly portable (requiring a FORTRAN compiler and MPI) and efficient on parallel machines. The source code can be found on the World Wide Web.

Bode, Paul; Ostriker, Jeremiah P.

2003-03-01

363

Parallel 3D Finite Element Particle-in-Cell Simulations with Pic3P  

SciTech Connect

SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic Particle-In-Cell code Pic3P. Designed for simulations of beam-cavity interactions dominated by space charge effects, Pic3P solves the complete set of Maxwell-Lorentz equations self-consistently and includes space-charge, retardation and boundary effects from first principles. Higher-order Finite Element methods with adaptive refinement on conformal unstructured meshes lead to highly efficient use of computational resources. Massively parallel processing with dynamic load balancing enables large-scale modeling of photoinjectors with unprecedented accuracy, aiding the design and operation of next-generation accelerator facilities. Applications include the LCLS RF gun and the BNL polarized SRF gun.

Candel, A.; Kabel, A.; Lee, L.; Li, Z.; Ng, C.; Schussman, G.; Ko, K.; /SLAC; Ben-Zvi, I.; Kewisch, J.; /Brookhaven

2009-06-19

364

Adaptive Flow Simulation of Turbulence in Subject-Specific Abdominal Aortic Aneurysm on Massively Parallel Computers  

NASA Astrophysics Data System (ADS)

Flow within the healthy human vascular system is typically laminar but diseased conditions can alter the geometry sufficiently to produce transitional/turbulent flows in regions focal (and immediately downstream) of the diseased section. The mean unsteadiness (pulsatile or respiratory cycle) further complicates the situation making traditional turbulence simulation techniques (e.g., Reynolds-averaged Navier-Stokes simulations (RANSS)) suspect. At the other extreme, direct numerical simulation (DNS) while fully appropriate can lead to large computational expense, particularly when the simulations must be done quickly since they are intended to affect the outcome of a medical treatment (e.g., virtual surgical planning). To produce simulations in a clinically relevant time frame requires; 1) adaptive meshing technique that closely matches the desired local mesh resolution in all three directions to the highly anisotropic physical length scales in the flow, 2) efficient solution algorithms, and 3) excellent scaling on massively parallel computers. In this presentation we will demonstrate results for a subject-specific simulation of an abdominal aortic aneurysm using stabilized finite element method on anisotropically adapted meshes consisting of O(10^8) elements over O(10^4) processors.

Sahni, Onkar; Jansen, Kenneth; Shephard, Mark; Taylor, Charles

2007-11-01

365

The Acceleration of Thermal Protons at Parallel Collisionless Shocks: Three-dimensional Hybrid Simulations  

NASA Astrophysics Data System (ADS)

We present three-dimensional hybrid simulations of collisionless shocks that propagate parallel to the background magnetic field to study the acceleration of protons that forms a high-energy tail on the distribution. We focus on the initial acceleration of thermal protons and compare it with results from one-dimensional simulations. We find that for both one- and three-dimensional simulations, particles that end up in the high-energy tail of the distribution later in the simulation gained their initial energy right at the shock. This confirms previous results but is the first to demonstrate this using fully three-dimensional fields. The result is not consistent with the "thermal leakage" model. We also show that the gyrocenters of protons in the three-dimensional simulation can drift away from the magnetic field lines on which they started due to the removal of ignorable coordinates that exist in one- and two-dimensional simulations. Our study clarifies the injection problem for diffusive shock acceleration.

Guo, Fan; Giacalone, Joe

2013-08-01

366

Using Speculative Execution to Reduce Communication in a Parallel Large Scale Earthquake Simulation  

NASA Astrophysics Data System (ADS)

Earthquake simulations on parallel systems can be communication intensive due to local events (rupture waves) which have global effects (stress transfer). These events require global communication to transmit the effects of increased stress to model elements on other computing nodes. We describe a method of using speculative execution in a large scale parallel computation to decrease communication and improve simulation speed. This method exploits the tendency of earthquake ruptures to remain physically localized even though their effects on stress will be over long ranges. In this method we assume the stress transfer caused by a rupture remains localized and avoid global communication until the rupture has a high probability of passing to another node. We then calculate the stress state of the system to ensure that the rupture in fact remained localized, proceeding if the assumption was correct or rolling back the calculation otherwise. Using this method we are able to reduce communication frequency by 78% percent, in turn decreasing communication time by up to 66% and improving simulation speed by up to 45%.

Heien, E. M.; Yikilmaz, M. B.; Sachs, M. K.; Rundle, J. B.; Turcotte, D. L.; Kellogg, L. H.

2011-12-01

367

Accelerating Groundwater Flow Simulation in MODFLOW Using JASMIN-Based Parallel Computing.  

PubMed

To accelerate the groundwater flow simulation process, this paper reports our work on developing an efficient parallel simulator through rebuilding the well-known software MODFLOW on JASMIN (J Adaptive Structured Meshes applications Infrastructure). The rebuilding process is achieved by designing patch-based data structure and parallel algorithms as well as adding slight modifications to the compute flow and subroutines in MODFLOW. Both the memory requirements and computing efforts are distributed among all processors; and to reduce communication cost, data transfers are batched and conveniently handled by adding ghost nodes to each patch. To further improve performance, constant-head/inactive cells are tagged and neglected during the linear solving process and an efficient load balancing strategy is presented. The accuracy and efficiency are demonstrated through modeling three scenarios: The first application is a field flow problem located at Yanming Lake in China to help design reasonable quantity of groundwater exploitation. Desirable numerical accuracy and significant performance enhancement are obtained. Typically, the tagged program with load balancing strategy running on 40 cores is six times faster than the fastest MICCG-based MODFLOW program. The second test is simulating flow in a highly heterogeneous aquifer. The AMG-based JASMIN program running on 40 cores is nine times faster than the GMG-based MODFLOW program. The third test is a simplified transient flow problem with the order of tens of millions of cells to examine the scalability. Compared to 32 cores, parallel efficiency of 77 and 68% are obtained on 512 and 1024 cores, respectively, which indicates impressive scalability. PMID:23600445

Cheng, Tangpei; Mo, Zeyao; Shao, Jingli

2013-04-18

368

Simple LabVIEW DC Circuit Simulation With Parallel Resistors: Overview  

NSDL National Science Digital Library

This is a downloadable simple DC circuit simulation with 2 resistors in parallel with a third resistor. This is useful for studying Ohm's Law. Users can adjust the voltage and the resistors while the current changes in real time, just like the real thing. Users are then asked whether the current increases or decreases as the ohms of the resistors increases. Includes instructions on how to measure DC / AC current. This free program requires Windows 9x, NT, XP or later. Note that this will NOT run on Mac OS.

2009-08-21

369

3-D Hybrid Simulation of Quasi-Parallel Bow Shock and Its Effects on the Magnetosphere  

SciTech Connect

A three-dimensional (3-D) global-scale hybrid simulation is carried out for the structure of the quasi-parallel bow shock, in particular the foreshock waves and pressure pulses. The wave evolution and interaction with the dayside magnetosphere are discussed. It is shown that diamagnetic cavities are generated in the turbulent foreshock due to the ion beam plasma interaction, and these compressional pulses lead to strong surface perturbations at the magnetopause and Alfven waves/field line resonance in the magnetosphere.

Lin, Y.; Wang, X.Y. [Physics Department, Auburn University, Auburn, AL 36849-5311 (United States)

2005-08-01

370

A parallel multigrid preconditioner for the simulation of large fracture networks  

SciTech Connect

Computational modeling of a fracture in disordered materials using discrete lattice models requires the solution of a linear system of equations every time a new lattice bond is broken. Solving these linear systems of equations successively is the most expensive part of fracture simulations using large three-dimensional networks. In this paper, we present a parallel multigrid preconditioned conjugate gradient algorithm to solve these linear systems. Numerical experiments demonstrate that this algorithm performs significantly better than the algorithms previously used to solve this problem.

Sampath, Rahul S [ORNL; Barai, Pallab [ORNL; Nukala, Phani K [ORNL

2010-01-01

371

High performance massively parallel direct N-body simulations on large GPU clusters.  

NASA Astrophysics Data System (ADS)

We present direct astrophysical N-body simulations with up to six million bodies using our parallel MPI/CUDA code on large GPU clusters in China, with different kinds of GPU hardware. These clusters are directly linked under the Chinese Academy of Sciences special GPU cluster program. We reach about one third of the peak GPU performance for this code, in a real application scenario with individual hierarchically block time-steps high (4th, 6th and 8th) order Hermite integration schemes and a real core-halo density structure of the modeled stellar systems.

Berczik, Peter; Nitadori, Keigo; Zhong, Shiyan; Spurzem, Rainer; Hamada, Tsuyoshi; Wang, Xiaowei; Berentzen, Ingo; Veles, Alexander; Ge, Wei

2011-10-01

372

Parallel Implementation of the Integral Transport Equation-Based Radiography Simulation Code  

SciTech Connect

An integral transport equation-based industrial radiography simulation code is parallelized using the Message Passing Interface standard on computers with both distributed- and shared-memory architectures. The algorithm involves partitioning of the problem domain into regions that are connected to each other through interface conditions. This results in a simultaneous set of integral transport equations. Each equation in the set is assigned to a different processor in the platform. The new algorithm is subjected to scalability tests in both cluster and shared-memory architectures for a varying number of processors with different problem domain partition strategies. The results show a high level of scalability with favorable results in both architectures.

Inanc, Feyzi; Vasiliu, Bogdan; Turner, Dave [Iowa State University (United States)

2001-02-15

373

Parallel Simulation of Three-Dimensional Free-Surface Fluid Flow Problems  

SciTech Connect

We describe parallel simulations of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact lines. The Galerlin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-static solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of problem unknowns. Issues concerning the proper constraints along the solid-fluid dynamic contact line in three dimensions are discussed. Parallel computations are carried out for an example taken from the coating flow industry, flow in the vicinity of a slot coater edge. This is a three-dimensional free-surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another part of the flow domain. Discussion focuses on parallel speedups for fixed problem size, a class of problems of immediate practical importance.

BAER,THOMAS A.; SUBIA,SAMUEL R.; SACKINGER,PHILIP A.

2000-01-18

374

An Introduction to Parallel Cluster Computing Using PVM for Computer Modeling and Simulation of Engineering Problems  

SciTech Connect

An investigation has been conducted regarding the ability of clustered personal computers to improve the performance of executing software simulations for solving engineering problems. The power and utility of personal computers continues to grow exponentially through advances in computing capabilities such as newer microprocessors, advances in microchip technologies, electronic packaging, and cost effective gigabyte-size hard drive capacity. Many engineering problems require significant computing power. Therefore, the computation has to be done by high-performance computer systems that cost millions of dollars and need gigabytes of memory to complete the task. Alternately, it is feasible to provide adequate computing in the form of clustered personal computers. This method cuts the cost and size by linking (clustering) personal computers together across a network. Clusters also have the advantage that they can be used as stand-alone computers when they are not operating as a parallel computer. Parallel computing software to exploit clusters is available for computer operating systems like Unix, Windows NT, or Linux. This project concentrates on the use of Windows NT, and the Parallel Virtual Machine (PVM) system to solve an engineering dynamics problem in Fortran.

Spencer, VN

2001-08-29

375

Parallel adaptive fluid-structure interaction simulation of explosions impacting on building structures  

SciTech Connect

We pursue a level set approach to couple an Eulerian shock-capturing fluid solver with space-time refinement to an explicit solid dynamics solver for large deformations and fracture. The coupling algorithms considering recursively finer fluid time steps as well as overlapping solver updates are discussed in detail. Our ideas are implemented in the AMROC adaptive fluid solver framework and are used for effective fluid-structure coupling to the general purpose solid dynamics code DYNA3D. Beside simulations verifying the coupled fluid-structure solver and assessing its parallel scalability, the detailed structural analysis of a reinforced concrete column under blast loading and the simulation of a prototypical blast explosion in a realistic multistory building are presented.

Deiterding, Ralf [ORNL; Wood, Stephen L [University of Tennessee, Knoxville (UTK)

2013-01-01

376

Numerical simulation via parallel-distributed computing of energy absorption by metal deformation  

SciTech Connect

Collapsible steering column designs are credited with saving tens-of-thousands of lives since their introduction in the late 1960`s. The collapsible steering column is a safety feature designed to absorb energy and protect-the driver in a head-on collision. One of the most frequently used design concepts employs two telescoping metal tubes that slide over one another as the occupant impacts the steering wheel. Hardened steel ball bearings are embedded in a plastic sleeve located between the two tubes. There are two primary mechanisms for energy absorption during steering column collapse. One is the friction between the bearing and tube surfaces. Another is the gouging of the tubes` surfaces by the bearings. Current analytical models are unable to adequately capture the physics behind this process. In this paper we will present an overview of a parallel finite element code, currently under development, that can be used to simulate the highly nonlinear response of this energy absorbing mechanism. Our parallel algorithms are constructed on a message-passing foundation. The actual message-passing implementation used was the Argonne-developed p4 package. However, other message-passing libraries can easily be accommodated as they are largely identical in function and differ only in syntax. Once the algorithm is restructured as a set of processes communicating through messages, the program can run on systems as diverse as a uniprocessor workstation, multiprocessors with and without shared memory, a group of workstations that communicate over a local network, or any combination of the above. Benchmarks of the parallel code performance on networks of workstations and the IBM SP1 parallel supercomputer will be discussed.

Plaskacz, E.J.; Kulak, R.F.

1995-07-01

377

Parallel 3-d simulations for porous media models in soil mechanics  

NASA Astrophysics Data System (ADS)

Numerical simulations in 3-d for porous media models in soil mechanics are a difficult task for the engineering modelling as well as for the numerical realization. Here, we present a general numerical scheme for the simulation of two-phase models in combination with an material model via the stress response with a specialized parallel saddle point solver. Therefore, we give a brief introduction into the theoretical background of the Theory of Porous Media and constitute a two-phase model consisting of a porous solid skeleton saturated by a viscous pore-fluid. The material behaviour of the skeleton is assumed to be elasto-viscoplastic. The governing equations are transfered to a weak formulation suitable for the application of the finite element method. Introducing an formulation in terms of the stress response, we define a clear interface between the assembling process and the parallel solver modules. We demonstrate the efficiency of this approach by challenging numerical experiments realized on the Linux Cluster in Chemnitz.

Wieners, C.; Ammann, M.; Diebels, S.; Ehlers, W.

378

3D Global Simulations of Mode Conversion at the Magnetopause with Quasi-Parallel Shocks  

NASA Astrophysics Data System (ADS)

Following previous 2D and 3D local simulations of mode conversion at the magnetopause near the Alfven resonance surface, we further examine the mode conversion process under much more complicated conditions of the self-consistent solar wind-magnetosphere interaction using a 3D global hybrid simulation model. During the interaction between the quasi-parallel shock and the Earth's magnetosphere, the fast mode compressional structures are present in the bow shock and the magnetosheath and are carried toward the magnetopause boundary by the solar wind. By tracking the fast mode structures that imping onto the magnetopause boundary layer, we analyze the dynamic evolution of the incident waves and its conversion to the shear Alfven waves and short-wavelength kinetic Alfven waves (KAWs) during the Alfven resonance processes. The KAWs are found to be localized near the magnetopause while propagating in the azimuthal direction, with sharp increases in the parallel electric field and field-aligned currents. A decrease in the magnetic field and increase in the density show an anti-phase relation in the trapped KAWs. The wave structure, propagation, and time evolution in the 3D magnetopause are investigated.

Shi, F.; Tan, B.; Wang, X.; Lin, Y.

2011-12-01

379

Parallel grid library with adaptive mesh refinement for development of highly scalable simulations  

NASA Astrophysics Data System (ADS)

As the single CPU core performance is saturating while the number of cores in the fastest supercomputers increases exponentially, the parallel performance of simulations on distributed memory machines is crucial. At the same time, utilizing efficiently the large number of available cores presents a challenge, especially in simulations with run-time adaptive mesh refinement. We have developed a generic grid library (dccrg) aimed at finite volume simulations that is easy to use and scales well up to tens of thousands of cores. The grid has several attractive features: It 1) allows an arbitrary C++ class or structure to be used as cell data; 2) provides a simple interface for adaptive mesh refinement during a simulation; 3) encapsulates the details of MPI communication when updating the data of neighboring cells between processes; and 4) provides a simple interface to run-time load balancing, e.g. domain decomposition, through the Zoltan library. Dccrg is freely available for anyone to use, study and modify under the GNU Lesser General Public License v3. We will present the implementation of dccrg, simple and advanced usage examples and scalability results on various supercomputers and problems.

Honkonen, I.; von Alfthan, S.; Sandroos, A.; Janhunen, P.; Palmroth, M.

2012-04-01

380

Three-dimensional gyrokinetic particle-in-cell simulation of plasmas on a massively parallel computer: LDRD Core Competency Project  

NASA Astrophysics Data System (ADS)

One of the programs of the Magnetic fusion Energy (MFE) Theory and Computations Program is studying the anomalous transport of thermal energy across the field lines in the core of a tokamak. We use the method of gyrokinetic particle-in-cell simulation in this study. For this LDRD project we employed massively parallel processing, new algorithms, and new algorithms, and new formal techniques to improve this research. Specifically, we sought to take steps toward: researching experimentally-relevant parameters in our simulations, learning parallel computing to have as a resource for our group, and achieving a 100 (times) speedup over our starting-point Cray2 simulation code's performance.

Byers, J. A.; Williams, T. J.; Cohen, B. I.; Dimits, A. M.

1994-04-01

381

Application of a 3D, Adaptive, Parallel, MHD Code to Supernova Remnant Simulations  

NASA Astrophysics Data System (ADS)

We at Michigan have a computational model, BATS-R-US, which incorporates several modern features that make it suitable for calculations of supernova remnant evolution. In particular, it is a three-dimensional MHD model, using a method called the Multiscale Adaptive Upwind Scheme for MagnetoHydroDynamics (MAUS-MHD). It incorporates a data structure that allows for adaptive refinement of the mesh, even in massively parallel calculations. Its advanced Godunov method, a solution-adaptive, upwind, high-resolution scheme, incorporates a new, flux-based approach to the Riemann solver with improved numerical properties. This code has been successfully applied to several problems, including the simulation of comets and of planetary magnetospheres, in the 3D context of the Heliosphere. The code was developed under a NASA computational grand challenge grant to run very rapidly on parallel platforms. It is also now being used to study time-dependent systems such as the transport of particles and energy from solar coronal mass ejections to the Earth. We are in the process of modifying this code so that it can accommodate the very strong shocks present in supernova remnants. Our test case simulates the explosion of a star of 1.4 solar masses with an energy of 1 foe, in a uniform background medium. We have performed runs of 250,000 to 1 million cells on 8 nodes of an Origin 2000. These relatively coarse grids do not allow fine details of instabilities to become visible. Nevertheless, the macroscopic evolution of the shock is simulated well, with the forward and reverse shocks visible in velocity profiles. We will show our work to date. This work was supported by NASA through its GSRP program.

Kominsky, P.; Drake, R. P.; Powell, K. G.

2001-05-01

382

An evaluation of parallelization strategies for low-frequency electromagnetic induction simulators using staggered grid discretizations  

NASA Astrophysics Data System (ADS)

The high computational cost of the forward solution for modeling low-frequency electromagnetic induction phenomena is one of the primary impediments against broad-scale adoption by the geoscience community of exploration techniques, such as magnetotellurics and geomagnetic depth sounding, that rely on fast and cheap forward solutions to make tractable the inverse problem. As geophysical observables, electromagnetic fields are direct indicators of Earth's electrical conductivity - a physical property independent of (but in some cases correlative with) seismic wavespeed. Electrical conductivity is known to be a function of Earth's physiochemical state and temperature, and to be especially sensitive to the presence of fluids, melts and volatiles. Hence, electromagnetic methods offer a critical and independent constraint on our understanding of Earth's interior processes. Existing methods for parallelization of time-harmonic electromagnetic simulators, as applied to geophysics, have relied heavily on a combination of strategies: coarse-grained decompositions of the model domain; and/or, a high-order functional decomposition across spectral components, which in turn can be domain-decomposed themselves. Hence, in terms of scaling, both approaches are ultimately limited by the growing communication cost as the granularity of the forward problem increases. In this presentation we examine alternate parallelization strategies based on OpenMP shared-memory parallelization and CUDA-based GPU parallelization. As a test case, we use two different numerical simulation packages, each based on a staggered Cartesian grid: FDM3D (Weiss, 2006) which solves the curl-curl equation directly in terms of the scattered electric field (available under the LGPL at www.openem.org); and APHID, the A-Phi Decomposition based on mixed vector and scalar potentials, in which the curl-curl operator is replaced operationally by the vector Laplacian. We describe progress made in modifying the code to use direct solvers in GPU cores dedicated to each small subdomain, iteratively improving the solution by matching adjacent subdomain boundary solutions, rather than iterative Krylov space sparse solvers as currently applied to the whole domain.

Weiss, C. J.; Schultz, A.

2011-12-01

383

Massively parallel simulation with DOE's ASCI supercomputers : an overview of the Los Alamos Crestone project  

SciTech Connect

The Los Alamos Crestone Project is part of the Department of Energy's (DOE) Accelerated Strategic Computing Initiative, or ASCI Program. The main goal of this software development project is to investigate the use of continuous adaptive mesh refinement (CAMR) techniques for application to problems of interest to the Laboratory. There are many code development efforts in the Crestone Project, both unclassified and classified codes. In this overview I will discuss the unclassified SAGE and the RAGE codes. The SAGE (SAIC adaptive grid Eulerian) code is a one-, two-, and three-dimensional multimaterial Eulerian massively parallel hydrodynamics code for use in solving a variety of high-deformation flow problems. The RAGE CAMR code is built from the SAGE code by adding various radiation packages, improved setup utilities and graphics packages and is used for problems in which radiation transport of energy is important. The goal of these massively-parallel versions of the codes is to run extremely large problems in a reasonable amount of calendar time. Our target is scalable performance to {approx}10,000 processors on a 1 billion CAMR computational cell problem that requires hundreds of variables per cell, multiple physics packages (e.g. radiation and hydrodynamics), and implicit matrix solves for each cycle. A general description of the RAGE code has been published in [l],[ 2], [3] and [4]. Currently, the largest simulations we do are three-dimensional, using around 500 million computation cells and running for literally months of calendar time using {approx}2000 processors. Current ASCI platforms range from several 3-teraOPS supercomputers to one 12-teraOPS machine at Lawrence Livermore National Laboratory, the White machine, and one 20-teraOPS machine installed at Los Alamos, the Q machine. Each machine is a system comprised of many component parts that must perform in unity for the successful run of these simulations. Key features of any massively parallel system include the processors, the disks, the interconnection between processors, the operating system, libraries for message passing and parallel 1/0 and other fundamental units of the system. We will give an overview of the current status of the Crestone Project codes SAGE and RAGE. These codes are intended for general applications without tuning of algorithms or parameters. We have run a wide variety of physical applications from millimeter-scale laboratory laser experiments to the multikilometer-scale asteroid impacts into the Pacific Ocean to parsec-scale galaxy formation. Examples of these simulations will be shown. The goal of our effort is to avoid ad hoc models and attempt to rely on first-principles physics. In addition to the large effort on developing parallel code physics packages, a substantial effort in the project is devoted to improving the computer science and software quality engineering (SQE) of the Project codes as well as a sizable effort on the verification and validation (V&V) of the resulting codes. Examples of these efforts for our project will be discussed.

Weaver, R. P. (Robert P.); Gittings, M. L. (Michael L.)

2004-01-01

384

Validation Of A Reactive Force Field Included With An Open Source, Massively Parallel Code For Molecular Dynamics Simulations Of RDX  

Microsoft Academic Search

Molecular dynamics (MD) simulations of RDX is carried out using the ReaxFF force field supplied with the Large-scale Atomic\\/Molecular Massively Parallel Simulator (LAMMPS). Validation of ReaxFF to model RDX is carried out by extracting the (i) crystal unit cell parameters, (ii) bulk modulus and (iii) thermal expansion coefficient and comparing with reported values from both experiments and simulations.

M. Warrier; P. Pahari; S. Chaturvedi

2010-01-01

385

Validation Of A Reactive Force Field Included With An Open Source, Massively Parallel Code For Molecular Dynamics Simulations Of RDX  

NASA Astrophysics Data System (ADS)

Molecular dynamics (MD) simulations of RDX is carried out using the ReaxFF force field supplied with the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). Validation of ReaxFF to model RDX is carried out by extracting the (i) crystal unit cell parameters, (ii) bulk modulus and (iii) thermal expansion coefficient and comparing with reported values from both experiments and simulations.

Warrier, M.; Pahari, P.; Chaturvedi, S.

2010-12-01

386

The Relevance of Topology in Parallel Simulation of Biological NetworkS.  

PubMed

Important achievements in traditional biology has deepened the knowledge about living systems leading to an extensive identification of parts-list of the cell as well as of the interactions among biochemical species responsible for cell's regulation. Such an expanding knowledge also introduces new issues. For example the increasing comprehension of the inter- dependencies between pathways (pathways cross-talk) has resulted, on one hand, in the growth of informational complexity, on the other, in a strong lack of information coherence. The overall grand challenge remains unchanged: to be able to assemble the knowledge of every 'piece' of a system in order to figure out the behavior of the whole (integrative approach). In light of these considerations high performance computing plays a fundamental role in the context of in-silico biology. Stochastic simulation is a renowned analysis tool, which, although widely used, is subject to stringent computational requirements, in particular when dealing with heterogeneous and high dimensional systems. Here we introduce and discuss a methodology aimed at alleviating the burden of simulating complex biological networks. Such a method, which springs from graph theory, is based on the principle of fragmenting the computational space of a simulation trace and delegating the computation of fragments to a number of parallel processes. PMID:22331861

Mazza, Tommaso; Ballarini, Paolo; Guido, Rosita; Prandi, Davide

2012-01-30

387

Mechanisms for the convergence of time-parallelized, parareal turbulent plasma simulations  

SciTech Connect

Parareal is a recent algorithm able to parallelize the time dimension in spite of its sequential nature. It has been applied to several linear and nonlinear problems and, very recently, to a simulation of fully-developed, two-dimensional drift wave turbulence. The mere fact that parareal works in such a turbulent regime is in itself somewhat unexpected, due to the characteristic sensitivity of turbulence to any change in initial conditions. This fundamental property of any turbulent system should render the iterative correction procedure characteristic of the parareal method inoperative, but this seems not to be the case. In addition, the choices that must be made to implement parareal (division of the temporal domain, election of the coarse solver and so on) are currently made using trial-and-error approaches. Here, we identify the mechanisms responsible for the convergence of parareal of these simulations of drift wave turbulence. We also investigate which conditions these mechanisms impose on any successful parareal implementation. The results reported here should be useful to guide future implementations of parareal within the much wider context of fully-developed fluid and plasma turbulent simulations.

Reynolds-Barredo, J. [University of Alaska; University Carlos III de Madrid; Newman, David E [University of Alaska; Sanchez, R. [Universidad Carlos III, Madrid, Spain; Samaddar, D. [ITER Organization, Saint Paul Lez Durance, France; Berry, Lee A [ORNL; Elwasif, Wael R [ORNL

2012-01-01

388

Three-dimensional parallel UNIPIC-3D code for simulations of high-power microwave devices  

NASA Astrophysics Data System (ADS)

This paper introduces a self-developed, three-dimensional parallel fully electromagnetic particle simulation code UNIPIC-3D. In this code, the electromagnetic fields are updated using the second-order, finite-difference time-domain method, and the particles are moved using the relativistic Newton-Lorentz force equation. The electromagnetic field and particles are coupled through the current term in Maxwell's equations. Two numerical examples are used to verify the algorithms adopted in this code, numerical results agree well with theoretical ones. This code can be used to simulate the high-power microwave (HPM) devices, such as the relativistic backward wave oscillator, coaxial vircator, and magnetically insulated line oscillator, etc. UNIPIC-3D is written in the object-oriented C++ language and can be run on a variety of platforms including WINDOWS, LINUX, and UNIX. Users can use the graphical user's interface to create the complex geometric structures of the simulated HPM devices, which can be automatically meshed by UNIPIC-3D code. This code has a powerful postprocessor which can display the electric field, magnetic field, current, voltage, power, spectrum, momentum of particles, etc. For the sake of comparison, the results computed by using the two-and-a-half-dimensional UNIPIC code are also provided for the same parameters of HPM devices, the numerical results computed from these two codes agree well with each other.

Wang, Jianguo; Chen, Zaigao; Wang, Yue; Zhang, Dianhui; Liu, Chunliang; Li, Yongdong; Wang, Hongguang; Qiao, Hailiang; Fu, Meiyan; Yuan, Yuan

2010-07-01

389

Comparing parallel- and simulated-tempering-enhanced sampling algorithms at phase-transition regimes  

NASA Astrophysics Data System (ADS)

Two important enhanced sampling algorithms, simulated (ST) and parallel (PT) tempering, are commonly used when ergodic simulations may be hard to achieve, e.g., due to a phase space separated by large free-energy barriers. This is so for systems around first-order phase transitions, a case still not fully explored with such approaches in the literature. In this contribution we make a comparative study between the PT and ST for the Ising (a lattice gas in the fluid language) and the Blume-Emery-Griffiths (a lattice gas with vacancies) models at phase-transition regimes. We show that although the two methods are equivalent in the limit of sufficiently long simulations, the PT is more advantageous than the ST with respect to all the analysis performed: convergence toward the stationarity; frequency of tunneling between phases at the coexistence; and decay of time-displaced correlation functions of thermodynamic quantities. Qualitative arguments for why one may expect better results from the PT than the ST near phase-transitions conditions are also presented.

Fiore, Carlos E.; da Luz, M. G. E.

2010-09-01

390

Effect of parallel currents on drift-interchange turbulence: Comparison of simulation and experiment  

SciTech Connect

Two-dimensional (2D) turbulence simulations are reported in which the balancing of the parallel and perpendicular currents is modified by changing the axial boundary condition (BC) to vary the sheath conductivity. The simulations are carried out using the 2D scrape-off-layer turbulence (SOLT) code. The results are compared with recent experiments on the controlled shear de-correlation experiment (CSDX) in which the axial BC was modified by changing the composition of the end plate. Reasonable qualitative agreement is found between the simulations and the experiment. When an insulating axial BC is used, broadband turbulence is obtained and an inverse cascade occurs down to low frequencies and long spatial scales. Robust sheared flows are obtained. By contrast, employing a conducting BC at the plate resulted in coherent (drift wave) modes rather than broadband turbulence, with weaker inverse cascade, and smaller zonal flows. The dependence of the two instability mechanisms (rotationally driven interchange mode and drift waves) on the axial BC is also discussed.

D'Ippolito, D. A.; Russell, D. A.; Myra, J. R. [Lodestar Research Corporation, 2400 Central Avenue, Boulder, Colorado 80301 (United States); Thakur, S. C.; Tynan, G. R.; Holland, C. [Center for Momentum Transport and Flow Organization, University of California at San Diego, San Diego, California 92093 (United States)

2012-10-15

391

The role of the electron convection term for the parallel electric field and electron acceleration in MHD simulations  

SciTech Connect

There has been a great concern about the origin of the parallel electric field in the frame of fluid equations in the auroral acceleration region. This paper proposes a new method to simulate magnetohydrodynamic (MHD) equations that include the electron convection term and shows its efficiency with simulation results in one dimension. We apply a third-order semi-discrete central scheme to investigate the characteristics of the electron convection term including its nonlinearity. At a steady state discontinuity, the sum of the ion and electron convection terms balances with the ion pressure gradient. We find that the electron convection term works like the gradient of the negative pressure and reduces the ion sound speed or amplifies the sound mode when parallel current flows. The electron convection term enables us to describe a situation in which a parallel electric field and parallel electron acceleration coexist, which is impossible for ideal or resistive MHD.

Matsuda, K.; Terada, N.; Katoh, Y. [Space and Terrestrial Plasma Physics Laboratory, Department of Geophysics, Graduate School of Science, Tohoku University, Sendai, Miyagi 980-8578 (Japan); Misawa, H. [Planetary Plasma and Atmospheric Research Center, Graduate School of Science, Tohoku University, Sendai, Miyagi 980-8578 (Japan)

2011-08-15

392

Massively-parallel FDTD simulations to address mask electromagnetic effects in hyper-NA immersion lithography  

NASA Astrophysics Data System (ADS)

In the Hyper-NA immersion lithography regime, the electromagnetic response of the reticle is known to deviate in a complicated manner from the idealized Thin-Mask-like behavior. Already, this is driving certain RET choices, such as the use of polarized illumination and the customization of reticle film stacks. Unfortunately, full 3-D electromagnetic mask simulations are computationally intensive. And while OPC-compatible mask electromagnetic field (EMF) models can offer a reasonable tradeoff between speed and accuracy for full-chip OPC applications, full understanding of these complex physical effects demands higher accuracy. Our paper describes recent advances in leveraging High Performance Computing as a critical step towards lithographic modeling of the full manufacturing process. In this paper, highly accurate full 3-D electromagnetic simulation of very large mask layouts are conducted in parallel with reasonable turnaround time, using a Blue- Gene/L supercomputer and a Finite-Difference Time-Domain (FDTD) code developed internally within IBM. A 3-D simulation of a large 2-D layout spanning 5?m×5?m at the wafer plane (and thus (20?m×20?m×0.5?m at the mask) results in a simulation with roughly 12.5GB of memory (grid size of 10nm at the mask, single-precision computation, about 30 bytes/grid point). FDTD is flexible and easily parallelizable to enable full simulations of such large layout in approximately an hour using one BlueGene/L "midplane" containing 512 dual-processor nodes with 256MB of memory per processor. Our scaling studies on BlueGene/L demonstrate that simulations up to 100?m × 100?m at the mask can be computed in a few hours. Finally, we will show that the use of a subcell technique permits accurate simulation of features smaller than the grid discretization, thus improving on the tradeoff between computational complexity and simulation accuracy. We demonstrate the correlation of the real and quadrature components that comprise the Boundary Layer representation of the EMF behavior of a mask blank to intensity measurements of the mask diffraction patterns by an Aerial Image Measurement System (AIMS) with polarized illumination. We also discuss how this model can become a powerful tool for the assessment of the impact to the lithographic process of a mask blank.

Tirapu Azpiroz, Jaione; Burr, Geoffrey W.; Rosenbluth, Alan E.; Hibbs, Michael

2008-03-01

393

Improving parallel scalability for edge plasma transport simulations with neutral gas species  

NASA Astrophysics Data System (ADS)

Simulating the transport of multi-species plasma and neutral species in the edge region of a tokamak magnetic fusion energy device is computationally intensive and difficult due to coupling among various components, strong nonlinearities and a broad range of temporal scales. In addition to providing boundary conditions for the core plasma, such models aid in the understanding and control of the associated plasma/material-wall interactions, a topic that is essential for the development of a viable fusion power plant. The governing partial differential equations are discretized to form a large nonlinear system that typically must be evolved in time to obtain steady-state solutions. Fully implicit techniques using preconditioned Jacobian-free Newton-Krylov methods with parallel domain-based preconditioners are shown to be robust and efficient for the plasma components. Inclusion of neutral gas components, however, increases the condition number of the system to the point where improved parallel preconditioning is needed. Standard algebraic preconditioners that provide sufficient coupling throughout the global domain to handle the neutrals are not generally scalable. We present a new preconditioner, termed FieldSplit, which exploits the character of the neutral equations to improve the scalability of the combined plasma/neutral system.

McCourt, M.; Rognlien, T. D.; McInnes, L. C.; Zhang, H.

2012-01-01

394

An Object-Oriented Parallel Particle-in-Cell Code for Beam Dynamics Simulation in Linear Accelerators  

Microsoft Academic Search

We present an object-oriented three-dimensional parallel particle-in-cell (PIC) code for simulation of beam dynamics in linear accelerators (linacs). An important feature of this code is the use of split-operator methods to integrate single-particle magnetic optics techniques with parallel PIC techniques. By choosing a splitting scheme that separates the self-fields from the complicated externally applied fields, we are able to utilize

Ji Qiang; Robert D. Ryne; Salman Habib; Viktor Decyk

2000-01-01

395

An object-oriented parallel particle-in-cell code for beam dynamics simulation in linear accelerators  

Microsoft Academic Search

In this paper, we present an object-oriented three- dimensional parallel particle-in-cell code for beam dynam- ics simulation in linear accelerators. A two-dimensional parallel domain decomposition approach is employed within a message passing programming paradigm along with a dynamic load balancing. Implementing object- oriented software design provides the code with better maintainability, reusability, and extensibility compared with conventional structure based code.

J. Qiang; R. D. Ryne; S. Habib; V. Decyk

1999-01-01

396

Parallel Adaptive Simulation of Weak and Strong Transverse-Wave Structures in H2-O2 Detonations  

SciTech Connect

Two- and three-dimensional simulation results are presented that investigate at great detail the temporal evolution of Mach reflection sub-structure patterns intrinsic to gaseous detonation waves. High local resolution is achieved by utilizing a distributed memory parallel shock-capturing finite volume code that employs block-structured dynamic mesh adaptation. The computational approach, the implemented parallelization strategy, and the software design are discussed.

Deiterding, Ralf [ORNL

2010-01-01

397

Non-equilibrium molecular dynamics simulation of nanojet injection with adaptive-spatial decomposition parallel algorithm.  

PubMed

An Adaptive-Spatial Decomposition parallel algorithm was developed to increase computation efficiency for molecular dynamics simulations of nano-fluids. Injection of a liquid argon jet with a scale of 17.6 molecular diameters was investigated. A solid annular platinum injector was also solved simultaneously with the liquid injectant by adopting a solid modeling technique which incorporates phantom atoms. The viscous heat was naturally discharged through the solids so the liquid boiling problem was avoided with no separate use of temperature controlling methods. Parametric investigations of injection speed, wall temperature, and injector length were made. A sudden pressure drop at the orifice exit causes flash boiling of the liquid departing the nozzle exit with strong evaporation on the surface of the liquids, while rendering a slender jet. The elevation of the injection speed and the wall temperature causes an activation of the surface evaporation concurrent with reduction in the jet breakup length and the drop size. PMID:19051924

Shin, Hyun-Ho; Yoon, Woong-Sup

2008-07-01

398

Enhancing parallel quasi-static particle-in-cell simulations with a pipelining algorithm  

SciTech Connect

A pipelining algorithm to overcome the limitation on scaling quasi-static particle-in-cell models of relativistic beams in plasmas to a very large number of processors is described. The pipelining algorithm uses multiple groups of processors and optimizes the job allocation on the processors in parallel computing. The algorithm is implemented on the quasi-static code QuickPIC and is shown to scale to over 10{sup 3} processors and increased the scale and speed by two orders of magnitude over the non-pipelined model. The new approach opens the door to performing full scale 3D simulations of future plasma wakefield accelerators or full lifetime models of beam interaction with electron clouds in circular accelerators such as the Large Hadron Collider (LHC) at CERN.

Feng, B. [Departments of Electrical Engineering and Physics and Astronomy, University of Southern California, Los Angeles, CA 90089 (United States)], E-mail: bfeng@usc.edu; Huang, C.; Decyk, V. [Department of Physics and Astronomy, University of California, Los Angeles, CA 90095 (United States); Mori, W.B. [Department of Physics and Astronomy, University of California, Los Angeles, CA 90095 (United States); Department of Electrical Engineering, University of California, Los Angeles, CA 90095 (United States); Muggli, P. [Departments of Electrical Engineering and Physics and Astronomy, University of Southern California, Los Angeles, CA 90089 (United States); Katsouleas, T. [Pratt School of Engineering, Duke University, Durham, NC 27708 (United States)

2009-08-20

399

Parallel 3D-TLM algorithm for simulation of the Earth-ionosphere cavity  

NASA Astrophysics Data System (ADS)

A parallel 3D algorithm for solving time-domain electromagnetic problems with arbitrary geometries is presented. The technique employed is the Transmission Line Modeling (TLM) method implemented in Shared Memory (SM) environments. The benchmarking performed reveals that the maximum speedup depends on the memory size of the problem as well as multiple hardware factors, like the disposition of CPUs, cache, or memory. A maximum speedup of 15 has been measured for the largest problem. In certain circumstances of low memory requirements, superlinear speedup is achieved using our algorithm. The model is employed to model the Earth-ionosphere cavity, thus enabling a study of the natural electromagnetic phenomena that occur in it. The algorithm allows complete 3D simulations of the cavity with a resolution of 10 km, within a reasonable timescale.

Toledo-Redondo, Sergio; Salinas, Alfonso; Morente-Molinera, Juan Antonio; Méndez, Antonio; Fornieles, Jesús; Portí, Jorge; Morente, Juan Antonio

2013-03-01

400

Implementation of a parallel algorithm for thermo-chemical nonequilibrium flow simulations  

NASA Astrophysics Data System (ADS)

Massively parallel (MP) computing is considered to be the future direction of high performance computing. When engineers apply this new MP computing technology to solve large-scale problems, one major interest is what is the maximum problem size that a MP computer can handle. To determine the maximum size, it is important to address the code scalability issue. Scalability implies whether the code can provide an increase in performance proportional to an increase in problem size. If the size of the problem increases, by utilizing more computer nodes, the ideal elapsed time to simulate a problem should not increase much. Hence one important task in the development of the MP computing technology is to ensure scalability. A scalable code is an efficient code. In order to obtain good scaled performance, it is necessary to first have the code optimized for a single node performance before proceeding to a large-scale simulation with a large number of computer nodes. This paper will discuss the implementation of a massively parallel computing strategy and the process of optimization to improve the scaled performance. Specifically, we will look at domain decomposition, resource management in the code, communication overhead, and problem mapping. By incorporating these improvements and adopting an efficient MP computing strategy, an efficiency of about 85% and 96%, respectively, has been achieved using 64 nodes on MP computers for both perfect gas and chemically reactive gas problems. A comparison of the performance between MP computers and a vectorized computer, such as Cray-YMP, will also be presented.

Wong, C. C.; Blottner, F. G.; Payne, J. L.; Soetrisno, M.

1995-01-01

401

Macro-scale phenomena of arterial coupled cells: a massively parallel simulation  

PubMed Central

Impaired mass transfer characteristics of blood-borne vasoactive species such as adenosine triphosphate in regions such as an arterial bifurcation have been hypothesized as a prospective mechanism in the aetiology of atherosclerotic lesions. Arterial endothelial cells (ECs) and smooth muscle cells (SMCs) respond differentially to altered local haemodynamics and produce coordinated macro-scale responses via intercellular communication. Using a computationally designed arterial segment comprising large populations of mathematically modelled coupled ECs and SMCs, we investigate their response to spatial gradients of blood-borne agonist concentrations and the effect of micro-scale-driven perturbation on the macro-scale. Altering homocellular (between same cell type) and heterocellular (between different cell types) intercellular coupling, we simulated four cases of normal and pathological arterial segments experiencing an identical gradient in the concentration of the agonist. Results show that the heterocellular calcium (Ca2+) coupling between ECs and SMCs is important in eliciting a rapid response when the vessel segment is stimulated by the agonist gradient. In the absence of heterocellular coupling, homocellular Ca2+ coupling between SMCs is necessary for propagation of Ca2+ waves from downstream to upstream cells axially. Desynchronized intracellular Ca2+ oscillations in coupled SMCs are mandatory for this propagation. Upon decoupling the heterocellular membrane potential, the arterial segment looses the inhibitory effect of ECs on the Ca2+ dynamics of the underlying SMCs. The full system comprises hundreds of thousands of coupled nonlinear ordinary differential equations simulated on the massively parallel Blue Gene architecture. The use of massively parallel computational architectures shows the capability of this approach to address macro-scale phenomena driven by elementary micro-scale components of the system.

Shaikh, Mohsin Ahmed; Wall, David J. N.; David, Tim

2012-01-01

402

A Multi-Bunch, Three-Dimensional, Strong-Strong Beam-Beam Simulation Code for Parallel Computers  

SciTech Connect

For simulating the strong-strong beam-beam effect, using Particle-In-Cell codes has become one of the methods of choice. While the two-dimensional problem is readily treatable using PC-class machines, the three-dimensional problem, i.e., a problem encompassing hourglass and phase-averaging effects, requires the use of parallel processors. In this paper, we introduce a strong-strong code NIMZOVICH, which was specifically designed for parallel processors and which is optimally used for many bunches and parasitic crossings. We describe the parallelization scheme and give some benchmarking results.

Cai, Y.; Kabel, A.C.; /SLAC

2005-05-11

403

Using Simulation as a Knowledge Discovery Tool in an Adversary C2 Network.  

National Technical Information Service (NTIS)

This paper discusses a discrete-event simulation model of an adversary social network using Micro Saint Simulation software. The purpose is for knowledge discovery from the many interactions and relationships among and between the adversary players in the...

C. A. Ntuen E. H. Park O. A. Alabi Y. Seong

2009-01-01

404

CONFIG: Qualitative Simulation Tool for Analyzing Behavior of Engineering Devices.  

National Technical Information Service (NTIS)

To design failure management expert systems, engineers mentally analyze the effects of failures and procedures as they propagate through device configurations. CONFIG is a generic device modeling tool for use in discrete event simulation, to support such ...

J. T. Malin B. D. Basham R. A. Harris

1987-01-01

405

Simulations of Implosions with a 3D, Parallel, Unstructured Grid, Radiation-Hydrodynamics Code  

NASA Astrophysics Data System (ADS)

We present results obtained with the code described in [1,2] which solves the equations of compressible hydrodynamics, laser deposition, heat conduction, and one-group radiation transport (flux-limited diffusion approximation). We choose problems of interest to ICF applications: point explosions and spherical implosions. For explosions, the problem couples non-linear heat conduction to hydrodynamics. For implosions, we include radiative effects in order to simulate indirectly--driven ICF capsules. The 3D results are obtained on unstructured tetrahedral grids which are domain-decomposed to run on as many as 128 processors. We compare results to 1D spherical simulations and analytic solutions. Results using laser deposition are presented in a companion paper by Kaiser et al. 1. A. I. Shestakov et al, ``The ICF3D Code,'' Lawrence Livermore National Laboratory, Livermore, CA, UCRL-JC-124448, (1997), to appear in Comput. Methods Appl. Mech. Engin. 2. A. I. Shestakov, J. L. Milovich, and D. S. Kershaw, ``Parallelization of a 3D, Unstructured Grid, Hydrodynamic-Diffusion Code,'' Lawrence Livermore National Laboratory, Livermore, CA, UCRL-JC-130988, (1998), submitted to SIAM J. Sci. Comp.

Shestakov, A. I.; Milovich, J. K.; Prasad, M. K.

1998-11-01

406

Progress on H5Part: A Portable High Performance Parallel DataInterface for Electromagnetics Simulations  

SciTech Connect

Significant problems facing all experimental andcomputationalsciences arise from growing data size and complexity. Commonto allthese problems is the need to perform efficient data I/O ondiversecomputer architectures. In our scientific application, thelargestparallel particle simulations generate vast quantitiesofsix-dimensional data. Such a simulation run produces data foranaggregate data size up to several TB per run. Motived by the needtoaddress data I/O and access challenges, we have implemented H5Part,anopen source data I/O API that simplifies the use of the HierarchicalDataFormat v5 library (HDF5). HDF5 is an industry standard forhighperformance, cross-platform data storage and retrieval that runsonall contemporary architectures from large parallel supercomputerstolaptops. H5Part, which is oriented to the needs of the particlephysicsand cosmology communities, provides support for parallelstorage andretrieval of particles, structured and in the future unstructuredmeshes.In this paper, we describe recent work focusing on I/O supportforparticles and structured meshes and provide data showing performance onmodernsupercomputer architectures like the IBM POWER 5.

Adelmann, Andreas; Gsell, Achim; Oswald, Benedikt; Schietinger,Thomas; Bethel, Wes; Shalf, John; Siegerist, Cristina; Stockinger, Kurt

2007-06-22

407

Giant Impacts During Planet Formation: Parallel Tree Code Simulations Using Smooth Particle Hydrodynamics  

NASA Astrophysics Data System (ADS)

There is both theoretical and observational evidence that giant planets collided with objects with mass >= Mearth during their evolution. These impacts may help shorten planetary formation timescales by changing the opacity of the planetary atmosphere to allow quicker cooling. They may also redistribute heavy metals within giant planets, affect the core/envelope mass ratio, and help determine the ratio of emitted to absorbed energy within giant planets. Thus, the researchers propose to simulate the impact of a ~ Earth-mass object onto a proto-giant-planet with SPH. Results of the SPH collision models will be input into a steady-state planetary evolution code and the effect of impacts on formation timescales, core/envelope mass ratios, density profiles, and thermal emissions of giant planets will be quantified. The collision will be modelled using a modified version of an SPH routine which simulates the collision of two polytropes. The Saumon-Chabrier and Tillotson equations of state will replace the polytropic equation of state. The parallel tree algorithm of Olson & Packer will be used for the domain decomposition and neighbor search necessary to calculate pressure and self-gravity efficiently. This work is funded by the NASA Graduate Student Researchers Program.

Cohen, R.; Bodenheimer, P.; Asphaug, E.

2000-12-01

408

Progress of Parallel Validation Tools for Fusion Simulations as Applied to Synthetic Diagnostic Efforts  

NASA Astrophysics Data System (ADS)

The verification and validation (V&V) of fusion simulation codes is necessary to ensure proper support of ever-increasingly expensive experiments such as ITER. Synthetic diagnostics are an important and useful tool for these V&V efforts, and is the focus of the Parallel Validation Tools for Fusion Simulations project. We will present our effort to develop standards, called schemas, for the data exchange between codes and synthetic diagnostics. We have developed a formal schema (expressed with XML Schema syntax) for specification of data for visualization and for data exchange. We have also developed a python tool for verification of HDF5 data against the formal schema. We will present the API for writing and reading HDF5 data complaint with the standards above in Fortran90, IDL, python, C and the VisIt visualization tool, enabling the user to decide the tool that works best to accomplish their goals. We will present the development of synthetic diagnostics based on this capability. These tools will be applied to the GYRO and the GEM codes for synthetic diagnostics using DIII-D experimental profiles.

Vadlamani, Srinath; Shasharina, Sveta; Kruger, Scott; Durant, Mark; Dimitrov, Dimitre; Holland, Chris; Candy, Jeff; Parker, Scott; Chen, Yang; Wan, Weigang; Sanderson, Allen

2011-11-01

409

Monte Carlo Simulation of an Ar RF Parallel Plate Discharge Plasma Employing LPWS  

NASA Astrophysics Data System (ADS)

The geometry, pressure and power coupling conditions of most plasma sources for semiconductor manufacturing lend themselves to particle simulations such as Monte Carlo simulations(here after MCS). Usually the kinetics solvers are coupled to solvers for Poisson’s Equation. MCS usually employ averaging over discrete regions of parameter space. When the number of discrete regions is increased, two problems result: one is the instability of the solution because of a statistical change and the other is the increase of the calculation time. Ventzek and Kitamori (J. Appl. Phys., vol. 75, pp. 3785-3788, 1994) proposed Legendre Polynomial Weighted Sampling (here after LPWS) which aimed to optimize sampling statistics with an economy of particles. In this paper, we characterize an Ar RF parallel plate discharge using a MCS employing LPWS based on the Date’s model (T. IEE Japan, 111-A, 11, pp. 962-972, 1991). The method is shown to replicate the behavior of RF discharges with high fidelity.

Horie, Ikuya; Suzuki, Takuma; Ohmori, Yoshiyuki; Kitamori, Kazutaka; Maruyama, Koichi

410

Monte Carlo Simulations of Nonlinear Particle Acceleration in Parallel Trans-relativistic Shocks  

NASA Astrophysics Data System (ADS)

We present results from a Monte Carlo simulation of a parallel collisionless shock undergoing particle acceleration. Our simulation, which contains parameterized scattering and a particular thermal leakage injection model, calculates the feedback between accelerated particles ahead of the shock, which influence the shock precursor and "smooth" the shock, and thermal particle injection. We show that there is a transition between nonrelativistic shocks, where the acceleration efficiency can be extremely high and the nonlinear compression ratio can be substantially greater than the Rankine-Hugoniot value, and fully relativistic shocks, where diffusive shock acceleration is less efficient and the compression ratio remains at the Rankine-Hugoniot value. This transition occurs in the trans-relativistic regime and, for the particular parameters we use, occurs around a shock Lorentz factor ?0 = 1.5. We also find that nonlinear shock smoothing dramatically reduces the acceleration efficiency presumed to occur with large-angle scattering in ultra-relativistic shocks. Our ability to seamlessly treat the transition from ultra-relativistic to trans-relativistic to nonrelativistic shocks may be important for evolving relativistic systems, such as gamma-ray bursts and Type Ibc supernovae. We expect a substantial evolution of shock accelerated spectra during this transition from soft early on to much harder when the blast-wave shock becomes nonrelativistic.

Ellison, Donald C.; Warren, Donald C.; Bykov, Andrei M.

2013-10-01

411

DEVSim++ Toolset for Defense Modeling and Simulation and Interoperation  

Microsoft Academic Search

Discrete Event Systems Specification (DEVS) formalism supports the specification of discrete event models in a hierarchical and modular manner. Efforts have been made to develop the simulation environments for the modeling and simulation (M&S) of systems using DEVS formalism, particularly in defense M&S domains. This paper introduces the DEVSim++ toolset and its applications. The Object-Analysis Index (OAI) matrix is a

Tag Gon Kim; Chang Ho Sung; Su-Youn Hong; Jeong Hee Hong; Chang Beom Choi; Jeong Hoon Kim; Kyung Min Seo; Jang Won Bae

2011-01-01

412

Three-Dimensional Parallel Adaptive Mesh Refinement Simulations of Shock-Driven Turbulent Mixing in Plane and Converging Geometries  

SciTech Connect

This paper presents the use of a dynamically adaptive mesh refinement strategy for the simulations of shock-driven turbulent mixing. Large-eddy simulations are necessary due the high Reynolds number turbulent regime. In this approach, the large scales are simulated directly and small scales at which the viscous dissipation occurs are modeled. A low-numerical centered finite-difference scheme is used in turbulent flow regions while a shock-capturing method is employed to capture shocks. Three-dimensional parallel simulations of the Richtmyer-Meshkov instability performed in plane and converging geometries are described.

Lombardini, Manuel [California Institute of Technology, Pasadena; Deiterding, Ralf [ORNL

2010-01-01

413

Parallel contact detection algorithm for transient solid dynamics simulations using PRONTO3D.  

National Technical Information Service (NTIS)

An efficient, scalable, parallel algorithm for treating material surface contacts in solid mechanics finite element programs has been implemented in a modular way for MIMD parallel computers. The serial contact detection algorithm that was developed previ...

S. W. Attaway B. A. Hendrickson S. J. Plimpton

1996-01-01

414

Flatness-Based Control of Parallel Kinematics using Multibody Systems – Simulation and Experimental Results  

Microsoft Academic Search

The development of new machine tools such as parallel and hybrid kinematics leads to new challenges in the design and control of such machines in comparison to conventional ones. Parallel kinematics exhibit inherently nonlinear dynamics over the whole workspace, machine vibrations become important as a lightweight design is used and independently controlled drives become infeasible due to the parallel setup.

Alexandra Ast; Peter Eberhard

2006-01-01

415

Simulation of lid-driven cavity flows by parallel lattice Boltzmann method using multi-relaxation-time scheme  

Microsoft Academic Search

Two-dimensional near-incompressible steady lid-driven cavity flows (Re = 100-7,500) are simulated using multi-relaxation-time (MRT) model in the parallel lattice Boltzmann BGK Bhatnager-Gross-Krook method (LBGK). Results are compared with those using single-relaxation-time (SRT) model in the LBGK method and previous simulation data using Navier-Stokes equations for the same flow conditions. Effects of variation of relaxation parameters in the MRT model, effects

J.-S. Wu; Y.-L. Shao

2004-01-01

416

Parallel Computing Techniques for Large-Scale Reservoir Simulation of MultiComponent and Multiphase Fluid Flow  

Microsoft Academic Search

Massively parallel computing techniques can overcome limitations of problem size a nd space resolution for r eservoir simulation on single-processor machine. This paper reports on our work to p arallelize a widely used numerical simulator, known as TOUGH2, for nonisothermal flows of multi- component, multiphase fluids in three-dimensional porous and fractured media. We have implemented the TOUGH2 package on a

K. Zhang; Y. S. Wu; C. Ding; K. Pruess; E. Elmroth

2001-01-01

417

A Parallel Implementation of the TOUGH2 Software Package for Large Scale Multiphase Fluid and Heat Flow Simulations  

Microsoft Academic Search

TOUGH2 is a widely used simulation package for solving groundwater flow related problems such as nuclear waste isolation, environmental remediation, and geothermal reservoir engineering. It solves a set of coupled mass and energy balance equations using a finite volume method. The parallel implementation first partitions the unstructured computational domain. For each time step, a set of coupled non-linear equations is

Erik Elmroth; Chris Ding; Yu-Shu Wu; Karsten Pruess

1999-01-01

418

A parallel implementation of the TOUGH2 software package for large scale multiphase fluid and heat flow simulations  

Microsoft Academic Search

TOUGH2 is a widely used simulation package for solving groundwater flo w related problems such as nuclear waste isolation, environmental remediation, and geothermal reservoir engineering. It solves a set of coupled mass and energy balance equations using a finite volume method. The parallel implementation first partitions the unstructured computational domain. For each time step, a set of coupled non-linear equations

Erik Elmroth; Chris H. Q. Ding; Yu-Shu Wu; Karsten Pruess

1999-01-01

419

Parallel simulation of three-dimensional complex flows: Application to two-phase compressible flows and turbulent wakes  

Microsoft Academic Search

In this paper, we present parallel simulations of three-dimensional complex flows obtained on an ORIGIN 3800 computer and on homogeneous and heterogeneous (processors of different speeds and RAM) computational grids. The solver under consideration, which is representative of modern numerics used in industrial computational fluid dynamics (CFD) software, is based on a mixed element-vol- ume method on unstructured tedrahedrisations. The

B. Koobus; S. Camarri; M. V. Salvetti; S. Wornom; A. Dervieux

2007-01-01

420

Three-dimensional gyrokinetic particle-in-cell simulation of plasmas on a massively parallel computer: LDRD Core Competency Project  

Microsoft Academic Search

One of the programs of the Magnetic fusion Energy (MFE) Theory and Computations Program is studying the anomalous transport of thermal energy across the field lines in the core of a tokamak. We use the method of gyrokinetic particle-in-cell simulation in this study. For this LDRD project we employed massively parallel processing, new algorithms, and new algorithms, and new formal

J. A. Byers; T. J. Williams; B. I. Cohen; A. M. Dimits

1994-01-01

421

Parallel Finite Element Particle-In-Cell Code for Simulations of Space-Charge Dominated Beam-Cavity Interactions.  

National Technical Information Service (NTIS)

Over the past years, SLACs Advanced Computations Department (ACD) has developed the parallel finite element (FE) particle-in-cell code Pic3P (Pic2P) for simulations of beam-cavity interactions dominated by spacecharge effects. As opposed to standard space...

A. C. Kabel A. E. Candel L. Lee R. Uplenchwar Y. K. Ko

2007-01-01

422

Life extension simulation of aged reactor pressure vessel material using probabilistic fracture mechanics analysis on a massively parallel computer  

Microsoft Academic Search

This paper describes a probabilistic fracture mechanics (PFM) computer program using the parallel Monte Carlo (MC) algorithm. In the stratified MC algorithm, a sampling space of probabilistic variables such as fracture toughness value, the depth and aspect ratio of an initial semi-elliptical surface crack is divided into a number of small cells. Fatigue crack growth simulations and failure judgements of

S. Yoshimura; M.-Y. Zhang; G. Yagawa

1995-01-01

423

A simulation study of robotic welding system with parallel and serial processes in the metal fabrication industry  

Microsoft Academic Search

This paper presents the usefulness of simulation in studying the impacts of system failures and delays on the output and cycle time of finished weldments produced by a robotic work cell having both serial and parallel processes. Due to multiple processes and overlapped activities, process mapping plays a significant role in building the model. The model replicates a non-terminating welding

Carl R. Williams; P. Chompuming

2002-01-01

424

ParCeL5\\/ParSSAP: A Parallel Programming Model and Library for Easy Development and Fast Execution of Simulations of Situated Multi-Agent Systems  

Microsoft Academic Search

This paper introduces a new parallel program- ming model for situated multi-agent systems simulations and its parallel library implementa- tion on shared memory MIMD parallel comput- ers. The first goal is to allow users to easily im- plement situated multi-agent systems, following their natural paradigm: concurrent agent behav- ior definition and environment update program- ming. The second goal is to

Stephane Vialle; Eugen Dedu; Claude Timsit

425

A Parallel Programming Model and Library for Easy Development and Fast Execution of Simulations of Situated Multi-Agen t Systems  

Microsoft Academic Search

This paper introduces a new parallel program- ming model for situated multi-agent systems simulations and its parallel library implementa- tion on shared memory MIMD parallel comput- ers. The rst goal is to allow users to easily im- plement situated multi-agent systems, following their natural paradigm: concurrent agent behav- ior denition and environment update program- ming. The second goal is to

Eugen Dedu; Claude Timsit

426

Mesoscale Simulations of Particulate Flows with Parallel Distributed Lagrange Multiplier Technique  

SciTech Connect

Fluid particulate flows are common phenomena in nature and industry. Modeling of such flows at micro and macro levels as well establishing relationships between these approaches are needed to understand properties of the particulate matter. We propose a computational technique based on the direct numerical simulation of the particulate flows. The numerical method is based on the distributed Lagrange multiplier technique following the ideas of Glowinski et al. (1999). Each particle is explicitly resolved on an Eulerian grid as a separate domain, using solid volume fractions. The fluid equations are solved through the entire computational domain, however, Lagrange multiplier constrains are applied inside the particle domain such that the fluid within any volume associated with a solid particle moves as an incompressible rigid body. Mutual forces for the fluid-particle interactions are internal to the system. Particles interact with the fluid via fluid dynamic equations, resulting in implicit fluid-rigid-body coupling relations that produce realistic fluid flow around the particles (i.e., no-slip boundary conditions). The particle-particle interactions are implemented using explicit force-displacement interactions for frictional inelastic particles similar to the DEM method of Cundall et al. (1979) with some modifications using a volume of an overlapping region as an input to the contact forces. The method is flexible enough to handle arbitrary particle shapes and size distributions. A parallel implementation of the method is based on the SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) library, which allows handling of large amounts of rigid particles and enables local grid refinement. Accuracy and convergence of the presented method has been tested against known solutions for a falling sphere as well as by examining fluid flows through stationary particle beds (periodic and cubic packing). To evaluate code performance and validate particle contact physics algorithm, we performed simulations of a representative experiment conducted at the University of California at Berkley for pebble flow through a narrow opening.

Kanarska, Y

2010-03-24

427

Monte Carlo simulation of photoelectron energization in parallel electric fields: Electroglow on Uranus  

SciTech Connect

A Monte Carlo simulation of photoelectron energization and energy degradation in H{sub 2} gas in the presence of parallel electric fields has been carried out. Numerical yield spectra which contain information about the electron energy degradation process and can be used to calculate the yield for any inelastic event are obtained. The variation of yield spectra with incident electron energy, electric field, pitch angle, and cutoff limit has been studied. The yield function is employed to determine the photoelectron fluxes. H{sub 2} Lyman and Werner band excitation rates and integrated column intensity are computed for three different electric field profiles taking various low-energy cutoff limits. It is found that an electric field profile with peak value of 4 mV/m at neutral number density of 3{times}10{sup 10} cm{sup {minus}3} produces enhanced volume emission rates of H{sub 2} bands ({lambda} < 1100 {angstrom}) explaining about 20% of the observed electroglow emission on Uranus. The effect of solar zenith angle and solar cycle variation on peak excitation rate is discussed.

Singhal, R.P.; Bhardwaj, A. (Banaras Hindu Univ., Varanasi (India))

1991-09-01

428

Performance Evaluation of Lattice-Boltzmann MagnetohydrodynamicsSimulations on Modern Parallel Vector Systems  

SciTech Connect

The last decade has witnessed a rapid proliferation of superscalarcache-based microprocessors to build high-end computing (HEC) platforms, primarily because of their generality, scalability, and cost effectiveness. However, the growing gap between sustained and peak performance for full-scale scientific applications on such platforms has become major concern in high performance computing. The latest generation of custom-built parallel vector systems have the potential to address this concern for numerical algorithms with sufficient regularity in their computational structure. In this work, we explore two and three dimensional implementations of a lattice-Boltzmann magnetohydrodynamics (MHD) physics application, on some of today's most powerful supercomputing platforms. Results compare performance between the vector-based Cray X1, Earth Simulator, and newly-released NEC SX-8, with the commodity-based superscalar platforms of the IBM Power3, IntelItanium2, and AMD Opteron. Overall results show that the SX-8 attains unprecedented aggregate performance across our evaluated applications.

Carter, Jonathan; Oliker, Leonid

2006-01-09

429

Cosmic Ray Acceleration at Cosmological Shocks: Numerical Simulations of CR Modified Plane-Parallel Shocks  

NASA Astrophysics Data System (ADS)

In order to explore the cosmic ray acceleration at the cosmological shocks, we have performed numerical simulations of one-dimensional, plane-parallel, cosmic ray (CR) modified shocks with the newly developed CRASH (Cosmic Ray Amr SHock) numerical code. Based on the hypothesis that strong Alfvén waves are self-generated by streaming CRs, the Bohm diffusion model for CRs is adopted. The code includes a plasma-physics-based ``injection'' model that transfers a small proportion of the thermal proton flux through the shock into low energy CRs for acceleration there. We found that, for strong accretion shocks with Mach numbers greater than 10, CRs can absorb most of shock kinetic energy and the accretion shock speed is reduced up to 20 %, compared to pure gas dynamic shocks. Although the amount of kinetic energy passed through accretion shocks is small, since they propagate into the low density intergalactic medium, they might possibly provide acceleration sites for ultra-high energy cosmic rays of E>1018eV. For internal/merger shocks with Mach numbers less than 3, however, the energy transfer to CRs is only about 10-20 % and so nonlinear feedback due to the CR pressure is insignificant. Considering that intracluster medium (ICM) can be shocked repeatedly, however, the CRs generated by these weak shocks could be sufficient to explain the observed non-thermal signatures from clusters of galaxies.

Kang, Hyesung

2003-09-01

430

Parallel Higher-order Finite Element Method for Accurate Field Computations in Wakefield and PIC Simulations  

SciTech Connect

Over the past years, SLAC's Advanced Computations Department (ACD), under SciDAC sponsorship, has developed a suite of 3D (2D) parallel higher-order finite element (FE) codes, T3P (T2P) and Pic3P (Pic2P), aimed at accurate, large-scale simulation of wakefields and particle-field interactions in radio-frequency (RF) cavities of complex shape. The codes are built on the FE infrastructure that supports SLAC's frequency domain codes, Omega3P and S3P, to utilize conformal tetrahedral (triangular)meshes, higher-order basis functions and quadratic geometry approximation. For time integration, they adopt an unconditionally stable implicit scheme. Pic3P (Pic2P) extends T3P (T2P) to treat charged-particle dynamics self-consistently using the PIC (particle-in-cell) approach, the first such implementation on a conformal, unstructured grid using Whitney basis functions. Examples from applications to the International Linear Collider (ILC), Positron Electron Project-II (PEP-II), Linac Coherent Light Source (LCLS) and other accelerators will be presented to compare the accuracy and computational efficiency of these codes versus their counterparts using structured grids.

Candel, A.; Kabel, A.; Lee, L.; Li, Z.; Limborg, C.; Ng, C.; Prudencio, E.; Schussman, G.; Uplenchwar, R.; Ko, K.; /SLAC

2009-06-19

431

Parallel strong-strong/strong-weak simulations of beam-beam interaction in hadron accelerators  

SciTech Connect

In this paper, we present a parallel computational tool, BeamBeam3D, developed at Lawrence Berkeley National Laboratory, for strong-strong/strong-weak beam-beam modeling. This tool calculates self-consistently the electromagnetic beam-beam forces for arbitrary distributions during each collision when a strong-strong beam-beam interaction model is used. When a strong-weak model is used, the code has the option of using a Gaussian approximation for the strong beam. BeamBeam3D uses a multiple-slice model, so finite bunch length effects can be studied. The code also includes a Lorentz boost and rotation to treat collisions with finite collision crossing angle. It handles arbitrary closed-orbit separation (static or time dependent) and models long-range beam-beam interactions using a newly developed shifted Green function approach. It can also handle multiple interaction points using externally supplied linear maps between interaction points in the strong-weak model. The code has been used to study beam-beam effects in the RHIC, Tevatron, and LHC. In this paper we will describe the BeamBeam3D code, present example simulations, and describe the code performance.

Qiang, Ji; Furman, Miguel; Ryne, Robert D.; Fischer, Wolfram; Sen, Tanaji; Xiao, Meiqin

2003-09-18

432

Multiscale modeling and simulation for polymer melt flows between parallel plates.  

PubMed

The flow behaviors of polymer melt composed of short chains with ten beads between parallel plates are simulated by using a hybrid method of molecular dynamics and computational fluid dynamics. Three problems are solved: creep motion under a constant shear stress and its recovery motion after removing the stress, pressure-driven flows, and the flows in rapidly oscillating plates. In the creep/recovery problem, the delayed elastic deformation in the creep motion and evident elastic behavior in the recovery motion are demonstrated. The velocity profiles of the melt in pressure-driven flows are quite different from those of Newtonian fluid due to shear thinning. Velocity gradients of the melt become steeper near the plates and flatter at the middle between the plates as the pressure gradient increases and the temperature decreases. In the rapidly oscillating plates, the viscous boundary layer of the melt is much thinner than that of Newtonian fluid due to the shear thinning of the melt. Three different rheological regimes, i.e., the viscous fluid, viscoelastic liquid, and viscoelastic solid regimes, form over the oscillating plate according to the local Deborah numbers. The melt behaves as a viscous fluid in a region for omegatauR < approximately 1 , and the crossover between the liquidlike and solidlike regime takes place around omegataualpha approximately equal 1 (where omega is the angular frequency of the plate and tauR and taualpha are Rouse and alpha relaxation time, respectively). PMID:20365855

Yasuda, Shugo; Yamamoto, Ryoichi

2010-03-08

433

A parallel contact detection algorithm for transient solid dynamics simulations using PRONTO3D  

NASA Astrophysics Data System (ADS)

An efficient, scalable, parallel algorithm for treating material surface contacts in solid mechanics finite element programs has been implemented in a modular way for multiple-instruction, multiple-data (MIMD) parallel computers. The serial contact detection algorithm that was developed previously for the transient dynamics finite element code PRONTO3D has been extended for use in parallel computation by utilizing a dynamic (adaptive) load balancing algorithm. This approach is scalable to thousands of computational nodes1

Attaway, S. W.; Hendrickson, B. A.; Plimpton, S. J.; Gardner, D. R.; Vaughan, C. T.; Brown, K. H.; Heinstein, M. W.

434

Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP  

Microsoft Academic Search

Future computer games need parallel programming to meet their ever growing hunger for per- formance. We report on our experiences in parallelizing the game-like C++ application Open- SteerDemo with OpenMP. To enable deterministic data-parallel processing of real-time agent steering behaviour, we had to change the high-level design, and refactor interfaces for explicit shared resource access. Our experience is summarized in

Bjoern Knafla; Claudia Leopold

2007-01-01

435

Numerical simulation and optimization on valve-induced water hammer characteristics for parallel pump feedwater system  

Microsoft Academic Search

In this study, the method of characteristic line (MOC) was adopted to evaluate the valve-induced water hammer phenomena in a parallel pumps feedwater system (PPFS) during the alternate startup process of parallel pumps. Based on closed physical and mathematical equations supplied with reasonable boundary conditions, a code was developed to compute the transient phenomena including the pressure wave vibration, local

Wenxi Tian; G. H. Su; Gaopeng Wang; Suizheng Qiu; Zejun Xiao

2008-01-01

436

Verification and Validation of Agent-based Scientific Simulation Models  

Microsoft Academic Search

Most formalized model verification and validation tech- niques come from industrial and system engineering for discrete-event system simulations. These techniques are widely used in computational science. The agent-based modeling approach is different from discrete event modeling approaches largely used in industrial and system engineer- ing in many aspects. Since the agent-based modeling ap- proach has recently become an attractive and

Xiaorong Xiang; Ryan Kennedy; Gregory Madey; Steve Cabaniss

2005-01-01

437

Dynamics analysis of a cable-driven parallel manipulator for hardware-in-the-loop dynamic simulation  

Microsoft Academic Search

This paper describes a preliminary study of the dynamics of a 6-DOF cable-driven parallel manipulator for a potential application in a ground-based hardware-in-the-loop simulator of microgravity dynamics and contact-dynamics of spacecraft or robotic systems. Two basic dynamics problems are studied. One is the inverse dynamics problem and the other is the rigidity and vibration problem. The study results support the

Ou Ma; Xiumin Diao

2005-01-01

438

Parallel and serial applications of the RETRAN-03 power plant simulation code using domain decomposition and Krylov subspace methods  

Microsoft Academic Search

High-fidelity simulation of nuclear reactor accidents such as the rupture of a main steam line in a pressurized water reactor (PWR) requires three-dimensional core hydrodynamics modeling because of the strong effect channel cross flow has on reactor kinetics. A parallel nested Krylov linear solver was developed and implemented in the RETRAN-03 reactor systems analysis code to make such high-fidelity core

T. J. Downar; J. Y. Wu; J. Steill; R. Janardhan

1997-01-01

439

GPU-Based Parallel Computing for the Simulation of Complex Multibody Systems with Unilateral and Bilateral Constraints: An Overview  

Microsoft Academic Search

\\u000a This work reports on advances in large-scale multibody dynamics simulation facilitated by the use of the Graphics Processing\\u000a Unit (GPU). A description of the GPU execution model along with its memory spaces is provided to illustrate its potential\\u000a parallel scientific computing. The equations of motion associated with the dynamics of large system of rigid bodies are introduced\\u000a and a solution

Alessandro Tasora; Dan Negrut; Mihai Anitescu

440

Design of a Parallel Mechanism Platform for Simulating Six Degrees-of-freedom General Motion Including Continuous 360-degree Spin  

Microsoft Academic Search

This paper presents a new six degree-of-freedom parallel mechanism platform, which can be used as a basis for general motion simulators. The unique feature of the platform is that it enables unlimited continuous 360-degree spin in any rotational axes plus finite X, Y, and Z-axis translation motion. The first part of the paper deals with the kinematic design issue of

Jongwon Kim; Young Man Cho; Frank C. Park; Jang Moo Lee

2003-01-01

441

Comparison of elastic and rigid blade-element rotor models using parallel processing technology for piloted simulations  

Microsoft Academic Search

A piloted comparison of rigid and aeroelastic blade-element rotor models was conducted at the Crew Station Research and Development Facility (CSRDF) at Ames Research Center. A simulation development and analysis tool, FLIGHTLAB, was used to implement these models in real time using parallel processing technology. Pilot comments and qualitative analysis performed both on-line and off-line confirmed that elastic degrees of

G. Hill; R. W. Du Val; J. A. Green; L. C. Huynh

1991-01-01

442

LPIC++ a parallel one-dimensional relativistic electromagnetic Particle-In-Cell code for simulating laser-plasma-interaction  

Microsoft Academic Search

We report on a recently developed electromagnetic relativistic 1D3V (one spatial, three velocity dimensions) Particle-In-Cell code for simulating laser-plasma interaction at normal and oblique incidence. The code is written in C++ and easy to extend. The data structure is characterized by the use of chained lists for the grid cells as well as particles belonging to one cell. The parallel

R. E. W. Pfund; R. Lichters; J. Meyer-Ter-Vehn

1998-01-01

443

PeliGRIFF, a parallel DEM-DLM\\/FD direct numerical simulation tool for 3D particulate flows  

Microsoft Academic Search

The problem of particulate flows at moderate to high concentration and finite Reynolds number is addressed by parallel direct\\u000a numerical simulation. The present contribution is an extension of the work published in Computers & Fluids 38:1608 (2009), where systems of moderate size in a 2D geometry were examined. At the numerical level, the suggested method is inspired\\u000a by the framework

Anthony Wachs

444

A two-dimensional numerical simulation of shock-enhanced mixing in a rectangular scramjet flowfield with parallel hydrogen injection  

NASA Astrophysics Data System (ADS)

The effect of shock impingement on the mixing and combustion of a reacting shear-layer is numerically simulated. Hydrogen fuel is injected at sonic velocity behind a backward facing step in a direction parallel to a supersonic freestream vitiated with H2O. The two-dimensional Navier-Stokes equations are solved and explicitly coupled to a chemistry 'package' employing a global, two-step combustion model. The results show that shock impingement enhances the mixing and combustion.

Domel, N. D.; Thompson, D. S.

1991-01-01

445

AMRSim: An Object-Oriented Performance Simulator for Parallel Adaptive Mesh Refinement.  

National Technical Information Service (NTIS)

Adaptive mesh refinement is complicated by both the algorithms and the dynamic nature of the computations. In parallel the complexity of getting good performance is dependent upon the architecture and the application. Most attempts to address the complexi...

2001-01-01

446

LUsim: A Framework for Simulation-Based Performance Modeling and Prediction of Parallel Sparse LU Factorization.  

National Technical Information Service (NTIS)

Sparse parallel factorization is among the most complicated and irregular algorithms to analyze and optimize. Performance depends both on system characteristics such as the floating point rate, the memory hierarchy, and the interconnect performance, as we...

P. Cicotti S. B. Baden X. Li

2008-01-01

447

A Parallel Adaptive Finite Element Method for the Simulation of Photon Migration with the Radiative-Transfer-Based Model  

PubMed Central

Whole-body optical molecular imaging of mouse models in preclinical research is rapidly developing in recent years. In this context, it is essential and necessary to develop novel simulation methods of light propagation for optical imaging, especially when a priori knowledge, large-volume domain and a wide-range of optical properties need to be considered in the reconstruction algorithm. In this paper, we propose a three dimensional parallel adaptive finite element method with simplified spherical harmonics (SPN) approximation to simulate optical photon propagation in lar